Re: [PATCH 16/19] cpuidle: Adjust includes to remove of_device.h

2023-03-29 Thread Sudeep Holla
On Wed, Mar 29, 2023 at 10:52:13AM -0500, Rob Herring wrote:
> Now that of_cpu_device_node_get() is defined in of.h, of_device.h is just
> implicitly including other includes, and is no longer needed. Adjust the
> include files with what was implicitly included by of_device.h (cpu.h,
> cpuhotplug.h, of.h, and of_platform.h) and drop including of_device.h.
> 
> Signed-off-by: Rob Herring 
> ---
> Please ack and I will take the series via the DT tree.
> ---
>  drivers/cpuidle/cpuidle-psci.c  | 1 -
>  drivers/cpuidle/cpuidle-qcom-spm.c  | 3 +--
>  drivers/cpuidle/cpuidle-riscv-sbi.c | 2 +-
>  drivers/cpuidle/dt_idle_states.c| 1 -
>  4 files changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
> index 6de027f9f6f5..bf68920d038a 100644
> --- a/drivers/cpuidle/cpuidle-psci.c
> +++ b/drivers/cpuidle/cpuidle-psci.c
> @@ -16,7 +16,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 

Acked-by: Sudeep Holla 

-- 
Regards,
Sudeep


Re: [PATCH 10/19] cacheinfo: Adjust includes to remove of_device.h

2023-03-29 Thread Sudeep Holla
On Wed, Mar 29, 2023 at 10:52:07AM -0500, Rob Herring wrote:
> Now that of_cpu_device_node_get() is defined in of.h, of_device.h is just
> implicitly including other includes, and is no longer needed. Update the
> includes to use of.h instead of of_device.h.
>

Acked-by: Sudeep Holla 

-- 
Regards,
Sudeep


Re: [PATCH 04/19] of: Move CPU node related functions to their own file

2023-03-29 Thread Sudeep Holla
On Wed, Mar 29, 2023 at 10:52:01AM -0500, Rob Herring wrote:
> drivers/of/base.c is quite long and we've accumulated a number of CPU
> node functions. Let's move them to a new file, cpu.c, along with the
> lone of_cpu_device_node_get() in of_device.h. Moving the declaration has
> no effect yet as of.h is included by of_device.h. This serves as
> preparation to disentangle the includes in of_device.h and
> of_platform.h.
>

Makes sense to have its own file for CPUs, I am sure there will be more
additions .

FWIW,

Reviewed-by: Sudeep Holla 

-- 
Regards,
Sudeep


Re: [PATCH v3 00/51] cpuidle,rcu: Clean up the mess

2023-01-17 Thread Sudeep Holla
On Tue, Jan 17, 2023 at 01:16:21PM +, Mark Rutland wrote:
> On Tue, Jan 17, 2023 at 11:26:29AM +0100, Peter Zijlstra wrote:
> > On Mon, Jan 16, 2023 at 04:59:04PM +, Mark Rutland wrote:
> > 
> > > I'm sorry to have to bear some bad news on that front. :(
> > 
> > Moo, something had to give..
> > 
> > 
> > > IIUC what's happenign here is the PSCI cpuidle driver has entered idle 
> > > and RCU
> > > is no longer watching when arm64's cpu_suspend() manipulates DAIF. Our
> > > local_daif_*() helpers poke lockdep and tracing, hence the call to
> > > trace_hardirqs_off() and the RCU usage.
> > 
> > Right, strictly speaking not needed at this point, IRQs should have been
> > traced off a long time ago.
> 
> True, but there are some other calls around here that *might* end up invoking
> RCU stuff (e.g. the MTE code).
> 
> That all needs a noinstr cleanup too, which I'll sort out as a follow-up.
> 
> > > I think we need RCU to be watching all the way down to cpu_suspend(), and 
> > > it's
> > > cpu_suspend() that should actually enter/exit idle context. That and we 
> > > need to
> > > make cpu_suspend() and the low-level PSCI invocation noinstr.
> > > 
> > > I'm not sure whether 32-bit will have a similar issue or not.
> > 
> > I'm not seeing 32bit or Risc-V have similar issues here, but who knows,
> > maybe I missed somsething.
> 
> I reckon if they do, the core changes here give us the infrastructure to fix
> them if/when we get reports.
> 
> > In any case, the below ought to cure the ARM64 case and remove that last
> > known RCU_NONIDLE() user as a bonus.
> 
> The below works for me testing on a Juno R1 board with PSCI, using defconfig +
> CONFIG_PROVE_LOCKING=y + CONFIG_DEBUG_LOCKDEP=y + CONFIG_DEBUG_ATOMIC_SLEEP=y.
> I'm not sure how to test the LPI / FFH part, but it looks good to me.
> 
> FWIW:
> 
> Reviewed-by: Mark Rutland 
> Tested-by: Mark Rutland 
> 
> Sudeep, would you be able to give the LPI/FFH side a spin with the kconfig
> options above?
> 

Not sure if I have messed up something in my mail setup, but I did reply
earlier. I did test both DT/cpuidle-psci driver and  ACPI/LPI+FFH driver
with the fix Peter sent. I was seeing same splat as you in both DT and
ACPI boot which the patch fixed it. I used the same config as described by
you above.

-- 
Regards,
Sudeep


Re: [PATCH v3 00/51] cpuidle,rcu: Clean up the mess

2023-01-17 Thread Sudeep Holla
On Tue, Jan 17, 2023 at 11:26:29AM +0100, Peter Zijlstra wrote:
> On Mon, Jan 16, 2023 at 04:59:04PM +, Mark Rutland wrote:
> 
> > I'm sorry to have to bear some bad news on that front. :(
> 
> Moo, something had to give..
> 
> 
> > IIUC what's happenign here is the PSCI cpuidle driver has entered idle and 
> > RCU
> > is no longer watching when arm64's cpu_suspend() manipulates DAIF. Our
> > local_daif_*() helpers poke lockdep and tracing, hence the call to
> > trace_hardirqs_off() and the RCU usage.
> 
> Right, strictly speaking not needed at this point, IRQs should have been
> traced off a long time ago.
> 
> > I think we need RCU to be watching all the way down to cpu_suspend(), and 
> > it's
> > cpu_suspend() that should actually enter/exit idle context. That and we 
> > need to
> > make cpu_suspend() and the low-level PSCI invocation noinstr.
> > 
> > I'm not sure whether 32-bit will have a similar issue or not.
> 
> I'm not seeing 32bit or Risc-V have similar issues here, but who knows,
> maybe I missed somsething.
> 
> In any case, the below ought to cure the ARM64 case and remove that last
> known RCU_NONIDLE() user as a bonus.
>

Thanks for the fix. I tested the series and did observe the same splat
with both DT and ACPI boot(they enter idle in different code paths). Thanks
to Mark for reminding me about ACPI. With this fix, I see the splat is
gone in both DT(cpuidle-psci.c) and ACPI(acpi_processor_idle.c).

You can add:

Tested-by: Sudeep Holla 

--
Regards,
Sudeep


Re: [PATCH 12/12] cacheinfo: Set cache 'id' based on DT data

2021-10-18 Thread Sudeep Holla
On Wed, Oct 06, 2021 at 11:43:32AM -0500, Rob Herring wrote:
> Use the minimum CPU h/w id of the CPUs associated with the cache for the
> cache 'id'. This will provide a stable id value for a given system. As
> we need to check all possible CPUs, we can't use the shared_cpu_map
> which is just online CPUs. As there's not a cache to CPUs mapping in DT,
> we have to walk all CPU nodes and then walk cache levels.
> 

Acked-by: Sudeep Holla 

-- 
Regards,
Sudeep


Re: [PATCH 11/12] cacheinfo: Allow for >32-bit cache 'id'

2021-10-18 Thread Sudeep Holla
On Wed, Oct 06, 2021 at 11:43:31AM -0500, Rob Herring wrote:
> In preparation to set the cache 'id' based on the CPU h/w ids, allow for
> 64-bit bit 'id' value. The only case that needs this is arm64, so
> unsigned long is sufficient.
> 

Reviewed-by: Sudeep Holla 

-- 
Regards,
Sudeep


Re: [PATCH 04/12] arm64: Use of_get_cpu_hwid()

2021-10-18 Thread Sudeep Holla
On Wed, Oct 06, 2021 at 11:43:24AM -0500, Rob Herring wrote:
> Replace the open coded parsing of CPU nodes' 'reg' property with
> of_get_cpu_hwid().
> 
> This change drops an error message for missing 'reg' property, but that
> should not be necessary as the DT tools will ensure 'reg' is present.
> 

Reviewed-by: Sudeep Holla 

-- 
Regards,
Sudeep


Re: [PATCH 01/12] of: Add of_get_cpu_hwid() to read hardware ID from CPU nodes

2021-10-18 Thread Sudeep Holla
On Wed, Oct 06, 2021 at 11:43:21AM -0500, Rob Herring wrote:
> There are various open coded implementions parsing the CPU node 'reg'
> property which contains the CPU's hardware ID. Introduce a new function,
> of_get_cpu_hwid(), to read the hardware ID.
>
> All the callers should be DT only code, so no need for an empty
> function.
>

Thanks for doing this. I postponed and forgot about this though I had
planned for this when I touched code around this.

Reviewed-by: Sudeep Holla 

--
Regards,
Sudeep


Re: [PATCH v4 5/5] bus: Make remove callback return void

2021-07-14 Thread Sudeep Holla
On Tue, Jul 13, 2021 at 09:35:22PM +0200, Uwe Kleine-König wrote:
> The driver core ignores the return value of this callback because there
> is only little it can do when a device disappears.
> 
> This is the final bit of a long lasting cleanup quest where several
> buses were converted to also return void from their remove callback.
> Additionally some resource leaks were fixed that were caused by drivers
> returning an error code in the expectation that the driver won't go
> away.
> 
> With struct bus_type::remove returning void it's prevented that newly
> implemented buses return an ignored error code and so don't anticipate
> wrong expectations for driver authors.
> 

[...]

> diff --git a/drivers/firmware/arm_scmi/bus.c b/drivers/firmware/arm_scmi/bus.c
> index 784cf0027da3..2682c3df651c 100644
> --- a/drivers/firmware/arm_scmi/bus.c
> +++ b/drivers/firmware/arm_scmi/bus.c
> @@ -116,15 +116,13 @@ static int scmi_dev_probe(struct device *dev)
>   return scmi_drv->probe(scmi_dev);
>  }
>  
> -static int scmi_dev_remove(struct device *dev)
> +static void scmi_dev_remove(struct device *dev)
>  {
>   struct scmi_driver *scmi_drv = to_scmi_driver(dev->driver);
>   struct scmi_device *scmi_dev = to_scmi_dev(dev);
>  
>   if (scmi_drv->remove)
>   scmi_drv->remove(scmi_dev);
> -
> - return 0;
>  }
>  
>  static struct bus_type scmi_bus_type = {

Acked-by: Sudeep Holla 

--
Regards,
Sudeep


Re: [PATCH 1/2] powerpc/64: drop redundant defination of spin_until_cond

2021-06-11 Thread Sudeep Holla
On Fri, Jun 11, 2021 at 07:10:57PM +, Christophe Leroy wrote:
> From: Sudeep Holla 
> 
> linux/processor.h has exactly same defination for spin_until_cond.
> Drop the redundant defination in asm/processor.h
>

Wow you must be real good at ML archaeology, this must have been at-least
3+ years old. I found this when I wanted to you spin_until_cond. Thanks
anyways for digging the original patch, nobody would have remembered even
if you posted fresh .

-- 
Regards,
Sudeep


Re: [PATCH v5] reboot: support offline CPUs before reboot

2020-01-15 Thread Sudeep Holla
On Wed, Jan 15, 2020 at 02:34:10PM +0800, Hsin-Yi Wang wrote:
> Currently system reboots uses architecture specific codes (smp_send_stop)
> to offline non reboot CPUs. Most architecture's implementation is looping
> through all non reboot online CPUs and call ipi function to each of them. Some
> architecture like arm64, arm, and x86... would set offline masks to cpu 
> without
> really offline them. This causes some race condition and kernel warning comes
> out sometimes when system reboots.
>
> This patch adds a config ARCH_OFFLINE_CPUS_ON_REBOOT, which would offline 
> cpus in
> migrate_to_reboot_cpu(). If non reboot cpus are all offlined here, the loop 
> for
> checking online cpus would be an empty loop. If architecture don't enable this
> config, or some cpus somehow fails to offline, it would fallback to ipi
> function.
>

What's the timing impact on systems with large number of CPUs(say 256 or
more) ? I remember we added some change to reduce the wait times for
offlining CPUs in system suspend path on arm64, still not negligible.

--
Regards,
Sudeep


Re: [PATCH v2 3/6] x86: clean up _TIF_SYSCALL_EMU handling using ptrace_syscall_enter hook

2019-05-01 Thread Sudeep Holla
On Wed, May 01, 2019 at 05:57:11PM +0200, Oleg Nesterov wrote:
> On 04/30, Sudeep Holla wrote:
> >
> > On Mon, Mar 18, 2019 at 04:33:22PM +0100, Oleg Nesterov wrote:
> > >
> > > And it seems that _TIF_WORK_SYSCALL_ENTRY needs some cleanups too... We 
> > > don't need
> > > "& _TIF_WORK_SYSCALL_ENTRY" in syscall_trace_enter, and 
> > > _TIF_WORK_SYSCALL_ENTRY
> > > should not include _TIF_NOHZ?
> > >
> >
> > I was about to post the updated version and checked this to make sure I have
> > covered everything or not. I had missed the above comment. All architectures
> > have _TIF_NOHZ in their mask that they check to do work. And from x86, I 
> > read
> > "...syscall_trace_enter(). Also includes TIF_NOHZ for 
> > enter_from_user_mode()"
> > So I don't understand why _TIF_NOHZ needs to be dropped.
>
> I have already forgot this discussion... But after I glanced at this code 
> again
> I still think the same, and I don't understand why do you disagree.
>

Sorry, but I didn't have any disagreement, I just said I don't understand
the usage on all architectures at that moment.

> > Also if we need to drop, we can address that separately examining all archs.
>
> Sure, and I was only talking about x86. We can keep TIF_NOHZ and even
> set_tsk_thread_flag(TIF_NOHZ) in context_tracking_cpu_set() if some arch needs
> this but remove TIF_NOHZ from TIF_WORK_SYSCALL_ENTRY in 
> arch/x86/include/asm/thread_info.h,
> afaics this shouldn't make any difference.
>

OK, it's just x86, then I understand your point. I was looking at all
the architectures, sorry for the confusion.

> And I see no reason why x86 needs to use TIF_WORK_SYSCALL_ENTRY in
> syscall_trace_enter().
>

Agreed

--
Regards,
Sudeep


Re: [PATCH v2 3/6] x86: clean up _TIF_SYSCALL_EMU handling using ptrace_syscall_enter hook

2019-04-30 Thread Sudeep Holla



On 30/04/2019 17:46, Andy Lutomirski wrote:
> On Mon, Mar 18, 2019 at 3:49 AM Sudeep Holla  wrote:
>>
>> Now that we have a new hook ptrace_syscall_enter that can be called from
>> syscall entry code and it handles PTRACE_SYSEMU in generic code, we
>> can do some cleanup using the same in syscall_trace_enter.
>>
>> Further the extra logic to find single stepping PTRACE_SYSEMU_SINGLESTEP
>> in syscall_slow_exit_work seems unnecessary. Let's remove the same.
>>
> 
> Unless the patch set contains a selftest that exercises all the
> interesting cases here, NAK.  To be clear, there needs to be a test
> that passes on an unmodified kernel and still passes on a patched
> kernel.  And that test case needs to *fail* if, for example, you force
> "emulated" to either true or false rather than reading out the actual
> value.
> 

Tested using tools/testing/selftests/x86/ptrace_syscall.c

Also v3 doesn't change any logic or additional call to new function as
in v2. It's just simple cleanup as suggested by Oleg.

-- 
Regards,
Sudeep


Re: [PATCH v2 3/6] x86: clean up _TIF_SYSCALL_EMU handling using ptrace_syscall_enter hook

2019-04-30 Thread Sudeep Holla
On Mon, Mar 18, 2019 at 04:33:22PM +0100, Oleg Nesterov wrote:
> On 03/18, Sudeep Holla wrote:
> >
> > --- a/arch/x86/entry/common.c
> > +++ b/arch/x86/entry/common.c
> > @@ -70,22 +70,16 @@ static long syscall_trace_enter(struct pt_regs *regs)
> >
> > struct thread_info *ti = current_thread_info();
> > unsigned long ret = 0;
> > -   bool emulated = false;
> > u32 work;
> >
> > if (IS_ENABLED(CONFIG_DEBUG_ENTRY))
> > BUG_ON(regs != task_pt_regs(current));
> >
> > -   work = READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY;
> > -
> > -   if (unlikely(work & _TIF_SYSCALL_EMU))
> > -   emulated = true;
> > -
> > -   if ((emulated || (work & _TIF_SYSCALL_TRACE)) &&
> > -   tracehook_report_syscall_entry(regs))
> > +   if (unlikely(ptrace_syscall_enter(regs)))
> > return -1L;
> >
> > -   if (emulated)
> > +   work = READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY;
> > +   if ((work & _TIF_SYSCALL_TRACE) && tracehook_report_syscall_entry(regs))
> > return -1L;
>
[...]

>
> And it seems that _TIF_WORK_SYSCALL_ENTRY needs some cleanups too... We don't 
> need
> "& _TIF_WORK_SYSCALL_ENTRY" in syscall_trace_enter, and 
> _TIF_WORK_SYSCALL_ENTRY
> should not include _TIF_NOHZ?
>

I was about to post the updated version and checked this to make sure I have
covered everything or not. I had missed the above comment. All architectures
have _TIF_NOHZ in their mask that they check to do work. And from x86, I read
"...syscall_trace_enter(). Also includes TIF_NOHZ for enter_from_user_mode()"
So I don't understand why _TIF_NOHZ needs to be dropped.

Also if we need to drop, we can address that separately examining all archs.
I will post the cleanup as you suggested for now.

--
Regards,
Sudeep


Re: [PATCH v2 4/6] powerpc: use common ptrace_syscall_enter hook to handle _TIF_SYSCALL_EMU

2019-03-18 Thread Sudeep Holla
On Mon, Mar 18, 2019 at 06:33:41PM +0100, Oleg Nesterov wrote:
> On 03/18, Sudeep Holla wrote:
> >
> > On Mon, Mar 18, 2019 at 06:20:24PM +0100, Oleg Nesterov wrote:
> > >
> > > Again, to me this patch just makes the code look worse. Honestly, I don't
> > > think that the new (badly named) ptrace_syscall_enter() hook makes any 
> > > sense.
> > >
> >
> > Worse because we end up reading current_thread_info->flags twice ?
>
> Mostly because in my opinion ptrace_syscall_enter() buys nothing but makes
> the caller's code less readable/understandable.
>
> Sure, this is subjective.
>

Based on what we have in that function today, I tend to agree. Will and
Richard were in the opinion to consolidate SYSEMU handling(in the threads
pointed in my cover letter). If there's a better way to achieve the same
I am in for it. I have just tried to put something together based on
what I could think of.

--
Regards,
Sudeep


Re: [PATCH v2 4/6] powerpc: use common ptrace_syscall_enter hook to handle _TIF_SYSCALL_EMU

2019-03-18 Thread Sudeep Holla
On Mon, Mar 18, 2019 at 06:20:24PM +0100, Oleg Nesterov wrote:
> On 03/18, Sudeep Holla wrote:
> >
> > --- a/arch/powerpc/kernel/ptrace.c
> > +++ b/arch/powerpc/kernel/ptrace.c
> > @@ -3278,35 +3278,29 @@ long do_syscall_trace_enter(struct pt_regs *regs)
> >
> > user_exit();
> >
> > -   flags = READ_ONCE(current_thread_info()->flags) &
> > -   (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE);
> > -
> > -   if (flags) {
> > -   int rc = tracehook_report_syscall_entry(regs);
> > +   if (unlikely(ptrace_syscall_enter(regs))) {
> > +   /*
> > +* A nonzero return code from tracehook_report_syscall_entry()
> > +* tells us to prevent the syscall execution, but we are not
> > +* going to execute it anyway.
> > +*
> > +* Returning -1 will skip the syscall execution. We want to
> > +* avoid clobbering any registers, so we don't goto the skip
> > +* label below.
> > +*/
> > +   return -1;
> > +   }
> >
> > -   if (unlikely(flags & _TIF_SYSCALL_EMU)) {
> > -   /*
> > -* A nonzero return code from
> > -* tracehook_report_syscall_entry() tells us to prevent
> > -* the syscall execution, but we are not going to
> > -* execute it anyway.
> > -*
> > -* Returning -1 will skip the syscall execution. We want
> > -* to avoid clobbering any registers, so we don't goto
> > -* the skip label below.
> > -*/
> > -   return -1;
> > -   }
> > +   flags = READ_ONCE(current_thread_info()->flags) & _TIF_SYSCALL_TRACE;
>
> Why do we need READ_ONCE() with this change?
>
> And now that we change a single bit "flags" doesn't look like a good name.
>
> Again, to me this patch just makes the code look worse. Honestly, I don't
> think that the new (badly named) ptrace_syscall_enter() hook makes any sense.
>

Worse because we end up reading current_thread_info->flags twice ?

--
Regards,
Sudeep


Re: [PATCH v2 4/6] powerpc: use common ptrace_syscall_enter hook to handle _TIF_SYSCALL_EMU

2019-03-18 Thread Sudeep Holla
On Mon, Mar 18, 2019 at 05:26:18PM +0300, Dmitry V. Levin wrote:
> On Mon, Mar 18, 2019 at 10:49:23AM +0000, Sudeep Holla wrote:
> > Now that we have a new hook ptrace_syscall_enter that can be called from
> > syscall entry code and it handles PTRACE_SYSEMU in generic code, we
> > can do some cleanup using the same in do_syscall_trace_enter.
> > 
> > Cc: Oleg Nesterov 
> > Cc: Paul Mackerras 
> > Cc: Michael Ellerman 
> > Signed-off-by: Sudeep Holla 
> > ---
> >  arch/powerpc/kernel/ptrace.c | 48 
> >  1 file changed, 21 insertions(+), 27 deletions(-)
> > 
> > diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
> > index 2e2183b800a8..05579a5dcb12 100644
> > --- a/arch/powerpc/kernel/ptrace.c
> > +++ b/arch/powerpc/kernel/ptrace.c
> > @@ -3278,35 +3278,29 @@ long do_syscall_trace_enter(struct pt_regs *regs)
> >  
> > user_exit();
> >  
> > -   flags = READ_ONCE(current_thread_info()->flags) &
> > -   (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE);
> > -
> > -   if (flags) {
> > -   int rc = tracehook_report_syscall_entry(regs);
> > +   if (unlikely(ptrace_syscall_enter(regs))) {
> > +   /*
> > +* A nonzero return code from tracehook_report_syscall_entry()
> > +* tells us to prevent the syscall execution, but we are not
> > +* going to execute it anyway.
> > +*
> > +* Returning -1 will skip the syscall execution. We want to
> > +* avoid clobbering any registers, so we don't goto the skip
> > +* label below.
> > +*/
> > +   return -1;
> > +   }
> 
> This comment is out of sync with the changed code.

Still applicable indirectly as ptrace_syscall_enter just executes
tracehook_report_syscall_entry, but I agree needs rewording, will update.

--
Regards,
Sudeep


Re: [PATCH v2 2/6] ptrace: introduce ptrace_syscall_enter to consolidate PTRACE_SYSEMU handling

2019-03-18 Thread Sudeep Holla
On Mon, Mar 18, 2019 at 05:41:15PM +0300, Dmitry V. Levin wrote:
> On Mon, Mar 18, 2019 at 10:49:21AM +0000, Sudeep Holla wrote:
> > Currently each architecture handles PTRACE_SYSEMU in very similar way.
> > It's completely arch independent and can be handled in the code helping
> > to consolidate PTRACE_SYSEMU handling.
> > 
> > Let's introduce a hook 'ptrace_syscall_enter' that arch specific syscall
> > entry code can call.
> 
> Sorry if I'm late for the party, but the new name looks confusing.
> If all it does is related to TIF_SYSCALL_EMU, why does it have a generic
> name 'ptrace_syscall_enter' without any hint of being specific to
> TIF_SYSCALL_EMU?
> 

Not at all late. Infact Haibo Xu pointed that out, I updated but somehow
missed to commit and lost those changes. I will rename as
ptrace_sysemu_syscall_enter

--
Regards,
Sudeep


Re: [PATCH v2 2/6] ptrace: introduce ptrace_syscall_enter to consolidate PTRACE_SYSEMU handling

2019-03-18 Thread Sudeep Holla
On Mon, Mar 18, 2019 at 05:31:47PM +0300, Dmitry V. Levin wrote:
> On Mon, Mar 18, 2019 at 10:49:21AM +0000, Sudeep Holla wrote:
> > Currently each architecture handles PTRACE_SYSEMU in very similar way.
> > It's completely arch independent and can be handled in the code helping
> > to consolidate PTRACE_SYSEMU handling.
> > 
> > Let's introduce a hook 'ptrace_syscall_enter' that arch specific syscall
> > entry code can call.
> > 
> > Cc: Oleg Nesterov 
> > Signed-off-by: Sudeep Holla 
> > ---
> >  include/linux/ptrace.h |  1 +
> >  kernel/ptrace.c| 22 ++
> >  2 files changed, 23 insertions(+)
> > 
> > diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
> > index edb9b040c94c..e30f51e3363e 100644
> > --- a/include/linux/ptrace.h
> > +++ b/include/linux/ptrace.h
> > @@ -407,6 +407,7 @@ static inline void user_single_step_report(struct 
> > pt_regs *regs)
> >  #define current_user_stack_pointer() user_stack_pointer(current_pt_regs())
> >  #endif
> >  
> > +extern long ptrace_syscall_enter(struct pt_regs *regs);
> >  extern int task_current_syscall(struct task_struct *target, long *callno,
> > unsigned long args[6], unsigned int maxargs,
> > unsigned long *sp, unsigned long *pc);
> > diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> > index 4fa3b7f4c3c7..c9c505c483df 100644
> > --- a/kernel/ptrace.c
> > +++ b/kernel/ptrace.c
> > @@ -29,6 +29,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  /*
> >   * Access another process' address space via ptrace.
> > @@ -557,6 +558,27 @@ static int ptrace_detach(struct task_struct *child, 
> > unsigned int data)
> > return 0;
> >  }
> >  
> > +/*
> > + * Hook to check and report for PTRACE_SYSEMU, can be called from arch
> > + * arch syscall entry code
> > + */
> > +long ptrace_syscall_enter(struct pt_regs *regs)
> > +{
> > +#ifdef TIF_SYSCALL_EMU
> > +   if (test_thread_flag(TIF_SYSCALL_EMU)) {
> > +   if (tracehook_report_syscall_entry(regs))
> > +   /*
> > +* We can ignore the return code here as we need
> > +* return -1 always for syscall emulation irrespective
> > +* of whether the tracehook report fails or succeed.
> > +*/
> > +   ;
> 
> This is problematic as it causes build errors with -Werror=empty-body,
> see https://lore.kernel.org/lkml/20181218205305.26647-1-ma...@debian.org/
> 

Thanks for the pointer, will update.

--
Regards,
Sudeep


[PATCH v2 6/6] arm64: ptrace: add support for syscall emulation

2019-03-18 Thread Sudeep Holla
Add PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP support on arm64.
We can just make sure of the generic ptrace_syscall_enter hook to
support PTRACE_SYSEMU. We don't need any special handling for
PTRACE_SYSEMU_SINGLESTEP.

Cc: Catalin Marinas 
Cc: Will Deacon 
Signed-off-by: Sudeep Holla 
---
 arch/arm64/include/asm/thread_info.h | 5 -
 arch/arm64/kernel/ptrace.c   | 3 +++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/thread_info.h 
b/arch/arm64/include/asm/thread_info.h
index eb3ef73e07cf..c285d1ce7186 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -75,6 +75,7 @@ void arch_release_task_struct(struct task_struct *tsk);
  *  TIF_SYSCALL_TRACE  - syscall trace active
  *  TIF_SYSCALL_TRACEPOINT - syscall tracepoint for ftrace
  *  TIF_SYSCALL_AUDIT  - syscall auditing
+ *  TIF_SYSCALL_EMU - syscall emulation active
  *  TIF_SECOMP - syscall secure computing
  *  TIF_SIGPENDING - signal pending
  *  TIF_NEED_RESCHED   - rescheduling necessary
@@ -91,6 +92,7 @@ void arch_release_task_struct(struct task_struct *tsk);
 #define TIF_SYSCALL_AUDIT  9
 #define TIF_SYSCALL_TRACEPOINT 10
 #define TIF_SECCOMP11
+#define TIF_SYSCALL_EMU12
 #define TIF_MEMDIE 18  /* is terminating due to OOM killer */
 #define TIF_FREEZE 19
 #define TIF_RESTORE_SIGMASK20
@@ -109,6 +111,7 @@ void arch_release_task_struct(struct task_struct *tsk);
 #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
 #define _TIF_SYSCALL_TRACEPOINT(1 << TIF_SYSCALL_TRACEPOINT)
 #define _TIF_SECCOMP   (1 << TIF_SECCOMP)
+#define _TIF_SYSCALL_EMU   (1 << TIF_SYSCALL_EMU)
 #define _TIF_UPROBE(1 << TIF_UPROBE)
 #define _TIF_FSCHECK   (1 << TIF_FSCHECK)
 #define _TIF_32BIT (1 << TIF_32BIT)
@@ -120,7 +123,7 @@ void arch_release_task_struct(struct task_struct *tsk);
 
 #define _TIF_SYSCALL_WORK  (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
 _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
-_TIF_NOHZ)
+_TIF_NOHZ | _TIF_SYSCALL_EMU)
 
 #define INIT_THREAD_INFO(tsk)  \
 {  \
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index b82e0a9b3da3..cf29275cd4d9 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1819,6 +1819,9 @@ static void tracehook_report_syscall(struct pt_regs *regs,
 
 int syscall_trace_enter(struct pt_regs *regs)
 {
+   if (unlikely(ptrace_syscall_enter(regs)))
+   return -1;
+
if (test_thread_flag(TIF_SYSCALL_TRACE))
tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
 
-- 
2.17.1



[PATCH v2 5/6] arm64: add PTRACE_SYSEMU{, SINGLESTEP} definations to uapi headers

2019-03-18 Thread Sudeep Holla
x86 and um use 31 and 32 for PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP
while powerpc uses different value maybe for legacy reasons.

Though handling of PTRACE_SYSEMU can be made architecture independent,
it's hard to make these definations generic. To add to this existing
mess few architectures like arm, c6x and sh use 31 for PTRACE_GETFDPIC
(get the ELF fdpic loadmap address). It's not possible to move the
definations to generic headers.

So we unfortunately have to duplicate the same defination to ARM64 if
we need to support PTRACE_SYSEMU.

Cc: Catalin Marinas 
Cc: Will Deacon 
Signed-off-by: Sudeep Holla 
---
 arch/arm64/include/uapi/asm/ptrace.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm64/include/uapi/asm/ptrace.h 
b/arch/arm64/include/uapi/asm/ptrace.h
index d78623acb649..627ac57c1581 100644
--- a/arch/arm64/include/uapi/asm/ptrace.h
+++ b/arch/arm64/include/uapi/asm/ptrace.h
@@ -62,6 +62,9 @@
 #define PSR_x  0xff00  /* Extension*/
 #define PSR_c  0x00ff  /* Control  */
 
+/* syscall emulation path in ptrace */
+#define PTRACE_SYSEMU31
+#define PTRACE_SYSEMU_SINGLESTEP  32
 
 #ifndef __ASSEMBLY__
 
-- 
2.17.1



[PATCH v2 4/6] powerpc: use common ptrace_syscall_enter hook to handle _TIF_SYSCALL_EMU

2019-03-18 Thread Sudeep Holla
Now that we have a new hook ptrace_syscall_enter that can be called from
syscall entry code and it handles PTRACE_SYSEMU in generic code, we
can do some cleanup using the same in do_syscall_trace_enter.

Cc: Oleg Nesterov 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Signed-off-by: Sudeep Holla 
---
 arch/powerpc/kernel/ptrace.c | 48 
 1 file changed, 21 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index 2e2183b800a8..05579a5dcb12 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -3278,35 +3278,29 @@ long do_syscall_trace_enter(struct pt_regs *regs)
 
user_exit();
 
-   flags = READ_ONCE(current_thread_info()->flags) &
-   (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE);
-
-   if (flags) {
-   int rc = tracehook_report_syscall_entry(regs);
+   if (unlikely(ptrace_syscall_enter(regs))) {
+   /*
+* A nonzero return code from tracehook_report_syscall_entry()
+* tells us to prevent the syscall execution, but we are not
+* going to execute it anyway.
+*
+* Returning -1 will skip the syscall execution. We want to
+* avoid clobbering any registers, so we don't goto the skip
+* label below.
+*/
+   return -1;
+   }
 
-   if (unlikely(flags & _TIF_SYSCALL_EMU)) {
-   /*
-* A nonzero return code from
-* tracehook_report_syscall_entry() tells us to prevent
-* the syscall execution, but we are not going to
-* execute it anyway.
-*
-* Returning -1 will skip the syscall execution. We want
-* to avoid clobbering any registers, so we don't goto
-* the skip label below.
-*/
-   return -1;
-   }
+   flags = READ_ONCE(current_thread_info()->flags) & _TIF_SYSCALL_TRACE;
 
-   if (rc) {
-   /*
-* The tracer decided to abort the syscall. Note that
-* the tracer may also just change regs->gpr[0] to an
-* invalid syscall number, that is handled below on the
-* exit path.
-*/
-   goto skip;
-   }
+   if (flags && tracehook_report_syscall_entry(regs)) {
+   /*
+* The tracer decided to abort the syscall. Note that
+* the tracer may also just change regs->gpr[0] to an
+* invalid syscall number, that is handled below on the
+* exit path.
+*/
+   goto skip;
}
 
/* Run seccomp after ptrace; allow it to set gpr[3]. */
-- 
2.17.1



[PATCH v2 3/6] x86: clean up _TIF_SYSCALL_EMU handling using ptrace_syscall_enter hook

2019-03-18 Thread Sudeep Holla
Now that we have a new hook ptrace_syscall_enter that can be called from
syscall entry code and it handles PTRACE_SYSEMU in generic code, we
can do some cleanup using the same in syscall_trace_enter.

Further the extra logic to find single stepping PTRACE_SYSEMU_SINGLESTEP
in syscall_slow_exit_work seems unnecessary. Let's remove the same.

Cc: Andy Lutomirski 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Signed-off-by: Sudeep Holla 
---
 arch/x86/entry/common.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 7bc105f47d21..5d7590994964 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -70,22 +70,16 @@ static long syscall_trace_enter(struct pt_regs *regs)
 
struct thread_info *ti = current_thread_info();
unsigned long ret = 0;
-   bool emulated = false;
u32 work;
 
if (IS_ENABLED(CONFIG_DEBUG_ENTRY))
BUG_ON(regs != task_pt_regs(current));
 
-   work = READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY;
-
-   if (unlikely(work & _TIF_SYSCALL_EMU))
-   emulated = true;
-
-   if ((emulated || (work & _TIF_SYSCALL_TRACE)) &&
-   tracehook_report_syscall_entry(regs))
+   if (unlikely(ptrace_syscall_enter(regs)))
return -1L;
 
-   if (emulated)
+   work = READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY;
+   if ((work & _TIF_SYSCALL_TRACE) && tracehook_report_syscall_entry(regs))
return -1L;
 
 #ifdef CONFIG_SECCOMP
-- 
2.17.1



[PATCH v2 2/6] ptrace: introduce ptrace_syscall_enter to consolidate PTRACE_SYSEMU handling

2019-03-18 Thread Sudeep Holla
Currently each architecture handles PTRACE_SYSEMU in very similar way.
It's completely arch independent and can be handled in the code helping
to consolidate PTRACE_SYSEMU handling.

Let's introduce a hook 'ptrace_syscall_enter' that arch specific syscall
entry code can call.

Cc: Oleg Nesterov 
Signed-off-by: Sudeep Holla 
---
 include/linux/ptrace.h |  1 +
 kernel/ptrace.c| 22 ++
 2 files changed, 23 insertions(+)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index edb9b040c94c..e30f51e3363e 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -407,6 +407,7 @@ static inline void user_single_step_report(struct pt_regs 
*regs)
 #define current_user_stack_pointer() user_stack_pointer(current_pt_regs())
 #endif
 
+extern long ptrace_syscall_enter(struct pt_regs *regs);
 extern int task_current_syscall(struct task_struct *target, long *callno,
unsigned long args[6], unsigned int maxargs,
unsigned long *sp, unsigned long *pc);
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 4fa3b7f4c3c7..c9c505c483df 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Access another process' address space via ptrace.
@@ -557,6 +558,27 @@ static int ptrace_detach(struct task_struct *child, 
unsigned int data)
return 0;
 }
 
+/*
+ * Hook to check and report for PTRACE_SYSEMU, can be called from arch
+ * arch syscall entry code
+ */
+long ptrace_syscall_enter(struct pt_regs *regs)
+{
+#ifdef TIF_SYSCALL_EMU
+   if (test_thread_flag(TIF_SYSCALL_EMU)) {
+   if (tracehook_report_syscall_entry(regs))
+   /*
+* We can ignore the return code here as we need
+* return -1 always for syscall emulation irrespective
+* of whether the tracehook report fails or succeed.
+*/
+   ;
+   return -1L;
+   }
+#endif
+   return 0;
+}
+
 /*
  * Detach all tasks we were using ptrace on. Called with tasklist held
  * for writing.
-- 
2.17.1



[PATCH v2 1/6] ptrace: move clearing of TIF_SYSCALL_EMU flag to core

2019-03-18 Thread Sudeep Holla
While the TIF_SYSCALL_EMU is set in ptrace_resume independent of any
architecture, currently only powerpc and x86 unset the TIF_SYSCALL_EMU
flag in ptrace_disable which gets called from ptrace_detach.

Let's move the clearing of TIF_SYSCALL_EMU flag to ptrace_detach after
we return from ptrace_disable to ensure there's no change in the flow.

Cc: Oleg Nesterov 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Signed-off-by: Sudeep Holla 
---
 arch/powerpc/kernel/ptrace.c | 1 -
 arch/x86/kernel/ptrace.c | 3 ---
 kernel/ptrace.c  | 4 
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index d9ac7d94656e..2e2183b800a8 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -2520,7 +2520,6 @@ void ptrace_disable(struct task_struct *child)
 {
/* make sure the single step bit is not set. */
user_disable_single_step(child);
-   clear_tsk_thread_flag(child, TIF_SYSCALL_EMU);
 }
 
 #ifdef CONFIG_PPC_ADV_DEBUG_REGS
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 4b8ee05dd6ad..45792dbd2443 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -746,9 +746,6 @@ static int ioperm_get(struct task_struct *target,
 void ptrace_disable(struct task_struct *child)
 {
user_disable_single_step(child);
-#ifdef TIF_SYSCALL_EMU
-   clear_tsk_thread_flag(child, TIF_SYSCALL_EMU);
-#endif
 }
 
 #if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 771e93f9c43f..4fa3b7f4c3c7 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -534,6 +534,10 @@ static int ptrace_detach(struct task_struct *child, 
unsigned int data)
/* Architecture-specific hardware disable .. */
ptrace_disable(child);
 
+#ifdef TIF_SYSCALL_EMU
+   clear_tsk_thread_flag(child, TIF_SYSCALL_EMU);
+#endif
+
write_lock_irq(_lock);
/*
 * We rely on ptrace_freeze_traced(). It can't be killed and
-- 
2.17.1



[PATCH v2 0/6] ptrace: consolidate PTRACE_SYSEMU handling and add support for arm64

2019-03-18 Thread Sudeep Holla
Hi,

This patchset evolved from the discussion in the thread[0][1]. When we
wanted to add PTRACE_SYSEMU support to ARM64, we thought instead of
duplicating what other architectures like x86 and powerpc have done,
let consolidate the existing support and move it to the core as there's
nothing arch specific in it.

v1->v2:
- added comment for empty statement after tracehook_report_syscall_entry
- dropped x86 change in syscall_slow_exit_work as I had ended
  up changing logic unintentionally
- removed spurious change in powerpc moving user_exit()

Regards,
Sudeep

[0] https://patchwork.kernel.org/patch/10585505/
[1] https://patchwork.kernel.org/patch/10675237/


Sudeep Holla (6):
  ptrace: move clearing of TIF_SYSCALL_EMU flag to core
  ptrace: introduce ptrace_syscall_enter to consolidate PTRACE_SYSEMU handling
  x86: clean up _TIF_SYSCALL_EMU handling using ptrace_syscall_enter hook
  powerpc: use common ptrace_syscall_enter hook to handle _TIF_SYSCALL_EMU
  arm64: add PTRACE_SYSEMU{,SINGLESTEP} definations to uapi headers
  arm64: ptrace: add support for syscall emulation

 arch/arm64/include/asm/thread_info.h |  5 ++-
 arch/arm64/include/uapi/asm/ptrace.h |  3 ++
 arch/arm64/kernel/ptrace.c   |  3 ++
 arch/powerpc/kernel/ptrace.c | 49 
 arch/x86/entry/common.c  | 12 ++-
 arch/x86/kernel/ptrace.c |  3 --
 include/linux/ptrace.h   |  1 +
 kernel/ptrace.c  | 26 +++
 8 files changed, 61 insertions(+), 41 deletions(-)

--
2.17.1



Re: [PATCH 3/6] x86: clean up _TIF_SYSCALL_EMU handling using ptrace_syscall_enter hook

2019-03-14 Thread Sudeep Holla
On Wed, Mar 13, 2019 at 01:03:18AM +, Haibo Xu (Arm Technology China) wrote:
[...]

> Since ptrace() system call do have so many request type, I'm not sure
> whether the test cases have covered all of that. But here we'd better make
> sure the PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP requests are work
> correctly. May be you can verify them with tests from Bin Lu(bin...@arm.com).

Sure happy to try them. Can you point me to them ?
I did end up writing few more tests.

--
Regards,
Sudeep


Re: [PATCH 3/6] x86: clean up _TIF_SYSCALL_EMU handling using ptrace_syscall_enter hook

2019-03-12 Thread Sudeep Holla
On Mon, Mar 11, 2019 at 08:04:39PM -0700, Andy Lutomirski wrote:
> On Mon, Mar 11, 2019 at 6:35 PM Haibo Xu (Arm Technology China)
>  wrote:
> >

[...]

> > For the PTRACE_SYSEMU_SINGLESTEP request, ptrace only need to report(send
> > SIGTRAP) at the entry of a system call, no need to report at the exit of a
> > system call.That's why the old logic-{step = ((flags & (_TIF_SINGLESTEP |
> > _TIF_SYSCALL_EMU)) == _TIF_SINGLESTEP)} here try to filter out the special
> > case(PTRACE_SYSEMU_SINGLESTEP).
> >
> > Another way to make sure the logic is fine, you can run some tests with
> > respect to both logic, and to check whether they have the same behavior.
>
> tools/testing/selftests/x86/ptrace_syscall.c has a test intended to
> exercise this.  Can one of you either confirm that it does exercise it
> and that it still passes or can you improve the test?
>
I did run the tests which didn't flag anything. I haven't looked at the
details of test implementation, but seem to miss this case. I will see
what can be improved(if it's possible). Also I think single_step_syscall
is the one I need to look for this particular one. Both single_step_syscall
ptrace_syscall reported no errors.

--
Regards,
Sudeep


Re: [PATCH 3/6] x86: clean up _TIF_SYSCALL_EMU handling using ptrace_syscall_enter hook

2019-03-12 Thread Sudeep Holla
On Tue, Mar 12, 2019 at 01:34:44AM +, Haibo Xu (Arm Technology China) wrote:
> On 2019/3/12 2:34, Sudeep Holla wrote:
> > (I thought I had sent this email, last Tuesday itself, but saw this in my
> > draft today, something went wrong, sorry for the delay)
> > 
> > On Tue, Mar 05, 2019 at 02:14:47AM +, Haibo Xu (Arm Technology China) 
> > wrote:
> >> On 2019/3/4 18:12, Sudeep Holla wrote:
> >>> On Mon, Mar 04, 2019 at 08:25:28AM +, Haibo Xu (Arm Technology China) 
> >>> wrote:
> >>>> On 2019/3/1 2:32, Sudeep Holla wrote:
> >>>>> Now that we have a new hook ptrace_syscall_enter that can be called from
> >>>>> syscall entry code and it handles PTRACE_SYSEMU in generic code, we
> >>>>> can do some cleanup using the same in syscall_trace_enter.
> >>>>>
> >>>>> Further the extra logic to find single stepping PTRACE_SYSEMU_SINGLESTEP
> >>>>> in syscall_slow_exit_work seems unnecessary. Let's remove the same.
> >>>>
> >>>> I think we should not change the logic here. Is so, it will double the 
> >>>> report of syscall
> >>>> when PTRACE_SYSEMU_SINGLESTEP is enabled.
> >>>>
> >>>
> >>> I don't think that should happen, but I may be missing something.
> >>> Can you explain how ?
> >>>
> >>
> >> When PTRACE_SYSEMU_SINGLESTEP is enabled, both the _TIF_SYSCALL_EMU and
> >> _TIF_SINGLESTEP flags are set, but ptrace only need to report(send SIGTRAP)
> >> at the entry of a system call, no need to report at the exit of a system
> >> call.
> >>
> > Sorry, but I still not get it, we have:
> > 
> > step = ((flags & (_TIF_SINGLESTEP | _TIF_SYSCALL_EMU)) == 
> > _TIF_SINGLESTEP);
> > 
> > For me, this is same as:
> > step = ((flags & _TIF_SINGLESTEP) == _TIF_SINGLESTEP)
> > or
> > if (flags & _TIF_SINGLESTEP)
> > step = true;
> > 
> 
> I don't think so! As I mentioned in the last email loop, when
> PTRACE_SYSEMU_SINGLESTE is enabled, both the _TIF_SYSCALL_EMU and
> _TIF_SINGLESTEP flags are set, in which case the step should be "false" for
> the old logic. But with the new logic, the step is "true".
> 

Ah right, sorry I missed that.

> > So when PTRACE_SYSEMU_SINGLESTEP, _TIF_SYSCALL_EMU and _TIF_SINGLESTEP
> > are set and step evaluates to true.
> > 
> > So dropping _TIF_SYSCALL_EMU here should be fine. Am I still missing
> > something ?
> > 
> > --
> > Regards,
> > Sudeep
> > 
> 
> For the PTRACE_SYSEMU_SINGLESTEP request, ptrace only need to report(send
> SIGTRAP) at the entry of a system call, no need to report at the exit of a
> system call.That's why the old logic-{step = ((flags & (_TIF_SINGLESTEP |
> _TIF_SYSCALL_EMU)) == _TIF_SINGLESTEP)} here try to filter out the special
> case(PTRACE_SYSEMU_SINGLESTEP).
> 

Understood

> Another way to make sure the logic is fine, you can run some tests with
> respect to both logic, and to check whether they have the same behavior.
>

I did run selftests after Andy Lutomirski pointed out. Nothing got flagged,
I haven't looked at the tests themselves yet, but it clearly misses this
case.

--
Regards,
Sudeep


Re: [PATCH 3/6] x86: clean up _TIF_SYSCALL_EMU handling using ptrace_syscall_enter hook

2019-03-11 Thread Sudeep Holla
(I thought I had sent this email, last Tuesday itself, but saw this in my
draft today, something went wrong, sorry for the delay)

On Tue, Mar 05, 2019 at 02:14:47AM +, Haibo Xu (Arm Technology China) wrote:
> On 2019/3/4 18:12, Sudeep Holla wrote:
> > On Mon, Mar 04, 2019 at 08:25:28AM +, Haibo Xu (Arm Technology China) 
> > wrote:
> >> On 2019/3/1 2:32, Sudeep Holla wrote:
> >>> Now that we have a new hook ptrace_syscall_enter that can be called from
> >>> syscall entry code and it handles PTRACE_SYSEMU in generic code, we
> >>> can do some cleanup using the same in syscall_trace_enter.
> >>>
> >>> Further the extra logic to find single stepping PTRACE_SYSEMU_SINGLESTEP
> >>> in syscall_slow_exit_work seems unnecessary. Let's remove the same.
> >>
> >> I think we should not change the logic here. Is so, it will double the 
> >> report of syscall
> >> when PTRACE_SYSEMU_SINGLESTEP is enabled.
> >>
> >
> > I don't think that should happen, but I may be missing something.
> > Can you explain how ?
> >
>
> When PTRACE_SYSEMU_SINGLESTEP is enabled, both the _TIF_SYSCALL_EMU and
> _TIF_SINGLESTEP flags are set, but ptrace only need to report(send SIGTRAP)
> at the entry of a system call, no need to report at the exit of a system
> call.
>
Sorry, but I still not get it, we have:

step = ((flags & (_TIF_SINGLESTEP | _TIF_SYSCALL_EMU)) == 
_TIF_SINGLESTEP);

For me, this is same as:
step = ((flags & _TIF_SINGLESTEP) == _TIF_SINGLESTEP)
or
if (flags & _TIF_SINGLESTEP)
step = true;

So when PTRACE_SYSEMU_SINGLESTEP, _TIF_SYSCALL_EMU and _TIF_SINGLESTEP
are set and step evaluates to true.

So dropping _TIF_SYSCALL_EMU here should be fine. Am I still missing
something ?

--
Regards,
Sudeep


Re: [PATCH 2/6] ptrace: introduce ptrace_syscall_enter to consolidate PTRACE_SYSEMU handling

2019-03-04 Thread Sudeep Holla
On Mon, Mar 04, 2019 at 06:23:32AM -0600, Segher Boessenkool wrote:
> On Mon, Mar 04, 2019 at 10:46:43AM +0000, Sudeep Holla wrote:
> > On Mon, Mar 04, 2019 at 08:03:47AM +, Haibo Xu (Arm Technology China) 
> > wrote:
> > > On 2019/3/1 2:32, Sudeep Holla wrote:
> > > > +long ptrace_syscall_enter(struct pt_regs *regs)
> > > > +{
> > > > +#ifdef TIF_SYSCALL_EMU
> > > > +   if (test_thread_flag(TIF_SYSCALL_EMU)) {
> > > > +   if (tracehook_report_syscall_entry(regs));
> > >
> > > Shall we remove the semi-colon at end of the above line?
> >
> > Added intentionally to keep GCC happy.
>
> GCC warns because the user explicitly asked for it, with __must_check.
> If you want to do things with an "if" like this, you should write e.g.
>
>   if (tracehook_report_syscall_entry(regs))
>   /*
>* We can ignore the return code here, because of
>* X and Y and Z.
>*/
>   ;
>
> Or it probably is nicer to use a block:
>
>   if (tracehook_report_syscall_entry(regs)) {
>   /*
>* We can ignore the return code here, because of
>* X and Y and Z.
>*/
>   }
>
> The point is, you *always* should have a nice fat comment if you are
> ignoring the return code of a __must_check function.
>

Agreed, will add the comment.

--
Regards,
Sudeep


Re: [PATCH 2/6] ptrace: introduce ptrace_syscall_enter to consolidate PTRACE_SYSEMU handling

2019-03-04 Thread Sudeep Holla
On Mon, Mar 04, 2019 at 08:03:47AM +, Haibo Xu (Arm Technology China) wrote:
> On 2019/3/1 2:32, Sudeep Holla wrote:
> > Currently each architecture handles PTRACE_SYSEMU in very similar way.
> > It's completely arch independent and can be handled in the code helping
> > to consolidate PTRACE_SYSEMU handling.
> > 
> > Let's introduce a hook 'ptrace_syscall_enter' that arch specific syscall
> > entry code can call.
> > 
> 
> The 'ptrace_syscall_enter' is dedicated for PTRACE_SYSEMU flag,
> So I suggest to rename the function to something like 
> 'ptrace_syscall_emu_enter".
> 

I am fine to rename.

> > +/*
> > + * Hook to check and report for PTRACE_SYSEMU, can be called from arch
> > + * arch syscall entry code
> > + */
> > +long ptrace_syscall_enter(struct pt_regs *regs)
> > +{
> > +#ifdef TIF_SYSCALL_EMU
> > +   if (test_thread_flag(TIF_SYSCALL_EMU)) {
> > +   if (tracehook_report_syscall_entry(regs));
> 
> Shall we remove the semi-colon at end of the above line?
> 

Added intentionally to keep GCC happy.

--
Regards,
Sudeep


Re: [PATCH 4/6] powerpc: use common ptrace_syscall_enter hook to handle _TIF_SYSCALL_EMU

2019-03-04 Thread Sudeep Holla
On Mon, Mar 04, 2019 at 09:36:27AM +, Haibo Xu (Arm Technology China) wrote:
> On 2019/3/1 2:32, Sudeep Holla wrote:
> > Now that we have a new hook ptrace_syscall_enter that can be called from
> > syscall entry code and it handles PTRACE_SYSEMU in generic code, we
> > can do some cleanup using the same in do_syscall_trace_enter.
> >
> > Cc: Oleg Nesterov 
> > Cc: Paul Mackerras 
> > Cc: Michael Ellerman 
> > Signed-off-by: Sudeep Holla 
> > ---
> >  arch/powerpc/kernel/ptrace.c | 50 
> >  1 file changed, 22 insertions(+), 28 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
> > index cb7e1439cafb..978cd2aac29e 100644
> > --- a/arch/powerpc/kernel/ptrace.c
> > +++ b/arch/powerpc/kernel/ptrace.c
> > @@ -3264,37 +3264,31 @@ long do_syscall_trace_enter(struct pt_regs *regs)
> >  {
> > u32 flags;
> >
> > -   user_exit();
>
> We'd better keep the user_exit() at here in case both context tracking and
> SYSCALL_EMU are enabled.
>

Ah right, spurious change will fix it.

--
Regards,
Sudeep


Re: [PATCH 3/6] x86: clean up _TIF_SYSCALL_EMU handling using ptrace_syscall_enter hook

2019-03-04 Thread Sudeep Holla
On Mon, Mar 04, 2019 at 08:25:28AM +, Haibo Xu (Arm Technology China) wrote:
> On 2019/3/1 2:32, Sudeep Holla wrote:
> > Now that we have a new hook ptrace_syscall_enter that can be called from
> > syscall entry code and it handles PTRACE_SYSEMU in generic code, we
> > can do some cleanup using the same in syscall_trace_enter.
> > 
> > Further the extra logic to find single stepping PTRACE_SYSEMU_SINGLESTEP
> > in syscall_slow_exit_work seems unnecessary. Let's remove the same.
> 
> I think we should not change the logic here. Is so, it will double the report 
> of syscall
> when PTRACE_SYSEMU_SINGLESTEP is enabled.
>

I don't think that should happen, but I may be missing something.
Can you explain how ?

--
Regards,
Sudeep



Re: [PATCH 3/6] x86: clean up _TIF_SYSCALL_EMU handling using ptrace_syscall_enter hook

2019-03-04 Thread Sudeep Holla
On Sat, Mar 02, 2019 at 05:11:40PM -0800, Andy Lutomirski wrote:
> On Thu, Feb 28, 2019 at 10:32 AM Sudeep Holla  wrote:
> >
> > Now that we have a new hook ptrace_syscall_enter that can be called from
> > syscall entry code and it handles PTRACE_SYSEMU in generic code, we
> > can do some cleanup using the same in syscall_trace_enter.
> >
> > Further the extra logic to find single stepping PTRACE_SYSEMU_SINGLESTEP
> > in syscall_slow_exit_work seems unnecessary. Let's remove the same.
>
> I wasn't cc'd on the whole series, so I can't easily review this.  Do
> you have a test case to make sure that emulation still works?  Are
> there adequate tests in tools/testing/selftests/x86?  Do they still
> pass after this patch?
>

I will ensure you are cc-ed on the whole threads, sorry for missing.
I remember seeing some selftests, but I haven't run them yet.

--
Regards,
Sudeep


[PATCH 6/6] arm64: ptrace: add support for syscall emulation

2019-02-28 Thread Sudeep Holla
Add PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP support on arm64.
We can just make sure of the generic ptrace_syscall_enter hook to
support PTRACE_SYSEMU. We don't need any special handling for
PTRACE_SYSEMU_SINGLESTEP.

Cc: Catalin Marinas 
Cc: Will Deacon 
Signed-off-by: Sudeep Holla 
---
 arch/arm64/include/asm/thread_info.h | 5 -
 arch/arm64/kernel/ptrace.c   | 3 +++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/thread_info.h 
b/arch/arm64/include/asm/thread_info.h
index bbca68b54732..c86aeb6379d5 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -75,6 +75,7 @@ void arch_release_task_struct(struct task_struct *tsk);
  *  TIF_SYSCALL_TRACE  - syscall trace active
  *  TIF_SYSCALL_TRACEPOINT - syscall tracepoint for ftrace
  *  TIF_SYSCALL_AUDIT  - syscall auditing
+ *  TIF_SYSCALL_EMU - syscall emulation active
  *  TIF_SECOMP - syscall secure computing
  *  TIF_SIGPENDING - signal pending
  *  TIF_NEED_RESCHED   - rescheduling necessary
@@ -92,6 +93,7 @@ void arch_release_task_struct(struct task_struct *tsk);
 #define TIF_SYSCALL_AUDIT  9
 #define TIF_SYSCALL_TRACEPOINT 10
 #define TIF_SECCOMP11
+#define TIF_SYSCALL_EMU12
 #define TIF_MEMDIE 18  /* is terminating due to OOM killer */
 #define TIF_FREEZE 19
 #define TIF_RESTORE_SIGMASK20
@@ -110,6 +112,7 @@ void arch_release_task_struct(struct task_struct *tsk);
 #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
 #define _TIF_SYSCALL_TRACEPOINT(1 << TIF_SYSCALL_TRACEPOINT)
 #define _TIF_SECCOMP   (1 << TIF_SECCOMP)
+#define _TIF_SYSCALL_EMU   (1 << TIF_SYSCALL_EMU)
 #define _TIF_UPROBE(1 << TIF_UPROBE)
 #define _TIF_FSCHECK   (1 << TIF_FSCHECK)
 #define _TIF_32BIT (1 << TIF_32BIT)
@@ -121,7 +124,7 @@ void arch_release_task_struct(struct task_struct *tsk);
 
 #define _TIF_SYSCALL_WORK  (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
 _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
-_TIF_NOHZ)
+_TIF_NOHZ | _TIF_SYSCALL_EMU)
 
 #define INIT_THREAD_INFO(tsk)  \
 {  \
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index ddaea0fd2fa4..c377ce597f92 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1672,6 +1672,9 @@ static void tracehook_report_syscall(struct pt_regs *regs,
 
 int syscall_trace_enter(struct pt_regs *regs)
 {
+   if (unlikely(ptrace_syscall_enter(regs)))
+   return -1;
+
if (test_thread_flag(TIF_SYSCALL_TRACE))
tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
 
-- 
2.17.1



[PATCH 5/6] arm64: add PTRACE_SYSEMU{, SINGLESTEP} definations to uapi headers

2019-02-28 Thread Sudeep Holla
x86 and um use 31 and 32 for PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP
while powerpc uses different value maybe for legacy reasons.

Though handling of PTRACE_SYSEMU can be made architecture independent,
it's hard to make these definations generic. To add to this existing
mess few architectures like arm, c6x and sh use 31 for PTRACE_GETFDPIC
(get the ELF fdpic loadmap address). It's not possible to move the
definations to generic headers.

So we unfortunately have to duplicate the same defination to ARM64 if
we need to support PTRACE_SYSEMU.

Cc: Catalin Marinas 
Cc: Will Deacon 
Signed-off-by: Sudeep Holla 
---
 arch/arm64/include/uapi/asm/ptrace.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm64/include/uapi/asm/ptrace.h 
b/arch/arm64/include/uapi/asm/ptrace.h
index 28d77c9ed531..8478b9007f9e 100644
--- a/arch/arm64/include/uapi/asm/ptrace.h
+++ b/arch/arm64/include/uapi/asm/ptrace.h
@@ -62,6 +62,9 @@
 #define PSR_x  0xff00  /* Extension*/
 #define PSR_c  0x00ff  /* Control  */
 
+/* syscall emulation path in ptrace */
+#define PTRACE_SYSEMU31
+#define PTRACE_SYSEMU_SINGLESTEP  32
 
 #ifndef __ASSEMBLY__
 
-- 
2.17.1



[PATCH 4/6] powerpc: use common ptrace_syscall_enter hook to handle _TIF_SYSCALL_EMU

2019-02-28 Thread Sudeep Holla
Now that we have a new hook ptrace_syscall_enter that can be called from
syscall entry code and it handles PTRACE_SYSEMU in generic code, we
can do some cleanup using the same in do_syscall_trace_enter.

Cc: Oleg Nesterov 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Signed-off-by: Sudeep Holla 
---
 arch/powerpc/kernel/ptrace.c | 50 
 1 file changed, 22 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index cb7e1439cafb..978cd2aac29e 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -3264,37 +3264,31 @@ long do_syscall_trace_enter(struct pt_regs *regs)
 {
u32 flags;
 
-   user_exit();
-
-   flags = READ_ONCE(current_thread_info()->flags) &
-   (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE);
+   if (unlikely(ptrace_syscall_enter(regs))) {
+   /*
+* A nonzero return code from tracehook_report_syscall_entry()
+* tells us to prevent the syscall execution, but we are not
+* going to execute it anyway.
+*
+* Returning -1 will skip the syscall execution. We want to
+* avoid clobbering any registers, so we don't goto the skip
+* label below.
+*/
+   return -1;
+   }
 
-   if (flags) {
-   int rc = tracehook_report_syscall_entry(regs);
+   user_exit();
 
-   if (unlikely(flags & _TIF_SYSCALL_EMU)) {
-   /*
-* A nonzero return code from
-* tracehook_report_syscall_entry() tells us to prevent
-* the syscall execution, but we are not going to
-* execute it anyway.
-*
-* Returning -1 will skip the syscall execution. We want
-* to avoid clobbering any registers, so we don't goto
-* the skip label below.
-*/
-   return -1;
-   }
+   flags = READ_ONCE(current_thread_info()->flags) & _TIF_SYSCALL_TRACE;
 
-   if (rc) {
-   /*
-* The tracer decided to abort the syscall. Note that
-* the tracer may also just change regs->gpr[0] to an
-* invalid syscall number, that is handled below on the
-* exit path.
-*/
-   goto skip;
-   }
+   if (flags && tracehook_report_syscall_entry(regs)) {
+   /*
+* The tracer decided to abort the syscall. Note that
+* the tracer may also just change regs->gpr[0] to an
+* invalid syscall number, that is handled below on the
+* exit path.
+*/
+   goto skip;
}
 
/* Run seccomp after ptrace; allow it to set gpr[3]. */
-- 
2.17.1



[PATCH 1/6] ptrace: move clearing of TIF_SYSCALL_EMU flag to core

2019-02-28 Thread Sudeep Holla
While the TIF_SYSCALL_EMU is set in ptrace_resume independent of any
architecture, currently only powerpc and x86 unset the TIF_SYSCALL_EMU
flag in ptrace_disable which gets called from ptrace_detach.

Let's move the clearing of TIF_SYSCALL_EMU flag to ptrace_detach after
we return from ptrace_disable to ensure there's no change in the flow.

Cc: Oleg Nesterov 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Signed-off-by: Sudeep Holla 
---
 arch/powerpc/kernel/ptrace.c | 1 -
 arch/x86/kernel/ptrace.c | 3 ---
 kernel/ptrace.c  | 4 
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index cdd5d1d3ae41..cb7e1439cafb 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -2508,7 +2508,6 @@ void ptrace_disable(struct task_struct *child)
 {
/* make sure the single step bit is not set. */
user_disable_single_step(child);
-   clear_tsk_thread_flag(child, TIF_SYSCALL_EMU);
 }
 
 #ifdef CONFIG_PPC_ADV_DEBUG_REGS
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 4b8ee05dd6ad..45792dbd2443 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -746,9 +746,6 @@ static int ioperm_get(struct task_struct *target,
 void ptrace_disable(struct task_struct *child)
 {
user_disable_single_step(child);
-#ifdef TIF_SYSCALL_EMU
-   clear_tsk_thread_flag(child, TIF_SYSCALL_EMU);
-#endif
 }
 
 #if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 771e93f9c43f..4fa3b7f4c3c7 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -534,6 +534,10 @@ static int ptrace_detach(struct task_struct *child, 
unsigned int data)
/* Architecture-specific hardware disable .. */
ptrace_disable(child);
 
+#ifdef TIF_SYSCALL_EMU
+   clear_tsk_thread_flag(child, TIF_SYSCALL_EMU);
+#endif
+
write_lock_irq(_lock);
/*
 * We rely on ptrace_freeze_traced(). It can't be killed and
-- 
2.17.1



[PATCH 0/6] ptrace: consolidate PTRACE_SYSEMU handling and add support for arm64

2019-02-28 Thread Sudeep Holla
Hi,

This patchset evolved from the discussion in the thread[0][1]. When we
wanted to add PTRACE_SYSEMU support to ARM64, we thought instead of
duplicating what other architectures like x86 and powerpc have done,
let consolidate the existing support and move it to the core as there's
nothing arch specific in it.

So this is the first attempt to the same.

Regards,
Sudeep

[0] https://patchwork.kernel.org/patch/10585505/
[1] https://patchwork.kernel.org/patch/10675237/

Sudeep Holla (6):
  ptrace: move clearing of TIF_SYSCALL_EMU flag to core
  ptrace: introduce ptrace_syscall_enter to consolidate PTRACE_SYSEMU handling
  x86: clean up _TIF_SYSCALL_EMU handling using ptrace_syscall_enter hook
  powerpc: use common ptrace_syscall_enter hook to handle _TIF_SYSCALL_EMU
  arm64: add PTRACE_SYSEMU{,SINGLESTEP} definations to uapi headers
  arm64: ptrace: add support for syscall emulation

 arch/arm64/include/asm/thread_info.h |  5 ++-
 arch/arm64/include/uapi/asm/ptrace.h |  3 ++
 arch/arm64/kernel/ptrace.c   |  3 ++
 arch/powerpc/kernel/ptrace.c | 51 
 arch/x86/entry/common.c  | 22 +++-
 arch/x86/kernel/ptrace.c |  3 --
 include/linux/ptrace.h   |  1 +
 kernel/ptrace.c  | 20 +++
 8 files changed, 57 insertions(+), 51 deletions(-)

--
2.17.1



[PATCH 3/6] x86: clean up _TIF_SYSCALL_EMU handling using ptrace_syscall_enter hook

2019-02-28 Thread Sudeep Holla
Now that we have a new hook ptrace_syscall_enter that can be called from
syscall entry code and it handles PTRACE_SYSEMU in generic code, we
can do some cleanup using the same in syscall_trace_enter.

Further the extra logic to find single stepping PTRACE_SYSEMU_SINGLESTEP
in syscall_slow_exit_work seems unnecessary. Let's remove the same.

Cc: Andy Lutomirski 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Signed-off-by: Sudeep Holla 
---
 arch/x86/entry/common.c | 22 --
 1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 7bc105f47d21..36457c1f87d2 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -70,22 +70,16 @@ static long syscall_trace_enter(struct pt_regs *regs)
 
struct thread_info *ti = current_thread_info();
unsigned long ret = 0;
-   bool emulated = false;
u32 work;
 
if (IS_ENABLED(CONFIG_DEBUG_ENTRY))
BUG_ON(regs != task_pt_regs(current));
 
-   work = READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY;
-
-   if (unlikely(work & _TIF_SYSCALL_EMU))
-   emulated = true;
-
-   if ((emulated || (work & _TIF_SYSCALL_TRACE)) &&
-   tracehook_report_syscall_entry(regs))
+   if (unlikely(ptrace_syscall_enter(regs)))
return -1L;
 
-   if (emulated)
+   work = READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY;
+   if ((work & _TIF_SYSCALL_TRACE) && tracehook_report_syscall_entry(regs))
return -1L;
 
 #ifdef CONFIG_SECCOMP
@@ -227,15 +221,7 @@ static void syscall_slow_exit_work(struct pt_regs *regs, 
u32 cached_flags)
if (cached_flags & _TIF_SYSCALL_TRACEPOINT)
trace_sys_exit(regs, regs->ax);
 
-   /*
-* If TIF_SYSCALL_EMU is set, we only get here because of
-* TIF_SINGLESTEP (i.e. this is PTRACE_SYSEMU_SINGLESTEP).
-* We already reported this syscall instruction in
-* syscall_trace_enter().
-*/
-   step = unlikely(
-   (cached_flags & (_TIF_SINGLESTEP | _TIF_SYSCALL_EMU))
-   == _TIF_SINGLESTEP);
+   step = unlikely((cached_flags & _TIF_SINGLESTEP));
if (step || cached_flags & _TIF_SYSCALL_TRACE)
tracehook_report_syscall_exit(regs, step);
 }
-- 
2.17.1



[PATCH 2/6] ptrace: introduce ptrace_syscall_enter to consolidate PTRACE_SYSEMU handling

2019-02-28 Thread Sudeep Holla
Currently each architecture handles PTRACE_SYSEMU in very similar way.
It's completely arch independent and can be handled in the code helping
to consolidate PTRACE_SYSEMU handling.

Let's introduce a hook 'ptrace_syscall_enter' that arch specific syscall
entry code can call.

Cc: Oleg Nesterov 
Signed-off-by: Sudeep Holla 
---
 include/linux/ptrace.h |  1 +
 kernel/ptrace.c| 16 
 2 files changed, 17 insertions(+)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index edb9b040c94c..e30f51e3363e 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -407,6 +407,7 @@ static inline void user_single_step_report(struct pt_regs 
*regs)
 #define current_user_stack_pointer() user_stack_pointer(current_pt_regs())
 #endif
 
+extern long ptrace_syscall_enter(struct pt_regs *regs);
 extern int task_current_syscall(struct task_struct *target, long *callno,
unsigned long args[6], unsigned int maxargs,
unsigned long *sp, unsigned long *pc);
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 4fa3b7f4c3c7..6724eaf98e79 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Access another process' address space via ptrace.
@@ -557,6 +558,21 @@ static int ptrace_detach(struct task_struct *child, 
unsigned int data)
return 0;
 }
 
+/*
+ * Hook to check and report for PTRACE_SYSEMU, can be called from arch
+ * arch syscall entry code
+ */
+long ptrace_syscall_enter(struct pt_regs *regs)
+{
+#ifdef TIF_SYSCALL_EMU
+   if (test_thread_flag(TIF_SYSCALL_EMU)) {
+   if (tracehook_report_syscall_entry(regs));
+   return -1L;
+   }
+#endif
+   return 0;
+}
+
 /*
  * Detach all tasks we were using ptrace on. Called with tasklist held
  * for writing.
-- 
2.17.1



Re: [PATCH 4/5] arm64: dts: add QorIQ LX2160A SoC support

2018-08-21 Thread Sudeep Holla
On Mon, Aug 20, 2018 at 12:17:15PM +0530, Vabhav Sharma wrote:
> LX2160A SoC is based on Layerscape Chassis Generation 3.2 Architecture.
> 
> LX2160A features an advanced 16 64-bit ARM v8 CortexA72 processor cores
> in 8 cluster, CCN508, GICv3,two 64-bit DDR4 memory controller, 8 I2C
> controllers, 3 dspi, 2 esdhc,2 USB 3.0, mmu 500, 3 SATA, 4 PL011 SBSA
> UARTs etc.
> 
> Signed-off-by: Ramneek Mehresh 
> Signed-off-by: Zhang Ying-22455 
> Signed-off-by: Nipun Gupta 
> Signed-off-by: Priyanka Jain 
> Signed-off-by: Yogesh Gaur 
> Signed-off-by: Sriram Dash 
> Signed-off-by: Vabhav Sharma 
> ---
>  arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi | 572 
> +
>  1 file changed, 572 insertions(+)
>  create mode 100644 arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
> 
> diff --git a/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi 
> b/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
> new file mode 100644
> index 000..e35e494
> --- /dev/null
> +++ b/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
> @@ -0,0 +1,572 @@
> +// SPDX-License-Identifier: (GPL-2.0 OR MIT)
> +//
> +// Device Tree Include file for Layerscape-LX2160A family SoC.
> +//
> +// Copyright 2018 NXP
> +
> +#include 
> +
> +/memreserve/ 0x8000 0x0001;
> +
> +/ {
> + compatible = "fsl,lx2160a";
> + interrupt-parent = <>;
> + #address-cells = <2>;
> + #size-cells = <2>;
> +
> + cpus {
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + // 8 clusters having 2 Cortex-A72 cores each
> + cpu@0 {
> + device_type = "cpu";
> + compatible = "arm,cortex-a72";
> + reg = <0x0>;
> + clocks = < 1 0>;
> + next-level-cache = <_l2>;

If you expect to get cache properties in sysfs entries, you need to populate
them here and for each L2 cache.

[...]

> +
> + rstcr: syscon@1e6 {
> + compatible = "syscon";
> + reg = <0x0 0x1e6 0x0 0x4>;
> + };
> +
> + reboot {
> + compatible ="syscon-reboot";
> + regmap = <>;
> + offset = <0x0>;
> + mask = <0x2>;

Is this disabled in bootloader ? With PSCI, it's preferred to use
SYSTEM_RESET/OFF. EL3 f/w may need to do some housekeeping on poweroff.

> + };
> +
> + timer {
> + compatible = "arm,armv8-timer";
> + interrupts = <1 13 4>, // Physical Secure PPI, active-low

The comment says active low but the value 4 indicates it's HIGH from
"include/dt-bindings/interrupt-controller/irq.h"

> +  <1 14 4>, // Physical Non-Secure PPI, active-low
> +  <1 11 4>, // Virtual PPI, active-low
> +  <1 10 4>; // Hypervisor PPI, active-low
> + };
> +
> + pmu {
> + compatible = "arm,armv8-pmuv3";

More specific compatible preferably "arm,cortex-a72-pmu" ?

--
Regards,
Sudeep


[RESEND][PATCH 1/2] powerpc/64: drop redundant defination of spin_until_cond

2018-02-26 Thread Sudeep Holla
linux/processor.h has exactly same defination for spin_until_cond.
Drop the redundant defination in asm/processor.h

Cc: Nicholas Piggin <npig...@gmail.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Signed-off-by: Sudeep Holla <sudeep.ho...@arm.com>
---
 arch/powerpc/include/asm/processor.h | 11 ---
 1 file changed, 11 deletions(-)

Hi,
(Resending as I got some errors from the mail server today)

When I was planning to use spin_until_cond, I came across the same defination
at 2 different headers, one of which includes the other and takes care of
enabling the defination in case of undefinded condition.

I found it redundant, but I may be wrong.

Regards,
Sudeep

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index 01299cdc9806..4816be3be02c 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -438,17 +438,6 @@ static inline unsigned long __pack_fe01(unsigned int 
fpmode)

 #define spin_end() HMT_medium()

-#define spin_until_cond(cond)  \
-do {   \
-   if (unlikely(!(cond))) {\
-   spin_begin();   \
-   do {\
-   spin_cpu_relax();   \
-   } while (!(cond));  \
-   spin_end(); \
-   }   \
-} while (0)
-
 #else
 #define cpu_relax()barrier()
 #endif
--
2.7.4



[RESEND][PATCH 2/2] powerpc/watchdog: include linux/processor.h for spin_until_cond

2018-02-26 Thread Sudeep Holla
This implementation uses spin_until_cond in wd_smp_lock including
neither linux/processor.h nor asm/processor.h

This patch includes linux/processor.h here for spin_until_cond usage.

Cc: Nicholas Piggin <npig...@gmail.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Signed-off-by: Sudeep Holla <sudeep.ho...@arm.com>
---
 arch/powerpc/kernel/watchdog.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index 6256dc3b0087..f4359fde99f3 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
-- 
2.7.4



[PATCH 2/2] powerpc/watchdog: include linux/processor.h for spin_until_cond

2018-02-19 Thread Sudeep Holla
This implementation uses spin_until_cond in wd_smp_lock including
neither linux/processor.h nor asm/processor.h

This patch includes linux/processor.h here for spin_until_cond usage.

Cc: Nicholas Piggin <npig...@gmail.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Signed-off-by: Sudeep Holla <sudeep.ho...@arm.com>
---
 arch/powerpc/kernel/watchdog.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index 6256dc3b0087..f4359fde99f3 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
-- 
2.7.4



[RFC PATCH 1/2] powerpc/64: drop redundant defination of spin_until_cond

2018-02-19 Thread Sudeep Holla
linux/processor.h has exactly same defination for spin_until_cond.
Drop the redundant defination in asm/processor.h

Cc: Nicholas Piggin <npig...@gmail.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Signed-off-by: Sudeep Holla <sudeep.ho...@arm.com>
---
 arch/powerpc/include/asm/processor.h | 11 ---
 1 file changed, 11 deletions(-)

Hi,

When I was planning to use spin_until_cond, I came across the same defination
at 2 different headers, one of which includes the other and takes care of
enabling the defination in case of undefinded condition.

I found it redundant, but I may be wrong.

Regards,
Sudeep

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index 01299cdc9806..4816be3be02c 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -438,17 +438,6 @@ static inline unsigned long __pack_fe01(unsigned int 
fpmode)

 #define spin_end() HMT_medium()

-#define spin_until_cond(cond)  \
-do {   \
-   if (unlikely(!(cond))) {\
-   spin_begin();   \
-   do {\
-   spin_cpu_relax();   \
-   } while (!(cond));  \
-   spin_end(); \
-   }   \
-} while (0)
-
 #else
 #define cpu_relax()barrier()
 #endif
--
2.7.4



Re: [RFC PATCH] of: base: add support to get machine model name

2016-11-17 Thread Sudeep Holla



On 17/11/16 14:13, Arnd Bergmann wrote:

On Thursday, November 17, 2016 2:08:30 PM CET Sudeep Holla wrote:

On 17/11/16 13:50, Arnd Bergmann wrote:

On Thursday, November 17, 2016 11:50:50 AM CET Sudeep Holla wrote:

Currently platforms/drivers needing to get the machine model name are
replicating the same snippet of code. In some case, the OF reference
counting is either missing or incorrect.

This patch adds support to read the machine model name either using
the "model" or the "compatible" property in the device tree root node.

Signed-off-by: Sudeep Holla <sudeep.ho...@arm.com>


I like the idea. One small comment:



Thanks. I prefer it as single patch but it can't be applied to any tree.
Any suggestions on handling this patch to fix the warning in -next ?


The patch that causes the warning is currently in the mmc tree, and I
don't think it would be good to have your entire patch in there too.

It's probably best to just fix the warning there now by adding another
open-coded copy of that function, and then apply your patch on top
for v4.11.


Sure, that's much simpler to deal with for now.

--
Regards,
Sudeep


Re: [RFC PATCH] of: base: add support to get machine model name

2016-11-17 Thread Sudeep Holla



On 17/11/16 13:50, Arnd Bergmann wrote:

On Thursday, November 17, 2016 11:50:50 AM CET Sudeep Holla wrote:

Currently platforms/drivers needing to get the machine model name are
replicating the same snippet of code. In some case, the OF reference
counting is either missing or incorrect.

This patch adds support to read the machine model name either using
the "model" or the "compatible" property in the device tree root node.

Signed-off-by: Sudeep Holla <sudeep.ho...@arm.com>


I like the idea. One small comment:



Thanks. I prefer it as single patch but it can't be applied to any tree.
Any suggestions on handling this patch to fix the warning in -next ?


+int of_machine_get_model_name(const char **model)
+{
+   int error;
+   struct device_node *root;
+
+   root = of_find_node_by_path("/");
+   if (!root)
+   return -EINVAL;


The global of_root variable points ot this already, and is defined
in the same file, so I think we can just skip the lookup.



Ah right, will fix it.

--
Regards,
Sudeep


[RFC PATCH] of: base: add support to get machine model name

2016-11-17 Thread Sudeep Holla
Currently platforms/drivers needing to get the machine model name are
replicating the same snippet of code. In some case, the OF reference
counting is either missing or incorrect.

This patch adds support to read the machine model name either using
the "model" or the "compatible" property in the device tree root node.

Signed-off-by: Sudeep Holla <sudeep.ho...@arm.com>
---
 arch/arm/mach-imx/cpu.c   |  4 +---
 arch/arm/mach-mxs/mach-mxs.c  |  3 +--
 arch/mips/cavium-octeon/setup.c   | 12 ++--
 arch/mips/generic/proc.c  | 15 +++
 arch/sh/boards/of-generic.c   |  6 +-
 drivers/of/base.c | 34 ++
 drivers/soc/fsl/guts.c|  3 +--
 drivers/soc/renesas/renesas-soc.c |  4 +---
 include/linux/of.h|  6 ++
 9 files changed, 50 insertions(+), 37 deletions(-)

Hi,

While trying to fix a simple build warning(as below) in -next for fsl/guts.c,
I came across this code duplication in multiple places.

WARNING: modpost: Found 1 section mismatch(es).
To see full details build your kernel with:
'make CONFIG_DEBUG_SECTION_MISMATCH=y'

With CONFIG_DEBUG_SECTION_MISMATCH enabled, the details are reported:

WARNING: vmlinux.o(.text+0x55d014): Section mismatch in reference from the
function fsl_guts_probe() to the function
.init.text:of_flat_dt_get_machine_name()
The function fsl_guts_probe() references
the function __init of_flat_dt_get_machine_name().
This is often because fsl_guts_probe lacks a __init
annotation or the annotation of of_flat_dt_get_machine_name is wrong.

I can split the patch if needed if people are OK with the idea.

Regards,
Sudeep

diff --git a/arch/arm/mach-imx/cpu.c b/arch/arm/mach-imx/cpu.c
index b3347d32349f..846f40008752 100644
--- a/arch/arm/mach-imx/cpu.c
+++ b/arch/arm/mach-imx/cpu.c
@@ -85,9 +85,7 @@ struct device * __init imx_soc_device_init(void)

soc_dev_attr->family = "Freescale i.MX";

-   root = of_find_node_by_path("/");
-   ret = of_property_read_string(root, "model", _dev_attr->machine);
-   of_node_put(root);
+   ret = of_machine_get_model_name(_dev_attr->machine);
if (ret)
goto free_soc;

diff --git a/arch/arm/mach-mxs/mach-mxs.c b/arch/arm/mach-mxs/mach-mxs.c
index e4f21086b42b..ed9af3a894f0 100644
--- a/arch/arm/mach-mxs/mach-mxs.c
+++ b/arch/arm/mach-mxs/mach-mxs.c
@@ -391,8 +391,7 @@ static void __init mxs_machine_init(void)
if (!soc_dev_attr)
return;

-   root = of_find_node_by_path("/");
-   ret = of_property_read_string(root, "model", _dev_attr->machine);
+   ret = of_machine_get_model_name(_dev_attr->machine);
if (ret)
return;

diff --git a/arch/mips/cavium-octeon/setup.c b/arch/mips/cavium-octeon/setup.c
index 9a2db1c013d9..2e2b1b5befa4 100644
--- a/arch/mips/cavium-octeon/setup.c
+++ b/arch/mips/cavium-octeon/setup.c
@@ -498,16 +498,8 @@ static void __init init_octeon_system_type(void)
char const *board_type;

board_type = cvmx_board_type_to_string(octeon_bootinfo->board_type);
-   if (board_type == NULL) {
-   struct device_node *root;
-   int ret;
-
-   root = of_find_node_by_path("/");
-   ret = of_property_read_string(root, "model", _type);
-   of_node_put(root);
-   if (ret)
-   board_type = "Unsupported Board";
-   }
+   if (!board_type && of_machine_get_model_name(_type))
+   board_type = "Unsupported Board";

snprintf(octeon_system_type, sizeof(octeon_system_type), "%s (%s)",
 board_type, octeon_model_get_string(read_c0_prid()));
diff --git a/arch/mips/generic/proc.c b/arch/mips/generic/proc.c
index 42b33250a4a2..f7fc067bf908 100644
--- a/arch/mips/generic/proc.c
+++ b/arch/mips/generic/proc.c
@@ -10,20 +10,11 @@

 #include 

-#include 
-
 const char *get_system_type(void)
 {
const char *str;
-   int err;
-
-   err = of_property_read_string(of_root, "model", );
-   if (!err)
-   return str;
-
-   err = of_property_read_string_index(of_root, "compatible", 0, );
-   if (!err)
-   return str;

-   return "Unknown";
+   if (of_machine_get_model_name())
+   return "Unknown";
+   return str;
 }
diff --git a/arch/sh/boards/of-generic.c b/arch/sh/boards/of-generic.c
index 1fb6d5714bae..938a14499298 100644
--- a/arch/sh/boards/of-generic.c
+++ b/arch/sh/boards/of-generic.c
@@ -135,11 +135,7 @@ static void __init sh_of_setup(char **cmdline_p)
board_time_init = sh_of_time_init;

sh_mv.mv_name = "Unknown SH model";
-   root = of_find_node_by_path("/");
-   if (root) {
-   of_pr

Re: [PATCH 1/2] soc: fsl: make it explicitly non-modular

2016-11-17 Thread Sudeep Holla



On 16/11/16 16:39, Sudeep Holla wrote:

The Kconfig currently controlling compilation of this code is:

drivers/soc/fsl/Kconfig:config FSL_GUTS
drivers/soc/fsl/Kconfig:   bool

...meaning that it currently is not being built as a module by anyone.

Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.

Since module_init was not in use by this code, the init ordering
remains unchanged with this commit.

Cc: Scott Wood <o...@buserror.net>
Cc: Yangbo Lu <yangbo...@nxp.com>
Cc: Arnd Bergmann <a...@arndb.de>
Signed-off-by: Sudeep Holla <sudeep.ho...@arm.com>
---


I saw Paul Gortmaker had sent similar patch on Nov 15, so drop/ignore this.

--
Regards,
Sudeep


Re: [PATCH 2/2] soc: fsl: fix section mismatch build warnings

2016-11-16 Thread Sudeep Holla



On 16/11/16 17:07, Arnd Bergmann wrote:

On Wednesday, November 16, 2016 4:39:27 PM CET Sudeep Holla wrote:

@@ -223,6 +222,7 @@ static struct platform_driver fsl_guts_driver = {

 static int __init fsl_guts_init(void)
 {
+   machine = of_flat_dt_get_machine_name();
return platform_driver_register(_guts_driver);
 }
 core_initcall(fsl_guts_init);


I think we simply need to use the normal DT API rather than the of_flat_* one.



I thought so, yes that will be better. I will respin accordingly.

--
Regards,
Sudeep


Re: [PATCH 1/2] soc: fsl: make it explicitly non-modular

2016-11-16 Thread Sudeep Holla


On 16/11/16 16:39, Sudeep Holla wrote:

The Kconfig currently controlling compilation of this code is:

drivers/soc/fsl/Kconfig:config FSL_GUTS
drivers/soc/fsl/Kconfig:   bool

...meaning that it currently is not being built as a module by anyone.

Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.

Since module_init was not in use by this code, the init ordering
remains unchanged with this commit.



Sorry I forgot to append -next as these are applicable only to linux-next


Cc: Scott Wood <o...@buserror.net>
Cc: Yangbo Lu <yangbo...@nxp.com>
Cc: Arnd Bergmann <a...@arndb.de>
Signed-off-by: Sudeep Holla <sudeep.ho...@arm.com>
---
 drivers/soc/fsl/guts.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/drivers/soc/fsl/guts.c b/drivers/soc/fsl/guts.c
index 0ac88263c2d7..885409d84eb2 100644
--- a/drivers/soc/fsl/guts.c
+++ b/drivers/soc/fsl/guts.c
@@ -11,7 +11,6 @@

 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -212,7 +211,6 @@ static const struct of_device_id fsl_guts_of_match[] = {
{ .compatible = "fsl,ls2080a-dcfg", },
{}
 };
-MODULE_DEVICE_TABLE(of, fsl_guts_of_match);

 static struct platform_driver fsl_guts_driver = {
.driver = {
@@ -228,9 +226,3 @@ static int __init fsl_guts_init(void)
return platform_driver_register(_guts_driver);
 }
 core_initcall(fsl_guts_init);
-
-static void __exit fsl_guts_exit(void)
-{
-   platform_driver_unregister(_guts_driver);
-}
-module_exit(fsl_guts_exit);



--
Regards,
Sudeep


[PATCH 1/2] soc: fsl: make it explicitly non-modular

2016-11-16 Thread Sudeep Holla
The Kconfig currently controlling compilation of this code is:

drivers/soc/fsl/Kconfig:config FSL_GUTS
drivers/soc/fsl/Kconfig:   bool

...meaning that it currently is not being built as a module by anyone.

Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.

Since module_init was not in use by this code, the init ordering
remains unchanged with this commit.

Cc: Scott Wood <o...@buserror.net>
Cc: Yangbo Lu <yangbo...@nxp.com>
Cc: Arnd Bergmann <a...@arndb.de>
Signed-off-by: Sudeep Holla <sudeep.ho...@arm.com>
---
 drivers/soc/fsl/guts.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/drivers/soc/fsl/guts.c b/drivers/soc/fsl/guts.c
index 0ac88263c2d7..885409d84eb2 100644
--- a/drivers/soc/fsl/guts.c
+++ b/drivers/soc/fsl/guts.c
@@ -11,7 +11,6 @@
 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -212,7 +211,6 @@ static const struct of_device_id fsl_guts_of_match[] = {
{ .compatible = "fsl,ls2080a-dcfg", },
{}
 };
-MODULE_DEVICE_TABLE(of, fsl_guts_of_match);
 
 static struct platform_driver fsl_guts_driver = {
.driver = {
@@ -228,9 +226,3 @@ static int __init fsl_guts_init(void)
return platform_driver_register(_guts_driver);
 }
 core_initcall(fsl_guts_init);
-
-static void __exit fsl_guts_exit(void)
-{
-   platform_driver_unregister(_guts_driver);
-}
-module_exit(fsl_guts_exit);
-- 
2.7.4



[PATCH 2/2] soc: fsl: fix section mismatch build warnings

2016-11-16 Thread Sudeep Holla
We get the following warning with the driver is compiled in:

WARNING: modpost: Found 1 section mismatch(es).
To see full details build your kernel with:
'make CONFIG_DEBUG_SECTION_MISMATCH=y'

With CONFIG_DEBUG_SECTION_MISMATCH enabled, the details are reported:

WARNING: vmlinux.o(.text+0x55d014): Section mismatch in reference from the
function fsl_guts_probe() to the function
.init.text:of_flat_dt_get_machine_name()
The function fsl_guts_probe() references
the function __init of_flat_dt_get_machine_name().
This is often because fsl_guts_probe lacks a __init
annotation or the annotation of of_flat_dt_get_machine_name is wrong.

This patch stashes the machine name during fsl_guts_init initcall to
fix the above warnings.

Cc: Scott Wood <o...@buserror.net>
Cc: Yangbo Lu <yangbo...@nxp.com>
Cc: Arnd Bergmann <a...@arndb.de>
Signed-off-by: Sudeep Holla <sudeep.ho...@arm.com>
---
 drivers/soc/fsl/guts.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/soc/fsl/guts.c b/drivers/soc/fsl/guts.c
index 885409d84eb2..5513a2b3448f 100644
--- a/drivers/soc/fsl/guts.c
+++ b/drivers/soc/fsl/guts.c
@@ -31,6 +31,7 @@ struct fsl_soc_die_attr {
 static struct guts *guts;
 static struct soc_device_attribute soc_dev_attr;
 static struct soc_device *soc_dev;
+static const char *machine;
 
 
 /* SoC die attribute definition for QorIQ platform */
@@ -135,7 +136,6 @@ static int fsl_guts_probe(struct platform_device *pdev)
struct device *dev = >dev;
struct resource *res;
const struct fsl_soc_die_attr *soc_die;
-   const char *machine;
u32 svr;
 
/* Initialize guts */
@@ -151,7 +151,6 @@ static int fsl_guts_probe(struct platform_device *pdev)
return PTR_ERR(guts->regs);
 
/* Register soc device */
-   machine = of_flat_dt_get_machine_name();
if (machine)
soc_dev_attr.machine = devm_kstrdup(dev, machine, GFP_KERNEL);
 
@@ -223,6 +222,7 @@ static struct platform_driver fsl_guts_driver = {
 
 static int __init fsl_guts_init(void)
 {
+   machine = of_flat_dt_get_machine_name();
return platform_driver_register(_guts_driver);
 }
 core_initcall(fsl_guts_init);
-- 
2.7.4



Re: [PATCH 04/17] powerpc: mpic: use IRQCHIP_SKIP_SET_WAKE instead of redundant mpic_irq_set_wake

2015-10-19 Thread Sudeep Holla

Hi Ben,

On 23/09/15 05:06, Scott Wood wrote:

On Mon, 2015-09-21 at 16:47 +0100, Sudeep Holla wrote:

mpic_irq_set_wake return -ENXIO for non FSL MPIC and sets IRQF_NO_SUSPEND
flag for FSL ones. enable_irq_wake already returns -ENXIO if irq_set_wak
is not implemented. Also there's no need to set the IRQF_NO_SUSPEND flag
as it doesn't guarantee wakeup for that interrupt.

This patch removes the redundant mpic_irq_set_wake and sets the
IRQCHIP_SKIP_SET_WAKE for only FSL MPIC.

Cc: Benjamin Herrenschmidt <b...@kernel.crashing.org>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Scott Wood <scottw...@freescale.com>
Cc: Hongtao Jia <hongtao@freescale.com>
Cc: Marc Zyngier <marc.zyng...@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Sudeep Holla <sudeep.ho...@arm.com>
---
  arch/powerpc/sysdev/mpic.c | 23 ---
  1 file changed, 4 insertions(+), 19 deletions(-)


Acked-by: Scott Wood <scottw...@freescale.com>



Can you pick this up via your tree ?

--
Regards,
Sudeep
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 04/17] powerpc: mpic: use IRQCHIP_SKIP_SET_WAKE instead of redundant mpic_irq_set_wake

2015-09-21 Thread Sudeep Holla
mpic_irq_set_wake return -ENXIO for non FSL MPIC and sets IRQF_NO_SUSPEND
flag for FSL ones. enable_irq_wake already returns -ENXIO if irq_set_wak
is not implemented. Also there's no need to set the IRQF_NO_SUSPEND flag
as it doesn't guarantee wakeup for that interrupt.

This patch removes the redundant mpic_irq_set_wake and sets the
IRQCHIP_SKIP_SET_WAKE for only FSL MPIC.

Cc: Benjamin Herrenschmidt <b...@kernel.crashing.org>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Scott Wood <scottw...@freescale.com>
Cc: Hongtao Jia <hongtao@freescale.com>
Cc: Marc Zyngier <marc.zyng...@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Sudeep Holla <sudeep.ho...@arm.com>
---
 arch/powerpc/sysdev/mpic.c | 23 ---
 1 file changed, 4 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c
index 537e5db85a06..123e43612f0a 100644
--- a/arch/powerpc/sysdev/mpic.c
+++ b/arch/powerpc/sysdev/mpic.c
@@ -924,22 +924,6 @@ int mpic_set_irq_type(struct irq_data *d, unsigned int 
flow_type)
return IRQ_SET_MASK_OK_NOCOPY;
 }
 
-static int mpic_irq_set_wake(struct irq_data *d, unsigned int on)
-{
-   struct irq_desc *desc = container_of(d, struct irq_desc, irq_data);
-   struct mpic *mpic = mpic_from_irq_data(d);
-
-   if (!(mpic->flags & MPIC_FSL))
-   return -ENXIO;
-
-   if (on)
-   desc->action->flags |= IRQF_NO_SUSPEND;
-   else
-   desc->action->flags &= ~IRQF_NO_SUSPEND;
-
-   return 0;
-}
-
 void mpic_set_vector(unsigned int virq, unsigned int vector)
 {
struct mpic *mpic = mpic_from_irq(virq);
@@ -977,7 +961,6 @@ static struct irq_chip mpic_irq_chip = {
.irq_unmask = mpic_unmask_irq,
.irq_eoi= mpic_end_irq,
.irq_set_type   = mpic_set_irq_type,
-   .irq_set_wake   = mpic_irq_set_wake,
 };
 
 #ifdef CONFIG_SMP
@@ -992,7 +975,6 @@ static struct irq_chip mpic_tm_chip = {
.irq_mask   = mpic_mask_tm,
.irq_unmask = mpic_unmask_tm,
.irq_eoi= mpic_end_irq,
-   .irq_set_wake   = mpic_irq_set_wake,
 };
 
 #ifdef CONFIG_MPIC_U3_HT_IRQS
@@ -1283,8 +1265,11 @@ struct mpic * __init mpic_alloc(struct device_node *node,
flags |= MPIC_NO_RESET;
if (of_get_property(node, "single-cpu-affinity", NULL))
flags |= MPIC_SINGLE_DEST_CPU;
-   if (of_device_is_compatible(node, "fsl,mpic"))
+   if (of_device_is_compatible(node, "fsl,mpic")) {
flags |= MPIC_FSL | MPIC_LARGE_VECTORS;
+   mpic_irq_chip.flags |= IRQCHIP_SKIP_SET_WAKE;
+   mpic_tm_chip.flags |= IRQCHIP_SKIP_SET_WAKE;
+   }
 
mpic = kzalloc(sizeof(struct mpic), GFP_KERNEL);
if (mpic == NULL)
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/3] cpuidle: updates related to tick_broadcast_enter() failures

2015-05-11 Thread Sudeep Holla



On 10/05/15 00:15, Rafael J. Wysocki wrote:

On Saturday, May 09, 2015 10:33:05 PM Rafael J. Wysocki wrote:

On Saturday, May 09, 2015 10:11:41 PM Rafael J. Wysocki wrote:

On Saturday, May 09, 2015 11:19:16 AM Preeti U Murthy wrote:

Hi Rafael,

On 05/08/2015 07:48 PM, Rafael J. Wysocki wrote:


[cut]



+   /* Take note of the planned idle state. */
+   idle_set_state(smp_processor_id(), target_state);


And I wouldn't do this either.

The behavior here is pretty much as though the driver demoted the state chosen
by the governor and we don't call idle_set_state() again in those cases.


Why is this wrong?


It is not wrong, but incomplete, because demotions done by the cpuidle driver
should also be taken into account in the same way.

But I'm seeing that the recent patch of mine that made cpuidle_enter_state()
call default_idle_call() was a mistake, because it might confuse 
find_idlest_cpu()
significantly as to what state the CPU is in.  I'll drop that one for now.


OK, done.

So after I've dropped it I think we need to do three things:
(1) Move the idle_set_state() calls to cpuidle_enter_state().
(2) Make cpuidle_enter_state() call default_idle_call() again, but this time
 do that *before* it has called idle_set_state() for target_state.
(3) Introduce demotion as per my last patch.

Let me cut patches for that.


Done as per the above and the patches follow in replies to this messge.

All on top of the current linux-next branch of the linux-pm.git tree.



Tested on ARM Vexpress platforms with one of the CPU in broadcast mode
and also with broadcast timer. So, you can add:

Tested-by: Sudeep Holla sudeep.ho...@arm.com

Regards,
Sudeep
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V3] cpuidle: Handle tick_broadcast_enter() failure gracefully

2015-05-08 Thread Sudeep Holla



On 08/05/15 08:35, Preeti U Murthy wrote:

When a CPU has to enter an idle state where tick stops, it makes a call
to tick_broadcast_enter(). The call will fail if this CPU is the
broadcast CPU. Today, under such a circumstance, the arch cpuidle code
handles this CPU.  This is not convincing because not only do we not
know what the arch cpuidle code does, but we also do not account for the
idle state residency time and usage of such a CPU.

This scenario can be handled better by simply choosing an idle state
where in ticks do not stop. To accommodate this change move the setting
of runqueue idle state from the core to the cpuidle driver, else the
rq-idle_state will be set wrong.

Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com


I gave it a spin on ARM64 Juno platform with one of the CPU in broadcast
mode and Vexpress TC2 with broadcast timer. I found no issues in both
the cases. So, you can add:

Tested-by: Sudeep Holla sudeep.ho...@arm.com

Regards,
Sudeep
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] cpuidle: Handle tick_broadcast_enter() failure gracefully

2015-05-07 Thread Sudeep Holla

Hi Preeti,

On 07/05/15 06:26, Preeti U Murthy wrote:

When a CPU has to enter an idle state where tick stops, it makes a call
to tick_broadcast_enter(). The call will fail if this CPU is the
broadcast CPU. Today, under such a circumstance, the arch cpuidle code
handles this CPU.  This is not convincing because not only are we not
aware what the arch cpuidle code does, but we also do not account for
the idle state residency time and usage of such a CPU.

This scenario can be handled better by simply asking the cpuidle
governor to choose an idle state where in ticks do not stop. To
accommodate this change move the setting of runqueue idle state from the
core to the cpuidle driver, else the rq-idle_state will be set wrong.

Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com
---
Based on linux-pm/bleeding-edge


I am unable to apply this patch cleanly on linux-pm/bleeding-edge
I think it conflicts with few patches that Rafael posted recently
which are in the branch now.

Regards,
Sudeep
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC/RFT, RESEND] powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure

2015-04-01 Thread Sudeep Holla



On 01/04/15 05:40, Michael Ellerman wrote:

On Tue, 2015-03-31 at 18:14 +0100, Sudeep Holla wrote:


On 31/03/15 11:56, Michael Ellerman wrote:

On Mon, 2015-23-02 at 18:18:20 UTC, Sudeep Holla wrote:

This patch removes the redundant sysfs cacheinfo code by reusing
the newly introduced generic cacheinfo infrastructure through the
commit 246246cbde5e (drivers: base: support cpu cache information
interface to userspace via sysfs)



Removing the include doesn't fix it, it needs cacheinfo_cpu_on/offline().



I agree, had a quick look at that, and it requires some rework not sure
if that should be in generic code or ppc specific.


Yeah OK.

Also if I just remove the references from the suspend code, it still causes
changes to the result, some of which look wrong:

--- cpu0.before 2015-04-01 15:34:58.985470973 +1100
+++ cpu0.after-no-power 2015-04-01 15:36:31.313435304 +1100
@@ -3,22 +3,24 @@
  ./cpu0/cache/index0/level:1
  ./cpu0/cache/index0/number_of_sets:8
  ./cpu0/cache/index0/shared_cpu_map:,00ff
+./cpu0/cache/index0/shared_cpu_list:0-7- additional, OK
  ./cpu0/cache/index0/coherency_line_size:128
  ./cpu0/cache/index0/ways_of_associativity:64
-./cpu0/cache/index1/size:32K   - we lost the size of 
the Icache?
  ./cpu0/cache/index1/type:Instruction
  ./cpu0/cache/index1/level:1
-./cpu0/cache/index1/number_of_sets:4   }-.
-./cpu0/cache/index1/shared_cpu_map:,00ff .
-./cpu0/cache/index1/coherency_line_size:128  .   These changes are 
no good
-./cpu0/cache/index1/ways_of_associativity:64 .
+./cpu0/cache/index1/shared_cpu_map:, .
+./cpu0/cache/index1/shared_cpu_list:0-47   }-
  ./cpu0/cache/index2/size:512K
  ./cpu0/cache/index2/type:Unified
  ./cpu0/cache/index2/level:2
  ./cpu0/cache/index2/number_of_sets:8
  ./cpu0/cache/index2/shared_cpu_map:,00ff
+./cpu0/cache/index2/shared_cpu_list:0-7- additional, OK
+./cpu0/cache/index2/ways_of_associativity:0- this is new but 
wrong I think
  ./cpu0/cache/index3/size:8192K
  ./cpu0/cache/index3/type:Unified
  ./cpu0/cache/index3/level:3
  ./cpu0/cache/index3/number_of_sets:8
  ./cpu0/cache/index3/shared_cpu_map:,00ff
+./cpu0/cache/index3/shared_cpu_list:0-7
+./cpu0/cache/index3/ways_of_associativity:0- ditto



Thanks for the log. It's been long time since I looked at this code.
It would be good to know if Anshuman had looked at this issue. If not
I will start looking at this in couple of days.

Regards,
Sudeep
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC/RFT, RESEND] powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure

2015-03-31 Thread Sudeep Holla



On 31/03/15 11:56, Michael Ellerman wrote:

On Mon, 2015-23-02 at 18:18:20 UTC, Sudeep Holla wrote:

This patch removes the redundant sysfs cacheinfo code by reusing
the newly introduced generic cacheinfo infrastructure through the
commit 246246cbde5e (drivers: base: support cpu cache information
interface to userspace via sysfs)

Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Paul Mackerras pau...@samba.org
Cc: Michael Ellerman m...@ellerman.id.au
Cc: Anshuman Khandual khand...@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
---
  arch/powerpc/kernel/cacheinfo.c | 811 +---
  arch/powerpc/kernel/cacheinfo.h |   8 -
  arch/powerpc/kernel/sysfs.c |  12 +-
  3 files changed, 91 insertions(+), 740 deletions(-)
  delete mode 100644 arch/powerpc/kernel/cacheinfo.h

Hi,

This patch is not tested. Last time Anshuman tested, he had seen issues.
The core driver has changed a lot after that. Since PPC depends a lot
on DT for cache information, there might be issues in the core later
which I could not identify with ARM/ARM64. It would be much appreciable
if someone help me in testing and fixing those so that PPC can migrate
to new/common cacheinfo infrastructure. This resend is rebased on v4.0-rc1


Doesn't build for me.

   arch/powerpc/platforms/pseries/suspend.c:29:36: fatal error: 
../../kernel/cacheinfo.h: No such file or directory
   #include ../../kernel/cacheinfo.h



Right, sorry for missing this. I just tested corenet32_smp_defconfig.
Also after some digging I found that this patch was written before the
commit 6b36ba8492ab ( powerpc/pseries: Update dynamic cache nodes for
suspend/resume operation). I had a quick looks at that patch and I am
not sure if we can support that with generic cacheinfo implementation.


Anshuman must have worked around that somehow to test it previously?



I think he tried before the above mentioned commit was introduced.


Removing the include doesn't fix it, it needs cacheinfo_cpu_on/offline().



I agree, had a quick look at that, and it requires some rework not sure
if that should be in generic code or ppc specific.

Regards,
Sudeep
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH RFC/RFT][RESEND] powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure

2015-02-23 Thread Sudeep Holla
This patch removes the redundant sysfs cacheinfo code by reusing
the newly introduced generic cacheinfo infrastructure through the
commit 246246cbde5e (drivers: base: support cpu cache information
interface to userspace via sysfs)

Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Paul Mackerras pau...@samba.org
Cc: Michael Ellerman m...@ellerman.id.au
Cc: Anshuman Khandual khand...@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/cacheinfo.c | 811 +---
 arch/powerpc/kernel/cacheinfo.h |   8 -
 arch/powerpc/kernel/sysfs.c |  12 +-
 3 files changed, 91 insertions(+), 740 deletions(-)
 delete mode 100644 arch/powerpc/kernel/cacheinfo.h

Hi,

This patch is not tested. Last time Anshuman tested, he had seen issues.
The core driver has changed a lot after that. Since PPC depends a lot
on DT for cache information, there might be issues in the core later
which I could not identify with ARM/ARM64. It would be much appreciable
if someone help me in testing and fixing those so that PPC can migrate
to new/common cacheinfo infrastructure. This resend is rebased on v4.0-rc1

Regards,
Sudeep

diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c
index ae77b7e59889..6845eb7fcc18 100644
--- a/arch/powerpc/kernel/cacheinfo.c
+++ b/arch/powerpc/kernel/cacheinfo.c
@@ -10,38 +10,10 @@
  * 2 as published by the Free Software Foundation.
  */
 
+#include linux/cacheinfo.h
 #include linux/cpu.h
-#include linux/cpumask.h
 #include linux/kernel.h
-#include linux/kobject.h
-#include linux/list.h
-#include linux/notifier.h
 #include linux/of.h
-#include linux/percpu.h
-#include linux/slab.h
-#include asm/prom.h
-
-#include cacheinfo.h
-
-/* per-cpu object for tracking:
- * - a cache kobject for the top-level directory
- * - a list of index objects representing the cpu's local cache hierarchy
- */
-struct cache_dir {
-   struct kobject *kobj; /* bare (not embedded) kobject for cache
-  * directory */
-   struct cache_index_dir *index; /* list of index objects */
-};
-
-/* index object: each cpu's cache directory has an index
- * subdirectory corresponding to a cache object associated with the
- * cpu.  This object's lifetime is managed via the embedded kobject.
- */
-struct cache_index_dir {
-   struct kobject kobj;
-   struct cache_index_dir *next; /* next index in parent directory */
-   struct cache *cache;
-};
 
 /* Template for determining which OF properties to query for a given
  * cache type */
@@ -60,11 +32,6 @@ struct cache_type_info {
const char *nr_sets_prop;
 };
 
-/* These are used to index the cache_type_info array. */
-#define CACHE_TYPE_UNIFIED 0
-#define CACHE_TYPE_INSTRUCTION 1
-#define CACHE_TYPE_DATA2
-
 static const struct cache_type_info cache_type_info[] = {
{
/* PowerPC Processor binding says the [di]-cache-*
@@ -92,231 +59,82 @@ static const struct cache_type_info cache_type_info[] = {
},
 };
 
-/* Cache object: each instance of this corresponds to a distinct cache
- * in the system.  There are separate objects for Harvard caches: one
- * each for instruction and data, and each refers to the same OF node.
- * The refcount of the OF node is elevated for the lifetime of the
- * cache object.  A cache object is released when its shared_cpu_map
- * is cleared (see cache_cpu_clear).
- *
- * A cache object is on two lists: an unsorted global list
- * (cache_list) of cache objects; and a singly-linked list
- * representing the local cache hierarchy, which is ordered by level
- * (e.g. L1d - L1i - L2 - L3).
- */
-struct cache {
-   struct device_node *ofnode;/* OF node for this cache, may be cpu */
-   struct cpumask shared_cpu_map; /* online CPUs using this cache */
-   int type;  /* split cache disambiguation */
-   int level; /* level not explicit in device tree */
-   struct list_head list; /* global list of cache objects */
-   struct cache *next_local;  /* next cache of = level */
-};
-
-static DEFINE_PER_CPU(struct cache_dir *, cache_dir_pcpu);
-
-/* traversal/modification of this list occurs only at cpu hotplug time;
- * access is serialized by cpu hotplug locking
- */
-static LIST_HEAD(cache_list);
-
-static struct cache_index_dir *kobj_to_cache_index_dir(struct kobject *k)
-{
-   return container_of(k, struct cache_index_dir, kobj);
-}
-
-static const char *cache_type_string(const struct cache *cache)
+static inline int get_cacheinfo_idx(enum cache_type type)
 {
-   return cache_type_info[cache-type].name;
-}
-
-static void cache_init(struct cache *cache, int type, int level,
-  struct device_node *ofnode)
-{
-   cache-type = type;
-   cache-level = level;
-   cache-ofnode = of_node_get(ofnode);
-   INIT_LIST_HEAD(cache-list

[PATCH RFT/RFC] powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure

2015-01-08 Thread Sudeep Holla
This patch removes the redundant sysfs cacheinfo code by reusing
the newly introduced generic cacheinfo infrastructure through the
commit 246246cbde5e (drivers: base: support cpu cache information
interface to userspace via sysfs)

Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Paul Mackerras pau...@samba.org
Cc: Michael Ellerman m...@ellerman.id.au
Cc: Anshuman Khandual khand...@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/cacheinfo.c | 812 +---
 arch/powerpc/kernel/cacheinfo.h |   8 -
 arch/powerpc/kernel/sysfs.c |  12 +-
 3 files changed, 90 insertions(+), 742 deletions(-)
 delete mode 100644 arch/powerpc/kernel/cacheinfo.h

Hi,

This patch is not tested. Last time Anshuman tested, he had seen issues.
The core driver has changed a lot after that. Since PPC depends a lot
on DT for cache information, there might be issues in the core later
which I could not identify with ARM/ARM64. It would be much appreciable
if someone help me in testing and fixing those so that PPC can migrate
to new/common cacheinfo infrastructure.

Regards,
Sudeep

diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c
index 40198d50b4c2..6845eb7fcc18 100644
--- a/arch/powerpc/kernel/cacheinfo.c
+++ b/arch/powerpc/kernel/cacheinfo.c
@@ -10,38 +10,10 @@
  * 2 as published by the Free Software Foundation.
  */
 
+#include linux/cacheinfo.h
 #include linux/cpu.h
-#include linux/cpumask.h
 #include linux/kernel.h
-#include linux/kobject.h
-#include linux/list.h
-#include linux/notifier.h
 #include linux/of.h
-#include linux/percpu.h
-#include linux/slab.h
-#include asm/prom.h
-
-#include cacheinfo.h
-
-/* per-cpu object for tracking:
- * - a cache kobject for the top-level directory
- * - a list of index objects representing the cpu's local cache hierarchy
- */
-struct cache_dir {
-   struct kobject *kobj; /* bare (not embedded) kobject for cache
-  * directory */
-   struct cache_index_dir *index; /* list of index objects */
-};
-
-/* index object: each cpu's cache directory has an index
- * subdirectory corresponding to a cache object associated with the
- * cpu.  This object's lifetime is managed via the embedded kobject.
- */
-struct cache_index_dir {
-   struct kobject kobj;
-   struct cache_index_dir *next; /* next index in parent directory */
-   struct cache *cache;
-};
 
 /* Template for determining which OF properties to query for a given
  * cache type */
@@ -60,11 +32,6 @@ struct cache_type_info {
const char *nr_sets_prop;
 };
 
-/* These are used to index the cache_type_info array. */
-#define CACHE_TYPE_UNIFIED 0
-#define CACHE_TYPE_INSTRUCTION 1
-#define CACHE_TYPE_DATA2
-
 static const struct cache_type_info cache_type_info[] = {
{
/* PowerPC Processor binding says the [di]-cache-*
@@ -92,231 +59,82 @@ static const struct cache_type_info cache_type_info[] = {
},
 };
 
-/* Cache object: each instance of this corresponds to a distinct cache
- * in the system.  There are separate objects for Harvard caches: one
- * each for instruction and data, and each refers to the same OF node.
- * The refcount of the OF node is elevated for the lifetime of the
- * cache object.  A cache object is released when its shared_cpu_map
- * is cleared (see cache_cpu_clear).
- *
- * A cache object is on two lists: an unsorted global list
- * (cache_list) of cache objects; and a singly-linked list
- * representing the local cache hierarchy, which is ordered by level
- * (e.g. L1d - L1i - L2 - L3).
- */
-struct cache {
-   struct device_node *ofnode;/* OF node for this cache, may be cpu */
-   struct cpumask shared_cpu_map; /* online CPUs using this cache */
-   int type;  /* split cache disambiguation */
-   int level; /* level not explicit in device tree */
-   struct list_head list; /* global list of cache objects */
-   struct cache *next_local;  /* next cache of = level */
-};
-
-static DEFINE_PER_CPU(struct cache_dir *, cache_dir_pcpu);
-
-/* traversal/modification of this list occurs only at cpu hotplug time;
- * access is serialized by cpu hotplug locking
- */
-static LIST_HEAD(cache_list);
-
-static struct cache_index_dir *kobj_to_cache_index_dir(struct kobject *k)
-{
-   return container_of(k, struct cache_index_dir, kobj);
-}
-
-static const char *cache_type_string(const struct cache *cache)
+static inline int get_cacheinfo_idx(enum cache_type type)
 {
-   return cache_type_info[cache-type].name;
-}
-
-static void cache_init(struct cache *cache, int type, int level,
-  struct device_node *ofnode)
-{
-   cache-type = type;
-   cache-level = level;
-   cache-ofnode = of_node_get(ofnode);
-   INIT_LIST_HEAD(cache-list);
-   list_add(cache-list, cache_list

[PATCH v5 08/11] powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure

2014-09-30 Thread Sudeep Holla
This patch removes the redundant sysfs cacheinfo code by making use of
the newly introduced generic cacheinfo infrastructure.

Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Paul Mackerras pau...@samba.org
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/cacheinfo.c | 812 +---
 arch/powerpc/kernel/cacheinfo.h |   8 -
 arch/powerpc/kernel/sysfs.c |  12 +-
 3 files changed, 90 insertions(+), 742 deletions(-)
 delete mode 100644 arch/powerpc/kernel/cacheinfo.h

diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c
index 40198d50b4c2..6845eb7fcc18 100644
--- a/arch/powerpc/kernel/cacheinfo.c
+++ b/arch/powerpc/kernel/cacheinfo.c
@@ -10,38 +10,10 @@
  * 2 as published by the Free Software Foundation.
  */
 
+#include linux/cacheinfo.h
 #include linux/cpu.h
-#include linux/cpumask.h
 #include linux/kernel.h
-#include linux/kobject.h
-#include linux/list.h
-#include linux/notifier.h
 #include linux/of.h
-#include linux/percpu.h
-#include linux/slab.h
-#include asm/prom.h
-
-#include cacheinfo.h
-
-/* per-cpu object for tracking:
- * - a cache kobject for the top-level directory
- * - a list of index objects representing the cpu's local cache hierarchy
- */
-struct cache_dir {
-   struct kobject *kobj; /* bare (not embedded) kobject for cache
-  * directory */
-   struct cache_index_dir *index; /* list of index objects */
-};
-
-/* index object: each cpu's cache directory has an index
- * subdirectory corresponding to a cache object associated with the
- * cpu.  This object's lifetime is managed via the embedded kobject.
- */
-struct cache_index_dir {
-   struct kobject kobj;
-   struct cache_index_dir *next; /* next index in parent directory */
-   struct cache *cache;
-};
 
 /* Template for determining which OF properties to query for a given
  * cache type */
@@ -60,11 +32,6 @@ struct cache_type_info {
const char *nr_sets_prop;
 };
 
-/* These are used to index the cache_type_info array. */
-#define CACHE_TYPE_UNIFIED 0
-#define CACHE_TYPE_INSTRUCTION 1
-#define CACHE_TYPE_DATA2
-
 static const struct cache_type_info cache_type_info[] = {
{
/* PowerPC Processor binding says the [di]-cache-*
@@ -92,231 +59,82 @@ static const struct cache_type_info cache_type_info[] = {
},
 };
 
-/* Cache object: each instance of this corresponds to a distinct cache
- * in the system.  There are separate objects for Harvard caches: one
- * each for instruction and data, and each refers to the same OF node.
- * The refcount of the OF node is elevated for the lifetime of the
- * cache object.  A cache object is released when its shared_cpu_map
- * is cleared (see cache_cpu_clear).
- *
- * A cache object is on two lists: an unsorted global list
- * (cache_list) of cache objects; and a singly-linked list
- * representing the local cache hierarchy, which is ordered by level
- * (e.g. L1d - L1i - L2 - L3).
- */
-struct cache {
-   struct device_node *ofnode;/* OF node for this cache, may be cpu */
-   struct cpumask shared_cpu_map; /* online CPUs using this cache */
-   int type;  /* split cache disambiguation */
-   int level; /* level not explicit in device tree */
-   struct list_head list; /* global list of cache objects */
-   struct cache *next_local;  /* next cache of = level */
-};
-
-static DEFINE_PER_CPU(struct cache_dir *, cache_dir_pcpu);
-
-/* traversal/modification of this list occurs only at cpu hotplug time;
- * access is serialized by cpu hotplug locking
- */
-static LIST_HEAD(cache_list);
-
-static struct cache_index_dir *kobj_to_cache_index_dir(struct kobject *k)
-{
-   return container_of(k, struct cache_index_dir, kobj);
-}
-
-static const char *cache_type_string(const struct cache *cache)
+static inline int get_cacheinfo_idx(enum cache_type type)
 {
-   return cache_type_info[cache-type].name;
-}
-
-static void cache_init(struct cache *cache, int type, int level,
-  struct device_node *ofnode)
-{
-   cache-type = type;
-   cache-level = level;
-   cache-ofnode = of_node_get(ofnode);
-   INIT_LIST_HEAD(cache-list);
-   list_add(cache-list, cache_list);
-}
-
-static struct cache *new_cache(int type, int level, struct device_node *ofnode)
-{
-   struct cache *cache;
-
-   cache = kzalloc(sizeof(*cache), GFP_KERNEL);
-   if (cache)
-   cache_init(cache, type, level, ofnode);
-
-   return cache;
-}
-
-static void release_cache_debugcheck(struct cache *cache)
-{
-   struct cache *iter;
-
-   list_for_each_entry(iter, cache_list, list)
-   WARN_ONCE(iter-next_local == cache,
- cache for %s(%s) refers to cache for %s(%s)\n,
- iter-ofnode-full_name

Re: [PATCH v4 04/11] drivers: base: support cpu cache information interface to userspace via sysfs

2014-09-30 Thread Sudeep Holla

Hi Greg,

On 24/09/14 07:35, Greg Kroah-Hartman wrote:

On Wed, Sep 17, 2014 at 12:00:48PM -0700, Greg Kroah-Hartman wrote:

On Wed, Sep 17, 2014 at 06:25:10PM +0100, Sudeep Holla wrote:

Hi Greg,

On 03/09/14 18:00, Sudeep Holla wrote:


[...]


Can you review the first 4 patches in this series please ?


It's in my todo queue, which is really long at the moment due to me
going to conferences (at one right now...)  Will be working on this
soon, thanks for your patience.


Based on the review comments, I think you are going to change at least
the first patch, right?  Please resend the latest version of this
series, with all of the accumulated tested-by and acked lines and
resend.



I have posted the new version as you suggested. I was holding off
assuming the merge window would open this week and hence the delay.

Regards,
Sudeep

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v5 04/11] drivers: base: support cpu cache information interface to userspace via sysfs

2014-09-30 Thread Sudeep Holla
This patch adds initial support for providing processor cache information
to userspace through sysfs interface. This is based on already existing
implementations(x86, ia64, s390 and powerpc) and hence the interface is
intended to be fully compatible.

The main purpose of this generic support is to avoid further code
duplication to support new architectures and also to unify all the existing
different implementations.

This implementation maintains the hierarchy of cache objects which reflects
the system's cache topology. Cache devices are instantiated as needed as
CPUs come online. The cache information is replicated per-cpu even if they are
shared. A per-cpu array of cache information maintained is used mainly for
sysfs-related book keeping.

It also implements the shared_cpu_map attribute, which is essential for
enabling both kernel and user-space to discover the system's overall cache
topology.

This patch also add the missing ABI documentation for the cacheinfo sysfs
interface already, which is well defined and widely used.

Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
Reviewed-by: Stephen Boyd sb...@codeaurora.org
Tested-by: Stephen Boyd sb...@codeaurora.org
Cc: Greg Kroah-Hartman gre...@linuxfoundation.org
Cc: linux-...@vger.kernel.org
Cc: linux...@de.ibm.com
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-i...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s...@vger.kernel.org
Cc: x...@kernel.org
---
 Documentation/ABI/testing/sysfs-devices-system-cpu |  47 ++
 drivers/base/Makefile  |   2 +-
 drivers/base/cacheinfo.c   | 541 +
 include/linux/cacheinfo.h  | 100 
 4 files changed, 689 insertions(+), 1 deletion(-)
 create mode 100644 drivers/base/cacheinfo.c
 create mode 100644 include/linux/cacheinfo.h

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu 
b/Documentation/ABI/testing/sysfs-devices-system-cpu
index acb9bfc89b48..99983e67c13c 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -224,3 +224,50 @@ Description:   Parameters for the Intel P-state driver
frequency range.
 
More details can be found in 
Documentation/cpu-freq/intel-pstate.txt
+
+What:  
/sys/devices/system/cpu/cpu*/cache/index*/set_of_attributes_mentioned_below
+Date:  July 2014(documented, existed before August 2008)
+Contact:   Sudeep Holla sudeep.ho...@arm.com
+   Linux kernel mailing list linux-ker...@vger.kernel.org
+Description:   Parameters for the CPU cache attributes
+
+   allocation_policy:
+   - WriteAllocate: allocate a memory location to a cache 
line
+on a cache miss because of a write
+   - ReadAllocate: allocate a memory location to a cache 
line
+   on a cache miss because of a read
+   - ReadWriteAllocate: both writeallocate and readallocate
+
+   attributes: LEGACY used only on IA64 and is same as write_policy
+
+   coherency_line_size: the minimum amount of data in bytes that 
gets
+transferred from memory to cache
+
+   level: the cache hierarcy in the multi-level cache configuration
+
+   number_of_sets: total number of sets in the cache, a set is a
+   collection of cache lines with the same cache 
index
+
+   physical_line_partition: number of physical cache line per 
cache tag
+
+   shared_cpu_list: the list of logical cpus sharing the cache
+
+   shared_cpu_map: logical cpu mask containing the list of cpus 
sharing
+   the cache
+
+   size: the total cache size in kB
+
+   type:
+   - Instruction: cache that only holds instructions
+   - Data: cache that only caches data
+   - Unified: cache that holds both data and instructions
+
+   ways_of_associativity: degree of freedom in placing a 
particular block
+   of memory in the cache
+
+   write_policy:
+   - WriteThrough: data is written to both the cache line
+   and to the block in the lower-level 
memory
+   - WriteBack: data is written only to the cache line and
+the modified cache line is written to main
+memory only when it is replaced
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 4aab26ec0292..f901bc1cffc8 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -4,7 +4,7 @@ obj-y   := component.o core.o bus.o dd.o 
syscore.o

Re: [PATCH v4 04/11] drivers: base: support cpu cache information interface to userspace via sysfs

2014-09-22 Thread Sudeep Holla

Hi Stephen,

On 19/09/14 23:24, Stephen Boyd wrote:

On 09/03/14 10:00, Sudeep Holla wrote:

From: Sudeep Holla sudeep.ho...@arm.com

This patch adds initial support for providing processor cache information
to userspace through sysfs interface. This is based on already existing
implementations(x86, ia64, s390 and powerpc) and hence the interface is
intended to be fully compatible.

The main purpose of this generic support is to avoid further code
duplication to support new architectures and also to unify all the existing
different implementations.

This implementation maintains the hierarchy of cache objects which reflects
the system's cache topology. Cache devices are instantiated as needed as
CPUs come online. The cache information is replicated per-cpu even if they are
shared. A per-cpu array of cache information maintained is used mainly for
sysfs-related book keeping.

It also implements the shared_cpu_map attribute, which is essential for
enabling both kernel and user-space to discover the system's overall cache
topology.

This patch also add the missing ABI documentation for the cacheinfo sysfs
interface already, which is well defined and widely used.

Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
Cc: Greg Kroah-Hartman gre...@linuxfoundation.org
Cc: Stephen Boyd sb...@codeaurora.org
Cc: linux-...@vger.kernel.org
Cc: linux...@de.ibm.com
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-i...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s...@vger.kernel.org
Cc: x...@kernel.org



Reviewed-by: Stephen Boyd sb...@codeaurora.org
Tested-by: Stephen Boyd sb...@codeaurora.org



Thanks for all the reviews and testings of the series.

Regards,
Sudeep

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 04/11] drivers: base: support cpu cache information interface to userspace via sysfs

2014-09-17 Thread Sudeep Holla

Hi Greg,

On 03/09/14 18:00, Sudeep Holla wrote:

From: Sudeep Holla sudeep.ho...@arm.com

This patch adds initial support for providing processor cache information
to userspace through sysfs interface. This is based on already existing
implementations(x86, ia64, s390 and powerpc) and hence the interface is
intended to be fully compatible.

The main purpose of this generic support is to avoid further code
duplication to support new architectures and also to unify all the existing
different implementations.

This implementation maintains the hierarchy of cache objects which reflects
the system's cache topology. Cache devices are instantiated as needed as
CPUs come online. The cache information is replicated per-cpu even if they are
shared. A per-cpu array of cache information maintained is used mainly for
sysfs-related book keeping.

It also implements the shared_cpu_map attribute, which is essential for
enabling both kernel and user-space to discover the system's overall cache
topology.

This patch also add the missing ABI documentation for the cacheinfo sysfs
interface already, which is well defined and widely used.



Can you review the first 4 patches in this series please ?

Regards,
Sudeep

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 08/11] powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure

2014-09-03 Thread Sudeep Holla
From: Sudeep Holla sudeep.ho...@arm.com

This patch removes the redundant sysfs cacheinfo code by making use of
the newly introduced generic cacheinfo infrastructure.

Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Paul Mackerras pau...@samba.org
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/cacheinfo.c | 812 +---
 arch/powerpc/kernel/cacheinfo.h |   8 -
 arch/powerpc/kernel/sysfs.c |  12 +-
 3 files changed, 90 insertions(+), 742 deletions(-)
 delete mode 100644 arch/powerpc/kernel/cacheinfo.h

diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c
index 40198d50b4c2..6845eb7fcc18 100644
--- a/arch/powerpc/kernel/cacheinfo.c
+++ b/arch/powerpc/kernel/cacheinfo.c
@@ -10,38 +10,10 @@
  * 2 as published by the Free Software Foundation.
  */
 
+#include linux/cacheinfo.h
 #include linux/cpu.h
-#include linux/cpumask.h
 #include linux/kernel.h
-#include linux/kobject.h
-#include linux/list.h
-#include linux/notifier.h
 #include linux/of.h
-#include linux/percpu.h
-#include linux/slab.h
-#include asm/prom.h
-
-#include cacheinfo.h
-
-/* per-cpu object for tracking:
- * - a cache kobject for the top-level directory
- * - a list of index objects representing the cpu's local cache hierarchy
- */
-struct cache_dir {
-   struct kobject *kobj; /* bare (not embedded) kobject for cache
-  * directory */
-   struct cache_index_dir *index; /* list of index objects */
-};
-
-/* index object: each cpu's cache directory has an index
- * subdirectory corresponding to a cache object associated with the
- * cpu.  This object's lifetime is managed via the embedded kobject.
- */
-struct cache_index_dir {
-   struct kobject kobj;
-   struct cache_index_dir *next; /* next index in parent directory */
-   struct cache *cache;
-};
 
 /* Template for determining which OF properties to query for a given
  * cache type */
@@ -60,11 +32,6 @@ struct cache_type_info {
const char *nr_sets_prop;
 };
 
-/* These are used to index the cache_type_info array. */
-#define CACHE_TYPE_UNIFIED 0
-#define CACHE_TYPE_INSTRUCTION 1
-#define CACHE_TYPE_DATA2
-
 static const struct cache_type_info cache_type_info[] = {
{
/* PowerPC Processor binding says the [di]-cache-*
@@ -92,231 +59,82 @@ static const struct cache_type_info cache_type_info[] = {
},
 };
 
-/* Cache object: each instance of this corresponds to a distinct cache
- * in the system.  There are separate objects for Harvard caches: one
- * each for instruction and data, and each refers to the same OF node.
- * The refcount of the OF node is elevated for the lifetime of the
- * cache object.  A cache object is released when its shared_cpu_map
- * is cleared (see cache_cpu_clear).
- *
- * A cache object is on two lists: an unsorted global list
- * (cache_list) of cache objects; and a singly-linked list
- * representing the local cache hierarchy, which is ordered by level
- * (e.g. L1d - L1i - L2 - L3).
- */
-struct cache {
-   struct device_node *ofnode;/* OF node for this cache, may be cpu */
-   struct cpumask shared_cpu_map; /* online CPUs using this cache */
-   int type;  /* split cache disambiguation */
-   int level; /* level not explicit in device tree */
-   struct list_head list; /* global list of cache objects */
-   struct cache *next_local;  /* next cache of = level */
-};
-
-static DEFINE_PER_CPU(struct cache_dir *, cache_dir_pcpu);
-
-/* traversal/modification of this list occurs only at cpu hotplug time;
- * access is serialized by cpu hotplug locking
- */
-static LIST_HEAD(cache_list);
-
-static struct cache_index_dir *kobj_to_cache_index_dir(struct kobject *k)
-{
-   return container_of(k, struct cache_index_dir, kobj);
-}
-
-static const char *cache_type_string(const struct cache *cache)
+static inline int get_cacheinfo_idx(enum cache_type type)
 {
-   return cache_type_info[cache-type].name;
-}
-
-static void cache_init(struct cache *cache, int type, int level,
-  struct device_node *ofnode)
-{
-   cache-type = type;
-   cache-level = level;
-   cache-ofnode = of_node_get(ofnode);
-   INIT_LIST_HEAD(cache-list);
-   list_add(cache-list, cache_list);
-}
-
-static struct cache *new_cache(int type, int level, struct device_node *ofnode)
-{
-   struct cache *cache;
-
-   cache = kzalloc(sizeof(*cache), GFP_KERNEL);
-   if (cache)
-   cache_init(cache, type, level, ofnode);
-
-   return cache;
-}
-
-static void release_cache_debugcheck(struct cache *cache)
-{
-   struct cache *iter;
-
-   list_for_each_entry(iter, cache_list, list)
-   WARN_ONCE(iter-next_local == cache,
- cache for %s(%s) refers to cache for %s(%s)\n,
- iter

[PATCH v4 00/11] drivers: cacheinfo support

2014-09-03 Thread Sudeep Holla
From: Sudeep Holla sudeep.ho...@arm.com

This series adds a generic cacheinfo support similar to topology. The
implementation is based on x86 cacheinfo support. Currently x86, powerpc,
ia64 and s390 have their own implementations. While adding similar support
to ARM and ARM64, here is the attempt to make it generic quite similar to
topology info support. It also adds the missing ABI documentation for
the cacheinfo sysfs which is already being used.

It moves all the existing different implementations on x86, ia64, powerpc
and s390 to use the generic cacheinfo infrastructure introduced here.
These changes on non-ARM platforms are only compile tested and tested on x86.

This series also adds support for ARM and ARM64 architectures based on
the generic support.

The code can be fetched from:
 git://linux-arm.org/linux-skn cacheinfo

Changes v3-v4:
- since userspace tools can't handle class and bus with same name,
  removed creating new cpu class and reused existing cpu bus with
  new cpu_device_create function
- (no changes in the arch specific port)

Changes v2-v3:
- Added {allocation,write}_policy instead of single attributes sysfs
  (attributes retained on ia64 privately as it was used only on that)
- factored out show_cpumap into separate helper in cpumask.h
- populate cpu_{map,list} for non-DT system if they are not populated
  by arch specific callbacks
- removed use of sysfs *_show callback in cache_attrs_is_visible
- all the review comments from Stephen Boyd implemented

Changes v1-v2:
- removed custom device_{add,remove}_attrs, using is_visible callback
  instead(suggested by GregKH)
- arm64: changes as per MarkR review comments
- Moved smp_call_function_single to architectures using it(arm, arm64,
  x86) (suggested by Stephen Boyd)
- arm (mostly changes as per RMK's review comments)
- fixed to allow v7 + v6 build
- l2 cache changes to remove extra structure
- populated CTR for few StrongARM CPU's not implementing CTR

Regards,
Sudeep

[v1] https://lkml.org/lkml/2014/6/25/603
[v2] https://lkml.org/lkml/2014/7/25/467
[v3] https://lkml.org/lkml/2014/8/21/175

Cc: linux-i...@vger.kernel.org
Cc: linux...@de.ibm.com
Cc: linux-s...@vger.kernel.org
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org


Sudeep Holla (11):
  cpumask: factor out show_cpumap into separate helper function
  topology: replace custom attribute macros with standard DEVICE_ATTR*
  drivers: base: add cpu_device_create to support per-cpu devices
  drivers: base: support cpu cache information interface to userspace
via sysfs
  ia64: move cacheinfo sysfs to generic cacheinfo infrastructure
  s390: move cacheinfo sysfs to generic cacheinfo infrastructure
  x86: move cacheinfo sysfs to generic cacheinfo infrastructure
  powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure
  ARM64: kernel: add support for cpu cache information
  ARM: kernel: add support for cpu cache information
  ARM: kernel: add outer cache support for cacheinfo implementation

 Documentation/ABI/testing/sysfs-devices-system-cpu |  47 ++
 arch/arm/include/asm/outercache.h  |   9 +
 arch/arm/kernel/Makefile   |   1 +
 arch/arm/kernel/cacheinfo.c| 287 
 arch/arm/mm/Kconfig|  13 +
 arch/arm/mm/cache-l2x0.c   |  35 +-
 arch/arm/mm/cache-tauros2.c|  36 +
 arch/arm/mm/cache-xsc3l2.c |  17 +
 arch/arm64/kernel/Makefile |   2 +-
 arch/arm64/kernel/cacheinfo.c  | 142 
 arch/ia64/kernel/topology.c| 421 +++
 arch/powerpc/kernel/cacheinfo.c| 812 +++--
 arch/powerpc/kernel/cacheinfo.h|   8 -
 arch/powerpc/kernel/sysfs.c|  12 +-
 arch/s390/kernel/cache.c   | 388 +++---
 arch/x86/kernel/cpu/intel_cacheinfo.c  | 709 +-
 arch/x86/kernel/cpu/perf_event_amd_iommu.c |   5 +-
 arch/x86/kernel/cpu/perf_event_amd_uncore.c|   6 +-
 arch/x86/kernel/cpu/perf_event_intel_rapl.c|   6 +-
 arch/x86/kernel/cpu/perf_event_intel_uncore.c  |   6 +-
 drivers/acpi/acpi_pad.c|   6 +-
 drivers/base/Makefile  |   2 +-
 drivers/base/cacheinfo.c   | 541 ++
 drivers/base/cpu.c |  59 +-
 drivers/base/node.c|  14 +-
 drivers/base/topology.c|  71 +-
 drivers/pci/pci-sysfs.c|  39 +-
 include/linux/cacheinfo.h  | 100 +++
 include

[PATCH v4 04/11] drivers: base: support cpu cache information interface to userspace via sysfs

2014-09-03 Thread Sudeep Holla
From: Sudeep Holla sudeep.ho...@arm.com

This patch adds initial support for providing processor cache information
to userspace through sysfs interface. This is based on already existing
implementations(x86, ia64, s390 and powerpc) and hence the interface is
intended to be fully compatible.

The main purpose of this generic support is to avoid further code
duplication to support new architectures and also to unify all the existing
different implementations.

This implementation maintains the hierarchy of cache objects which reflects
the system's cache topology. Cache devices are instantiated as needed as
CPUs come online. The cache information is replicated per-cpu even if they are
shared. A per-cpu array of cache information maintained is used mainly for
sysfs-related book keeping.

It also implements the shared_cpu_map attribute, which is essential for
enabling both kernel and user-space to discover the system's overall cache
topology.

This patch also add the missing ABI documentation for the cacheinfo sysfs
interface already, which is well defined and widely used.

Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
Cc: Greg Kroah-Hartman gre...@linuxfoundation.org
Cc: Stephen Boyd sb...@codeaurora.org
Cc: linux-...@vger.kernel.org
Cc: linux...@de.ibm.com
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-i...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s...@vger.kernel.org
Cc: x...@kernel.org
---
 Documentation/ABI/testing/sysfs-devices-system-cpu |  47 ++
 drivers/base/Makefile  |   2 +-
 drivers/base/cacheinfo.c   | 541 +
 include/linux/cacheinfo.h  | 100 
 4 files changed, 689 insertions(+), 1 deletion(-)
 create mode 100644 drivers/base/cacheinfo.c
 create mode 100644 include/linux/cacheinfo.h

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu 
b/Documentation/ABI/testing/sysfs-devices-system-cpu
index acb9bfc89b48..99983e67c13c 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -224,3 +224,50 @@ Description:   Parameters for the Intel P-state driver
frequency range.
 
More details can be found in 
Documentation/cpu-freq/intel-pstate.txt
+
+What:  
/sys/devices/system/cpu/cpu*/cache/index*/set_of_attributes_mentioned_below
+Date:  July 2014(documented, existed before August 2008)
+Contact:   Sudeep Holla sudeep.ho...@arm.com
+   Linux kernel mailing list linux-ker...@vger.kernel.org
+Description:   Parameters for the CPU cache attributes
+
+   allocation_policy:
+   - WriteAllocate: allocate a memory location to a cache 
line
+on a cache miss because of a write
+   - ReadAllocate: allocate a memory location to a cache 
line
+   on a cache miss because of a read
+   - ReadWriteAllocate: both writeallocate and readallocate
+
+   attributes: LEGACY used only on IA64 and is same as write_policy
+
+   coherency_line_size: the minimum amount of data in bytes that 
gets
+transferred from memory to cache
+
+   level: the cache hierarcy in the multi-level cache configuration
+
+   number_of_sets: total number of sets in the cache, a set is a
+   collection of cache lines with the same cache 
index
+
+   physical_line_partition: number of physical cache line per 
cache tag
+
+   shared_cpu_list: the list of logical cpus sharing the cache
+
+   shared_cpu_map: logical cpu mask containing the list of cpus 
sharing
+   the cache
+
+   size: the total cache size in kB
+
+   type:
+   - Instruction: cache that only holds instructions
+   - Data: cache that only caches data
+   - Unified: cache that holds both data and instructions
+
+   ways_of_associativity: degree of freedom in placing a 
particular block
+   of memory in the cache
+
+   write_policy:
+   - WriteThrough: data is written to both the cache line
+   and to the block in the lower-level 
memory
+   - WriteBack: data is written only to the cache line and
+the modified cache line is written to main
+memory only when it is replaced
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 4aab26ec0292..f901bc1cffc8 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -4,7 +4,7 @@ obj-y   := component.o core.o bus.o dd.o 
syscore.o

[PATCH v3 08/11] powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure

2014-08-21 Thread Sudeep Holla
From: Sudeep Holla sudeep.ho...@arm.com

This patch removes the redundant sysfs cacheinfo code by making use of
the newly introduced generic cacheinfo infrastructure.

Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Paul Mackerras pau...@samba.org
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/cacheinfo.c | 812 +---
 arch/powerpc/kernel/cacheinfo.h |   8 -
 arch/powerpc/kernel/sysfs.c |  12 +-
 3 files changed, 90 insertions(+), 742 deletions(-)
 delete mode 100644 arch/powerpc/kernel/cacheinfo.h

diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c
index 40198d50b4c2..6845eb7fcc18 100644
--- a/arch/powerpc/kernel/cacheinfo.c
+++ b/arch/powerpc/kernel/cacheinfo.c
@@ -10,38 +10,10 @@
  * 2 as published by the Free Software Foundation.
  */
 
+#include linux/cacheinfo.h
 #include linux/cpu.h
-#include linux/cpumask.h
 #include linux/kernel.h
-#include linux/kobject.h
-#include linux/list.h
-#include linux/notifier.h
 #include linux/of.h
-#include linux/percpu.h
-#include linux/slab.h
-#include asm/prom.h
-
-#include cacheinfo.h
-
-/* per-cpu object for tracking:
- * - a cache kobject for the top-level directory
- * - a list of index objects representing the cpu's local cache hierarchy
- */
-struct cache_dir {
-   struct kobject *kobj; /* bare (not embedded) kobject for cache
-  * directory */
-   struct cache_index_dir *index; /* list of index objects */
-};
-
-/* index object: each cpu's cache directory has an index
- * subdirectory corresponding to a cache object associated with the
- * cpu.  This object's lifetime is managed via the embedded kobject.
- */
-struct cache_index_dir {
-   struct kobject kobj;
-   struct cache_index_dir *next; /* next index in parent directory */
-   struct cache *cache;
-};
 
 /* Template for determining which OF properties to query for a given
  * cache type */
@@ -60,11 +32,6 @@ struct cache_type_info {
const char *nr_sets_prop;
 };
 
-/* These are used to index the cache_type_info array. */
-#define CACHE_TYPE_UNIFIED 0
-#define CACHE_TYPE_INSTRUCTION 1
-#define CACHE_TYPE_DATA2
-
 static const struct cache_type_info cache_type_info[] = {
{
/* PowerPC Processor binding says the [di]-cache-*
@@ -92,231 +59,82 @@ static const struct cache_type_info cache_type_info[] = {
},
 };
 
-/* Cache object: each instance of this corresponds to a distinct cache
- * in the system.  There are separate objects for Harvard caches: one
- * each for instruction and data, and each refers to the same OF node.
- * The refcount of the OF node is elevated for the lifetime of the
- * cache object.  A cache object is released when its shared_cpu_map
- * is cleared (see cache_cpu_clear).
- *
- * A cache object is on two lists: an unsorted global list
- * (cache_list) of cache objects; and a singly-linked list
- * representing the local cache hierarchy, which is ordered by level
- * (e.g. L1d - L1i - L2 - L3).
- */
-struct cache {
-   struct device_node *ofnode;/* OF node for this cache, may be cpu */
-   struct cpumask shared_cpu_map; /* online CPUs using this cache */
-   int type;  /* split cache disambiguation */
-   int level; /* level not explicit in device tree */
-   struct list_head list; /* global list of cache objects */
-   struct cache *next_local;  /* next cache of = level */
-};
-
-static DEFINE_PER_CPU(struct cache_dir *, cache_dir_pcpu);
-
-/* traversal/modification of this list occurs only at cpu hotplug time;
- * access is serialized by cpu hotplug locking
- */
-static LIST_HEAD(cache_list);
-
-static struct cache_index_dir *kobj_to_cache_index_dir(struct kobject *k)
-{
-   return container_of(k, struct cache_index_dir, kobj);
-}
-
-static const char *cache_type_string(const struct cache *cache)
+static inline int get_cacheinfo_idx(enum cache_type type)
 {
-   return cache_type_info[cache-type].name;
-}
-
-static void cache_init(struct cache *cache, int type, int level,
-  struct device_node *ofnode)
-{
-   cache-type = type;
-   cache-level = level;
-   cache-ofnode = of_node_get(ofnode);
-   INIT_LIST_HEAD(cache-list);
-   list_add(cache-list, cache_list);
-}
-
-static struct cache *new_cache(int type, int level, struct device_node *ofnode)
-{
-   struct cache *cache;
-
-   cache = kzalloc(sizeof(*cache), GFP_KERNEL);
-   if (cache)
-   cache_init(cache, type, level, ofnode);
-
-   return cache;
-}
-
-static void release_cache_debugcheck(struct cache *cache)
-{
-   struct cache *iter;
-
-   list_for_each_entry(iter, cache_list, list)
-   WARN_ONCE(iter-next_local == cache,
- cache for %s(%s) refers to cache for %s(%s)\n,
- iter

[PATCH v3 00/11] drivers: cacheinfo support

2014-08-21 Thread Sudeep Holla
From: Sudeep Holla sudeep.ho...@arm.com

This series adds a generic cacheinfo support similar to topology. The
implementation is based on x86 cacheinfo support. Currently x86, powerpc,
ia64 and s390 have their own implementations. While adding similar support
to ARM and ARM64, here is the attempt to make it generic quite similar to
topology info support. It also adds the missing ABI documentation for
the cacheinfo sysfs which is already being used.

It moves all the existing different implementations on x86, ia64, powerpc
and s390 to use the generic cacheinfo infrastructure introduced here.
These changes on non-ARM platforms are only compile tested and tested on x86.

This series also adds support for ARM and ARM64 architectures based on
the generic support.

The code can be fetched from:
 git://linux-arm.org/linux-skn cacheinfo

Changes v2-v3:
- Added {allocation,write}_policy instead of single attributes sysfs
  (attributes retained on ia64 privately as it was used only on that)
- factored out show_cpumap into separate helper in cpumask.h
- populate cpu_{map,list} for non-DT system if they are not populated
  by arch specific callbacks
- removed use of sysfs *_show callback in cache_attrs_is_visible
- all the review comments from Stephen Boyd implemented

Changes v1-v2:
- removed custom device_{add,remove}_attrs, using is_visible callback
  instead(suggested by GregKH)
- arm64: changes as per MarkR review comments
- Moved smp_call_function_single to architectures using it(arm, arm64,
  x86) (suggested by Stephen Boyd)
- arm (mostly changes as per RMK's review comments)
- fixed to allow v7 + v6 build
- l2 cache changes to remove extra structure
- populated CTR for few StrongARM CPU's not implementing CTR

[v1] https://lkml.org/lkml/2014/6/25/603
[v2] https://lkml.org/lkml/2014/7/25/467

Cc: Greg Kroah-Hartman gre...@linuxfoundation.org
Cc: linux-i...@vger.kernel.org
Cc: linux...@de.ibm.com
Cc: linux-s...@vger.kernel.org
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org

Sudeep Holla (11):
  cpumask: factor out show_cpumap into separate helper function
  topology: replace custom attribute macros with standard DEVICE_ATTR*
  drivers: base: add new class cpu to group cpu devices
  drivers: base: support cpu cache information interface to userspace
via sysfs
  ia64: move cacheinfo sysfs to generic cacheinfo infrastructure
  s390: move cacheinfo sysfs to generic cacheinfo infrastructure
  x86: move cacheinfo sysfs to generic cacheinfo infrastructure
  powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure
  ARM64: kernel: add support for cpu cache information
  ARM: kernel: add support for cpu cache information
  ARM: kernel: add outer cache support for cacheinfo implementation

 Documentation/ABI/testing/sysfs-devices-system-cpu |  47 ++
 arch/arm/include/asm/outercache.h  |   9 +
 arch/arm/kernel/Makefile   |   1 +
 arch/arm/kernel/cacheinfo.c| 287 
 arch/arm/mm/Kconfig|  13 +
 arch/arm/mm/cache-l2x0.c   |  35 +-
 arch/arm/mm/cache-tauros2.c|  36 +
 arch/arm/mm/cache-xsc3l2.c |  17 +
 arch/arm64/kernel/Makefile |   2 +-
 arch/arm64/kernel/cacheinfo.c  | 142 
 arch/ia64/kernel/topology.c| 421 +++
 arch/powerpc/kernel/cacheinfo.c| 812 +++--
 arch/powerpc/kernel/cacheinfo.h|   8 -
 arch/powerpc/kernel/sysfs.c|  12 +-
 arch/s390/kernel/cache.c   | 388 +++---
 arch/x86/kernel/cpu/intel_cacheinfo.c  | 709 +-
 arch/x86/kernel/cpu/perf_event_amd_iommu.c |   5 +-
 arch/x86/kernel/cpu/perf_event_amd_uncore.c|   6 +-
 arch/x86/kernel/cpu/perf_event_intel_rapl.c|   6 +-
 arch/x86/kernel/cpu/perf_event_intel_uncore.c  |   6 +-
 drivers/acpi/acpi_pad.c|   6 +-
 drivers/base/Makefile  |   2 +-
 drivers/base/cacheinfo.c   | 543 ++
 drivers/base/core.c|  39 +-
 drivers/base/cpu.c |  12 +-
 drivers/base/node.c|  14 +-
 drivers/base/topology.c|  71 +-
 drivers/pci/pci-sysfs.c|  39 +-
 include/linux/cacheinfo.h  | 100 +++
 include/linux/cpu.h|   2 +
 include/linux/cpumask.h|  27 +
 31 files changed, 1826 insertions(+), 1991 deletions(-)
 create mode 100644 arch/arm/kernel/cacheinfo.c

[PATCH v2 0/9] drivers: cacheinfo support

2014-07-25 Thread Sudeep Holla
From: Sudeep Holla sudeep.ho...@arm.com

This series adds a generic cacheinfo support similar to topology. The
implementation is based on x86 cacheinfo support. Currently x86, powerpc,
ia64 and s390 have their own implementations. While adding similar support
to ARM and ARM64, here is the attempt to make it generic quite similar to
topology info support. It also adds the missing ABI documentation for
the cacheinfo sysfs which is already being used.

It moves all the existing different implementations on x86, ia64, powerpc
and s390 to use the generic cacheinfo infrastructure introduced here.
These changes on non-ARM platforms are only compile tested and tested on x86.

This series also adds support for ARM and ARM64 architectures based on
the generic support.

Since there was no objection to the idea in RFC, I am posting non-RFC
version here.

The code can be fetched from:
 git://linux-arm.org/linux-skn cacheinfo


Changes v1-v2:
- removed custom device_{add,remove}_attrs, using is_visible callback
  instead(suggested by GregKH)
- arm64: changes as per MarkR review comments
- Moved smp_call_function_single to architectures using it(arm, arm64,
  x86) (suggested by Stephen Boyd)
- arm (mostly changes as per RMK's review comments)
- fixed to allow v7 + v6 build
- l2 cache changes to remove extra structure
- populated CTR for few StrongARM CPU's not implementing CTR

Previous RFCs:
[1] https://lkml.org/lkml/2014/1/8/523
[2] https://lkml.org/lkml/2014/2/7/654
[3] https://lkml.org/lkml/2014/2/19/391

Cc: Greg Kroah-Hartman gre...@linuxfoundation.org
Cc: linux-i...@vger.kernel.org
Cc: linux...@de.ibm.com
Cc: linux-s...@vger.kernel.org
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org

Sudeep Holla (9):
  drivers: base: add new class cpu to group cpu devices
  drivers: base: support cpu cache information interface to userspace
via sysfs
  ia64: move cacheinfo sysfs to generic cacheinfo infrastructure
  s390: move cacheinfo sysfs to generic cacheinfo infrastructure
  x86: move cacheinfo sysfs to generic cacheinfo infrastructure
  powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure
  ARM64: kernel: add support for cpu cache information
  ARM: kernel: add support for cpu cache information
  ARM: kernel: add outer cache support for cacheinfo implementation

 Documentation/ABI/testing/sysfs-devices-system-cpu |  41 ++
 arch/arm/include/asm/outercache.h  |   9 +
 arch/arm/kernel/Makefile   |   1 +
 arch/arm/kernel/cacheinfo.c| 284 +++
 arch/arm/mm/Kconfig|  13 +
 arch/arm/mm/cache-l2x0.c   |  35 +-
 arch/arm/mm/cache-tauros2.c|  35 +
 arch/arm/mm/cache-xsc3l2.c |  16 +
 arch/arm64/kernel/Makefile |   3 +-
 arch/arm64/kernel/cacheinfo.c  | 142 
 arch/ia64/kernel/topology.c| 401 ++
 arch/powerpc/kernel/cacheinfo.c| 813 +++--
 arch/powerpc/kernel/cacheinfo.h|   8 -
 arch/powerpc/kernel/sysfs.c|  12 +-
 arch/s390/kernel/cache.c   | 388 +++---
 arch/x86/kernel/cpu/intel_cacheinfo.c  | 680 +
 drivers/base/Makefile  |   2 +-
 drivers/base/cacheinfo.c   | 539 ++
 drivers/base/core.c|  39 +-
 drivers/base/cpu.c |   7 +
 include/linux/cacheinfo.h  |  73 ++
 include/linux/cpu.h|   2 +
 22 files changed, 1660 insertions(+), 1883 deletions(-)
 create mode 100644 arch/arm/kernel/cacheinfo.c
 create mode 100644 arch/arm64/kernel/cacheinfo.c
 delete mode 100644 arch/powerpc/kernel/cacheinfo.h
 create mode 100644 drivers/base/cacheinfo.c
 create mode 100644 include/linux/cacheinfo.h

-- 
1.8.3.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 6/9] powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure

2014-07-25 Thread Sudeep Holla
From: Sudeep Holla sudeep.ho...@arm.com

This patch removes the redundant sysfs cacheinfo code by making use of
the newly introduced generic cacheinfo infrastructure.

Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Paul Mackerras pau...@samba.org
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/cacheinfo.c | 813 +---
 arch/powerpc/kernel/cacheinfo.h |   8 -
 arch/powerpc/kernel/sysfs.c |  12 +-
 3 files changed, 91 insertions(+), 742 deletions(-)
 delete mode 100644 arch/powerpc/kernel/cacheinfo.h

diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c
index 40198d50b4c2..b871c246d945 100644
--- a/arch/powerpc/kernel/cacheinfo.c
+++ b/arch/powerpc/kernel/cacheinfo.c
@@ -10,38 +10,10 @@
  * 2 as published by the Free Software Foundation.
  */
 
+#include linux/cacheinfo.h
 #include linux/cpu.h
-#include linux/cpumask.h
 #include linux/kernel.h
-#include linux/kobject.h
-#include linux/list.h
-#include linux/notifier.h
 #include linux/of.h
-#include linux/percpu.h
-#include linux/slab.h
-#include asm/prom.h
-
-#include cacheinfo.h
-
-/* per-cpu object for tracking:
- * - a cache kobject for the top-level directory
- * - a list of index objects representing the cpu's local cache hierarchy
- */
-struct cache_dir {
-   struct kobject *kobj; /* bare (not embedded) kobject for cache
-  * directory */
-   struct cache_index_dir *index; /* list of index objects */
-};
-
-/* index object: each cpu's cache directory has an index
- * subdirectory corresponding to a cache object associated with the
- * cpu.  This object's lifetime is managed via the embedded kobject.
- */
-struct cache_index_dir {
-   struct kobject kobj;
-   struct cache_index_dir *next; /* next index in parent directory */
-   struct cache *cache;
-};
 
 /* Template for determining which OF properties to query for a given
  * cache type */
@@ -60,11 +32,6 @@ struct cache_type_info {
const char *nr_sets_prop;
 };
 
-/* These are used to index the cache_type_info array. */
-#define CACHE_TYPE_UNIFIED 0
-#define CACHE_TYPE_INSTRUCTION 1
-#define CACHE_TYPE_DATA2
-
 static const struct cache_type_info cache_type_info[] = {
{
/* PowerPC Processor binding says the [di]-cache-*
@@ -92,231 +59,83 @@ static const struct cache_type_info cache_type_info[] = {
},
 };
 
-/* Cache object: each instance of this corresponds to a distinct cache
- * in the system.  There are separate objects for Harvard caches: one
- * each for instruction and data, and each refers to the same OF node.
- * The refcount of the OF node is elevated for the lifetime of the
- * cache object.  A cache object is released when its shared_cpu_map
- * is cleared (see cache_cpu_clear).
- *
- * A cache object is on two lists: an unsorted global list
- * (cache_list) of cache objects; and a singly-linked list
- * representing the local cache hierarchy, which is ordered by level
- * (e.g. L1d - L1i - L2 - L3).
- */
-struct cache {
-   struct device_node *ofnode;/* OF node for this cache, may be cpu */
-   struct cpumask shared_cpu_map; /* online CPUs using this cache */
-   int type;  /* split cache disambiguation */
-   int level; /* level not explicit in device tree */
-   struct list_head list; /* global list of cache objects */
-   struct cache *next_local;  /* next cache of = level */
-};
-
-static DEFINE_PER_CPU(struct cache_dir *, cache_dir_pcpu);
-
-/* traversal/modification of this list occurs only at cpu hotplug time;
- * access is serialized by cpu hotplug locking
- */
-static LIST_HEAD(cache_list);
-
-static struct cache_index_dir *kobj_to_cache_index_dir(struct kobject *k)
-{
-   return container_of(k, struct cache_index_dir, kobj);
-}
-
-static const char *cache_type_string(const struct cache *cache)
+static inline int get_cacheinfo_idx(enum cache_type type)
 {
-   return cache_type_info[cache-type].name;
-}
-
-static void cache_init(struct cache *cache, int type, int level,
-  struct device_node *ofnode)
-{
-   cache-type = type;
-   cache-level = level;
-   cache-ofnode = of_node_get(ofnode);
-   INIT_LIST_HEAD(cache-list);
-   list_add(cache-list, cache_list);
-}
-
-static struct cache *new_cache(int type, int level, struct device_node *ofnode)
-{
-   struct cache *cache;
-
-   cache = kzalloc(sizeof(*cache), GFP_KERNEL);
-   if (cache)
-   cache_init(cache, type, level, ofnode);
-
-   return cache;
-}
-
-static void release_cache_debugcheck(struct cache *cache)
-{
-   struct cache *iter;
-
-   list_for_each_entry(iter, cache_list, list)
-   WARN_ONCE(iter-next_local == cache,
- cache for %s(%s) refers to cache for %s(%s)\n,
- iter

Re: [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs

2014-07-10 Thread Sudeep Holla

Hi Greg,

Thanks for reviewing this.

On 10/07/14 01:09, Greg Kroah-Hartman wrote:

On Wed, Jun 25, 2014 at 06:30:37PM +0100, Sudeep Holla wrote:

+static const struct device_attribute *cache_optional_attrs[] = {
+   dev_attr_coherency_line_size,
+   dev_attr_ways_of_associativity,
+   dev_attr_number_of_sets,
+   dev_attr_size,
+   dev_attr_attributes,
+   dev_attr_physical_line_partition,
+   NULL
+};
+
+static int device_add_attrs(struct device *dev,
+   const struct device_attribute **dev_attrs)
+{
+   int i, error = 0;
+   struct device_attribute *dev_attr;
+   char *buf;
+
+   if (!dev_attrs)
+   return 0;
+
+   buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
+   if (!buf)
+   return -ENOMEM;
+
+   for (i = 0; dev_attrs[i]; i++) {
+   dev_attr = (struct device_attribute *)dev_attrs[i];
+
+   /* create attributes that provides meaningful value */
+   if (dev_attr-show(dev, dev_attr, buf)  0)
+   continue;
+
+   error = device_create_file(dev, dev_attrs[i]);
+   if (error) {
+   while (--i = 0)
+   device_remove_file(dev, dev_attrs[i]);
+   break;
+   }
+   }
+
+   kfree(buf);
+   return error;
+}


Ick, why create your own function for this when the driver core has this
functionality built into it?  Look at the is_visible() callback, and how
it is use for an attribute group please.



I agree even I added this function hesitantly as didn't realize that I can use
is_visible for this purpose. Thanks for pointing that out I will have a look
at it.


+static void device_remove_attrs(struct device *dev,
+   const struct device_attribute **dev_attrs)
+{
+   int i;
+
+   if (!dev_attrs)
+   return;
+
+   for (i = 0; dev_attrs[i]; dev_attrs++, i++)
+   device_remove_file(dev, dev_attrs[i]);
+}


You should just remove a whole group at once, not individually.



Right, I must be able to get rid of these 2 functions once I use
is_visible callback.


+
+const struct device_attribute **
+__weak cache_get_priv_attr(struct device *cache_idx_dev)
+{
+   return NULL;
+}
+
+/* Add/Remove cache interface for CPU device */
+static void cpu_cache_sysfs_exit(unsigned int cpu)
+{
+   int i;
+   struct device *tmp_dev;
+   const struct device_attribute **ci_priv_attr;
+
+   if (per_cpu_index_dev(cpu)) {
+   for (i = 0; i  cache_leaves(cpu); i++) {
+   tmp_dev = per_cache_index_dev(cpu, i);
+   if (!tmp_dev)
+   continue;
+   ci_priv_attr = cache_get_priv_attr(tmp_dev);
+   device_remove_attrs(tmp_dev, ci_priv_attr);
+   device_remove_attrs(tmp_dev, cache_optional_attrs);
+   device_unregister(tmp_dev);
+   }
+   kfree(per_cpu_index_dev(cpu));
+   per_cpu_index_dev(cpu) = NULL;
+   }
+   device_unregister(per_cpu_cache_dev(cpu));
+   per_cpu_cache_dev(cpu) = NULL;
+}
+
+static int cpu_cache_sysfs_init(unsigned int cpu)
+{
+   struct device *dev = get_cpu_device(cpu);
+
+   if (per_cpu_cacheinfo(cpu) == NULL)
+   return -ENOENT;
+
+   per_cpu_cache_dev(cpu) = device_create(dev-class, dev, cpu,
+  NULL, cache);
+   if (IS_ERR_OR_NULL(per_cpu_cache_dev(cpu)))
+   return PTR_ERR(per_cpu_cache_dev(cpu));
+
+   /* Allocate all required memory */
+   per_cpu_index_dev(cpu) = kzalloc(sizeof(struct device *) *
+cache_leaves(cpu), GFP_KERNEL);
+   if (unlikely(per_cpu_index_dev(cpu) == NULL))
+   goto err_out;
+
+   return 0;
+
+err_out:
+   cpu_cache_sysfs_exit(cpu);
+   return -ENOMEM;
+}
+
+static int cache_add_dev(unsigned int cpu)
+{
+   unsigned short i;
+   int rc;
+   struct device *tmp_dev, *parent;
+   struct cacheinfo *this_leaf;
+   const struct device_attribute **ci_priv_attr;
+   struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+
+   rc = cpu_cache_sysfs_init(cpu);
+   if (unlikely(rc  0))
+   return rc;
+
+   parent = per_cpu_cache_dev(cpu);
+   for (i = 0; i  cache_leaves(cpu); i++) {
+   this_leaf = this_cpu_ci-info_list + i;
+   if (this_leaf-disable_sysfs)
+   continue;
+   tmp_dev = device_create_with_groups(parent-class, parent, i,
+   this_leaf,
+   cache_default_groups,
+   index%1u, i);
+   if (IS_ERR_OR_NULL(tmp_dev

Re: [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs

2014-06-26 Thread Sudeep Holla

Hi,

On 25/06/14 23:23, Russell King - ARM Linux wrote:

On Wed, Jun 25, 2014 at 06:30:37PM +0100, Sudeep Holla wrote:

+   coherency_line_size: the minimum amount of data that gets 
transferred


So, what value to do envision this taking for a CPU where the cache
line size is 32 bytes, but each cache line has two dirty bits which
allow it to only evict either the upper or lower 16 bytes depending
on which are dirty?



IIUC most of existing implementations of cacheinfo on various architectures
are representing the cache line size as coherency_line_size, in which case I
need fix the definition in this file.

BTW will there be any architectural way of finding such configuration ?

Regards,
Sudeep

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs

2014-06-26 Thread Sudeep Holla



On 26/06/14 19:50, Russell King - ARM Linux wrote:

On Thu, Jun 26, 2014 at 07:41:32PM +0100, Sudeep Holla wrote:

Hi,

On 25/06/14 23:23, Russell King - ARM Linux wrote:

On Wed, Jun 25, 2014 at 06:30:37PM +0100, Sudeep Holla wrote:

+   coherency_line_size: the minimum amount of data that gets 
transferred


So, what value to do envision this taking for a CPU where the cache
line size is 32 bytes, but each cache line has two dirty bits which
allow it to only evict either the upper or lower 16 bytes depending
on which are dirty?



IIUC most of existing implementations of cacheinfo on various architectures
are representing the cache line size as coherency_line_size, in which case I
need fix the definition in this file.


As an example, here's an extract from the SA110 TRM:

StrongARM contains a 16KByte writeback data cache. The DC has 512 lines
of 32 bytes (8 words), arranged as a 32 way set associative cache, and
uses the virtual addresses generated by the processor. A line also
contains the physical address the block was fetched from and two dirty
bits. There is a dirty bit associated with both the first and second
half of the block. When a store hits in the cache the dirty bit
associated with it is set. When a block is evicted from the cache the
dirty bits are used to decide if all, half, or none of the block will
be written back to memory using the physical address stored with the
block. The DC is always reloaded a line at a time (8 words).



Thanks for the information. It's interesting that line is referred as block
when referring to 2 dirty bits. I am not sure if this can be mapped to 
physical_line_partition = 2. Thoughts ?



BTW will there be any architectural way of finding such configuration ?


Not that I know of.


That's bad :)

Regards,
Sudeep

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 6/9] powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure

2014-06-25 Thread Sudeep Holla
From: Sudeep Holla sudeep.ho...@arm.com

This patch removes the redundant sysfs cacheinfo code by making use of
the newly introduced generic cacheinfo infrastructure.

Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Paul Mackerras pau...@samba.org
Cc: Anshuman Khandual khand...@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/cacheinfo.c | 813 +---
 arch/powerpc/kernel/cacheinfo.h |   8 -
 arch/powerpc/kernel/sysfs.c |  12 +-
 3 files changed, 91 insertions(+), 742 deletions(-)
 delete mode 100644 arch/powerpc/kernel/cacheinfo.h

diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c
index 40198d5..b871c24 100644
--- a/arch/powerpc/kernel/cacheinfo.c
+++ b/arch/powerpc/kernel/cacheinfo.c
@@ -10,38 +10,10 @@
  * 2 as published by the Free Software Foundation.
  */
 
+#include linux/cacheinfo.h
 #include linux/cpu.h
-#include linux/cpumask.h
 #include linux/kernel.h
-#include linux/kobject.h
-#include linux/list.h
-#include linux/notifier.h
 #include linux/of.h
-#include linux/percpu.h
-#include linux/slab.h
-#include asm/prom.h
-
-#include cacheinfo.h
-
-/* per-cpu object for tracking:
- * - a cache kobject for the top-level directory
- * - a list of index objects representing the cpu's local cache hierarchy
- */
-struct cache_dir {
-   struct kobject *kobj; /* bare (not embedded) kobject for cache
-  * directory */
-   struct cache_index_dir *index; /* list of index objects */
-};
-
-/* index object: each cpu's cache directory has an index
- * subdirectory corresponding to a cache object associated with the
- * cpu.  This object's lifetime is managed via the embedded kobject.
- */
-struct cache_index_dir {
-   struct kobject kobj;
-   struct cache_index_dir *next; /* next index in parent directory */
-   struct cache *cache;
-};
 
 /* Template for determining which OF properties to query for a given
  * cache type */
@@ -60,11 +32,6 @@ struct cache_type_info {
const char *nr_sets_prop;
 };
 
-/* These are used to index the cache_type_info array. */
-#define CACHE_TYPE_UNIFIED 0
-#define CACHE_TYPE_INSTRUCTION 1
-#define CACHE_TYPE_DATA2
-
 static const struct cache_type_info cache_type_info[] = {
{
/* PowerPC Processor binding says the [di]-cache-*
@@ -92,231 +59,83 @@ static const struct cache_type_info cache_type_info[] = {
},
 };
 
-/* Cache object: each instance of this corresponds to a distinct cache
- * in the system.  There are separate objects for Harvard caches: one
- * each for instruction and data, and each refers to the same OF node.
- * The refcount of the OF node is elevated for the lifetime of the
- * cache object.  A cache object is released when its shared_cpu_map
- * is cleared (see cache_cpu_clear).
- *
- * A cache object is on two lists: an unsorted global list
- * (cache_list) of cache objects; and a singly-linked list
- * representing the local cache hierarchy, which is ordered by level
- * (e.g. L1d - L1i - L2 - L3).
- */
-struct cache {
-   struct device_node *ofnode;/* OF node for this cache, may be cpu */
-   struct cpumask shared_cpu_map; /* online CPUs using this cache */
-   int type;  /* split cache disambiguation */
-   int level; /* level not explicit in device tree */
-   struct list_head list; /* global list of cache objects */
-   struct cache *next_local;  /* next cache of = level */
-};
-
-static DEFINE_PER_CPU(struct cache_dir *, cache_dir_pcpu);
-
-/* traversal/modification of this list occurs only at cpu hotplug time;
- * access is serialized by cpu hotplug locking
- */
-static LIST_HEAD(cache_list);
-
-static struct cache_index_dir *kobj_to_cache_index_dir(struct kobject *k)
-{
-   return container_of(k, struct cache_index_dir, kobj);
-}
-
-static const char *cache_type_string(const struct cache *cache)
+static inline int get_cacheinfo_idx(enum cache_type type)
 {
-   return cache_type_info[cache-type].name;
-}
-
-static void cache_init(struct cache *cache, int type, int level,
-  struct device_node *ofnode)
-{
-   cache-type = type;
-   cache-level = level;
-   cache-ofnode = of_node_get(ofnode);
-   INIT_LIST_HEAD(cache-list);
-   list_add(cache-list, cache_list);
-}
-
-static struct cache *new_cache(int type, int level, struct device_node *ofnode)
-{
-   struct cache *cache;
-
-   cache = kzalloc(sizeof(*cache), GFP_KERNEL);
-   if (cache)
-   cache_init(cache, type, level, ofnode);
-
-   return cache;
-}
-
-static void release_cache_debugcheck(struct cache *cache)
-{
-   struct cache *iter;
-
-   list_for_each_entry(iter, cache_list, list)
-   WARN_ONCE(iter-next_local == cache,
- cache for %s(%s) refers to cache for %s

[PATCH 0/9] drivers: cacheinfo support

2014-06-25 Thread Sudeep Holla
From: Sudeep Holla sudeep.ho...@arm.com

This series adds a generic cacheinfo support similar to topology. The
implementation is based on x86 cacheinfo support. Currently x86, powerpc,
ia64 and s390 have their own implementations. While adding similar support
to ARM and ARM64, here is the attempt to make it generic quite similar to
topology info support. It also adds the missing ABI documentation for
the cacheinfo sysfs which is already being used.

It moves all the existing different implementations on x86, ia64, powerpc
and s390 to use the generic cacheinfo infrastructure introduced here.
These changes on non-ARM platforms are only compile tested and tested on x86.

This series also adds support for ARM and ARM64 architectures based on
the generic support.

Since there was no objection to the idea in RFC, I am posting non-RFC
version here.

The code can be fetched from:
 git://linux-arm.org/linux-skn cacheinfo

Previous RFCs:
[1] https://lkml.org/lkml/2014/1/8/523
[2] https://lkml.org/lkml/2014/2/7/654
[3] https://lkml.org/lkml/2014/2/19/391

Cc: Greg Kroah-Hartman gre...@linuxfoundation.org
Cc: linux-i...@vger.kernel.org
Cc: linux...@de.ibm.com
Cc: linux-s...@vger.kernel.org
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org

---
Sudeep Holla (9):
  drivers: base: add new class cpu to group cpu devices
  drivers: base: support cpu cache information interface to userspace
via sysfs
  ia64: move cacheinfo sysfs to generic cacheinfo infrastructure
  s390: move cacheinfo sysfs to generic cacheinfo infrastructure
  x86: move cacheinfo sysfs to generic cacheinfo infrastructure
  powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure
  ARM64: kernel: add support for cpu cache information
  ARM: kernel: add support for cpu cache information
  ARM: kernel: add outer cache support for cacheinfo implementation

 Documentation/ABI/testing/sysfs-devices-system-cpu |  41 ++
 arch/arm/include/asm/outercache.h  |  13 +
 arch/arm/kernel/Makefile   |   1 +
 arch/arm/kernel/cacheinfo.c| 249 +++
 arch/arm/mm/Kconfig|  13 +
 arch/arm/mm/cache-l2x0.c   |  10 +
 arch/arm/mm/cache-tauros2.c|  34 +
 arch/arm/mm/cache-xsc3l2.c |  15 +
 arch/arm64/kernel/Makefile |   3 +-
 arch/arm64/kernel/cacheinfo.c  | 135 
 arch/ia64/kernel/topology.c| 401 ++
 arch/powerpc/kernel/cacheinfo.c| 813 +++--
 arch/powerpc/kernel/cacheinfo.h|   8 -
 arch/powerpc/kernel/sysfs.c|  12 +-
 arch/s390/kernel/cache.c   | 388 +++---
 arch/x86/kernel/cpu/intel_cacheinfo.c  | 655 -
 drivers/base/Makefile  |   2 +-
 drivers/base/cacheinfo.c   | 564 ++
 drivers/base/core.c|  39 +-
 drivers/base/cpu.c |   7 +
 include/linux/cacheinfo.h  |  56 ++
 include/linux/cpu.h|   2 +
 22 files changed, 1590 insertions(+), 1871 deletions(-)
 create mode 100644 arch/arm/kernel/cacheinfo.c
 create mode 100644 arch/arm64/kernel/cacheinfo.c
 delete mode 100644 arch/powerpc/kernel/cacheinfo.h
 create mode 100644 drivers/base/cacheinfo.c
 create mode 100644 include/linux/cacheinfo.h

-- 
1.8.3.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs

2014-06-25 Thread Sudeep Holla
From: Sudeep Holla sudeep.ho...@arm.com

This patch adds initial support for providing processor cache information
to userspace through sysfs interface. This is based on already existing
implementations(x86, ia64, s390 and powerpc) and hence the interface is
intended to be fully compatible.

The main purpose of this generic support is to avoid further code
duplication to support new architectures and also to unify all the existing
different implementations.

This implementation maintains the hierarchy of cache objects which reflects
the system's cache topology. Cache devices are instantiated as needed as
CPUs come online. The cache information is replicated per-cpu even if they are
shared. A per-cpu array of cache information maintained is used mainly for
sysfs-related book keeping.

It also implements the shared_cpu_map attribute, which is essential for
enabling both kernel and user-space to discover the system's overall cache
topology.

This patch also add the missing ABI documentation for the cacheinfo sysfs
interface already, which is well defined and widely used.

Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
Cc: Greg Kroah-Hartman gre...@linuxfoundation.org
Cc: Rob Herring r...@kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux-i...@vger.kernel.org
Cc: linux...@de.ibm.com
Cc: linux-s...@vger.kernel.org
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
---
 Documentation/ABI/testing/sysfs-devices-system-cpu |  41 ++
 drivers/base/Makefile  |   2 +-
 drivers/base/cacheinfo.c   | 564 +
 include/linux/cacheinfo.h  |  56 ++
 4 files changed, 662 insertions(+), 1 deletion(-)
 create mode 100644 drivers/base/cacheinfo.c
 create mode 100644 include/linux/cacheinfo.h

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu 
b/Documentation/ABI/testing/sysfs-devices-system-cpu
index acb9bfc..5827f4e 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -224,3 +224,44 @@ Description:   Parameters for the Intel P-state driver
frequency range.
 
More details can be found in 
Documentation/cpu-freq/intel-pstate.txt
+
+What:  
/sys/devices/system/cpu/cpu*/cache/index*/set_of_attributes_mentioned_below
+Date:  June 2014(documented, existed before August 2008)
+Contact:   Sudeep Holla sudeep.ho...@arm.com
+   Linux kernel mailing list linux-ker...@vger.kernel.org
+Description:   Parameters for the CPU cache attributes
+
+   attributes:
+   - writethrough: data is written to both the cache line
+   and to the block in the lower-level 
memory
+   - writeback: data is written only to the cache line and
+the modified cache line is written to main
+memory only when it is replaced
+   - writeallocate: allocate a memory location to a cache 
line
+on a cache miss because of a write
+   - readallocate: allocate a memory location to a cache 
line
+   on a cache miss because of a read
+
+   coherency_line_size: the minimum amount of data that gets 
transferred
+
+   level: the cache hierarcy in the multi-level cache configuration
+
+   number_of_sets: total number of sets in the cache, a set is a
+   collection of cache lines with the same cache 
index
+
+   physical_line_partition: number of physical cache line per 
cache tag
+
+   shared_cpu_list: the list of cpus sharing the cache
+
+   shared_cpu_map: logical cpu mask containing the list of cpus 
sharing
+   the cache
+
+   size: the total cache size in kB
+
+   type:
+   - instruction: cache that only holds instructions
+   - data: cache that only caches data
+   - unified: cache that holds both data and instructions
+
+   ways_of_associativity: degree of freedom in placing a 
particular block
+   of memory in the cache
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 04b314e..bad2ff8 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -4,7 +4,7 @@ obj-y   := component.o core.o bus.o dd.o 
syscore.o \
   driver.o class.o platform.o \
   cpu.o firmware.o init.o map.o devres.o \
   attribute_container.o transport_class.o \
-  topology.o container.o
+  topology.o container.o cacheinfo.o
 obj

Re: [PATCH RFC/RFT v3 6/9] powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure

2014-03-21 Thread Sudeep Holla
Hi Anshuman,

On 21/03/14 03:44, Anshuman Khandual wrote:
 On 03/10/2014 04:42 PM, Sudeep Holla wrote:
 Hi Anshuman,

 On 07/03/14 06:14, Anshuman Khandual wrote:
 On 03/07/2014 09:36 AM, Anshuman Khandual wrote:
 On 02/19/2014 09:36 PM, Sudeep Holla wrote:
 From: Sudeep Holla sudeep.ho...@arm.com

 This patch removes the redundant sysfs cacheinfo code by making use of
 the newly introduced generic cacheinfo infrastructure.

[...]

 When it is UNIFIED we return index 0, which is correct. But the index
 for instruction and data cache seems to be swapped which wrong. This
 will fetch invalid properties for any given cache type.


 Ah, that's silly mistake on my side, will fix it.

 I have done some initial review and testing for this patch's impact on
 PowerPC (ppc64 POWER specifically). I am trying to do some code clean-up
 and re-arrangements. Will post out soon. Thanks !

 Thanks for taking time for testing and reviewing these patches.
 
 Now that you got some of the problems to work on and resend the patches, I 
 will
 hold on to the clean up patches I had.
 

I have done most of the changes but still unable to find why the shared_cpu_map
is getting incorrect on PPC. All the other wrong entries are fixed. Any clue on
shared_cpu_map ?

Regards,
Sudeep

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH RFC/RFT v3 6/9] powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure

2014-03-10 Thread Sudeep Holla

Hi Anshuman,

On 07/03/14 06:14, Anshuman Khandual wrote:

On 03/07/2014 09:36 AM, Anshuman Khandual wrote:

On 02/19/2014 09:36 PM, Sudeep Holla wrote:

From: Sudeep Holla sudeep.ho...@arm.com

This patch removes the redundant sysfs cacheinfo code by making use of
the newly introduced generic cacheinfo infrastructure.

Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Paul Mackerras pau...@samba.org
Cc: linuxppc-dev@lists.ozlabs.org
---
  arch/powerpc/kernel/cacheinfo.c | 831 ++--
  arch/powerpc/kernel/cacheinfo.h |   8 -
  arch/powerpc/kernel/sysfs.c |   4 -
  3 files changed, 109 insertions(+), 734 deletions(-)
  delete mode 100644 arch/powerpc/kernel/cacheinfo.h

diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c
index 2912b87..05b7580 100644
--- a/arch/powerpc/kernel/cacheinfo.c
+++ b/arch/powerpc/kernel/cacheinfo.c
@@ -10,38 +10,10 @@
   * 2 as published by the Free Software Foundation.
   */

+#include linux/cacheinfo.h
  #include linux/cpu.h
-#include linux/cpumask.h
  #include linux/kernel.h
-#include linux/kobject.h
-#include linux/list.h
-#include linux/notifier.h
  #include linux/of.h
-#include linux/percpu.h
-#include linux/slab.h
-#include asm/prom.h
-
-#include cacheinfo.h
-
-/* per-cpu object for tracking:
- * - a cache kobject for the top-level directory
- * - a list of index objects representing the cpu's local cache hierarchy
- */
-struct cache_dir {
-   struct kobject *kobj; /* bare (not embedded) kobject for cache
-  * directory */
-   struct cache_index_dir *index; /* list of index objects */
-};
-
-/* index object: each cpu's cache directory has an index
- * subdirectory corresponding to a cache object associated with the
- * cpu.  This object's lifetime is managed via the embedded kobject.
- */
-struct cache_index_dir {
-   struct kobject kobj;
-   struct cache_index_dir *next; /* next index in parent directory */
-   struct cache *cache;
-};

  /* Template for determining which OF properties to query for a given
   * cache type */
@@ -60,11 +32,6 @@ struct cache_type_info {
const char *nr_sets_prop;
  };

-/* These are used to index the cache_type_info array. */
-#define CACHE_TYPE_UNIFIED 0
-#define CACHE_TYPE_INSTRUCTION 1
-#define CACHE_TYPE_DATA2
-
  static const struct cache_type_info cache_type_info[] = {
{
/* PowerPC Processor binding says the [di]-cache-*
@@ -77,246 +44,115 @@ static const struct cache_type_info cache_type_info[] = {
.nr_sets_prop= d-cache-sets,
},
{
-   .name= Instruction,
-   .size_prop   = i-cache-size,
-   .line_size_props = { i-cache-line-size,
-i-cache-block-size, },
-   .nr_sets_prop= i-cache-sets,
-   },
-   {
.name= Data,
.size_prop   = d-cache-size,
.line_size_props = { d-cache-line-size,
 d-cache-block-size, },
.nr_sets_prop= d-cache-sets,
},
+   {
+   .name= Instruction,
+   .size_prop   = i-cache-size,
+   .line_size_props = { i-cache-line-size,
+i-cache-block-size, },
+   .nr_sets_prop= i-cache-sets,
+   },
  };



Hey Sudeep,

After applying this patch, the cache_type_info array looks like this.

static const struct cache_type_info cache_type_info[] = {
 {
 /*
  * PowerPC Processor binding says the [di]-cache-*
  * must be equal on unified caches, so just use
  * d-cache properties.
  */
 .name= Unified,
 .size_prop   = d-cache-size,
 .line_size_props = { d-cache-line-size,
  d-cache-block-size, },
 .nr_sets_prop= d-cache-sets,
 },
 {
 .name= Data,
 .size_prop   = d-cache-size,
 .line_size_props = { d-cache-line-size,
  d-cache-block-size, },
 .nr_sets_prop= d-cache-sets,
 },
 {
 .name= Instruction,
 .size_prop   = i-cache-size,
 .line_size_props = { i-cache-line-size,
  i-cache-block-size, },
 .nr_sets_prop= i-cache-sets,
 },
};

and this function computes the the array index for any given cache type
define for PowerPC.

static inline int get_cacheinfo_idx(enum cache_type type)
{
 if (type == CACHE_TYPE_UNIFIED)
 return 0;
 else

[PATCH RFC/RFT v3 6/9] powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure

2014-02-19 Thread Sudeep Holla
From: Sudeep Holla sudeep.ho...@arm.com

This patch removes the redundant sysfs cacheinfo code by making use of
the newly introduced generic cacheinfo infrastructure.

Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Paul Mackerras pau...@samba.org
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/cacheinfo.c | 831 ++--
 arch/powerpc/kernel/cacheinfo.h |   8 -
 arch/powerpc/kernel/sysfs.c |   4 -
 3 files changed, 109 insertions(+), 734 deletions(-)
 delete mode 100644 arch/powerpc/kernel/cacheinfo.h

diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c
index 2912b87..05b7580 100644
--- a/arch/powerpc/kernel/cacheinfo.c
+++ b/arch/powerpc/kernel/cacheinfo.c
@@ -10,38 +10,10 @@
  * 2 as published by the Free Software Foundation.
  */
 
+#include linux/cacheinfo.h
 #include linux/cpu.h
-#include linux/cpumask.h
 #include linux/kernel.h
-#include linux/kobject.h
-#include linux/list.h
-#include linux/notifier.h
 #include linux/of.h
-#include linux/percpu.h
-#include linux/slab.h
-#include asm/prom.h
-
-#include cacheinfo.h
-
-/* per-cpu object for tracking:
- * - a cache kobject for the top-level directory
- * - a list of index objects representing the cpu's local cache hierarchy
- */
-struct cache_dir {
-   struct kobject *kobj; /* bare (not embedded) kobject for cache
-  * directory */
-   struct cache_index_dir *index; /* list of index objects */
-};
-
-/* index object: each cpu's cache directory has an index
- * subdirectory corresponding to a cache object associated with the
- * cpu.  This object's lifetime is managed via the embedded kobject.
- */
-struct cache_index_dir {
-   struct kobject kobj;
-   struct cache_index_dir *next; /* next index in parent directory */
-   struct cache *cache;
-};
 
 /* Template for determining which OF properties to query for a given
  * cache type */
@@ -60,11 +32,6 @@ struct cache_type_info {
const char *nr_sets_prop;
 };
 
-/* These are used to index the cache_type_info array. */
-#define CACHE_TYPE_UNIFIED 0
-#define CACHE_TYPE_INSTRUCTION 1
-#define CACHE_TYPE_DATA2
-
 static const struct cache_type_info cache_type_info[] = {
{
/* PowerPC Processor binding says the [di]-cache-*
@@ -77,246 +44,115 @@ static const struct cache_type_info cache_type_info[] = {
.nr_sets_prop= d-cache-sets,
},
{
-   .name= Instruction,
-   .size_prop   = i-cache-size,
-   .line_size_props = { i-cache-line-size,
-i-cache-block-size, },
-   .nr_sets_prop= i-cache-sets,
-   },
-   {
.name= Data,
.size_prop   = d-cache-size,
.line_size_props = { d-cache-line-size,
 d-cache-block-size, },
.nr_sets_prop= d-cache-sets,
},
+   {
+   .name= Instruction,
+   .size_prop   = i-cache-size,
+   .line_size_props = { i-cache-line-size,
+i-cache-block-size, },
+   .nr_sets_prop= i-cache-sets,
+   },
 };
 
-/* Cache object: each instance of this corresponds to a distinct cache
- * in the system.  There are separate objects for Harvard caches: one
- * each for instruction and data, and each refers to the same OF node.
- * The refcount of the OF node is elevated for the lifetime of the
- * cache object.  A cache object is released when its shared_cpu_map
- * is cleared (see cache_cpu_clear).
- *
- * A cache object is on two lists: an unsorted global list
- * (cache_list) of cache objects; and a singly-linked list
- * representing the local cache hierarchy, which is ordered by level
- * (e.g. L1d - L1i - L2 - L3).
- */
-struct cache {
-   struct device_node *ofnode;/* OF node for this cache, may be cpu */
-   struct cpumask shared_cpu_map; /* online CPUs using this cache */
-   int type;  /* split cache disambiguation */
-   int level; /* level not explicit in device tree */
-   struct list_head list; /* global list of cache objects */
-   struct cache *next_local;  /* next cache of = level */
-};
-
-static DEFINE_PER_CPU(struct cache_dir *, cache_dir_pcpu);
-
-/* traversal/modification of this list occurs only at cpu hotplug time;
- * access is serialized by cpu hotplug locking
- */
-static LIST_HEAD(cache_list);
-
-static struct cache_index_dir *kobj_to_cache_index_dir(struct kobject *k)
-{
-   return container_of(k, struct cache_index_dir, kobj);
-}
-
-static const char *cache_type_string(const struct cache *cache)
+static inline int get_cacheinfo_idx(enum cache_type type)
 {
-   return cache_type_info[cache-type].name

[PATCH RFC/RFT v3 0/9] drivers: cacheinfo support

2014-02-19 Thread Sudeep Holla
From: Sudeep Holla sudeep.ho...@arm.com

Hi,

This series adds a generic cacheinfo support similar to topology. The
implementation is based on x86 cacheinfo support. Currently x86, powerpc,
ia64 and s390 have their own implementations. While adding similar support
to ARM and ARM64, here is the attempt to make it generic quite similar to
topology info support. It also adds the missing ABI documentation for
the cacheinfo sysfs which is already being used.

It moves all the existing different implementations on x86, ia64, powerpc
and s390 to use the generic cacheinfo infrastructure introduced here.
These changes on non-ARM platforms are only compile tested and hence
the request for testing too.

This series also adds support for ARM and ARM64 architectures based on
the generic support.

Changes v2[2]-v3:
- Added new class cpu to group all cpu devices
- Converted all raw kobjects used in cacheinfo to device_attr
  by creating cache index devices
- Added back s390 show_cacheinfo for /proc/cpuinfo
- Added disable_sysfs to cache_info for preventing a cache node
  to be exposed through sysfs if required(used on s390)

Changes v1[1]-v2[2]:
- Extended the generic cacheinfo support to accomodate all
  the existing implementations
- Moved all the existing implementations to use this new
  generic infrastructure
- Added missing ABI documentation as suggested by Greg KH
- Added support for unimplemented CTR on pre-ARMv6 implementations
  as suggested by Russell. However the ctr_info_list is not yet
  populated
- not yet changed to device_attr as suggested by Greg KH,
  registering cache as device won't eliminate the need of kobject
  unless each index of cache is registered as a device which don't
  seem to be good idea, but now it's unified it can be done easily
  in one place if needed

[1] https://lkml.org/lkml/2014/1/8/523
[2] https://lkml.org/lkml/2014/2/7/654

Cc: Greg Kroah-Hartman gre...@linuxfoundation.org
Cc: linux-i...@vger.kernel.org
Cc: linux...@de.ibm.com
Cc: linux-s...@vger.kernel.org
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
---
Sudeep Holla (9):
  drivers: base: add new class cpu to group cpu devices
  drivers: base: support cpu cache information interface to userspace
via sysfs
  ia64: move cacheinfo sysfs to generic cacheinfo infrastructure
  s390: move cacheinfo sysfs to generic cacheinfo infrastructure
  x86: move cacheinfo sysfs to generic cacheinfo infrastructure
  powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure
  ARM64: kernel: add support for cpu cache information
  ARM: kernel: add support for cpu cache information
  ARM: kernel: add outer cache support for cacheinfo implementation

 Documentation/ABI/testing/sysfs-devices-system-cpu |  40 +
 arch/arm/include/asm/outercache.h  |  13 +
 arch/arm/kernel/Makefile   |   1 +
 arch/arm/kernel/cacheinfo.c| 248 ++
 arch/arm/mm/Kconfig|  13 +
 arch/arm/mm/cache-l2x0.c   |  14 +
 arch/arm/mm/cache-tauros2.c|  35 +
 arch/arm/mm/cache-xsc3l2.c |  15 +
 arch/arm64/kernel/Makefile |   2 +-
 arch/arm64/kernel/cacheinfo.c  | 134 
 arch/ia64/kernel/topology.c| 399 ++
 arch/powerpc/kernel/cacheinfo.c| 831 +++--
 arch/powerpc/kernel/cacheinfo.h|   8 -
 arch/powerpc/kernel/sysfs.c|   4 -
 arch/s390/kernel/cache.c   | 388 +++---
 arch/x86/kernel/cpu/intel_cacheinfo.c  | 647 
 drivers/base/Makefile  |   2 +-
 drivers/base/cacheinfo.c   | 485 
 drivers/base/core.c|  35 +-
 drivers/base/cpu.c |   7 +
 include/linux/cacheinfo.h  |  55 ++
 include/linux/cpu.h|   2 +
 22 files changed, 1523 insertions(+), 1855 deletions(-)
 create mode 100644 arch/arm/kernel/cacheinfo.c
 create mode 100644 arch/arm64/kernel/cacheinfo.c
 delete mode 100644 arch/powerpc/kernel/cacheinfo.h
 create mode 100644 drivers/base/cacheinfo.c
 create mode 100644 include/linux/cacheinfo.h

-- 
1.8.3.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH RFC/RFT v2 5/8] powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure

2014-02-08 Thread Sudeep Holla
From: Sudeep Holla sudeep.ho...@arm.com

This patch removes the redundant sysfs cacheinfo code by making use of
the newly introduced generic cacheinfo infrastructure.

Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Paul Mackerras pau...@samba.org
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/cacheinfo.c | 828 ++--
 arch/powerpc/kernel/cacheinfo.h |   8 -
 arch/powerpc/kernel/sysfs.c |   4 -
 3 files changed, 109 insertions(+), 731 deletions(-)
 delete mode 100644 arch/powerpc/kernel/cacheinfo.h

diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c
index abfa011..05b7580 100644
--- a/arch/powerpc/kernel/cacheinfo.c
+++ b/arch/powerpc/kernel/cacheinfo.c
@@ -10,38 +10,10 @@
  * 2 as published by the Free Software Foundation.
  */
 
+#include linux/cacheinfo.h
 #include linux/cpu.h
-#include linux/cpumask.h
 #include linux/kernel.h
-#include linux/kobject.h
-#include linux/list.h
-#include linux/notifier.h
 #include linux/of.h
-#include linux/percpu.h
-#include linux/slab.h
-#include asm/prom.h
-
-#include cacheinfo.h
-
-/* per-cpu object for tracking:
- * - a cache kobject for the top-level directory
- * - a list of index objects representing the cpu's local cache hierarchy
- */
-struct cache_dir {
-   struct kobject *kobj; /* bare (not embedded) kobject for cache
-  * directory */
-   struct cache_index_dir *index; /* list of index objects */
-};
-
-/* index object: each cpu's cache directory has an index
- * subdirectory corresponding to a cache object associated with the
- * cpu.  This object's lifetime is managed via the embedded kobject.
- */
-struct cache_index_dir {
-   struct kobject kobj;
-   struct cache_index_dir *next; /* next index in parent directory */
-   struct cache *cache;
-};
 
 /* Template for determining which OF properties to query for a given
  * cache type */
@@ -60,11 +32,6 @@ struct cache_type_info {
const char *nr_sets_prop;
 };
 
-/* These are used to index the cache_type_info array. */
-#define CACHE_TYPE_UNIFIED 0
-#define CACHE_TYPE_INSTRUCTION 1
-#define CACHE_TYPE_DATA2
-
 static const struct cache_type_info cache_type_info[] = {
{
/* PowerPC Processor binding says the [di]-cache-*
@@ -77,246 +44,115 @@ static const struct cache_type_info cache_type_info[] = {
.nr_sets_prop= d-cache-sets,
},
{
-   .name= Instruction,
-   .size_prop   = i-cache-size,
-   .line_size_props = { i-cache-line-size,
-i-cache-block-size, },
-   .nr_sets_prop= i-cache-sets,
-   },
-   {
.name= Data,
.size_prop   = d-cache-size,
.line_size_props = { d-cache-line-size,
 d-cache-block-size, },
.nr_sets_prop= d-cache-sets,
},
+   {
+   .name= Instruction,
+   .size_prop   = i-cache-size,
+   .line_size_props = { i-cache-line-size,
+i-cache-block-size, },
+   .nr_sets_prop= i-cache-sets,
+   },
 };
 
-/* Cache object: each instance of this corresponds to a distinct cache
- * in the system.  There are separate objects for Harvard caches: one
- * each for instruction and data, and each refers to the same OF node.
- * The refcount of the OF node is elevated for the lifetime of the
- * cache object.  A cache object is released when its shared_cpu_map
- * is cleared (see cache_cpu_clear).
- *
- * A cache object is on two lists: an unsorted global list
- * (cache_list) of cache objects; and a singly-linked list
- * representing the local cache hierarchy, which is ordered by level
- * (e.g. L1d - L1i - L2 - L3).
- */
-struct cache {
-   struct device_node *ofnode;/* OF node for this cache, may be cpu */
-   struct cpumask shared_cpu_map; /* online CPUs using this cache */
-   int type;  /* split cache disambiguation */
-   int level; /* level not explicit in device tree */
-   struct list_head list; /* global list of cache objects */
-   struct cache *next_local;  /* next cache of = level */
-};
-
-static DEFINE_PER_CPU(struct cache_dir *, cache_dir_pcpu);
-
-/* traversal/modification of this list occurs only at cpu hotplug time;
- * access is serialized by cpu hotplug locking
- */
-static LIST_HEAD(cache_list);
-
-static struct cache_index_dir *kobj_to_cache_index_dir(struct kobject *k)
-{
-   return container_of(k, struct cache_index_dir, kobj);
-}
-
-static const char *cache_type_string(const struct cache *cache)
+static inline int get_cacheinfo_idx(enum cache_type type)
 {
-   return cache_type_info[cache-type].name

Re: [PATCH RFC 1/3] drivers: base: support cpu cache information interface to userspace via sysfs

2014-01-09 Thread Sudeep Holla
On 08/01/14 20:28, Greg Kroah-Hartman wrote:
 On Wed, Jan 08, 2014 at 07:26:06PM +, Sudeep Holla wrote:
 From: Sudeep Holla sudeep.ho...@arm.com
 +#define define_one_ro(_name) \
 +static struct cache_attr _name = \
 +__ATTR(_name, 0444, show_##_name, NULL)
 
 In the future, we do have __ATTR_RO(), which should be used instead.
 You should never use __ATTR() on it's own, if at all possible.  I'm
 sweeping the tree for all usages and fixing them slowly up over time.
 

Understood, will fix it.

Regards,
Sudeep


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH RFC 1/3] drivers: base: support cpu cache information interface to userspace via sysfs

2014-01-09 Thread Sudeep Holla
On 08/01/14 20:26, Greg Kroah-Hartman wrote:
 On Wed, Jan 08, 2014 at 07:26:06PM +, Sudeep Holla wrote:
 From: Sudeep Holla sudeep.ho...@arm.com

 This patch adds initial support for providing processor cache information
 to userspace through sysfs interface. This is based on x86 implementation
 and hence the interface is intended to be fully compatible.

 A per-cpu array of cache information maintained is used mainly for
 sysfs-related book keeping.

 Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
 ---
  drivers/base/Makefile |   2 +-
  drivers/base/cacheinfo.c  | 296 
 ++
  include/linux/cacheinfo.h |  43 +++
  3 files changed, 340 insertions(+), 1 deletion(-)
  create mode 100644 drivers/base/cacheinfo.c
  create mode 100644 include/linux/cacheinfo.h
 
 You are creating sysfs files, yet you didn't add Documentation/ABI/
 information, which is required.  Please fix that.
 
Ah, I overlooked it. But I am not creating any new sysfs files in this series.
I am just trying to unify duplicated code in various architectures.

Since these sysfs files are already created in:
1. arch/ia64/kernel/topology.c
2. arch/powerpc/kernel/cacheinfo.c
3. arch/s390/kernel/cache.c
4. arch/x86/kernel/cpu/intel_cacheinfo.c and
also already used by user-space tools like `lscpu` I assumed it's already
documented.

I will add it in next version.

Regards,
Sudeep

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH RFC 1/3] drivers: base: support cpu cache information interface to userspace via sysfs

2014-01-09 Thread Sudeep Holla
On 08/01/14 20:27, Greg Kroah-Hartman wrote:
 On Wed, Jan 08, 2014 at 07:26:06PM +, Sudeep Holla wrote:
 From: Sudeep Holla sudeep.ho...@arm.com

 This patch adds initial support for providing processor cache information
 to userspace through sysfs interface. This is based on x86 implementation
 and hence the interface is intended to be fully compatible.

 A per-cpu array of cache information maintained is used mainly for
 sysfs-related book keeping.

 Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
 ---
  drivers/base/Makefile |   2 +-
  drivers/base/cacheinfo.c  | 296 
 ++
  include/linux/cacheinfo.h |  43 +++
  3 files changed, 340 insertions(+), 1 deletion(-)
  create mode 100644 drivers/base/cacheinfo.c
  create mode 100644 include/linux/cacheinfo.h

 diff --git a/drivers/base/Makefile b/drivers/base/Makefile
 index 94e8a80..76f07c8 100644
 --- a/drivers/base/Makefile
 +++ b/drivers/base/Makefile
 @@ -4,7 +4,7 @@ obj-y:= core.o bus.o dd.o syscore.o \
 driver.o class.o platform.o \
 cpu.o firmware.o init.o map.o devres.o \
 attribute_container.o transport_class.o \
 -   topology.o
 +   topology.o cacheinfo.o
  obj-$(CONFIG_DEVTMPFS)  += devtmpfs.o
  obj-$(CONFIG_DMA_CMA) += dma-contiguous.o
  obj-y   += power/
 diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
 new file mode 100644
 index 000..f436c31
 --- /dev/null
 +++ b/drivers/base/cacheinfo.c
 @@ -0,0 +1,296 @@
 +/*
 + * cacheinfo support - processor cache information via sysfs
 + *
 + * Copyright (C) 2013 ARM Ltd.
 + * All Rights Reserved
 + *
 + * Author: Sudeep Holla sudeep.ho...@arm.com
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License version 2 as
 + * published by the Free Software Foundation.
 + *
 + * This program is distributed as is WITHOUT ANY WARRANTY of any
 + * kind, whether express or implied; without even the implied warranty
 + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU General Public License for more details.
 + */
 +#include linux/bitops.h
 +#include linux/cacheinfo.h
 +#include linux/compiler.h
 +#include linux/cpu.h
 +#include linux/device.h
 +#include linux/init.h
 +#include linux/kobject.h
 +#include linux/of.h
 +#include linux/sched.h
 +#include linux/slab.h
 +#include linux/smp.h
 +#include linux/sysfs.h
 +
 +struct cache_attr {
 +struct attribute attr;
 + ssize_t(*show) (unsigned int, unsigned short, char *);
 + ssize_t(*store) (unsigned int, unsigned short, const char *, size_t);
 +};
 +
 +/* pointer to kobject for cpuX/cache */
 +static DEFINE_PER_CPU(struct kobject *, ci_cache_kobject);
 +#define per_cpu_cache_kobject(cpu) (per_cpu(ci_cache_kobject, cpu))
 +
 +struct index_kobject {
 +struct kobject kobj;
 +unsigned int cpu;
 +unsigned short index;
 +};
 +
 +static cpumask_t cache_dev_map;
 +
 +/* pointer to array of kobjects for cpuX/cache/indexY */
 
 Please don't use raw kobjects for this, use the device attribute
 groups, that's what they are there for.  Bonus is that your code should
 get a lot simpler when you do that.
 

Yes I now understand device attribute group simplifies the code, but I think
kobjects are still needed as we need to track both cpu and cache index.
By reusing only cpu device kobject, we can track cpu only.

Please correct me if I am missing to understand something here.

One thought I have is to make cache_info structure common to all architecture
(for now its ARM specific) and introduce kobject in that similar to ia64
implementation. That even eliminates lot of weak functions defined.

Regards,
Sudeep



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH RFC 2/3] ARM: kernel: add support for cpu cache information

2014-01-09 Thread Sudeep Holla
On 08/01/14 20:57, Russell King - ARM Linux wrote:
 On Wed, Jan 08, 2014 at 07:26:07PM +, Sudeep Holla wrote:
 +#if __LINUX_ARM_ARCH__  7 /* pre ARMv7 */
 +
 +#define MAX_CACHE_LEVEL 1   /* Only 1 level supported */
 +#define CTR_CTYPE_SHIFT 24
 +#define CTR_CTYPE_MASK  (1  CTR_CTYPE_SHIFT)
 +
 +static inline unsigned int get_ctr(void)
 +{
 +unsigned int ctr;
 +asm volatile (mrc p15, 0, %0, c0, c0, 1 : =r (ctr));
 +return ctr;
 +}
 +
 +static enum cache_type get_cache_type(int level)
 +{
 +if (level  MAX_CACHE_LEVEL)
 +return CACHE_TYPE_NOCACHE;
 +return get_ctr()  CTR_CTYPE_MASK ?
 +CACHE_TYPE_SEPARATE : CACHE_TYPE_UNIFIED;
 
 So, what do we do for CPUs that don't implement the CTR?  Just return
 random rubbish based on decoding the CPU Identity register as if it
 were the cache type register?
 

I assume you referring to some particular CPUs which don't implement this.
I could not find it as optional or IMPLEMENTATION defined in ARM ARM.
I might be missing to find it or there may be exceptions.
Can you please provide more information on that ?

Regards,
Sudeep

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH RFC 1/3] drivers: base: support cpu cache information interface to userspace via sysfs

2014-01-09 Thread Sudeep Holla
On 09/01/14 19:31, Greg Kroah-Hartman wrote:
 On Thu, Jan 09, 2014 at 07:19:00PM +, Sudeep Holla wrote:
 On 08/01/14 20:27, Greg Kroah-Hartman wrote:
 On Wed, Jan 08, 2014 at 07:26:06PM +, Sudeep Holla wrote:
 From: Sudeep Holla sudeep.ho...@arm.com

 This patch adds initial support for providing processor cache information
 to userspace through sysfs interface. This is based on x86 implementation
 and hence the interface is intended to be fully compatible.

 A per-cpu array of cache information maintained is used mainly for
 sysfs-related book keeping.

 Signed-off-by: Sudeep Holla sudeep.ho...@arm.com
 ---
  drivers/base/Makefile |   2 +-
  drivers/base/cacheinfo.c  | 296 
 ++
  include/linux/cacheinfo.h |  43 +++
  3 files changed, 340 insertions(+), 1 deletion(-)
  create mode 100644 drivers/base/cacheinfo.c
  create mode 100644 include/linux/cacheinfo.h

 diff --git a/drivers/base/Makefile b/drivers/base/Makefile
 index 94e8a80..76f07c8 100644
 --- a/drivers/base/Makefile
 +++ b/drivers/base/Makefile
 @@ -4,7 +4,7 @@ obj-y  := core.o bus.o dd.o syscore.o \
   driver.o class.o platform.o \
   cpu.o firmware.o init.o map.o devres.o \
   attribute_container.o transport_class.o \
 - topology.o
 + topology.o cacheinfo.o
  obj-$(CONFIG_DEVTMPFS)+= devtmpfs.o
  obj-$(CONFIG_DMA_CMA) += dma-contiguous.o
  obj-y += power/
 diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
 new file mode 100644
 index 000..f436c31
 --- /dev/null
 +++ b/drivers/base/cacheinfo.c
 @@ -0,0 +1,296 @@
 +/*
 + * cacheinfo support - processor cache information via sysfs
 + *
 + * Copyright (C) 2013 ARM Ltd.
 + * All Rights Reserved
 + *
 + * Author: Sudeep Holla sudeep.ho...@arm.com
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License version 2 as
 + * published by the Free Software Foundation.
 + *
 + * This program is distributed as is WITHOUT ANY WARRANTY of any
 + * kind, whether express or implied; without even the implied warranty
 + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU General Public License for more details.
 + */
 +#include linux/bitops.h
 +#include linux/cacheinfo.h
 +#include linux/compiler.h
 +#include linux/cpu.h
 +#include linux/device.h
 +#include linux/init.h
 +#include linux/kobject.h
 +#include linux/of.h
 +#include linux/sched.h
 +#include linux/slab.h
 +#include linux/smp.h
 +#include linux/sysfs.h
 +
 +struct cache_attr {
 +  struct attribute attr;
 +   ssize_t(*show) (unsigned int, unsigned short, char *);
 +   ssize_t(*store) (unsigned int, unsigned short, const char *, size_t);
 +};
 +
 +/* pointer to kobject for cpuX/cache */
 +static DEFINE_PER_CPU(struct kobject *, ci_cache_kobject);
 +#define per_cpu_cache_kobject(cpu) (per_cpu(ci_cache_kobject, cpu))
 +
 +struct index_kobject {
 +  struct kobject kobj;
 +  unsigned int cpu;
 +  unsigned short index;
 +};
 +
 +static cpumask_t cache_dev_map;
 +
 +/* pointer to array of kobjects for cpuX/cache/indexY */

 Please don't use raw kobjects for this, use the device attribute
 groups, that's what they are there for.  Bonus is that your code should
 get a lot simpler when you do that.


 Yes I now understand device attribute group simplifies the code, but I think
 kobjects are still needed as we need to track both cpu and cache index.
 By reusing only cpu device kobject, we can track cpu only.
 
 I don't understand, you are putting things under the cpu device object,
 why do you care about a cache kobject?
 
Yes though the cache attributes are under cpu objects, it's hierarchical
something like:
/sys/devices/system/cpu/cpun/cache/indexm/attribute_x
attribute_x is unique for each pair of (cpun, indexm
index is more like cache level, but with 2 indices if they are separate(I$,D$)

 One thought I have is to make cache_info structure common to all architecture
 (for now its ARM specific) and introduce kobject in that similar to ia64
 implementation. That even eliminates lot of weak functions defined.
 
 Please don't use raw kobjects if at all possible, it's not good for a
 variety of reasons (no userspace events, have to roll your own code,
 etc.)
 
Yes I understand, will try to explore other feasible solutions.

Regards,
Sudeep


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev