date:20150225

Re: [patch] perf_event_open.2: 3.19 PERF_SAMPLE_REGS_INTR support

2015-02-25 Thread Michael Kerrisk (man-pages)

Hi Stephane (and Jiri),

Ping!

Cheers,

Michael

On 02/17/2015 06:33 AM, Michael Kerrisk (man-pages) wrote:
> Hi Stephane (and Jiri),
> 
> Would you be willing to review/comment on Vince's patch, please.
> 
> Cheers,
> 
> Michael
> 
> 
> On 02/12/2015 06:33 AM, Vince Weaver wrote:
>>
>> This manpage patch relates to the addition of PERF_SAMPLE_REGS_INTR
>> support added in the following commit:
>>
>> perf_sample_regs_intr; Linux 3.19
>>  commit 60e2364e60e86e81bc6377f49779779e6120977f
>>  Author: Stephane Eranian 
>>
>> perf: Add ability to sample machine state on interrupt
>>
>>  Reviewed-by: Jiri Olsa 
>>  Signed-off-by: Stephane Eranian 
>>  Signed-off-by: Peter Zijlstra (Intel) 
>>  Cc: cebbert.l...@gmail.com
>>  Cc: Arnaldo Carvalho de Melo 
>>  Cc: Linus Torvalds 
>>  Cc: linux-...@vger.kernel.org
>>  Link: 
>> http://lkml.kernel.org/r/1411559322-16548-2-git-send-email-eran...@google.com
>>  Signed-off-by: Ingo Molnar 
>>
>> >From what I can tell the primary difference between 
>> PERF_SAMPLE_REGS_INTR and the existing PERF_SAMPLE_REGS_USER
>> is that the new support will return kernel register values
>> (I assume that's not some sort of info leak?).
>>
>> In theory also when precise_ip is set high enough you should
>> get the PEBS register state rather than the PMU interrupt
>> register state, but I was unable to construct a test case
>> on a Haswell system where I got different values with
>> precise_ip=0, precise_ip=2, or by using PERF_SAMPLE_REGS_USER
>> instead.  Am I missing something about how to use this new 
>> interface?
>>
>> Signed-off-by: Vince Weaver 
>>
>> diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
>> index 39c8d8c..ca03928 100644
>> --- a/man2/perf_event_open.2
>> +++ b/man2/perf_event_open.2
>> @@ -256,7 +256,7 @@ struct perf_event_attr {
>>  __u32 sample_stack_user;/* size of stack to dump on
>> samples */
>>  __u32 __reserved_2; /* Align to u64 */
>> -
>> +__u64 sample_regs_intr; /* regs to dump on samples */
>>  };
>>  .fi
>>  .in
>> @@ -350,6 +350,11 @@ and
>>  .I sample_stack_user
>>  in Linux 3.7.
>>  .\" commit 1659d129ed014b715b0b2120e6fd929bdd33ed03
>> +.B PERF_ATTR_SIZE_VER4
>> +is 104 corresponding to the addition of
>> +.I sample_regs_intr
>> +in Linux 3.19.
>> +.\" commit 60e2364e60e86e81bc6377f49779779e6120977f
>>  .TP
>>  .I "config"
>>  This specifies which event you want, in conjunction with
>> @@ -752,6 +757,23 @@ event must be measured or no values will be recorded.
>>  Also note that some perf_event measurements, such as sampled
>>  cycle counting, may cause extraneous aborts (by causing an
>>  interrupt during a transaction).
>> +.TP
>> +.BR PERF_SAMPLE_REGS_INTR " (since Linux 3.19)"
>> +.\" commit 60e2364e60e86e81bc6377f49779779e6120977f
>> +Records a subset of the current CPU register state
>> +as specified by
>> +.IR sample_regs_intr .
>> +Unlike
>> +.B PERF_SAMPLE_REGS_USER
>> +the register values will return kernel register
>> +state if the overflow happened while kernel
>> +code is running.
>> +If the CPU supports hardware sampling of
>> +register state (as does PEBS on x86) and
>> +.I precise_ip
>> +is set higher than zero then the register
>> +values returned are those captured by
>> +hardware.
>>  .RE
>>  .TP
>>  .IR "read_format"
>> @@ -1855,6 +1877,9 @@ struct {
>>  u64   weight; /* if PERF_SAMPLE_WEIGHT */
>>  u64   data_src;   /* if PERF_SAMPLE_DATA_SRC */
>>  u64   transaction;/* if PERF_SAMPLE_TRANSACTION */
>> +u64   abi;/* if PERF_SAMPLE_REGS_INTR */
>> +u64   regs[weight(mask)];
>> +  /* if PERF_SAMPLE_REGS_INTR */
>>  };
>>  .fi
>>  .RS 4
>> @@ -2242,6 +2267,27 @@ the high 32 bits of the field by shifting right by
>>  .B PERF_TXN_ABORT_SHIFT
>>  and masking with
>>  .BR PERF_TXN_ABORT_MASK .
>> +.TP
>> +.IR abi ", " regs[weight(mask)]
>> +If
>> +.B PERF_SAMPLE_REGS_INTR
>> +is enabled, then the user CPU registers are recorded.
>> +
>> +The
>> +.I abi
>> +field is one of
>> +.BR PERF_SAMPLE_REGS_ABI_NONE ", " PERF_SAMPLE_REGS_ABI_32 " or "
>> +.BR PERF_SAMPLE_REGS_ABI_64 .
>> +
>> +The
>> +.I regs
>> +field is an array of the CPU registers that were specified by
>> +the
>> +.I sample_regs_intr
>> +attr field.
>> +The number of values is the number of bits set in the
>> +.I sample_regs_intr
>> +bit mask.
>>  .RE
>>  .TP
>>  .B PERF_RECORD_MMAP2
>>
>>
> 
> 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] perf_event_open.2: Exclude_hv clarification

2015-02-25 Thread Michael Kerrisk (man-pages)

Hello Paul,

Ping!

Cheers,

Michael


On 02/17/2015 06:32 AM, Michael Kerrisk (man-pages) wrote:
> Hi Paul Mackerass,
> 
> Would you be willing to review/comment on Vince's patch, please.
> 
> Cheers,
> 
> Michael
> 
> On 02/11/2015 08:04 PM, Vince Weaver wrote:
>>
>> This manpage patch relates to the exclude_hv bit added to the kernel
>> in the following commit:
>>
>> exclude_hv; Linux 2.6.31
>> commit 0475f9ea8e2cc030298908949e0d5da9f2fc2cfe
>> Author: Paul Mackerras 
>>
>> perf_counters: allow users to count user, kernel and/or 
>> hypervisor events
>>
>> Signed-off-by: Paul Mackerras 
>>
>> The updated manpage text points out that the exclude_hv 
>> "exclude hypervisor" bit only applies on hardware that 
>> supports this  feature (such as PowerPC)
>> and is silently ignored on other platforms such as x86.
>>
>> This is a resend of the patch; the previous time I sent it
>> (http://thread.gmane.org/gmane.linux.man/7500) it did not
>> receive any comments.
>>
>>
>> Signed-off-by: Vince Weaver 
>>
>> diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
>> index 39c8d8c..665aa31 100644
>> --- a/man2/perf_event_open.2
>> +++ b/man2/perf_event_open.2
>> @@ -856,10 +856,9 @@ If this bit is set, the count excludes events that 
>> happen in kernel-space.
>>  .IR "exclude_hv"
>>  If this bit is set, the count excludes events that happen in the
>>  hypervisor.
>> -This is mainly for PMUs that have built-in support for handling this
>> -(such as POWER).
>> -Extra support is needed for handling hypervisor measurements on most
>> -machines.
>> +This is mainly for PMUs that have built-in hardware support
>> +for this feature (such as POWER; this setting is silently
>> +ignored on x86).
>>  .TP
>>  .IR "exclude_idle"
>>  If set, don't count when the CPU is idle.
>>
> 
> 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] perf probe: export get_real_path

2015-02-25 Thread Masami Hiramatsu

(2015/02/26 16:12), Naohiro Aota wrote:
> Export it to use from util/probe-finder.c

Please fold this in to the next patch, since this exported symbol
is not used until applying the next one.

BTW, since get_real_path is compiled only when HAVE_DWARF_SUPPORT=y,
we can also move it into probe-finder.c.
Could you also move it into probe-finder.c and export it at probe-finder.h?

Thank you,

> 
> Signed-off-by: Naohiro Aota 
> ---
>  tools/perf/util/probe-event.c | 2 +-
>  tools/perf/util/probe-event.h | 2 ++
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index 919937e..1d0d505 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -520,7 +520,7 @@ static int try_to_find_probe_trace_events(struct 
> perf_probe_event *pev,
>   * a newly allocated path on success.
>   * Return 0 if file was found and readable, -errno otherwise.
>   */
> -static int get_real_path(const char *raw_path, const char *comp_dir,
> +int get_real_path(const char *raw_path, const char *comp_dir,
>char **new_path)
>  {
>   const char *prefix = symbol_conf.source_prefix;
> diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
> index e01e994..30a3391 100644
> --- a/tools/perf/util/probe-event.h
> +++ b/tools/perf/util/probe-event.h
> @@ -135,6 +135,8 @@ extern int show_available_vars(struct perf_probe_event 
> *pevs, int npevs,
>  struct strfilter *filter, bool externs);
>  extern int show_available_funcs(const char *module, struct strfilter *filter,
>   bool user);
> +extern int get_real_path(const char *raw_path, const char *comp_dir,
> +  char **new_path);
>  
>  /* Maximum index number of event-name postfix */
>  #define MAX_EVENT_INDEX  1024
> 


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] perf_event_open.2: Exclude_host/exclude_guest clarification

2015-02-25 Thread Michael Kerrisk (man-pages)

Hello Joerg,

Ping!

Cheers,

Michael

On 02/17/2015 06:32 AM, Michael Kerrisk (man-pages) wrote:
> Hi Joerg,
> 
> Would you be willing to review/comment on Vince's patch, please.
> 
> Cheers,
> 
> Michael
> 
> On 02/11/2015 08:06 PM, Vince Weaver wrote:
>>
>> This patch relates to the exclude_host and exclude_guest bits added 
>> by the  following commit:
>>
>>exclude_host, exclude_guest; Linux 3.2
>> commit a240f76165e6255384d4bdb8139895fac7988799
>> Author: Joerg Roedel 
>> Date:   Wed Oct 5 14:01:16 2011 +0200
>>
>> perf, core: Introduce attrs to count in either host or guest mode
>>
>> Signed-off-by: Joerg Roedel 
>> Signed-off-by: Gleb Natapov 
>> Signed-off-by: Peter Zijlstra 
>> Link: 
>> http://lkml.kernel.org/r/1317816084-18026-2-git-send-email-g...@redhat.com
>> Signed-off-by: Ingo Molnar 
>>
>> The updated manpage text clarifies that the "exclude_host" and
>> "exclude_guest" perf_event_open() attr bits only apply in the 
>> context of a KVM environment and are currently x86 only.
>>
>> This is a resend of the patch; the previous time I sent it
>> (http://thread.gmane.org/gmane.linux.man/7500) it did not
>> receive any comments.
>>
>> Signed-off-by: Vince Weaver 
>>
>> diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
>> index 39c8d8c..1ea56c9 100644
>> --- a/man2/perf_event_open.2
>> +++ b/man2/perf_event_open.2
>> @@ -1006,11 +1006,25 @@ struct sample_id {
>>  .TP
>>  .IR "exclude_host" " (since Linux 3.2)"
>>  .\" commit a240f76165e6255384d4bdb8139895fac7988799
>> -Do not measure time spent in VM host.
>> +When conducting measurements that include processes running
>> +VM instances (i.e. have executed a
>> +.I KVM_RUN
>> +.BR ioctl (2)
>> +) only measure events happening inside a guest instance.
>> +This is only meaningful outside the guests; this setting does
>> +not change counts gathered inside of a guest.
>> +Currently this functionality is x86 only.
>>  .TP
>>  .IR "exclude_guest" " (since Linux 3.2)"
>>  .\" commit a240f76165e6255384d4bdb8139895fac7988799
>> -Do not measure time spent in VM guest.
>> +When conducting measurements that include processes running
>> +VM instances (i.e. have executed a
>> +.I KVM_RUN
>> +.BR ioctl (2)
>> +) do not measure events happening inside guest instances.
>> +This is only meaningful outside the guests; this setting does
>> +not change counts gathered inside of a guest.
>> +Currently this functionality is x86 only.
>>  .TP
>>  .IR "exclude_callchain_kernel" " (since Linux 3.7)"
>>  .\" commit d077526485d5c9b12fe85d0b2b3b7041e6bc5f91
>>
> 
> 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/4] x86: use correct early_[mem,io][re,un]map pairs

2015-02-25 Thread Dave Young

On 02/24/15 at 10:13am, Juergen Gross wrote:
> Areas mapped via early_memremap() should be unmapped via
> early_memunmap(), while I/O-areas should be mapped via early_ioremap()
> and unmapped via early_iounmap().
> 
> There are multiple spots where an area is mapped via the mem variant
> and unmapped via the io variant. This series corrects this by using
> the appropriate variants.
> 
> Juergen Gross (4):
>   x86: use early_memunmap in arch/x86/kernel/devicetree.c
>   x86: use early_memunmap in arch/x86/kernel/e820.c
>   x86: use early_memunmap in arch/x86/kernel/setup.c
>   x86, efi: use early_ioremap in arch/x86/platform/efi/efi-bgrt.c
> 
>  arch/x86/kernel/devicetree.c | 4 ++--
>  arch/x86/kernel/e820.c   | 2 +-
>  arch/x86/kernel/setup.c  | 8 
>  arch/x86/platform/efi/efi-bgrt.c | 4 ++--
>  4 files changed, 9 insertions(+), 9 deletions(-)

Acked-by: Dave Young 

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH v2] sched/rt: Use IPI to trigger RT task push migration instead of pulling

2015-02-25 Thread Peter Zijlstra

On Wed, Feb 25, 2015 at 12:50:15PM -0500, Steven Rostedt wrote:
> > Well, the problem with it is one of collisions. So the 'easy' solution I
> > proposed would be something like:
> > 
> > int ips_next(struct ipi_pull_struct *ips)
> > {
> > int cpu = ips->src_cpu;
> > cpu = cpumask_next(cpu, rto_mask);
> > if (cpu >= nr_cpu_ids) {
> 
> Do we really need to loop? Just start with the first one, and go to the
> end.
> 
> > cpu = 0;
> > ips->flags |= IPS_LOOPED;
> > cpu = cpumask_next(cpu, rto_mask);
> > if (cpu >= nr_cpu_ids) /* empty mask *;
> > return cpu;
> > }
> > if (ips->flags & IPS_LOOPED && cpu >= ips->stop_cpu)
> > return nr_cpu_ids;
> > return cpu;
> > }

Yes, notice that we don't start iterating at the beginning; this in on
purpose. If we start iterating at the beginning, _every_ cpu will again
pile up on the first one.

By starting at the current cpu, each cpu will start iteration some place
else and hopefully, with a big enough system, different CPUs end up on a
different rto cpu.

> > 
> > 
> > struct ipi_pull_struct *ips = __this_cpu_ptr(ips);
> > 
> > raw_spin_lock(&ips->lock);
> > if (ips->flags & IPS_BUSY) {
> > /* there is an IPI active; update state */
> > ips->dst_prio = current->prio;
> > ips->stop_cpu = ips->src_cpu;
> > ips->flags &= ~IPS_LOOPED;
> 
> I guess the loop is needed for continuing the work, in case the
> scheduling changed?

That too.

> > } else {
> > /* no IPI active, make one go */
> > ips->dst_cpu = smp_processor_id();
> > ips->dst_prio = current->prio;
> > ips->src_cpu = ips->dst_cpu;
> > ips->stop_cpu = ips->dst_cpu;
> > ips->flags = IPS_BUSY;
> > 
> > cpu = ips_next(ips);
> > ips->src_cpu = cpu;
> > if (cpu < nr_cpu_ids)
> > irq_work_queue_on(&ips->work, cpu);
> > }
> > raw_spin_unlock(&ips->lock);
> 
> I'll have to spend some time comprehending this.

:-)

> > Where you would simply start walking the RTO mask from the current
> > position -- it also includes some restart logic, and you'd only take
> > ips->lock when your ipi handler starts and when it needs to migrate to
> > another cpu.
> > 
> > This way, on big systems, there's at least some chance different CPUs
> > find different targets to pull from.
> 
> OK, makes sense. I can try that.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH v2] sched/rt: Use IPI to trigger RT task push migration instead of pulling

2015-02-25 Thread Peter Zijlstra

On Wed, Feb 25, 2015 at 12:50:15PM -0500, Steven Rostedt wrote:
> It can't be used for state?
> 
> If one CPU writes "zero", and the other CPU wants to decide if the
> system is in the state to do something, isn't a rmb() fine to use?
> 
> 
> CPU 1:
> 
>   x = 0;
>   /* Tell other CPUs they can now do something */
>   smp_wmb();
> 
> CPU 2:
>   /* Make sure we see current state of x */
>   smp_rmb();
>   if (x == 0)
>   do_something();
> 
> The above situation is not acceptable?

Acceptable is just not the word. It plain doesn't work that way.

> Otherwise, we fail to be able to do_something() when it is perfectly
> fine to do so.

Can't be helped.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 1/6] clk: add of_clk_get_parent_rate function

2015-02-25 Thread Ray Jui

On 2/25/2015 10:51 PM, Sascha Hauer wrote:
> On Wed, Feb 25, 2015 at 10:13:15PM -0800, Ray Jui wrote:
>> Hi Sascha,
>>
>> On 2/25/2015 9:54 PM, Sascha Hauer wrote:
>>> Hi Ray,
>>>
>>> On Wed, Feb 04, 2015 at 04:55:00PM -0800, Ray Jui wrote:
 Sometimes a clock needs to know the rate of its parent before itself is
 registered to the framework. An example is that a PLL may need to
 initialize itself to a specific VCO frequency, before registering to the
 framework. The parent rate needs to be known, for PLL multipliers and
 divisors to be configured properly.

 Introduce helper function of_clk_get_parent_rate, which can be used to
 obtain the parent rate of a clock, given a device node and index.
>>>
>>> I can't see how this patch helps you. First it's not guaranteed that
>>> the parent is already registered, what do you do in this case?
>>
>> In the case when clock parent is not found, as you can see from the
>> code, it simply returns zero, just like other clk get rate APIs.
> 
> Yes, but what do you do with the 0 result then in your PLL initialization?
> 

As of the current code, it fails the PLL frequency initialization and
bails out. Thinking about it more, it actually makes more sense to just
warn and still go ahead to register the clock, in which case it will use
whatever default frequency after chip power on reset or a frequency
configured in the bootloader.

>>
>> I thought the order of clock registration is based on order of the clock
>> nodes in device tree. It makes sense to me to declare the parent clock
>> before a child clock, so it's guaranteed that the parent is registered
>> before the child.
> 
> No, you can't rely on that. The order of the device nodes may happen to
> define the order of clock initialization now, but that may change.
> device nodes are usually ordered by bus addresses, not by intended
> initialization order. Even if you reorder them everything must still
> work.
> 

Okay I get your point that the order of device nodes may not be relied
on for device initialization order. But then another mechanism should be
deployed to give developers the option to decide on the clock
initialization sequence. It can be optional but it should be there.

>>
>>> Then the clock framework doesn't require that you initialize the PLL
>>> before registering. That can be done in the clk ops later.
>>
>> Sure it's not mandatory. But what's wrong with me choosing to initialize
>> the PLL clock to a known frequency before registering it to the framework?
> 
> Appearantly you don't know the (input) frequency of the PLL when
> registering it to the framework, so the question must be: What's wrong
> with keeping it uninitialized?
> 
> If the PLL is unused then you don't care about it's initialization
> status. If it happens to be enabled by a bootloader and still unused
> at late_initcall time the clock framework will disable it so you
> have a known state then. If a consumer for the PLL appears it's its
> job to initialize it through the clk api.
> 
> Sascha
> 

Okay, what we need here is to initialize the PLL to a desired frequency,
based on device tree settings (since it will be configured differently,
among different boards). This is a PLL that 1) has limited options of
frequencies which it can be configured to, and 2) has multiple child
clocks, where is a more suitable place to initialize it to the desired
frequency than right before registering it to the framework? I know a
lot of people do it in the bootloader, but I thought we should be given
the flexibility of configuring it in the kernel.

When you say "consumers", do you mean 1) the device driver that uses the
PLL; or 2) the device driver that use the child clock of the PLL? If
it's case 1), then we don't really have a device driver that directly
uses the PLL, and I thought that's quite normal, as most PLLs don't
directly feed into any peripherals.

We do have multiple device drivers that use the child clocks of the PLL,
but it makes no sense to configure the PLL clock in any of those drivers.

Ray
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] perf probe: export get_real_path

2015-02-25 Thread Naohiro Aota

Export it to use from util/probe-finder.c

Signed-off-by: Naohiro Aota 
---
 tools/perf/util/probe-event.c | 2 +-
 tools/perf/util/probe-event.h | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 919937e..1d0d505 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -520,7 +520,7 @@ static int try_to_find_probe_trace_events(struct 
perf_probe_event *pev,
  * a newly allocated path on success.
  * Return 0 if file was found and readable, -errno otherwise.
  */
-static int get_real_path(const char *raw_path, const char *comp_dir,
+int get_real_path(const char *raw_path, const char *comp_dir,
 char **new_path)
 {
const char *prefix = symbol_conf.source_prefix;
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index e01e994..30a3391 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -135,6 +135,8 @@ extern int show_available_vars(struct perf_probe_event 
*pevs, int npevs,
   struct strfilter *filter, bool externs);
 extern int show_available_funcs(const char *module, struct strfilter *filter,
bool user);
+extern int get_real_path(const char *raw_path, const char *comp_dir,
+char **new_path);
 
 /* Maximum index number of event-name postfix */
 #define MAX_EVENT_INDEX1024
-- 
2.3.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] perf probe: Find compilation directory path for lazy matching

2015-02-25 Thread Naohiro Aota

If we use lazy matching, it failed to open a souce file if perf command
is invoked outside of compilation directory:

$ perf probe -a '__schedule;clear_*'
Failed to open kernel/sched/core.c: No such file or directory
  Error: Failed to add events. (-2)

OTOH, other commands like "probe -L" can solve the souce directory by
themselves. Let's make it possible for lazy matching too!

Signed-off-by: Naohiro Aota 
---
 tools/perf/util/probe-finder.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index b5247d7..8e0714c 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -39,6 +39,7 @@
 #include "util.h"
 #include "symbol.h"
 #include "probe-finder.h"
+#include "probe-event.h"
 
 /* Kprobe tracer basic type is up to u64 */
 #define MAX_BASIC_TYPE_BITS64
@@ -849,11 +850,23 @@ static int probe_point_lazy_walker(const char *fname, int 
lineno,
 static int find_probe_point_lazy(Dwarf_Die *sp_die, struct probe_finder *pf)
 {
int ret = 0;
+   char *fpath;
 
if (intlist__empty(pf->lcache)) {
+   const char *comp_dir;
+
+   comp_dir = cu_get_comp_dir(&pf->cu_die);
+   ret = get_real_path(pf->fname, comp_dir, &fpath);
+   if (ret < 0) {
+   free(fpath);
+   pr_warning("Failed to find source file path.\n");
+   return ret;
+   }
+
/* Matching lazy line pattern */
-   ret = find_lazy_match_lines(pf->lcache, pf->fname,
+   ret = find_lazy_match_lines(pf->lcache, fpath,
pf->pev->point.lazy_line);
+   free(fpath);
if (ret <= 0)
return ret;
}
-- 
2.3.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86, traps: maps all IDTs to fixmap area.

2015-02-25 Thread Wang Nan

The reason why mapping idt_table to fixmap area should also be applied
to debug_idt_table and trace_idt_table. This patch does same thing for
all IDTs.

Signed-off-by: Wang Nan 
---

I believe trace_idt_table and debug_idt_table should be symmetrical with
idt_table. However, Like my previous patch 'x86, traps: install gates
using IST after cpu_init()', I'm not sure whether this is a practical
fix.

---
 arch/x86/include/asm/fixmap.h |  6 ++
 arch/x86/kernel/tracepoint.c  |  2 +-
 arch/x86/kernel/traps.c   | 13 +++--
 arch/x86/xen/mmu.c|  6 ++
 4 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index f80d700..79550f4 100644
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -90,6 +90,12 @@ enum fixed_addresses {
FIX_IO_APIC_BASE_END = FIX_IO_APIC_BASE_0 + MAX_IO_APICS - 1,
 #endif
FIX_RO_IDT, /* Virtual mapping for read-only IDT */
+#ifdef CONFIG_X86_64
+   FIX_RO_DEBUG_IDT,   /* Virtual mapping for read-only 
debug_idt_table */
+#endif
+#ifdef CONFIG_TRACING
+   FIX_RO_TRACE_IDT,   /* Virtual mapping for read-only 
trace_idt_table */
+#endif
 #ifdef CONFIG_X86_32
FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */
FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1,
diff --git a/arch/x86/kernel/tracepoint.c b/arch/x86/kernel/tracepoint.c
index 1c113db..296e130 100644
--- a/arch/x86/kernel/tracepoint.c
+++ b/arch/x86/kernel/tracepoint.c
@@ -12,7 +12,7 @@ atomic_t trace_idt_ctr = ATOMIC_INIT(0);
 struct desc_ptr trace_idt_descr = { NR_VECTORS * 16 - 1,
(unsigned long) trace_idt_table };
 
-/* No need to be aligned, but done to keep all IDTs defined the same way. */
+/* Must be page-aligned because the real IDT is used in a fixmap. */
 gate_desc trace_idt_table[NR_VECTORS] __page_aligned_bss;
 
 static int trace_irq_vector_refcount;
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index cf7898e..6d88c37 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -67,7 +67,7 @@
 #include 
 #include 
 
-/* No need to be aligned, but done to keep all IDTs defined the same way. */
+/* Must be page-aligned because the real IDT is used in a fixmap. */
 gate_desc debug_idt_table[NR_VECTORS] __page_aligned_bss;
 #else
 #include 
@@ -998,9 +998,18 @@ void __init trap_init(void)
 * Set the IDT descriptor to a fixed read-only location, so that the
 * "sidt" instruction will not leak the location of the kernel, and
 * to defend the IDT against arbitrary memory write vulnerabilities.
-* It will be reloaded in cpu_init() */
+* It will be reloaded in cpu_init()
+*/
__set_fixmap(FIX_RO_IDT, __pa_symbol(idt_table), PAGE_KERNEL_RO);
idt_descr.address = fix_to_virt(FIX_RO_IDT);
+#ifdef CONFIG_X86_64
+   __set_fixmap(FIX_RO_DEBUG_IDT, __pa_symbol(debug_idt_table), 
PAGE_KERNEL_RO);
+   debug_idt_descr.address = fix_to_virt(FIX_RO_DEBUG_IDT);
+#endif
+#ifdef CONFIG_TRACING
+   __set_fixmap(FIX_RO_TRACE_IDT, __pa_symbol(trace_idt_table), 
PAGE_KERNEL_RO);
+   trace_idt_descr.address = fix_to_virt(FIX_RO_TRACE_IDT);
+#endif
 
/*
 * Should be a barrier for any external CPU state:
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index adca9e2..1fd4a4c 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -1984,6 +1984,12 @@ static void xen_set_fixmap(unsigned idx, phys_addr_t 
phys, pgprot_t prot)
switch (idx) {
case FIX_BTMAP_END ... FIX_BTMAP_BEGIN:
case FIX_RO_IDT:
+#ifdef CONFIG_X86_64
+   case FIX_RO_DEBUG_IDT:
+#endif
+#ifdef CONFIG_TRACING
+   case FIX_RO_TRACE_IDT:
+#endif
 #ifdef CONFIG_X86_32
case FIX_WP_TEST:
 # ifdef CONFIG_HIGHMEM
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/5] tty/serial: at91: set ops in property init each time

2015-02-25 Thread Leilei Zhao

The property in device tree will be reading each time when tty is opened,
so the ops of serial port should be set after that instead of setting once
in probe. Otherwise, the ops of serial port is inconsistent with the state
of serial work manner. For example, the atmel serial driver can't work when
switching to PIO mode due to DMA channel is not available.

Signed-off-by: Leilei Zhao 
Acked-by: Nicolas Ferre 
---
 drivers/tty/serial/atmel_serial.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/tty/serial/atmel_serial.c 
b/drivers/tty/serial/atmel_serial.c
index 30a62cd..8d28210 100644
--- a/drivers/tty/serial/atmel_serial.c
+++ b/drivers/tty/serial/atmel_serial.c
@@ -1759,6 +1759,7 @@ static int atmel_startup(struct uart_port *port)
 * Initialize DMA (if necessary)
 */
atmel_init_property(atmel_port, pdev);
+   atmel_set_ops(port);
 
if (atmel_port->prepare_rx) {
retval = atmel_port->prepare_rx(port);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/5] tty/serial: at91: revise the return type of atmel_init_property

2015-02-25 Thread Leilei Zhao

The function of atmel_init_property is to set the work manner of
atmel serial ports according to the property in device trees.
If DMA or PDC is not set or something goes wrong in getting property,
the work manner will switch to general PIO mode, thus there will
not be any failure case in this function. It's actually a procedure.
So changing the return type from int to void.

Signed-off-by: Leilei Zhao 
Acked-by: Nicolas Ferre 
---
 drivers/tty/serial/atmel_serial.c |7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/tty/serial/atmel_serial.c 
b/drivers/tty/serial/atmel_serial.c
index 6a4d44d..30a62cd 100644
--- a/drivers/tty/serial/atmel_serial.c
+++ b/drivers/tty/serial/atmel_serial.c
@@ -1534,7 +1534,7 @@ static void atmel_tasklet_func(unsigned long data)
spin_unlock(&port->lock);
 }
 
-static int atmel_init_property(struct atmel_uart_port *atmel_port,
+static void atmel_init_property(struct atmel_uart_port *atmel_port,
struct platform_device *pdev)
 {
struct device_node *np = pdev->dev.of_node;
@@ -1575,7 +1575,6 @@ static int atmel_init_property(struct atmel_uart_port 
*atmel_port,
atmel_port->use_dma_tx  = false;
}
 
-   return 0;
 }
 
 static void atmel_init_rs485(struct uart_port *port,
@@ -2235,8 +2234,8 @@ static int atmel_init_port(struct atmel_uart_port 
*atmel_port,
struct uart_port *port = &atmel_port->uart;
struct atmel_uart_data *pdata = dev_get_platdata(&pdev->dev);
 
-   if (!atmel_init_property(atmel_port, pdev))
-   atmel_set_ops(port);
+   atmel_init_property(atmel_port, pdev);
+   atmel_set_ops(port);
 
atmel_init_rs485(port, pdev);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/5] tty/serial: at91: correct buffer size used in DMA

2015-02-25 Thread Leilei Zhao

The buffer size set in DMA is inconsistent with its allocation.
So keep them consistent here. The structure atmel_uart_char is
used in PIO mode with its meaning. But here in DMA, all of the
buffer is treated as general char.

Signed-off-by: Leilei Zhao 
Acked-by: Nicolas Ferre 
---
 drivers/tty/serial/atmel_serial.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/serial/atmel_serial.c 
b/drivers/tty/serial/atmel_serial.c
index 460903c..6a4d44d 100644
--- a/drivers/tty/serial/atmel_serial.c
+++ b/drivers/tty/serial/atmel_serial.c
@@ -1030,7 +1030,7 @@ static int atmel_prepare_rx_dma(struct uart_port *port)
BUG_ON((int)ring->buf & ~PAGE_MASK);
sg_set_page(&atmel_port->sg_rx,
virt_to_page(ring->buf),
-   ATMEL_SERIAL_RINGSIZE,
+   sizeof(struct atmel_uart_char) * ATMEL_SERIAL_RINGSIZE,
(int)ring->buf & ~PAGE_MASK);
nent = dma_map_sg(port->dev,
  &atmel_port->sg_rx,
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/5] tty/serial: at91: correct the usage of tasklet

2015-02-25 Thread Leilei Zhao

The tasklet may be scheduled and executed after serial port
was shutdown, for example, DMA rx callback will schedule the
tasklet while serial port is shutting down, especially serial
port is sending and receiving data in a higher baud rate and
it's killed by external program. In this case, tasklet_kill
can only clear the current scheduling out, so tasklet should
be disabled to prevent being executed in later scheduling.
Otherwise, the tasklet executed after serial port was shutdown
can lead to kernel crash.

Signed-off-by: Leilei Zhao 
Acked-by: Nicolas Ferre 
---
 drivers/tty/serial/atmel_serial.c |4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/tty/serial/atmel_serial.c 
b/drivers/tty/serial/atmel_serial.c
index 8d28210..39ec278 100644
--- a/drivers/tty/serial/atmel_serial.c
+++ b/drivers/tty/serial/atmel_serial.c
@@ -1755,6 +1755,8 @@ static int atmel_startup(struct uart_port *port)
if (retval)
goto free_irq;
 
+   tasklet_enable(&atmel_port->tasklet);
+
/*
 * Initialize DMA (if necessary)
 */
@@ -1858,6 +1860,7 @@ static void atmel_shutdown(struct uart_port *port)
 * Clear out any scheduled tasklets before
 * we destroy the buffers
 */
+   tasklet_disable(&atmel_port->tasklet);
tasklet_kill(&atmel_port->tasklet);
 
/*
@@ -2251,6 +2254,7 @@ static int atmel_init_port(struct atmel_uart_port 
*atmel_port,
 
tasklet_init(&atmel_port->tasklet, atmel_tasklet_func,
(unsigned long)port);
+   tasklet_disable(&atmel_port->tasklet);
 
memset(&atmel_port->rx_ring, 0, sizeof(atmel_port->rx_ring));
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/5] tty/serial: at91: correct check of buf used in DMA

2015-02-25 Thread Leilei Zhao

We only use buf of ring In DMA rx function while using buf of xmit
in DMA tx function. So here we need definitively to check the buf
of ring which is corresponding to DMA rx function.

Signed-off-by: Leilei Zhao 
Acked-by: Nicolas Ferre 
---
 drivers/tty/serial/atmel_serial.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/serial/atmel_serial.c 
b/drivers/tty/serial/atmel_serial.c
index 846552b..460903c 100644
--- a/drivers/tty/serial/atmel_serial.c
+++ b/drivers/tty/serial/atmel_serial.c
@@ -1027,7 +1027,7 @@ static int atmel_prepare_rx_dma(struct uart_port *port)
spin_lock_init(&atmel_port->lock_rx);
sg_init_table(&atmel_port->sg_rx, 1);
/* UART circular rx buffer is an aligned page. */
-   BUG_ON((int)port->state->xmit.buf & ~PAGE_MASK);
+   BUG_ON((int)ring->buf & ~PAGE_MASK);
sg_set_page(&atmel_port->sg_rx,
virt_to_page(ring->buf),
ATMEL_SERIAL_RINGSIZE,
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/5] tty/serial: at91: fix bugs when using multiple serials

2015-02-25 Thread Leilei Zhao

The series of patches fix bugs when using multiple serial ports at the same 
time:
 - The using rx ring buffer in DMA is inconsistent with its allocation.
 - The serial port can't send and receive data when it's opened the second time
   and the later if it switches to PIO when DMA channel is not available.
 - The serial port can lead to kernel crash when data is sending and receiving
   and the program is killed.
 
The patches were made from branch tty-next of gregkh/tty.git repository, and 
tested on a SAMA5D36 VB boards with 5 serial ports enabled.

Leilei Zhao (5):
  tty/serial: at91: correct check of buf used in DMA
  tty/serial: at91: correct buffer size used in DMA
  tty/serial: at91: revise the return type of atmel_init_property
  tty/serial: at91: set ops in property init each time
  tty/serial: at91: correct the usage of tasklet

 drivers/tty/serial/atmel_serial.c |   16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[perf/core PATCH v5 2/4] perf buildid-cache: Add --purge FILE to remove all caches of FILE

2015-02-25 Thread Masami Hiramatsu

Add --purge FILE to remove all caches of FILE.
Since the current --remove FILE removes a cache which has
same build-id of given FILE. Since the command takes a
FILE path, it can confuse user who tries to remove cache
about FILE path.

  -
  # ./perf buildid-cache -v --add ./perf
  Adding 133b7b5486d987a5ab5c3ebf4ea14941f45d4d4f ./perf: Ok
  # (update the ./perf binary)
  # ./perf buildid-cache -v --remove ./perf
  Removing 305bbd1be68f66eca7e2d78db294653031edfa79 ./perf: FAIL
  ./perf wasn't in the cache
  -
Actually, the --remove's FAIL is not shown, it just silently fails.

So, this patch adds --purge FILE action for such usecase.
perf buildid-cache --purge FILE removes all caches which
has same FILE path.
In other words, it removes all caches including old binaries.

  -
  # ./perf buildid-cache -v --add ./perf
  Adding 133b7b5486d987a5ab5c3ebf4ea14941f45d4d4f ./perf: Ok
  # (update the ./perf binary)
  # ./perf buildid-cache -v --purge ./perf
  Removing 133b7b5486d987a5ab5c3ebf4ea14941f45d4d4f ./perf: Ok
  -

BTW, if you want to purge all the caches, remove ~/.debug/* .

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/Documentation/perf-buildid-cache.txt |   13 ++--
 tools/perf/builtin-buildid-cache.c  |   44 
 tools/perf/util/build-id.c  |   85 ++-
 tools/perf/util/build-id.h  |1 
 4 files changed, 122 insertions(+), 21 deletions(-)

diff --git a/tools/perf/Documentation/perf-buildid-cache.txt 
b/tools/perf/Documentation/perf-buildid-cache.txt
index cec6b57..dd07b55 100644
--- a/tools/perf/Documentation/perf-buildid-cache.txt
+++ b/tools/perf/Documentation/perf-buildid-cache.txt
@@ -12,9 +12,9 @@ SYNOPSIS
 
 DESCRIPTION
 ---
-This command manages the build-id cache. It can add and remove files to/from
-the cache. In the future it should as well purge older entries, set upper
-limits for the space used by the cache, etc.
+This command manages the build-id cache. It can add, remove, update and purge
+files to/from the cache. In the future it should as well set upper limits for
+the space used by the cache, etc.
 
 OPTIONS
 ---
@@ -36,7 +36,12 @@ OPTIONS
 actually made.
 -r::
 --remove=::
-Remove specified file from the cache.
+Remove a cached binary which has same build-id of specified file
+from the cache.
+-p::
+--purge=::
+Purge all cached binaries including older caches which have specified
+   path from the cache.
 -M::
 --missing=::
List missing build ids in the cache for the specified file.
diff --git a/tools/perf/builtin-buildid-cache.c 
b/tools/perf/builtin-buildid-cache.c
index e7568f5..37182bb 100644
--- a/tools/perf/builtin-buildid-cache.c
+++ b/tools/perf/builtin-buildid-cache.c
@@ -223,6 +223,29 @@ static int build_id_cache__remove_file(const char 
*filename)
return err;
 }
 
+static int build_id_cache__purge_path(const char *pathname)
+{
+   struct strlist *list;
+   struct str_node *pos;
+   int err;
+
+   list = build_id_cache__list_build_ids(pathname);
+   if (!list)
+   return 0;
+
+   strlist__for_each(pos, list) {
+   err = build_id_cache__remove_s(pos->s);
+   if (verbose)
+   pr_info("Removing %s %s: %s\n", pos->s, pathname,
+   err ? "FAIL" : "Ok");
+   if (err)
+   break;
+   }
+   strlist__delete(list);
+
+   return err;
+}
+
 static bool dso__missing_buildid_cache(struct dso *dso, int parm 
__maybe_unused)
 {
char filename[PATH_MAX];
@@ -285,6 +308,7 @@ int cmd_buildid_cache(int argc, const char **argv,
bool force = false;
char const *add_name_list_str = NULL,
   *remove_name_list_str = NULL,
+  *purge_name_list_str = NULL,
   *missing_filename = NULL,
   *update_name_list_str = NULL,
   *kcore_filename = NULL;
@@ -302,6 +326,8 @@ int cmd_buildid_cache(int argc, const char **argv,
   "file", "kcore file to add"),
OPT_STRING('r', "remove", &remove_name_list_str, "file list",
"file(s) to remove"),
+   OPT_STRING('p', "purge", &purge_name_list_str, "path list",
+   "path(s) to remove (remove old caches too)"),
OPT_STRING('M', "missing", &missing_filename, "file",
   "to find missing build ids in the cache"),
OPT_BOOLEAN('f', "force", &force, "don't complain, do it"),
@@ -368,6 +394,24 @@ int cmd_buildid_cache(int argc, const char **argv,
}
}
 
+   if (purge_name_list_str) {
+   list = strlist__new(true, purge_name_list_str);
+   if (list) {
+   strlist__for_each(pos, list)
+   if (build_id_cache__purge_path(pos->s)) {
+   if (e

[perf/core PATCH v5 4/4] perf-buildid-cache: Show usage with incorrect params

2015-02-25 Thread Masami Hiramatsu

Show usage if no action is specified or unexpected parameter
is given. In other words, be more user friendly.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/builtin-buildid-cache.c |5 +
 1 file changed, 5 insertions(+)

diff --git a/tools/perf/builtin-buildid-cache.c 
b/tools/perf/builtin-buildid-cache.c
index 20be743..5eaa9bf 100644
--- a/tools/perf/builtin-buildid-cache.c
+++ b/tools/perf/builtin-buildid-cache.c
@@ -340,6 +340,11 @@ int cmd_buildid_cache(int argc, const char **argv,
argc = parse_options(argc, argv, buildid_cache_options,
 buildid_cache_usage, 0);
 
+   if (argc || (!add_name_list_str && !kcore_filename &&
+!remove_name_list_str && !purge_name_list_str &&
+!missing_filename && !update_name_list_str))
+   usage_with_options(buildid_cache_usage, buildid_cache_options);
+
if (missing_filename) {
file.path = missing_filename;
file.force = force;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[perf/core PATCH v5 3/4] perf-buildid-cache: Use pr_debug instead of verbose && pr_info

2015-02-25 Thread Masami Hiramatsu

Use pr_debug instead of the combination of verbose and pr_info.

"if (verbose) pr_info(...)" is same as "pr_debug(...)", replace it.

Suggested-by: Namhyung Kim 
Signed-off-by: Masami Hiramatsu 
---
 tools/perf/builtin-buildid-cache.c |   20 
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/tools/perf/builtin-buildid-cache.c 
b/tools/perf/builtin-buildid-cache.c
index 37182bb..20be743 100644
--- a/tools/perf/builtin-buildid-cache.c
+++ b/tools/perf/builtin-buildid-cache.c
@@ -196,9 +196,8 @@ static int build_id_cache__add_file(const char *filename)
build_id__sprintf(build_id, sizeof(build_id), sbuild_id);
err = build_id_cache__add_s(sbuild_id, filename,
false, false);
-   if (verbose)
-   pr_info("Adding %s %s: %s\n", sbuild_id, filename,
-   err ? "FAIL" : "Ok");
+   pr_debug("Adding %s %s: %s\n", sbuild_id, filename,
+err ? "FAIL" : "Ok");
return err;
 }
 
@@ -216,9 +215,8 @@ static int build_id_cache__remove_file(const char *filename)
 
build_id__sprintf(build_id, sizeof(build_id), sbuild_id);
err = build_id_cache__remove_s(sbuild_id);
-   if (verbose)
-   pr_info("Removing %s %s: %s\n", sbuild_id, filename,
-   err ? "FAIL" : "Ok");
+   pr_debug("Removing %s %s: %s\n", sbuild_id, filename,
+err ? "FAIL" : "Ok");
 
return err;
 }
@@ -235,9 +233,8 @@ static int build_id_cache__purge_path(const char *pathname)
 
strlist__for_each(pos, list) {
err = build_id_cache__remove_s(pos->s);
-   if (verbose)
-   pr_info("Removing %s %s: %s\n", pos->s, pathname,
-   err ? "FAIL" : "Ok");
+   pr_debug("Removing %s %s: %s\n", pos->s, pathname,
+err ? "FAIL" : "Ok");
if (err)
break;
}
@@ -292,9 +289,8 @@ static int build_id_cache__update_file(const char *filename)
if (!err)
err = build_id_cache__add_s(sbuild_id, filename, false, false);
 
-   if (verbose)
-   pr_info("Updating %s %s: %s\n", sbuild_id, filename,
-   err ? "FAIL" : "Ok");
+   pr_debug("Updating %s %s: %s\n", sbuild_id, filename,
+err ? "FAIL" : "Ok");
 
return err;
 }


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[perf/core PATCH v5 1/4] perf buildid-cache: Add new buildid cache if update target is not cached

2015-02-25 Thread Masami Hiramatsu

Add new buildid cache if the update target file is not cached.
This can happen when an old binary is replaced by new one
after caching the old one. In this case, user sees his operation
just failed. But it does not look straight, since user just
pass the binary "path", not "build-id".

  
  # ./perf buildid-cache --add ./perf
  (update ./perf to new binary)
  # ./perf buildid-cache --update ./perf
  ./perf wasn't in the cache
  #
  

This patch adds given new binary to cache if the new binary is
not cached. So we'll not see the above error.

  
  # ./perf buildid-cache --add ./perf
  (update ./perf to new binary)
  # ./perf buildid-cache --update ./perf
  #
  

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/Documentation/perf-buildid-cache.txt |   11 ---
 tools/perf/builtin-buildid-cache.c  |6 --
 tools/perf/util/build-id.c  |   12 
 tools/perf/util/build-id.h  |1 +
 4 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-buildid-cache.txt 
b/tools/perf/Documentation/perf-buildid-cache.txt
index 0294c57..cec6b57 100644
--- a/tools/perf/Documentation/perf-buildid-cache.txt
+++ b/tools/perf/Documentation/perf-buildid-cache.txt
@@ -41,9 +41,14 @@ OPTIONS
 --missing=::
List missing build ids in the cache for the specified file.
 -u::
---update::
-   Update specified file of the cache. It can be used to update kallsyms
-   kernel dso to vmlinux in order to support annotation.
+--update=::
+   Update specified file of the cache. Note that this doesn't remove
+   older entires since those may be still needed for annotating old
+   (or remote) perf.data. Only if there is already a cache which has
+   exactly same build-id, that is replaced by new one. It can be used
+   to update kallsyms and kernel dso to vmlinux in order to support
+   annotation.
+
 -v::
 --verbose::
Be more verbose.
diff --git a/tools/perf/builtin-buildid-cache.c 
b/tools/perf/builtin-buildid-cache.c
index d929d95..e7568f5 100644
--- a/tools/perf/builtin-buildid-cache.c
+++ b/tools/perf/builtin-buildid-cache.c
@@ -255,7 +255,7 @@ static int build_id_cache__update_file(const char *filename)
u8 build_id[BUILD_ID_SIZE];
char sbuild_id[BUILD_ID_SIZE * 2 + 1];
 
-   int err;
+   int err = 0;
 
if (filename__read_build_id(filename, &build_id, sizeof(build_id)) < 0) 
{
pr_debug("Couldn't read a build-id in %s\n", filename);
@@ -263,7 +263,9 @@ static int build_id_cache__update_file(const char *filename)
}
 
build_id__sprintf(build_id, sizeof(build_id), sbuild_id);
-   err = build_id_cache__remove_s(sbuild_id);
+   if (build_id_cache__cached(sbuild_id))
+   err = build_id_cache__remove_s(sbuild_id);
+
if (!err)
err = build_id_cache__add_s(sbuild_id, filename, false, false);
 
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index adbc360..0bc33be 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -352,6 +352,18 @@ static int build_id_cache__add_b(const u8 *build_id, 
size_t build_id_size,
return build_id_cache__add_s(sbuild_id, name, is_kallsyms, is_vdso);
 }
 
+bool build_id_cache__cached(const char *sbuild_id)
+{
+   bool ret = false;
+   char *filename = build_id__filename(sbuild_id, NULL, 0);
+
+   if (filename && !access(filename, F_OK))
+   ret = true;
+   free(filename);
+
+   return ret;
+}
+
 int build_id_cache__remove_s(const char *sbuild_id)
 {
const size_t size = PATH_MAX;
diff --git a/tools/perf/util/build-id.h b/tools/perf/util/build-id.h
index 31b3c63..2a09498 100644
--- a/tools/perf/util/build-id.h
+++ b/tools/perf/util/build-id.h
@@ -22,6 +22,7 @@ bool perf_session__read_build_ids(struct perf_session 
*session, bool with_hits);
 int perf_session__write_buildid_table(struct perf_session *session, int fd);
 int perf_session__cache_build_ids(struct perf_session *session);
 
+bool build_id_cache__cached(const char *sbuild_id);
 int build_id_cache__add_s(const char *sbuild_id,
  const char *name, bool is_kallsyms, bool is_vdso);
 int build_id_cache__remove_s(const char *sbuild_id);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[perf/core PATCH v5 0/4] perf-buildid-cache: Enhance --update and add --purge

2015-02-25 Thread Masami Hiramatsu

Hi,

Here is the 5th version of of perf buildid-cache update.
This updates the 2nd patch and add 2 patches just for
cleanup and improve usability a bit.
Here are the changes in v5.

 - [2/4] Remove NULL check before calling strlist__delete()
 (Thanks to Hemant!)
 - [3/4] Use pr_debug instead of verbose && pr_info
 (Thanks to Namhyung!)
 - [4/4] Show usage with incorrect params 

Thank you,


---

Masami Hiramatsu (4):
  perf buildid-cache: Add new buildid cache if update target is not cached
  perf buildid-cache: Add --purge FILE to remove all caches of FILE
  perf-buildid-cache: Use pr_debug instead of verbose && pr_info
  perf-buildid-cache: Show usage with incorrect params


 tools/perf/Documentation/perf-buildid-cache.txt |   24 --
 tools/perf/builtin-buildid-cache.c  |   69 ++--
 tools/perf/util/build-id.c  |   97 +++
 tools/perf/util/build-id.h  |2 
 4 files changed, 157 insertions(+), 35 deletions(-)

--
Masami HIRAMATSU
Software Platform Research Dpt. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 1/6] clk: add of_clk_get_parent_rate function

2015-02-25 Thread Sascha Hauer

On Wed, Feb 25, 2015 at 10:13:15PM -0800, Ray Jui wrote:
> Hi Sascha,
> 
> On 2/25/2015 9:54 PM, Sascha Hauer wrote:
> > Hi Ray,
> > 
> > On Wed, Feb 04, 2015 at 04:55:00PM -0800, Ray Jui wrote:
> >> Sometimes a clock needs to know the rate of its parent before itself is
> >> registered to the framework. An example is that a PLL may need to
> >> initialize itself to a specific VCO frequency, before registering to the
> >> framework. The parent rate needs to be known, for PLL multipliers and
> >> divisors to be configured properly.
> >>
> >> Introduce helper function of_clk_get_parent_rate, which can be used to
> >> obtain the parent rate of a clock, given a device node and index.
> > 
> > I can't see how this patch helps you. First it's not guaranteed that
> > the parent is already registered, what do you do in this case?
> 
> In the case when clock parent is not found, as you can see from the
> code, it simply returns zero, just like other clk get rate APIs.

Yes, but what do you do with the 0 result then in your PLL initialization?

> 
> I thought the order of clock registration is based on order of the clock
> nodes in device tree. It makes sense to me to declare the parent clock
> before a child clock, so it's guaranteed that the parent is registered
> before the child.

No, you can't rely on that. The order of the device nodes may happen to
define the order of clock initialization now, but that may change.
device nodes are usually ordered by bus addresses, not by intended
initialization order. Even if you reorder them everything must still
work.

> 
> > Then the clock framework doesn't require that you initialize the PLL
> > before registering. That can be done in the clk ops later.
> 
> Sure it's not mandatory. But what's wrong with me choosing to initialize
> the PLL clock to a known frequency before registering it to the framework?

Appearantly you don't know the (input) frequency of the PLL when
registering it to the framework, so the question must be: What's wrong
with keeping it uninitialized?

If the PLL is unused then you don't care about it's initialization
status. If it happens to be enabled by a bootloader and still unused
at late_initcall time the clock framework will disable it so you
have a known state then. If a consumer for the PLL appears it's its
job to initialize it through the clk api.

Sascha

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86, boot: skip relocs when load address unchanged

2015-02-25 Thread Baoquan He

On 02/26/15 at 07:29am, MegaBrutal wrote:
> Thanks for this patch, and good to see it in mainline!
> 
> This actually fixes the problem I reported here:
> https://lkml.org/lkml/2014/12/1/15
> 
> I wish it to be backported into the Ubuntu Utopic kernel asap.
> 
> > This patch works for me. And good to see it's being merged. About the
> > patch log, I would say relocations do unexpected things when the kernel
> > is above 1G since randomization is done from 16M to 1G, namely
> > CONFIG_RANDOMIZE_BASE_MAX_OFFSET. So above 1G kernel text mapping will
> > step into kernel module mapping region.
> 
> I'm just speculating, but is it the reason why I only get problems
> when I boot with kexec? Maybe it's only then when the kernel gets
> above 1G. Otherwise, the kernels boot properly when they are booted
> from GRUB.

Yeah, kexec loads kernel at the top of physical memory. And above 1G
kernel mapping will step into module mapping area and collapse system.
Grub doesn't incur this problem.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86, boot: skip relocs when load address unchanged

2015-02-25 Thread MegaBrutal

Thanks for this patch, and good to see it in mainline!

This actually fixes the problem I reported here:
https://lkml.org/lkml/2014/12/1/15

I wish it to be backported into the Ubuntu Utopic kernel asap.

> This patch works for me. And good to see it's being merged. About the
> patch log, I would say relocations do unexpected things when the kernel
> is above 1G since randomization is done from 16M to 1G, namely
> CONFIG_RANDOMIZE_BASE_MAX_OFFSET. So above 1G kernel text mapping will
> step into kernel module mapping region.

I'm just speculating, but is it the reason why I only get problems
when I boot with kexec? Maybe it's only then when the kernel gets
above 1G. Otherwise, the kernels boot properly when they are booted
from GRUB.


2015-01-20 14:04 GMT+01:00 Baoquan He :
>
> On 01/15/15 at 04:51pm, Kees Cook wrote:
> > On 64-bit, relocation is not required unless the load address gets
> > changed. Without this, relocations do unexpected things when the kernel
> > is above 4G.
>
> This patch works for me. And good to see it's being merged. About the
> patch log, I would say relocations do unexpected things when the kernel
> is above 1G since randomization is done from 16M to 1G, namely
> CONFIG_RANDOMIZE_BASE_MAX_OFFSET. So above 1G kernel text mapping will
> step into kernel module mapping region.
>
> BTW, I am working on separate randomization of kernel physical and virtual
> address , will post it. But it won't conflict with this because I don't
> think it can be accepted in a short time. Before that this patch truly
> fix the kexec/kdump bug when kaslr is compiled in.
>
> Thanks
> Baoquan
>
> >
> > Reported-by: Baoquan He 
> > Signed-off-by: Kees Cook 
> > Cc: sta...@vger.kernel.org
> > ---
> > This is a reimplementation of Baoquan's "kaslr: check if kernel location is
> > changed", which performs the check without needing to change the function
> > declaration. This should have exactly the same effect, but I dropped Vivek's
> > Ack and Thomas's Test, since it's technically a different patch.
> > ---
> >  arch/x86/boot/compressed/misc.c | 9 -
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/boot/compressed/misc.c 
> > b/arch/x86/boot/compressed/misc.c
> > index dcc1c536cc21..a950864a64da 100644
> > --- a/arch/x86/boot/compressed/misc.c
> > +++ b/arch/x86/boot/compressed/misc.c
> > @@ -373,6 +373,8 @@ asmlinkage __visible void *decompress_kernel(void 
> > *rmode, memptr heap,
> > unsigned long output_len,
> > unsigned long run_size)
> >  {
> > + unsigned char *output_orig = output;
> > +
> >   real_mode = rmode;
> >
> >   sanitize_boot_params(real_mode);
> > @@ -421,7 +423,12 @@ asmlinkage __visible void *decompress_kernel(void 
> > *rmode, memptr heap,
> >   debug_putstr("\nDecompressing Linux... ");
> >   decompress(input_data, input_len, NULL, NULL, output, NULL, error);
> >   parse_elf(output);
> > - handle_relocations(output, output_len);
> > + /*
> > +  * 32-bit always performs relocations. 64-bit relocations are only
> > +  * needed if kASLR has chosen a different load address.
> > +  */
> > + if (!IS_ENABLED(CONFIG_X86_64) || output != output_orig)
> > + handle_relocations(output, output_len);
> >   debug_putstr("done.\nBooting the kernel.\n");
> >   return output;
> >  }
> > --
> > 1.9.1
> >
> >
> > --
> > Kees Cook
> > Chrome OS Security
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3 v3] x86: entry_64.S: always allocate complete "struct pt_regs"

2015-02-25 Thread Stephen Rothwell

Hi all,

On Wed, 25 Feb 2015 21:18:52 -0800 Andrew Morton  
wrote:
>
> On Thu, 26 Feb 2015 02:12:57 +0100 Denys Vlasenko  
> wrote:
> 
> > On Thu, Feb 26, 2015 at 12:34 AM, Sabrina Dubroca  
> > wrote:
> > > 2015-02-25, 23:40:55 +0100, Sabrina Dubroca wrote:
> > >> I can run some userspace programs, but I have no idea what would be
> > >> helpful.
> > >> I can also try booting a real machine with archlinux/systemd tomorrow.
> > >
> > > I got a good boot out of kernels that normally fail.  I booted
> > > systemd's emergency shell and enabled a few services, in the same
> > > order they normally start.  journald started cleanly, but after that,
> > > every single command produced a "traps:" output and an "audit:" line.
> > >
> > > I disabled systemd-journald (chmod -x, because `systemctl disable`
> > > didn't really disable it), and now it boots, no "traps:" in the log.
> > > If I run it, everything fails again (zsh has traps for simply pressing
> > > enter on an empty cmd).
> > 
> > That's some progress!
> > 
> > It's strange how one process manages to affect everything else.
> > 
> > "If I run it, everything fails again". How do you run it? Directly,
> > or via systemd services mechanism?
> > If you just run it directly, can you try running it under
> > "strace -f -tt -oLOG"? Does it have the same effect? What's in the LOG?
> 
> I'm hitting this bug as well, bisected to this commit.  On an old
> x64_64 box, no vms, paravirt, etc.  Running FC6 userspace (heh).
> 
> Quite late in initscripts, binaries start getting segmentation faults
> and init gives up.  Seems to only affect /usr/bin/rhgb-client.  There's
> one instance where /bin/rm is said to segfault, but I suspect that's
> init lying to me.

I note that that commit has been removed from today's version of the
luto-misc tree and thus linux-next.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpu6vlymy8yz.pgp
Description: OpenPGP digital signature

[PATCH 1/2] cpumask: Properly calculate cpumask values

2015-02-25 Thread green

From: Oleg Drokin 

With CONFIG_CPUMASK_OFFSTACK enabled there seems to be some disparity
between theoretical maximum of CPUs in the system (NR_CPUS that is huge)
and the actual value that is calculated at runtime (nr_cpu_ids).

Functions like cpus_weight should only check up to nr_cpu_ids bits
in the cpu mask, as there's no point to go all the way to 8192 bits
of theoritically possibly CPUs.

Signed-off-by: Oleg Drokin 
---
 include/linux/cpumask.h | 25 +++--
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 086549a..f0599e1 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -902,21 +902,24 @@ static inline int __cpu_test_and_set(int cpu, cpumask_t 
*addr)
return test_and_set_bit(cpu, addr->bits);
 }
 
-#define cpus_and(dst, src1, src2) __cpus_and(&(dst), &(src1), &(src2), NR_CPUS)
+#define cpus_and(dst, src1, src2) __cpus_and(&(dst), &(src1), &(src2), \
+nr_cpumask_bits)
 static inline int __cpus_and(cpumask_t *dstp, const cpumask_t *src1p,
const cpumask_t *src2p, unsigned int 
nbits)
 {
return bitmap_and(dstp->bits, src1p->bits, src2p->bits, nbits);
 }
 
-#define cpus_or(dst, src1, src2) __cpus_or(&(dst), &(src1), &(src2), NR_CPUS)
+#define cpus_or(dst, src1, src2) __cpus_or(&(dst), &(src1), &(src2), \
+  nr_cpumask_bits)
 static inline void __cpus_or(cpumask_t *dstp, const cpumask_t *src1p,
const cpumask_t *src2p, unsigned int 
nbits)
 {
bitmap_or(dstp->bits, src1p->bits, src2p->bits, nbits);
 }
 
-#define cpus_xor(dst, src1, src2) __cpus_xor(&(dst), &(src1), &(src2), NR_CPUS)
+#define cpus_xor(dst, src1, src2) __cpus_xor(&(dst), &(src1), &(src2), \
+nr_cpumask_bits)
 static inline void __cpus_xor(cpumask_t *dstp, const cpumask_t *src1p,
const cpumask_t *src2p, unsigned int 
nbits)
 {
@@ -924,48 +927,50 @@ static inline void __cpus_xor(cpumask_t *dstp, const 
cpumask_t *src1p,
 }
 
 #define cpus_andnot(dst, src1, src2) \
-   __cpus_andnot(&(dst), &(src1), &(src2), NR_CPUS)
+   __cpus_andnot(&(dst), &(src1), &(src2), \
+ nr_cpumask_bits)
 static inline int __cpus_andnot(cpumask_t *dstp, const cpumask_t *src1p,
const cpumask_t *src2p, unsigned int 
nbits)
 {
return bitmap_andnot(dstp->bits, src1p->bits, src2p->bits, nbits);
 }
 
-#define cpus_equal(src1, src2) __cpus_equal(&(src1), &(src2), NR_CPUS)
+#define cpus_equal(src1, src2) __cpus_equal(&(src1), &(src2), nr_cpumask_bits)
 static inline int __cpus_equal(const cpumask_t *src1p,
const cpumask_t *src2p, unsigned int 
nbits)
 {
return bitmap_equal(src1p->bits, src2p->bits, nbits);
 }
 
-#define cpus_intersects(src1, src2) __cpus_intersects(&(src1), &(src2), 
NR_CPUS)
+#define cpus_intersects(src1, src2) __cpus_intersects(&(src1), &(src2), \
+ nr_cpumask_bits)
 static inline int __cpus_intersects(const cpumask_t *src1p,
const cpumask_t *src2p, unsigned int 
nbits)
 {
return bitmap_intersects(src1p->bits, src2p->bits, nbits);
 }
 
-#define cpus_subset(src1, src2) __cpus_subset(&(src1), &(src2), NR_CPUS)
+#define cpus_subset(src1, src2) __cpus_subset(&(src1), &(src2), 
nr_cpumask_bits)
 static inline int __cpus_subset(const cpumask_t *src1p,
const cpumask_t *src2p, unsigned int 
nbits)
 {
return bitmap_subset(src1p->bits, src2p->bits, nbits);
 }
 
-#define cpus_empty(src) __cpus_empty(&(src), NR_CPUS)
+#define cpus_empty(src) __cpus_empty(&(src), nr_cpumask_bits)
 static inline int __cpus_empty(const cpumask_t *srcp, unsigned int nbits)
 {
return bitmap_empty(srcp->bits, nbits);
 }
 
-#define cpus_weight(cpumask) __cpus_weight(&(cpumask), NR_CPUS)
+#define cpus_weight(cpumask) __cpus_weight(&(cpumask), nr_cpumask_bits)
 static inline int __cpus_weight(const cpumask_t *srcp, unsigned int nbits)
 {
return bitmap_weight(srcp->bits, nbits);
 }
 
 #define cpus_shift_left(dst, src, n) \
-   __cpus_shift_left(&(dst), &(src), (n), NR_CPUS)
+   __cpus_shift_left(&(dst), &(src), (n), nr_cpumask_bits)
 static inline void __cpus_shift_left(cpumask_t *dstp,
const cpumask_t *srcp, int n, int nbits)
 {
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/2] incorrect cpumask behavior with CPUMASK_OFFSTACK

2015-02-25 Thread green

From: Oleg Drokin 

I just got a report today from Tyson Whitehead 
that Lustre crashes when CPUMASK_OFFSTACK is enabled.

A little investigation revealed that this code:
cpumask_t   mask;
...
cpumask_copy(&mask, topology_thread_cpumask(0));
weight = cpus_weight(mask);

that was supposed to calculate number of cpu siblings/partitions returns
a crazy high number over 3000 which is impossible as I only have
8 cpu cores on my system.

So after a bit of digging I found out that:
cpumask_copy only copies up to nr_cpumask_bits (actual number of cpus I
have in the system),
where as cpumask_weight actully tries to count bits up to NR_CPUS.

Not only calculating up to NR_CPUS is wasteful in this case, and
since we know how many cpus we have in the system - it only makes sense
to calculate only this much anyway, it's wrong because the copy only copied
8 bits to our variable and the rest of it is some random stack garbage.

So I propose two patches here, the first one I am more certain about -
operations that operate on current cpuset like cpus_weight, but also
cpus_empty, cpus_$LOGICALop cpus_$BINARYop are converted from NR_CPUS to
nr_cpumask_bits (this is ok when CONFIG_CPUMASK_OFFSTACK is not set as it's
then defined to NR_CPUS anyway).
I am leaving __cpus_setall __cpus_clear out of it as these two look like
they deal with entire set and it would be useful for them to operate on
all NR_CPUS bits for the case if more cpus are added later and such.

The second patch that I am not sure if we wnat, but it seems to be useful
until struct cpumask is fully dynamic is to convert what looks like
whole-set operations e.g. copies, namely:
cpumask_setall, cpumask_clear, cpumask_copy to always operate on NR_CPUS
bits to ensure there's no stale garbage left in the mask should the
cpu count increases later.

I checked the code and allocating cpumasks on stack is not all that 
uncommon in the code, so this should be a worthwhile fix.

Please consider.

Oleg Drokin (2):
  cpumask: Properly calculate cpumask values
  cpumask: make whole cpumask operations like copy to work with NR_CPUS
bits

 include/linux/cpumask.h | 37 +
 1 file changed, 21 insertions(+), 16 deletions(-)

-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] cpumask: make whole cpumask operations like copy to work with NR_CPUS bits

2015-02-25 Thread green

From: Oleg Drokin 

When we are doing things like cpumask_copy, and CONFIG_CPUMASK_OFFSTACK
is set, we only copy actual number of bits equal to number of CPUs
we have. But underlying allocations got NR_CPUS = 8192, so
if the cpumask is allocated on the stack or has other prefilled values
there's a lot of garbage left that might become exposed as more CPUs are
added into the system.
The patch converts such whole-mask functions:
cpumask_setall, cpumask_clear, cpumask_copy
to operate on the whole NR_CPUS value.

Signed-off-by: Oleg Drokin 
---
 include/linux/cpumask.h | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index f0599e1..28a8bb3 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -324,21 +324,21 @@ static inline int cpumask_test_and_clear_cpu(int cpu, 
struct cpumask *cpumask)
 }
 
 /**
- * cpumask_setall - set all cpus (< nr_cpu_ids) in a cpumask
+ * cpumask_setall - set all cpus (< NR_CPUS) in a cpumask
  * @dstp: the cpumask pointer
  */
 static inline void cpumask_setall(struct cpumask *dstp)
 {
-   bitmap_fill(cpumask_bits(dstp), nr_cpumask_bits);
+   bitmap_fill(cpumask_bits(dstp), NR_CPUS);
 }
 
 /**
- * cpumask_clear - clear all cpus (< nr_cpu_ids) in a cpumask
+ * cpumask_clear - clear all cpus (< NR_CPUS) in a cpumask
  * @dstp: the cpumask pointer
  */
 static inline void cpumask_clear(struct cpumask *dstp)
 {
-   bitmap_zero(cpumask_bits(dstp), nr_cpumask_bits);
+   bitmap_zero(cpumask_bits(dstp), NR_CPUS);
 }
 
 /**
@@ -470,7 +470,7 @@ static inline bool cpumask_full(const struct cpumask *srcp)
 
 /**
  * cpumask_weight - Count of bits in *srcp
- * @srcp: the cpumask to count bits (< nr_cpu_ids) in.
+ * @srcp: the cpumask to count bits (< NR_CPUS) in.
  */
 static inline unsigned int cpumask_weight(const struct cpumask *srcp)
 {
@@ -511,7 +511,7 @@ static inline void cpumask_shift_left(struct cpumask *dstp,
 static inline void cpumask_copy(struct cpumask *dstp,
const struct cpumask *srcp)
 {
-   bitmap_copy(cpumask_bits(dstp), cpumask_bits(srcp), nr_cpumask_bits);
+   bitmap_copy(cpumask_bits(dstp), cpumask_bits(srcp), NR_CPUS);
 }
 
 /**
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86, traps: install gates using IST after cpu_init().

2015-02-25 Thread Wang Nan

X86_TRAP_NMI, X86_TRAP_DF and X86_TRAP_MC use their own stack. Those
stacks are invalid until cpu_init() installs TSS.

This patch moves setting of the 3 gates after cpu_init().

Signed-off-by: Wang Nan 
---

If I understand correctly, logically speaking the original code is
incorrect.  However, there is no real bug caused by it for serval years.
I'm not sure whether this fix is practical or not. Fix them only for
logical correctness.

---
 arch/x86/kernel/traps.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 4281988..cf7898e 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -962,7 +962,6 @@ void __init trap_init(void)
 #endif
 
set_intr_gate(X86_TRAP_DE, divide_error);
-   set_intr_gate_ist(X86_TRAP_NMI, &nmi, NMI_STACK);
/* int4 can be called from all */
set_system_intr_gate(X86_TRAP_OF, &overflow);
set_intr_gate(X86_TRAP_BR, bounds);
@@ -970,8 +969,6 @@ void __init trap_init(void)
set_intr_gate(X86_TRAP_NM, device_not_available);
 #ifdef CONFIG_X86_32
set_task_gate(X86_TRAP_DF, GDT_ENTRY_DOUBLEFAULT_TSS);
-#else
-   set_intr_gate_ist(X86_TRAP_DF, &double_fault, DOUBLEFAULT_STACK);
 #endif
set_intr_gate(X86_TRAP_OLD_MF, coprocessor_segment_overrun);
set_intr_gate(X86_TRAP_TS, invalid_TSS);
@@ -981,9 +978,6 @@ void __init trap_init(void)
set_intr_gate(X86_TRAP_SPURIOUS, spurious_interrupt_bug);
set_intr_gate(X86_TRAP_MF, coprocessor_error);
set_intr_gate(X86_TRAP_AC, alignment_check);
-#ifdef CONFIG_X86_MCE
-   set_intr_gate_ist(X86_TRAP_MC, &machine_check, MCE_STACK);
-#endif
set_intr_gate(X86_TRAP_XF, simd_coprocessor_error);
 
/* Reserve all the builtin and the syscall vector: */
@@ -1013,6 +1007,14 @@ void __init trap_init(void)
 */
cpu_init();
 
+   set_intr_gate_ist(X86_TRAP_NMI, &nmi, NMI_STACK);
+#ifndef CONFIG_X86_32
+   set_intr_gate_ist(X86_TRAP_DF, &double_fault, DOUBLEFAULT_STACK);
+#endif
+#ifdef CONFIG_X86_MCE
+   set_intr_gate_ist(X86_TRAP_MC, &machine_check, MCE_STACK);
+#endif
+
/*
 * X86_TRAP_DB and X86_TRAP_BP have been set
 * in early_trap_init(). However, DEBUG_STACK works only after
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/7] thinkpad_acpi: Remember adaptive kbd presence

2015-02-25 Thread Darren Hart

On Fri, Feb 20, 2015 at 03:44:10PM +0100, Bastien Nocera wrote:
> Rather than checking on each suspend and resume whether the laptop
> has an adaptive keyboard, check when the driver is initialised.

Bastien, am I awaiting another version of this from you to address comments from
Henrique?

Henrique, when you're satisfied, please provide a Reviewed-by for the series.

Thanks,

-- 
Darren Hart
Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mm: completely remove dumping per-cpu lists from show_mem()

2015-02-25 Thread Konstantin Khlebnikov

It seems nobody needs this.

Signed-off-by: Konstantin Khlebnikov 
---
 include/linux/mm.h |1 -
 mm/page_alloc.c|   22 ++
 2 files changed, 2 insertions(+), 21 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 9c21b42..6571dd78 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1126,7 +1126,6 @@ extern void pagefault_out_of_memory(void);
  * various contexts.
  */
 #define SHOW_MEM_FILTER_NODES  (0x0001u)   /* disallowed nodes */
-#define SHOW_MEM_PERCPU_LISTS  (0x0002u)   /* per-zone per-cpu */
 
 extern void show_free_areas(unsigned int flags);
 extern bool skip_free_areas_node(unsigned int flags, int nid);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a120bce..8ddcb0e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3254,7 +3254,6 @@ static void show_migration_types(unsigned char type)
  * Bits in @filter:
  * SHOW_MEM_FILTER_NODES: suppress nodes that are not allowed by current's
  *   cpuset.
- * SHOW_MEM_PERCPU_LISTS: display full per-node per-cpu pcp lists
  */
 void show_free_areas(unsigned int filter)
 {
@@ -3266,25 +3265,8 @@ void show_free_areas(unsigned int filter)
if (skip_free_areas_node(filter, zone_to_nid(zone)))
continue;
 
-   if (filter & SHOW_MEM_PERCPU_LISTS) {
-   show_node(zone);
-   printk("%s per-cpu:\n", zone->name);
-   }
-
-   for_each_online_cpu(cpu) {
-   struct per_cpu_pageset *pageset;
-
-   pageset = per_cpu_ptr(zone->pageset, cpu);
-
-   free_pcp += pageset->pcp.count;
-
-   if (!(filter & SHOW_MEM_PERCPU_LISTS))
-   continue;
-
-   printk("CPU %4d: hi:%5d, btch:%4d usd:%4d\n",
-  cpu, pageset->pcp.high,
-  pageset->pcp.batch, pageset->pcp.count);
-   }
+   for_each_online_cpu(cpu)
+   free_pcp += per_cpu_ptr(zone->pageset, cpu)->pcp.count;
}
 
printk("active_anon:%lu inactive_anon:%lu isolated_anon:%lu\n"

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 11/20] power_supply: Change ownership from driver to core

2015-02-25 Thread Darren Hart

On Thu, Feb 26, 2015 at 01:45:22AM +0100, Sebastian Reichel wrote:
> Hi,
> 
> On Mon, Feb 23, 2015 at 12:47:32PM +0100, Krzysztof Kozlowski wrote:
> > Change the ownership of power_supply structure from each driver
> > implementing the class to the power supply core.
> > 
> > The patch changes power_supply_register() function thus all drivers
> > implementing power supply class are adjusted.
> > 
> > Each driver provides the implementation of power supply. However it
> > should not be the owner of power supply class instance because it is
> > exposed by core to other subsystems with power_supply_get_by_name().
> > These other subsystems have no knowledge when the driver will unregister
> > the power supply. This leads to several issues when driver is unbound -
> > mostly because user of power supply accesses freed memory.
> > 
> > Instead let the core own the instance of struct 'power_supply'.  Other
> > users of this power supply will still access valid memory because it
> > will be freed when device reference count reaches 0. Currently this
> > means "it will leak" but power_supply_put() call in next patches will
> > solve it.
> > 
> > This solves invalid memory references in following race condition
> > scenario:
> > 
> > Thread 1: charger manager
> > Thread 2: power supply driver, used by charger manager
> > 
> > THREAD 1 (charger manager) THREAD 2 (power supply driver)
> > == ==
> > psy = power_supply_get_by_name()
> >Driver unbind, .remove
> >  power_supply_unregister()
> >  Device fully removed
> > psy->get_property()
> > 
> > The 'get_property' call is executed in invalid context because the driver 
> > was
> > unbound and struct 'power_supply' memory was freed.
> > 
> > This could be observed easily with charger manager driver (here compiled
> > with max17040 fuel gauge):
> > 
> > $ cat /sys/devices/virtual/power_supply/cm-battery/capacity &
> > $ echo "1-0036" > /sys/bus/i2c/drivers/max17040/unbind
> > [   55.725123] Unable to handle kernel NULL pointer dereference at virtual 
> > address 
> > [   55.732584] pgd = d98d4000
> > [   55.734060] [] *pgd=5afa2831, *pte=, *ppte=
> > [   55.740318] Internal error: Oops: 8007 [#1] PREEMPT SMP ARM
> > [   55.746210] Modules linked in:
> > [   55.749259] CPU: 1 PID: 2936 Comm: cat Tainted: GW   
> > 3.19.0-rc1-next-20141226-00048-gf79f475f3c44-dirty #1496
> > [   55.760190] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
> > [   55.766270] task: d9b76f00 ti: daf54000 task.ti: daf54000
> > [   55.771647] PC is at 0x0
> > [   55.774182] LR is at charger_get_property+0x2f4/0x36c
> > [   55.779201] pc : [<>]lr : []psr: 6013
> > [   55.779201] sp : daf55e90  ip : 0003  fp : 
> > [   55.790657] r10:   r9 : c06e2878  r8 : d9b26c68
> > [   55.795865] r7 : dad81610  r6 : daec7410  r5 : daf55ebc  r4 : 
> > [   55.802367] r3 :   r2 : daf55ebc  r1 : 002a  r0 : d9b26c68
> > [   55.808879] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment 
> > user
> > [   55.815994] Control: 10c5387d  Table: 598d406a  DAC: 0015
> > [   55.821723] Process cat (pid: 2936, stack limit = 0xdaf54210)
> > [   55.827451] Stack: (0xdaf55e90 to 0xdaf56000)
> > [   55.831795] 5e80: 6013 c01459c4 
> > 002a c06f8ef8
> > [   55.839956] 5ea0: db651000 c06f8ef8 daebac00 c04cb668 daebac08 c0346864 
> >  c01459c4
> > [   55.848115] 5ec0: d99eaa80 c06f8ef8 0fff 1000 db651000 c027f25c 
> > c027f240 d99eaa80
> > [   55.856274] 5ee0: d9a06c00 c0146218 daf55f18 1000 d99eaa80 db4c18c0 
> > 0001 0001
> > [   55.864468] 5f00: daf55f80 c0144c78 c0144c54 c0107f90 00015000 d99eaab0 
> >  
> > [   55.872603] 5f20: 51c7  db4c18c0 c04a9370 00015000 1000 
> > daf55f80 1000
> > [   55.880763] 5f40: daf54000 00015000  c00e53dc db4c18c0 c00e548c 
> > 000d 8124
> > [   55.888937] 5f60: 0001   db4c18c0 db4c18c0 1000 
> > 00015000 c00e5550
> > [   55.897099] 5f80:   1000 1000 00015000 0003 
> > 0003 c000f364
> > [   55.905239] 5fa0:  c000f1a0 1000 00015000 0003 00015000 
> > 1000 0001333c
> > [   55.913399] 5fc0: 1000 00015000 0003 0003 0002  
> >  
> > [   55.921560] 5fe0: 7fffe000 be999850 a225 b6f3c19c 6010 0003 
> >  
> > [   55.929744] [] (charger_get_property) from [] 
> > (power_supply_show_property+0x48/0x20c)
> > [   55.939286] [] (power_supply_show_property) from [] 
> > (dev_attr_show+0x1c/0x48)
> > [   55.948130] [] (dev_attr_show) from [] 
> > (sysfs_kf_seq_show+0x84/0x104)
> > [   55.956298] [] (sysfs_kf_seq_show) from [] 
> > (kernfs_seq_show+0x2

Re: [PATCH v5 1/6] clk: add of_clk_get_parent_rate function

2015-02-25 Thread Ray Jui

Hi Sascha,

On 2/25/2015 9:54 PM, Sascha Hauer wrote:
> Hi Ray,
> 
> On Wed, Feb 04, 2015 at 04:55:00PM -0800, Ray Jui wrote:
>> Sometimes a clock needs to know the rate of its parent before itself is
>> registered to the framework. An example is that a PLL may need to
>> initialize itself to a specific VCO frequency, before registering to the
>> framework. The parent rate needs to be known, for PLL multipliers and
>> divisors to be configured properly.
>>
>> Introduce helper function of_clk_get_parent_rate, which can be used to
>> obtain the parent rate of a clock, given a device node and index.
> 
> I can't see how this patch helps you. First it's not guaranteed that
> the parent is already registered, what do you do in this case?

In the case when clock parent is not found, as you can see from the
code, it simply returns zero, just like other clk get rate APIs.

I thought the order of clock registration is based on order of the clock
nodes in device tree. It makes sense to me to declare the parent clock
before a child clock, so it's guaranteed that the parent is registered
before the child.

> Then the clock framework doesn't require that you initialize the PLL
> before registering. That can be done in the clk ops later.

Sure it's not mandatory. But what's wrong with me choosing to initialize
the PLL clock to a known frequency before registering it to the framework?

Ray
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] platform: x86: dell-laptop: Add support for keyboard backlight

2015-02-25 Thread Darren Hart

On Sun, Feb 22, 2015 at 12:04:23PM +0100, Pali Rohár wrote:
> On Thursday 19 February 2015 11:58:29 Gabriele Mazzotta wrote:
> > This patch adds the support for the configuration of the
> > keyboard backlight on supported Dell laptops.
> > 
> > With this patch it is possible to set:
> > * keyboard backlight level
> > * timeout after which the backlight will be automatically
> > turned off * input activity triggers (keyboard, touchpad,
> > mouse) that enable the backlight * ambient light settings
> > 
> > The settings are exposed via
> > /sys/class/leds/dell::kbd_backlight/
> > 
> > The code is based on the newly released documentation by Dell
> > in the libsmbios project.
> > 
> > Signed-off-by: Pali Rohár 
> > Signed-off-by: Gabriele Mazzotta 
> > Cc: Dan Carpenter 
> > ---
> >  .../ABI/testing/sysfs-platform-dell-laptop |   69 ++
> >  drivers/platform/x86/dell-laptop.c | 1089
> > +++- 2 files changed, 1152 insertions(+), 6
> > deletions(-)
> >  create mode 100644
> > Documentation/ABI/testing/sysfs-platform-dell-laptop
> > 
> 
> This patch is same as combination of previous, just contains all 
> changes in single patch file. Difference between v3.19-rc5 and 
> current version is just als code was moved to separate functions 
> and als backlight can be enabled via new sysfs attribute (like in 
> second patch which was sent).
> 
> I tested this patch on top of linus tree (commit 79513d0) and 
> keyboard backlight works fine.
> 
> Darren, what do you think, can be this patch which brings 
> keyboard backlight support finally moved to mainline kernel?

So sorry for the delay folks, bad week :( This landed too late for 3.20... her
durr 4.0, but should make 4.1. Expect to see it in next in the next few days.

-- 
Darren Hart
Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] thermal: fix the casting issue for long type

2015-02-25 Thread Leo Yan

When enable the thermal on arm64 platform, it will report failure when
call function *thermal_zone_bind_cooling_device()*.

The failure is caused by casting. If dtb specify the minimum cooling
state and maximum cooling state as THERMAL_NO_LIMIT, then variables
"lower" and "upper" equal to 0x___f after casting the
value from "unsigned int" to "unsigned long".

Finally in kernel if check the variables "lower" and "upper", it will
never equal to (-1UL), or 0x___f.

So change to use "unsigned int" type, this can be compatible for 32 bits
and 64 bits system.

Signed-off-by: Leo Yan 
---
 drivers/thermal/of-thermal.c   | 4 ++--
 drivers/thermal/thermal_core.c | 6 +++---
 include/linux/thermal.h| 2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c
index 668fb1b..b50c4e1 100644
--- a/drivers/thermal/of-thermal.c
+++ b/drivers/thermal/of-thermal.c
@@ -49,8 +49,8 @@ struct __thermal_bind_params {
struct device_node *cooling_device;
unsigned int trip_id;
unsigned int usage;
-   unsigned long min;
-   unsigned long max;
+   unsigned int min;
+   unsigned int max;
 };
 
 /**
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 48491d1..04de575 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -923,7 +923,7 @@ thermal_cooling_device_trip_point_show(struct device *dev,
 int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
 int trip,
 struct thermal_cooling_device *cdev,
-unsigned long upper, unsigned long lower)
+unsigned int upper, unsigned int lower)
 {
struct thermal_instance *dev;
struct thermal_instance *pos;
@@ -952,8 +952,8 @@ int thermal_zone_bind_cooling_device(struct 
thermal_zone_device *tz,
return ret;
 
/* lower default 0, upper default max_state */
-   lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
-   upper = upper == THERMAL_NO_LIMIT ? max_state : upper;
+   lower = lower == (unsigned int)THERMAL_NO_LIMIT ? 0 : lower;
+   upper = upper == (unsigned int)THERMAL_NO_LIMIT ? max_state : upper;
 
if (lower > upper || upper > max_state)
return -EINVAL;
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index fc52e30..a2afe04 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -321,7 +321,7 @@ void thermal_zone_device_unregister(struct 
thermal_zone_device *);
 
 int thermal_zone_bind_cooling_device(struct thermal_zone_device *, int,
 struct thermal_cooling_device *,
-unsigned long, unsigned long);
+unsigned int, unsigned int);
 int thermal_zone_unbind_cooling_device(struct thermal_zone_device *, int,
   struct thermal_cooling_device *);
 void thermal_zone_device_update(struct thermal_zone_device *);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: vf610: use SMP_ON_UP for Vybrid SoC

2015-02-25 Thread Shawn Guo

On Wed, Jan 21, 2015 at 12:12:45AM +0100, Stefan Agner wrote:
> The Vybrid SoC has only one Cortex-A5 core and hence should select
> the SMP_ON_UP configuration on a SMP kernel.
> 
> Signed-off-by: Stefan Agner 

Applied, thanks.

> ---
>  arch/arm/mach-imx/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm/mach-imx/Kconfig b/arch/arm/mach-imx/Kconfig
> index e8627e0..b8d3ccd 100644
> --- a/arch/arm/mach-imx/Kconfig
> +++ b/arch/arm/mach-imx/Kconfig
> @@ -634,6 +634,7 @@ config SOC_VF610
>   select ARM_GIC
>   select PINCTRL_VF610
>   select PL310_ERRATA_769419 if CACHE_L2X0
> + select SMP_ON_UP if SMP
>  
>   help
> This enable support for Freescale Vybrid VF610 processor.
> -- 
> 2.2.2
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 1/6] clk: add of_clk_get_parent_rate function

2015-02-25 Thread Sascha Hauer

Hi Ray,

On Wed, Feb 04, 2015 at 04:55:00PM -0800, Ray Jui wrote:
> Sometimes a clock needs to know the rate of its parent before itself is
> registered to the framework. An example is that a PLL may need to
> initialize itself to a specific VCO frequency, before registering to the
> framework. The parent rate needs to be known, for PLL multipliers and
> divisors to be configured properly.
> 
> Introduce helper function of_clk_get_parent_rate, which can be used to
> obtain the parent rate of a clock, given a device node and index.

I can't see how this patch helps you. First it's not guaranteed that
the parent is already registered, what do you do in this case?
Then the clock framework doesn't require that you initialize the PLL
before registering. That can be done in the clk ops later.

Sascha

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] coresight-stm: adding driver for CoreSight STM component

2015-02-25 Thread Shawn Guo

On Wed, Feb 25, 2015 at 04:32:32PM -0700, Mathieu Poirier wrote:
> diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-stm 
> b/Documentation/ABI/testing/sysfs-bus-coresight-devices-stm
> new file mode 100644
> index ..3ddb676831ab
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-stm
> @@ -0,0 +1,62 @@
> +What:
> /sys/bus/coresight/devices/.stm/enable_source
> +Date:February 2015
> +KernelVersion:   3.20

A random comment - there will never be a v3.20 kernel.

Shawn

> +Contact: Mathieu Poirier 
> +Description: (RW) Enable/disable tracing on this specific trace macrocell.
> + Enabling the trace macrocell implies it has been configured
> + properly and a sink has been identidifed for it.  The path
> + of coresight components linking the source to the sink is
> + configured and managed automatically by the coresight framework.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] xen: avoid NULL pointer dereference in dom0 on large machines

2015-02-25 Thread Juergen Gross

Using the pvops kernel a NULL pointer dereference was detected on a
large machine (144 processors) when booting as dom0 in
evtchn_fifo_unmask() during assignment of a pirq.

The event channel in question was the first to need a new entry in
event_array[] in events_fifo.c. Unfortunately xen_irq_info_pirq_setup()
is called with evtchn being 0 for a new pirq and the real event channel
number is assigned to the pirq only during __startup_pirq().

It is mandatory to call xen_evtchn_port_setup() after assigning the
event channel number to the pirq to make sure all memory needed for the
event channel is allocated.

Signed-off-by: Juergen Gross 
---
 drivers/xen/events/events_base.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index b4bca2d..70fba97 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -526,20 +526,26 @@ static unsigned int __startup_pirq(unsigned int irq)
pirq_query_unmask(irq);
 
rc = set_evtchn_to_irq(evtchn, irq);
-   if (rc != 0) {
-   pr_err("irq%d: Failed to set port to irq mapping (%d)\n",
-  irq, rc);
-   xen_evtchn_close(evtchn);
-   return 0;
-   }
+   if (rc)
+   goto err;
+
bind_evtchn_to_cpu(evtchn, 0);
info->evtchn = evtchn;
 
+   rc = xen_evtchn_port_setup(info);
+   if (rc)
+   goto err;
+
 out:
unmask_evtchn(evtchn);
eoi_pirq(irq_get_irq_data(irq));
 
return 0;
+
+err:
+   pr_err("irq%d: Failed to set port to irq mapping (%d)\n", irq, rc);
+   xen_evtchn_close(evtchn);
+   return 0;
 }
 
 static unsigned int startup_pirq(struct irq_data *data)
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 00/30] Refine PCI scan interfaces and make generic pci host bridge

2015-02-25 Thread Bjorn Helgaas

On Thu, Feb 26, 2015 at 09:29:17AM +0800, Yijing Wang wrote:
> v2->v3:
>   Rebase this series on v4.0-rc1.

Hm, still doesn't apply for me:

  11:48:15 ~/linux (pci/enumeration)$ git show --oneline | head -1
  c517d838eb7d Linux 4.0-rc1
  11:48:36 ~/linux (pci/enumeration)$ stg import -M --sign m/yw
  Checking for changes in the working directory ... done
  Importing patch "pci-rip-out" ... done
  Importing patch "pci-rip-out-0" ... done
  Importing patch "xen-pci-don-t-use-deprecated" ... done
  Importing patch "pci-remove-deprecated" ... done
  Importing patch "pci-rename-pci_scan_bus-to" ... done
  Importing patch "pci-combine-pci-domain-and-bus" ... done
  Importing patch "pci-pass-pci-domain-number" ... done
  Importing patch "pci-introduce" ... done
  Importing patch "pci-separate-pci_host_bridge" ... done
  Importing patch "pci-introduce-0" ... done
  Importing patch "pci-save-sysdata-in" ... error: patch failed: 
drivers/pci/host-bridge.c:58
  error: drivers/pci/host-bridge.c: patch does not apply
  error: patch failed: drivers/pci/probe.c:1954
  error: drivers/pci/probe.c: patch does not apply
  stg import: Diff does not apply cleanly
  11:48:52 ~/linux (pci/enumeration)$ 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] x86, traps: Enable DEBUG_STACK after cpu_init() for TRAP_DB/BP.

2015-02-25 Thread Wang Nan

Before this patch early_trap_init() installs DEBUG_STACK for X86_TRAP_BP
and X86_TRAP_DB. However, DEBUG_STACK doesn't work correctly until
cpu_init() <-- trap_init().

This patch passes 0 to set_intr_gate_ist() and
set_system_intr_gate_ist() instead of DEBUG_STACK to let it use same
stack as kernel, and installs DEBUG_STACK for them in trap_init().

As core runs at ring 0 between early_trap_init() and trap_init(), there
is no chance to get a bad stack before trap_init().

As NMI is also enabled in trap_init(), we don't need to care about
is_debug_stack() and related things used in arch/x86/kernel/nmi.c.

Signed-off-by: Wang Nan 
Reviewed-by: Masami Hiramatsu 
---
v1 -> v2: Correct grammar issues in comments.
---
 arch/x86/kernel/traps.c | 21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 9d2073e..4281988 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -925,9 +925,17 @@ dotraplinkage void do_iret_error(struct pt_regs *regs, 
long error_code)
 /* Set of traps needed for early debugging. */
 void __init early_trap_init(void)
 {
-   set_intr_gate_ist(X86_TRAP_DB, &debug, DEBUG_STACK);
+   /*
+* Don't set ist to DEBUG_STACK as it doesn't work until TSS is
+* ready in cpu_init() <-- trap_init(). Before trap_init(), CPU
+* runs at ring 0 so it is impossible to hit an invalid stack.
+* Using the original stack works well enough at this early
+* stage. DEBUG_STACK will be equipped after cpu_init() in
+* trap_init().
+*/
+   set_intr_gate_ist(X86_TRAP_DB, &debug, 0);
/* int3 can be called from all */
-   set_system_intr_gate_ist(X86_TRAP_BP, &int3, DEBUG_STACK);
+   set_system_intr_gate_ist(X86_TRAP_BP, &int3, 0);
 #ifdef CONFIG_X86_32
set_intr_gate(X86_TRAP_PF, page_fault);
 #endif
@@ -1005,6 +1013,15 @@ void __init trap_init(void)
 */
cpu_init();
 
+   /*
+* X86_TRAP_DB and X86_TRAP_BP have been set
+* in early_trap_init(). However, DEBUG_STACK works only after
+* cpu_init() loads TSS. See comments in early_trap_init().
+*/
+   set_intr_gate_ist(X86_TRAP_DB, &debug, DEBUG_STACK);
+   /* int3 can be called from all */
+   set_system_intr_gate_ist(X86_TRAP_BP, &int3, DEBUG_STACK);
+
x86_init.irqs.trap_init();
 
 #ifdef CONFIG_X86_64
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 32/35] clockevents: Fix cpu down race for hrtimer based broadcasting

2015-02-25 Thread Preeti U Murthy

On 02/23/2015 11:03 PM, Nicolas Pitre wrote:
> On Mon, 23 Feb 2015, Nicolas Pitre wrote:
> 
>> On Mon, 23 Feb 2015, Peter Zijlstra wrote:
>>
>>> The reported function that fails: bL_switcher_restore_cpus() is called
>>> in the error paths of the former and the main path in the latter to make
>>> the 'stolen' cpus re-appear.
>>>
>>> The patch in question somehow makes that go boom.
>>>
>>>
>>> Now what all do you need to do to make it go boom? Just enable/disable
>>> the switcher once and it'll explode? Or does it need to do actual
>>> switches while it is enabled?
>>
>> It gets automatically enabled during boot.  Then several switches are 
>> performed while user space is brought up.  If I manually disable it 
>> via /sys then it goes boom.
> 
> OK. Forget the bL switcher.  I configured it out of my kernel and then 
> managed to get the same crash by simply hotplugging out one CPU and 
> plugging it back in.
> 
> $ echo 0 > /sys/devices/system/cpu/cpu2/online
> [CPU2 gone]
> $ echo 1 > /sys/devices/system/cpu/cpu2/online
> [Boom!]
> 
> 
I saw an issue with this patch as well. I tried to do an smt mode switch
on a power machine, i.e varying the number of hyperthreads on an SMT 8
system, and the system hangs. Worse, there are no softlockup
messages/warnings/bug_ons reported. I am digging into this issue.

A couple of points though. Looking at the patch, I see that we are
shutting down tick device of the hotplugged out cpu *much before*
migrating the timers and hrtimers from it. Migration of timers is done
in the CPU_DEAD phase, while we shutdown tick devices in the CPU_DYING
phase. There is quite a bit of a gap here. Earlier we would do both in a
single notification.

Another point is that the tick devices are shutdown before the
hotplugged out cpu actually dies in __cpu_die(). At first look none of
these two points should create any issues. But since we are noticing
problems with this patch, I thought it would be best to put them forth.

But why are tick devices being shutdown that early ? Is there any
specific advantage to this? Taking/handing over tick duties should be
done before __cpu_die(), but shutdown of tick devices should be done
after this phase. This seems more natural, doesn't it?

Regards
Preeti U Murthy

> Nicolas
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[LKP] [mm] 8a0516ed8b9: -1.7% netperf.Throughput_Mbps, +2189.6% netperf.time.minor_page_faults, +3987.5% proc-vmstat.numa_pte_updates

2015-02-25 Thread Huang Ying

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
commit 8a0516ed8b90c95ffa1363b420caa37418149f21 ("mm: convert p[te|md]_numa 
users to p[te|md]_protnone_numa")


testbox/testcase/testparams: lkp-sbx04/netperf/performance-900s-200%-TCP_MAERTS

e7bb4b6d1609cce3  8a0516ed8b90c95ffa1363b420  
  --  
 %stddev %change %stddev
 \  |\  
226261 ±  1%   +2189.6%5180560 ±  0%  netperf.time.minor_page_faults
   721 ±  0%  -1.7%709 ±  0%  netperf.Throughput_Mbps
 12341 ± 16%-100.0%  0 ±  0%  proc-vmstat.numa_pages_migrated
364595 ±  3%-100.0%  0 ±  0%  proc-vmstat.numa_hint_faults_local
388922 ±  4%-100.0%  0 ±  0%  proc-vmstat.numa_hint_faults
226261 ±  1%   +2189.6%5180560 ±  0%  time.minor_page_faults
388831 ±  3%   +3987.5%   15893407 ±  0%  proc-vmstat.numa_pte_updates
 12341 ± 16%-100.0%  0 ±  0%  proc-vmstat.pgmigrate_success
47 ± 42% -60.3% 18 ± 13%  
sched_debug.cfs_rq[5]:/.blocked_load_avg
73 ± 19% -53.9% 34 ± 18%  sched_debug.cfs_rq[46]:/.load
32 ± 20% +75.0% 56 ± 32%  sched_debug.cpu#32.load
27 ± 37% +61.1% 43 ± 27%  
sched_debug.cfs_rq[15]:/.blocked_load_avg
54 ± 20% -43.8% 30 ±  5%  sched_debug.cfs_rq[17]:/.load
57 ± 30% -39.8% 34 ± 17%  sched_debug.cfs_rq[53]:/.load
70 ± 29% -41.3% 41 ±  8%  
sched_debug.cfs_rq[5]:/.tg_load_contrib
64 ± 20% -27.9% 46 ± 14%  sched_debug.cpu#26.load
34 ± 21% +68.6% 57 ±  1%  sched_debug.cfs_rq[15]:/.load
60 ± 21% -28.2% 43 ± 26%  sched_debug.cfs_rq[6]:/.load
50 ± 18% +33.2% 67 ± 18%  
sched_debug.cfs_rq[15]:/.tg_load_contrib
62 ± 28% -40.6% 37 ± 32%  sched_debug.cfs_rq[30]:/.load
59 ± 18% -33.5% 39 ± 14%  sched_debug.cfs_rq[62]:/.load
   556 ± 25% -54.2%255 ± 36%  sched_debug.cpu#59.sched_goidle
  1.63 ±  2% -31.2%   1.12 ±  0%  
perf-profile.cpu-cycles._raw_spin_lock.free_one_page.__free_pages_ok.free_compound_page.put_compound_page
50 ± 40% -35.5% 32 ± 16%  sched_debug.cpu#43.load
31 ± 18% +39.7% 44 ± 22%  sched_debug.cpu#53.load
  2.18 ±  3% -29.0%   1.55 ±  3%  
perf-profile.cpu-cycles.free_one_page.__free_pages_ok.free_compound_page.put_compound_page.put_page
46 ± 13% -37.6% 29 ± 31%  
sched_debug.cfs_rq[16]:/.blocked_load_avg
51 ± 26% -36.6% 32 ±  6%  sched_debug.cpu#7.load
73 ± 13% -20.8% 58 ±  9%  
sched_debug.cfs_rq[51]:/.tg_load_contrib
  1.77 ±  2% -25.1%   1.33 ±  1%  
perf-profile.cpu-cycles._raw_spin_lock_irqsave.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current.skb_page_frag_refill
58 ± 23% -38.4% 35 ± 24%  sched_debug.cfs_rq[2]:/.load
   8833788 ±  8% +22.5%   10821104 ± 12%  
sched_debug.cfs_rq[12]:/.max_vruntime
   8833787 ±  8% +22.5%   10821104 ± 12%  
sched_debug.cfs_rq[12]:/.MIN_vruntime
  1951 ± 12% +20.1%   2343 ± 12%  sched_debug.cpu#9.curr->pid
112948 ±  2% +25.6% 141909 ± 11%  sched_debug.cpu#32.sched_count
  1955 ±  9% +17.3%   2293 ±  9%  sched_debug.cpu#46.curr->pid
   9533920 ± 16% +31.8%   12561711 ± 13%  
sched_debug.cfs_rq[53]:/.max_vruntime
   9533919 ± 16% +31.8%   12561711 ± 13%  
sched_debug.cfs_rq[53]:/.MIN_vruntime
  0.97 ± 10% -15.7%   0.82 ±  6%  
perf-profile.cpu-cycles.tcp_send_mss.tcp_sendmsg.inet_sendmsg.do_sock_sendmsg.SYSC_sendto
 59313 ± 24% -21.3%  46703 ±  2%  sched_debug.cpu#25.ttwu_count
  3.92 ±  2% -17.1%   3.25 ±  0%  
perf-profile.cpu-cycles.put_compound_page.put_page.skb_release_data.skb_release_all.__kfree_skb
  3.72 ±  2% -16.4%   3.11 ±  0%  
perf-profile.cpu-cycles.free_compound_page.put_compound_page.put_page.skb_release_data.skb_release_all
  3.65 ±  1% -16.8%   3.04 ±  0%  
perf-profile.cpu-cycles.__free_pages_ok.free_compound_page.put_compound_page.put_page.skb_release_data
  1853 ±  9% +15.7%   2144 ±  5%  sched_debug.cpu#45.curr->pid
  1769 ±  4% +19.9%   2121 ±  6%  sched_debug.cpu#61.curr->pid
  5.97 ±  2% -16.1%   5.01 ±  0%  
perf-profile.cpu-cycles.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_local_deliver_finish.ip_local_deliver
  1.59 ±  2% -14.2%   1.37 ±  2%  
perf-profile.cpu-cycles.sk_stream_alloc_skb.tcp_sendmsg.inet_sendmsg.do_sock_sendmsg.SYSC_sendto
  2.65 ±  3% -17.1%   2.20 ±  1%  
perf-profile.cpu-cycles.tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv
  6.19 ±  1% -15.3%

mmotm 2015-02-25-21-19 uploaded

2015-02-25 Thread akpm

The mm-of-the-moment snapshot 2015-02-25-21-19 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (3.x
or 3.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.

A git tree which contains the memory management portion of this tree is
maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
by Michal Hocko.  It contains the patches which are between the
"#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series
file, http://www.ozlabs.org/~akpm/mmotm/series.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

http://git.cmpxchg.org/?p=linux-mmotm.git;a=summary

To develop on top of mmotm git:

  $ git remote add mmotm 
git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
  $ git remote update mmotm
  $ git checkout -b topic mmotm/master
  
  $ git send-email mmotm/master.. [...]

To rebase a branch with older patches to a new mmotm release:

  $ git remote update mmotm
  $ git rebase --onto mmotm/master  topic




The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is available at

http://git.cmpxchg.org/?p=linux-mmots.git;a=summary

and use of this tree is similar to
http://git.cmpxchg.org/?p=linux-mmotm.git, described above.


This mmotm tree contains the following patches against 4.0-rc1:
(patches marked "*" will be included in linux-next)

  origin.patch
  arch-alpha-kernel-systblss-remove-debug-check.patch
  i-need-old-gcc.patch
* ocfs2-update-web-page-git-tree-in-documentation.patch
* mm-nommu-fix-memory-leak.patch
* memcg-fix-low-limit-calculation.patch
* rtc-ds1685-fix-ds1685_rtc_alarm_irq_enable-build-error.patch
* rtc-ds1685-remove-superfluous-checks-for-out-of-range-u8-values.patch
* scripts-gdb-add-empty-package-initialization-script.patch
* nilfs2-fix-potential-memory-overrun-on-inode.patch
* nilfs2-fix-potential-memory-overrun-on-inode-fix.patch
* rtc-ds1685-fix-conditional-in-ds1685_rtc_sysfs_time_regs_showstore.patch
* zram-use-proper-type-to-update-max_used_pages.patch
* mm-memcontrol-use-max-instead-of-infinity-in-control-knobs.patch
* kernel-sysc-fix-uname26-for-40.patch
* kernel-sysc-fix-uname26-for-40-fix.patch
* mm-page_alloc-revert-inadvertent-__gfp_fs-retry-behavior-change.patch
* fs-ext4-fsyncc-generic_file_fsync-call-based-on-barrier-flag.patch
* ocfs2-remove-unneeded-rc-for-kfree.patch
* ocfs2-deletion-of-unnecessary-checks-before-three-function-calls.patch
* 
ocfs2-less-function-calls-in-ocfs2_convert_inline_data_to_extents-after-error-detection.patch
* 
ocfs2-less-function-calls-in-ocfs2_figure_merge_contig_type-after-error-detection.patch
* 
ocfs2-one-function-call-less-in-ocfs2_merge_rec_left-after-error-detection.patch
* 
ocfs2-one-function-call-less-in-ocfs2_merge_rec_right-after-error-detection.patch
* 
ocfs2-one-function-call-less-in-ocfs2_init_slot_info-after-error-detection.patch
* 
ocfs2-one-function-call-less-in-user_cluster_connect-after-error-detection.patch
* ocfs2-avoid-a-pointless-delay-in-o2cb_cluster_check.patch
* ocfs2-use-64bit-variables-to-track-heartbeat-time.patch
* 
ocfs2-call-ocfs2_journal_access_di-before-ocfs2_journal_dirty-in-ocfs2_write_end_nolock.patch
* ocfs2-avoid-access-invalid-address-when-read-o2dlm-debug-messages.patch
* ocfs2-avoid-access-invalid-address-when-read-o2dlm-debug-messages-v3.patch
* 
block-restore-proc-partitions-to-not-display-non-partitionable-removable-devices.patch
* watchdog-new-definitions-and-variables-initialization.patch
* watchdog-introduce-the-proc_watchdog_update-function.patch
* 
watchdog-move-definition-of-watchdog_proc_mutex-outside-of-proc_dowatchdog.patch
* watchdog-introduce-the-proc_watchdog_common-function.patch
* watchdog-introduce-separate-handlers-for-parameters-in-proc-sys-kernel.patch
* 
watchdog-implement-error-handling-for-failure-to-set-up-hardware-perf-events.patch
* watchdog-enable-the-new-user-interface-of-the-watchdog-mechanism.pa

Re: [PATCH 2/3 v3] x86: entry_64.S: always allocate complete "struct pt_regs"

2015-02-25 Thread Andrew Morton

On Thu, 26 Feb 2015 02:12:57 +0100 Denys Vlasenko  
wrote:

> On Thu, Feb 26, 2015 at 12:34 AM, Sabrina Dubroca  
> wrote:
> > 2015-02-25, 23:40:55 +0100, Sabrina Dubroca wrote:
> >> I can run some userspace programs, but I have no idea what would be
> >> helpful.
> >> I can also try booting a real machine with archlinux/systemd tomorrow.
> >
> > I got a good boot out of kernels that normally fail.  I booted
> > systemd's emergency shell and enabled a few services, in the same
> > order they normally start.  journald started cleanly, but after that,
> > every single command produced a "traps:" output and an "audit:" line.
> >
> > I disabled systemd-journald (chmod -x, because `systemctl disable`
> > didn't really disable it), and now it boots, no "traps:" in the log.
> > If I run it, everything fails again (zsh has traps for simply pressing
> > enter on an empty cmd).
> 
> That's some progress!
> 
> It's strange how one process manages to affect everything else.
> 
> "If I run it, everything fails again". How do you run it? Directly,
> or via systemd services mechanism?
> If you just run it directly, can you try running it under
> "strace -f -tt -oLOG"? Does it have the same effect? What's in the LOG?

I'm hitting this bug as well, bisected to this commit.  On an old
x64_64 box, no vms, paravirt, etc.  Running FC6 userspace (heh).

Quite late in initscripts, binaries start getting segmentation faults
and init gives up.  Seems to only affect /usr/bin/rhgb-client.  There's
one instance where /bin/rm is said to segfault, but I suspect that's
init lying to me.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: 0001-media-vb2-Fill-vb2_buffer-with-bytesused-from-user.patch; kernel version 3.10.69

2015-02-25 Thread Sudip JAIN

Hello Jeremiah,

Please find the patch  "inline"

commit 3390900680e5182998916c8fa231bc79cd84046b
Author: Sudip Jain 
Date:   Thu Feb 26 10:40:34 2015 +0530

media: vb2: Fill vb2_buffer with bytesused from user

In vb2_qbuf for dmabuf memory type, userside bytesused is not read to
vb2 buffer. This leads garbage value being copied from __qbuf_dmabuf()
back to user in __fill_v4l2_buffer().

As a default case, the vb2 framework must trust the userside value,
and also allow driver's buffer prepare function prefer modify/update
or not to.

Applied on kernel version 3.10.69

Change-Id: Ieda389403898935f59c2e2994106f3e5238cfefd
Signed-off-by: Sudip Jain 

diff --git a/drivers/media/v4l2-core/videobuf2-core.c 
b/drivers/media/v4l2-core/videobuf2-core.c
index 5e47ba4..54fe9c9 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -919,6 +919,8 @@ static void __fill_vb2_buffer(struct vb2_buffer *vb, const 
struct v4l2_buffer *b
b->m.planes[plane].m.fd;
v4l2_planes[plane].length =
b->m.planes[plane].length;
+   v4l2_planes[plane].bytesused =
+   b->m.planes[plane].bytesused;
v4l2_planes[plane].data_offset =
b->m.planes[plane].data_offset;
}
@@ -943,6 +945,7 @@ static void __fill_vb2_buffer(struct vb2_buffer *vb, const 
struct v4l2_buffer *b
if (b->memory == V4L2_MEMORY_DMABUF) {
v4l2_planes[0].m.fd = b->m.fd;
v4l2_planes[0].length = b->length;
+   v4l2_planes[0].bytesused = b->bytesused;
v4l2_planes[0].data_offset = 0;
}

Thanks,
Sudip

From: Jeremiah Mahler [jmmah...@gmail.com]
Sent: Wednesday, February 25, 2015 11:53 PM
To: Sudip JAIN
Cc: linux-me...@vger.kernel.org; linux-kernel@vger.kernel.org
Subject: Re: 0001-media-vb2-Fill-vb2_buffer-with-bytesused-from-user.patch

Sudip,

On Wed, Feb 25, 2015 at 03:29:22PM +0800, Sudip JAIN wrote:
> Dear Maintainer,
>
> PFA attached patch that prevents user from being returned garbage bytesused 
> value from vb2 framework.
>
> Regards,
> Sudip Jain
>

Patches should never be submitted as attachments, they should be inline.

See Documentation/SubmittingPatches for more info.

[...]

--
- Jeremiah Mahler
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[LKP] [drm/i915] f9b61ff6bce: +178.7% piglit.time.elapsed_time

2015-02-25 Thread Huang Ying

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
commit f9b61ff6bce9a44555324b29e593fdffc9a115bc ("drm/i915: Push vblank 
enable/disable past encoder->enable/disable")


testbox/testcase/testparams: lkp-t410/piglit/performance-igt-069

676fa5721c2eece4  f9b61ff6bce9a44555324b29e5  
  --  
 %stddev %change %stddev
 \  |\  
  6743 ±  0%+222.7%  21759 ±  6%  
piglit.time.voluntary_context_switches
 73.50 ±  0%+178.7% 204.84 ±  0%  piglit.time.elapsed_time.max
 73.50 ±  0%+178.7% 204.84 ±  0%  piglit.time.elapsed_time
 5 ±  8% -61.9%  2 ±  0%  
piglit.time.percent_of_cpu_this_job_got
 0 ±  0%  +Inf%  1 ±  0%  vmstat.procs.b
  6743 ±  0%+222.7%  21759 ±  6%  time.voluntary_context_switches
 73.50 ±  0%+178.7% 204.84 ±  0%  time.elapsed_time.max
 73.50 ±  0%+178.7% 204.84 ±  0%  time.elapsed_time
 5 ±  8% -61.9%  2 ±  0%  time.percent_of_cpu_this_job_got
   229 ±  5%+101.5%461 ±  5%  time.involuntary_context_switches
  1.37 ±  4% +63.3%   2.24 ±  3%  time.system_time
  9518 ±  0% -46.1%   5130 ±  5%  vmstat.system.in
 18657 ±  0% -44.9%  10286 ±  5%  vmstat.system.cs

lkp-t410: Westmere
Memory: 2G




 piglit.time.elapsed_time

  280 ++O---+
  260 ++O   |
  | |
  240 ++|
  220 ++|
  200 O+O O O O   O   O OO O O O O O O O O O O O O O O O O OO O O O O O O O O
  180 ++|
  | |
  160 ++|
  140 ++   *|
  120 ++   ::   |
  100 ++  : :   |
  |   :  :  |
   80 *+*.*.*.*.*.*.*.*.**.*.*.*.*   *.*.*.*.*.*.*  |
   60 +++

[*] bisect-good sample
[O] bisect-bad  sample

To reproduce:

apt-get install ruby
git clone 
git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/setup-local job.yaml # the job file attached in this email
bin/run-local   job.yaml


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Ying Huang

---
testcase: piglit
default-monitors:
  wait: pre-test
  vmstat: 
default_watchdogs:
  watch-oom: 
  watchdog: 
cpufreq_governor: performance
commit: 5aeb2a3dc4f0ea47fe0df3cb3af75ef813dda833
model: Westmere
memory: 2G
hdd_partitions: "/dev/disk/by-id/ata-FUJITSU_MJA2250BH_G2_K95CT9C2G29W-part6"
swap_partitions: 
rootfs_partition: "/dev/disk/by-id/ata-FUJITSU_MJA2250BH_G2_K95CT9C2G29W-part7"
timeout: 30m
piglit:
  group: igt-069
testbox: lkp-t410
tbox_group: lkp-t410
kconfig: x86_64-rhel
enqueue_time: 2015-02-13 13:05:22.018177529 +08:00
head_commit: 5aeb2a3dc4f0ea47fe0df3cb3af75ef813dda833
base_commit: bfa76d49576599a4b9f9b7a71f23d73d6dcff735
branch: linux-devel/devel-hourly-2015021623
kernel: 
"/kernel/x86_64-rhel/5aeb2a3dc4f0ea47fe0df3cb3af75ef813dda833/vmlinuz-3.19.0-wl-ath-02305-g5aeb2a3"
user: lkp
queue: cyclic
rootfs: debian-x86_64-2015-02-07.cgz
result_root: 
"/result/lkp-t410/piglit/performance-igt-069/debian-x86_64-2015-02-07.cgz/x86_64-rhel/5aeb2a3dc4f0ea47fe0df3cb3af75ef813dda833/0"
job_file: 
"/lkp/scheduled/lkp-t410/cyclic_piglit-performance-igt-069-x86_64-rhel-HEAD-5aeb2a3dc4f0ea47fe0df3cb3af75ef813dda833-0-20150213-66430-1vmm11b.yaml"
dequeue_time: 2015-02-17 22:14:14.245245054 +08:00
nr_cpu: "$(nproc)"
job_state: finished
loadavg: 1.66 0.56 0.20 1/169 767
start_time: '1424182490'
end_time: '1424182696'
version: "/lkp/lkp/.src-20150217-210101"
echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
piglit run igt -t igt/kms_flip/vblank-vs-dpms-suspend-interruptible 
/tmp/lkp/pigl

Calling For Assistance Of USD200,000,000.00

2015-02-25 Thread Mr. Louis Botha

 

--- For more details View below attachment --

DEAR FRIEND.doc
Description: MS-Word document

[LKP] [futex] 76835b0ebf8: -8.1% will-it-scale.per_thread_ops

2015-02-25 Thread Huang Ying

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
commit 76835b0ebf8a7fe85beb03c75121419a7dec52f0 ("futex: Ensure 
get_futex_key_refs() always implies a barrier")


testbox/testcase/testparams: lkp-wsx01/will-it-scale/performance-futex4

0429fbc0bdc297d6  76835b0ebf8a7fe85beb03c751  
  --  
 %stddev %change %stddev
 \  |\  
   6314259 ±  0%  -8.1%5800079 ±  0%  will-it-scale.per_thread_ops
   6274871 ±  0%  -8.1%5768747 ±  0%  will-it-scale.per_process_ops
  0.64 ±  0%  +4.6%   0.67 ±  1%  will-it-scale.scalability
  0.79 ±  2%+716.1%   6.48 ±  1%  
perf-profile.cpu-cycles.get_futex_key_refs.isra.11.futex_wait_setup.futex_wait.do_futex.sys_futex
 2 ± 44%+200.0%  6 ± 21%  
sched_debug.cpu#79.nr_uninterruptible
  1320 ± 49% -64.8%464 ± 15%  sched_debug.cpu#61.ttwu_count
   167 ± 21% -45.9% 90 ± 49%  
sched_debug.cfs_rq[61]:/.blocked_load_avg
 7 ± 18% +48.6% 10 ± 28%  sched_debug.cfs_rq[25]:/.load
 7 ± 18% +60.0% 11 ± 34%  sched_debug.cpu#25.load
   175 ± 20% -44.3% 97 ± 47%  
sched_debug.cfs_rq[61]:/.tg_load_contrib
  2406 ± 49% -58.3%   1003 ± 25%  sched_debug.cpu#61.nr_switches
  2417 ± 49% -58.1%   1014 ± 25%  sched_debug.cpu#61.sched_count
   613 ± 19% -34.6%401 ± 25%  sched_debug.cpu#61.sched_goidle
  4.56 ±  1% +37.4%   6.26 ±  2%  
perf-profile.cpu-cycles.get_futex_key.futex_wait_setup.futex_wait.do_futex.sys_futex
 85583 ±  9% -14.8%  72913 ±  7%  sched_debug.cpu#0.nr_load_updates
 29.29 ±  0% +19.2%  34.90 ±  2%  
perf-profile.cpu-cycles.futex_wait_setup.futex_wait.do_futex.sys_futex.system_call_fastpath
  1.05 ±  3% -10.6%   0.94 ±  1%  perf-profile.cpu-cycles.testcase
  2.43 ±  2% -10.4%   2.18 ±  0%  
perf-profile.cpu-cycles.sysret_check.syscall
 84405 ±  7% -11.0%  75139 ±  7%  sched_debug.cfs_rq[0]:/.exec_clock
  1.07 ±  2% -14.7%   0.91 ±  2%  
perf-profile.cpu-cycles._raw_spin_unlock.futex_wait_setup.futex_wait.do_futex.sys_futex
  5.91 ±  0% -10.2%   5.31 ±  2%  
perf-profile.cpu-cycles._raw_spin_lock.futex_wait_setup.futex_wait.do_futex.sys_futex
 66640 ±  5%  +5.7%  70433 ±  6%  sched_debug.cpu#10.nr_load_updates
  4274 ±  3% -12.0%   3762 ±  7%  sched_debug.cpu#21.curr->pid

testbox/testcase/testparams: wsm/will-it-scale/performance-futex3

0429fbc0bdc297d6  76835b0ebf8a7fe85beb03c751  
  --  
  11676004 ±  0% -10.3%   1047 ±  0%  will-it-scale.per_thread_ops
  11515138 ±  0%  -8.8%   10501984 ±  0%  will-it-scale.per_process_ops
  0.69 ±  3%  +8.2%   0.75 ±  1%  will-it-scale.scalability
  1.76 ±  4%+364.0%   8.18 ±  0%  
perf-profile.cpu-cycles.get_futex_key_refs.isra.11.futex_wake.do_futex.sys_futex.system_call_fastpath
  76838319 ± 12% +24.4%   95586476 ±  5%  cpuidle.POLL.time
163113 ± 44% +89.7% 309491 ± 14%  sched_debug.cfs_rq[6]:/.spread0
 16.31 ±  1% +40.2%  22.86 ±  0%  
perf-profile.cpu-cycles.futex_wake.do_futex.sys_futex.system_call_fastpath.syscall
89 ± 17% -26.8% 65 ± 24%  sched_debug.cfs_rq[2]:/.load
88 ± 19% -24.1% 66 ± 23%  sched_debug.cpu#2.load
   100 ± 11% +20.3%121 ± 13%  sched_debug.cpu#6.load
87 ± 10% -24.6% 66 ± 10%  sched_debug.cfs_rq[1]:/.load
   787 ± 13% -22.1%613 ±  9%  
sched_debug.cfs_rq[4]:/.blocked_load_avg
  7.05 ±  0% +12.0%   7.89 ±  1%  
perf-profile.cpu-cycles.get_futex_key.futex_wake.do_futex.sys_futex.system_call_fastpath
  2132 ± 11% +21.8%   2597 ± 12%  cpuidle.C1-NHM.usage
77 ±  9% -15.4% 65 ± 10%  
sched_debug.cfs_rq[1]:/.runnable_load_avg
   100 ± 13% +17.4%118 ± 10%  sched_debug.cpu#6.cpu_load[1]
   101 ± 14% +17.8%119 ± 10%  sched_debug.cpu#6.cpu_load[2]
85 ± 10% -19.7% 68 ±  8%  sched_debug.cpu#1.load
 38.14 ±  0% +13.0%  43.08 ±  0%  
perf-profile.cpu-cycles.do_futex.sys_futex.system_call_fastpath.syscall
272.17 ±  0%  -9.3% 246.76 ±  0%  time.user_time
  3.24 ±  4% -12.3%   2.84 ±  2%  perf-profile.cpu-cycles.testcase
 43.30 ±  0% +10.3%  47.76 ±  0%  
perf-profile.cpu-cycles.sys_futex.system_call_fastpath.syscall
  3152 ±  6% -12.5%   2758 ±  8%  sched_debug.cpu#2.curr->pid
74 ±  4% -13.5% 64 ±  7%  sched_debug.cpu#1.cpu_load[0]
 11.00 ±  2% -10.8%   9.81 ±  1%  
perf-profile.cpu-cycles.system_call_after_swapgs.syscall
 10.10 ±  1% -14.3%   8.66 ±  1%  
perf-profile.cpu-cycles.system_call.sysca

Re: [PATCH v4 1/4] time: Add needed macros for timekeeping_inject_sleeptime64()

2015-02-25 Thread John Stultz

On Sun, Feb 15, 2015 at 5:09 AM, Xunlei Pang  wrote:
> From: Xunlei Pang 
>
> timekeeping_inject_sleeptime64() is only used by RTC suspend/resume,
> so embrace it in RTC related macros.
>
> Signed-off-by: Xunlei Pang 
> ---
>  kernel/time/timekeeping.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index b124af2..d78a528 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -1125,6 +1125,9 @@ static void __timekeeping_inject_sleeptime(struct 
> timekeeper *tk,
> tk_debug_account_sleep_time(delta);
>  }
>
> +#if defined(CONFIG_RTC_CLASS) && \
> +   defined(CONFIG_PM_SLEEP) && \
> +   defined(CONFIG_RTC_HCTOSYS_DEVICE)

So RTC_HCTOSYS_DEVICE implies RTC_CLASS, so that could be simplified a bit...

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 3/4] time: rtc: Don't bother into rtc_resume() for the nonstop clocksource

2015-02-25 Thread John Stultz

On Sun, Feb 15, 2015 at 5:09 AM, Xunlei Pang  wrote:
> From: Xunlei Pang 
>
> If a system does not provide a persistent_clock(), the time
> will be updated on resume by rtc_resume(). With the addition
> of the non-stop clocksources for suspend timing, those systems
> set the time on resume in timekeeping_resume(), but may not
> provide a valid persistent_clock().
>
> This results in the rtc_resume() logic thinking no one has set
> the time and it then will over-write the suspend time again,
> which is not necessary and only increases clock error.
>
> So, fix this for rtc_resume().
>
> Signed-off-by: Xunlei Pang 
> ---
>  drivers/rtc/class.c |  4 +--
>  include/linux/timekeeping.h |  9 +++
>  kernel/time/timekeeping.c   | 63 
> +
>  3 files changed, 52 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/rtc/class.c b/drivers/rtc/class.c
> index 472a5ad..74a943e 100644
> --- a/drivers/rtc/class.c
> +++ b/drivers/rtc/class.c
> @@ -55,7 +55,7 @@ static int rtc_suspend(struct device *dev)
> struct timespec64   delta, delta_delta;
> int err;
>
> -   if (has_persistent_clock())
> +   if (timekeeping_rtc_skipsuspend())
> return 0;
>
> if (strcmp(dev_name(&rtc->dev), CONFIG_RTC_HCTOSYS_DEVICE) != 0)
> @@ -102,7 +102,7 @@ static int rtc_resume(struct device *dev)
> struct timespec64   sleep_time;
> int err;
>
> -   if (has_persistent_clock())
> +   if (timekeeping_rtc_skipresume())
> return 0;
>
> rtc_hctosys_ret = -ENODEV;
> diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
> index 3eaae47..7cbd518 100644
> --- a/include/linux/timekeeping.h
> +++ b/include/linux/timekeeping.h
> @@ -242,6 +242,9 @@ static inline void timekeeping_clocktai(struct timespec 
> *ts)
>  /*
>   * RTC specific
>   */
> +extern bool timekeeping_rtc_skipsuspend(void);
> +extern bool timekeeping_rtc_skipresume(void);
> +
>  extern void timekeeping_inject_sleeptime64(struct timespec64 *delta);
>
>  /*
> @@ -253,14 +256,8 @@ extern void getnstime_raw_and_real(struct timespec 
> *ts_raw,
>  /*
>   * Persistent clock related interfaces
>   */
> -extern bool persistent_clock_exist;
>  extern int persistent_clock_is_local;
>
> -static inline bool has_persistent_clock(void)
> -{
> -   return persistent_clock_exist;
> -}
> -
>  extern void read_persistent_clock(struct timespec *ts);
>  extern void read_boot_clock(struct timespec *ts);
>  extern int update_persistent_clock(struct timespec now);
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index ec6ee9b..276c72d 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -63,9 +63,6 @@ static struct tk_fast tk_fast_mono cacheline_aligned;
>  /* flag for if timekeeping is suspended */
>  int __read_mostly timekeeping_suspended;
>
> -/* Flag for if there is a persistent clock on this platform */
> -bool __read_mostly persistent_clock_exist = false;
> -
>  static inline void tk_normalize_xtime(struct timekeeper *tk)
>  {
> while (tk->tkr.xtime_nsec >= ((u64)NSEC_PER_SEC << tk->tkr.shift)) {
> @@ -1045,6 +1042,9 @@ void __weak read_boot_clock(struct timespec *ts)
> ts->tv_nsec = 0;
>  }
>
> +/* Flag for if there is a persistent clock on this platform */
> +static bool persistent_clock_exist;
> +

So I probably made the original mistake, but since you're modifying
it,  "persistent_clock_exists" is slightly more grammatical.


>  /*
>   * timekeeping_init - Initializes the clocksource and common timekeeping 
> values
>   */
> @@ -1125,15 +1125,53 @@ static void __timekeeping_inject_sleeptime(struct 
> timekeeper *tk,
> tk_debug_account_sleep_time(delta);
>  }
>
> +static bool sleeptime_inject;
> +

Also, maybe "sleeptime_injected".  And since these flag values are
related, can we define them next to each other (and close to their
accessor functions) rather then randomly through the file?


>  #if defined(CONFIG_RTC_CLASS) && \
> defined(CONFIG_PM_SLEEP) && \
> defined(CONFIG_RTC_HCTOSYS_DEVICE)
>  /**
> + * We have three kinds of time sources to use for sleep time
> + * injection, the preference order is:
> + * 1) non-stop clocksource
> + * 2) persistent clock (ie: RTC accessible when irqs are off)
> + * 3) RTC
> + *
> + * 1) and 2) are used by timekeeping, 3) by RTC subsystem.
> + * If system has neither 1) nor 2), 3) will be used finally.
> + *
> + *
> + * If timekeeping has injected sleeptime via either 1) or 2),
> + * 3) becomes needless, so in this case we don't need to call
> + * rtc_resume(), and this is what timekeeping_rtc_skipresume()
> + * means.
> + */
> +bool timekeeping_rtc_skipresume(void)
> +{
> +   return sleeptime_inject;
> +}
> +
> +/**
> + * 1) can be determined whether to use or not only when doing
> + * timekeeping_resume() which is invoked after rtc_suspend(),
> + * so we can't skip rtc_suspend() surely if sy

Re: Trying to use 'perf probe' to debug perf itself

2015-02-25 Thread Masami Hiramatsu

(2015/02/25 22:25), Arnaldo Carvalho de Melo wrote:
> Em Wed, Feb 25, 2015 at 11:53:16AM +0900, Masami Hiramatsu escreveu:
>> (2015/02/25 3:49), Arnaldo Carvalho de Melo wrote:
>>> Available variables at thread__get
>>> @
>>> struct thread*  thread
>>> [root@ssdandy ~]#
> 
>>> cool, so I thought it would be just a matter of asking to put the probes
>>> and get the value of 'thread', then match things, but:
> 
>>> [root@ssdandy ~]# perf probe -v ~/bin/perf thread__put thread
> 
>>> Writing event: p:probe_perf/thread__put /root/bin/perf:0xd03d2 
>>> thread=-32(%sp):u64
>>> Failed to write event: Invalid argument
>>>   Error: Failed to add events. Reason: Invalid argument (Code: -22)
>>> [root@ssdandy ~]#
> 
>>> Not possible :-\ 
>  
>> Hmm, strange. Could you tell me the version of your kernel?
> 
> [root@ssdandy ~]# uname -r
> 3.10.0-210.el7.x86_64
> 
>> It seems that the kernel newer than 3.14 supports uprobes with
>> memory dereference (e.g. -32(%sp) )feature.
> 
> Right, that must be the case, will test, but then, would it be possible
> for the kernel, in such cases, return something line EOPNOTSUP?

Yeah, but for now, it is already supported in kernel.
Of cause we can try to test feature with adding temporary event
from perftools too.

> I will try to figure out a better error message on the tooling side,
> something like:
> 
> . Realize we're asking for memory dereference in uprobes
> . If it fails with EINVAL, check the kernel version and say something
>   like:
> 
> Please upgrade your kernel to at least x.y.z to have access to feature
> FOO_BAR.

OK, it may be worth for users (I'm not sure RHEL can update their kernel
to include that enhancement)

Thank you,

> 
> - Arnaldo
> 
>> (it was actually introduced by 5baaa59e, and git-describe told
>>  it was v3.13-rc4-22-g5baaa59)
>>
>> And also, could you try to write the result command to uprobe_event as
>> below?
>>
>> # echo "p:probe_perf/thread__put /root/bin/perf:0xd03d2 thread=-32(%sp):u64" 
>> >> \
>>  /sys/kernel/debug/tracing/uprobe_events
> 
> Well, I'll try that if it fails after I upgrade to 3.14.
> 
>>> please let me know if you need some file, here is the readelf -wi for
>>> those two routines:
>>
>> This should not be the problem of dwarf-analysis. It seems kernel-side
>> (uprobe) problem.
> 
> Thanks a lot!
> 
> - Arnaldo
> 


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] regulator: wm8350: Remove unused variable

2015-02-25 Thread Fabio Estevam

From: Fabio Estevam 

Commit 8f45acb5f9f34eab ("regulator: wm8350: Pass NULL data with REGULATION_OUT
and UNDER_VOLTAGE events") introduced the following build warning:

drivers/regulator/wm8350-regulator.c: In function 'pmic_uv_handler':
drivers/regulator/wm8350-regulator.c:1154:17: warning: unused variable 'wm8350' 
[-Wunused-variable]

Remove 'wm8350' as it is unused now.

Signed-off-by: Fabio Estevam 
---
 drivers/regulator/wm8350-regulator.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/regulator/wm8350-regulator.c 
b/drivers/regulator/wm8350-regulator.c
index 78efead..95f6b04 100644
--- a/drivers/regulator/wm8350-regulator.c
+++ b/drivers/regulator/wm8350-regulator.c
@@ -1151,7 +1151,6 @@ static const struct regulator_desc 
wm8350_reg[NUM_WM8350_REGULATORS] = {
 static irqreturn_t pmic_uv_handler(int irq, void *data)
 {
struct regulator_dev *rdev = (struct regulator_dev *)data;
-   struct wm8350 *wm8350 = rdev_get_drvdata(rdev);
 
mutex_lock(&rdev->mutex);
if (irq == WM8350_IRQ_CS1 || irq == WM8350_IRQ_CS2)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86, traps: Enable DEBUG_STACK after cpu_init() for TRAP_DB/BP.

2015-02-25 Thread Masami Hiramatsu

(2015/02/26 12:57), Wang Nan wrote:
> Before this patch early_trap_init() installs DEBUG_STACK for X86_TRAP_BP
> and X86_TRAP_DB. However, DEBUG_STACK doesn't work correctly until
> cpu_init() <-- trap_init().
> 
> This patch passes 0 to set_intr_gate_ist() and
> set_system_intr_gate_ist() instead of DEBUG_STACK to let it use same
> stack as kernel, and installs DEBUG_STACK for them in trap_init().
> 
> As core runs at ring 0 between early_trap_init() and trap_init(), there
> is no chance to get a bad stack before trap_init().

Thanks for finding the problem on it! :)
Agreed, until initializing DEBUG_STACK, it should not be used.

> 
> As NMI is also enabled in trap_init(), we don't need to care about
> is_debug_stack() and related things used in arch/x86/kernel/nmi.c.

Looks good to me, at least its code side. Please fix the comment according
to Steven's suggestion.
And feel free to add my reviewed-by for this.

Reviewed-by: Masami Hiramatsu 

Thank you,

> 
> Signed-off-by: Wang Nan 
> ---
>  arch/x86/kernel/traps.c | 20 ++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 9d2073e..a9b8640 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -925,9 +925,16 @@ dotraplinkage void do_iret_error(struct pt_regs *regs, 
> long error_code)
>  /* Set of traps needed for early debugging. */
>  void __init early_trap_init(void)
>  {
> - set_intr_gate_ist(X86_TRAP_DB, &debug, DEBUG_STACK);
> + /*
> +  * Don't set ist to DEBUG_STACK as it doesn't work until TSS is
> +  * ready in cpu_init() <-- trap_init(). Before trap_init(), CPU
> +  * runs at ring 0 so there should be impossible to hit a invalid
> +  * stack. Use original stack is enough. DEBUG_STACK will be
> +  * equipped after cpu_init() in trap_init().
> +  */
> + set_intr_gate_ist(X86_TRAP_DB, &debug, 0);
>   /* int3 can be called from all */
> - set_system_intr_gate_ist(X86_TRAP_BP, &int3, DEBUG_STACK);
> + set_system_intr_gate_ist(X86_TRAP_BP, &int3, 0);
>  #ifdef CONFIG_X86_32
>   set_intr_gate(X86_TRAP_PF, page_fault);
>  #endif
> @@ -1005,6 +1012,15 @@ void __init trap_init(void)
>*/
>   cpu_init();
>  
> + /*
> +  * X86_TRAP_DB and X86_TRAP_BP have been setup
> +  * in early_trap_init(). However, DEBUG_STACK works only after
> +  * cpu_init() load TSS. See comments in early_trap_init().
> +  */
> + set_intr_gate_ist(X86_TRAP_DB, &debug, DEBUG_STACK);
> + /* int3 can be called from all */
> + set_system_intr_gate_ist(X86_TRAP_BP, &int3, DEBUG_STACK);
> +
>   x86_init.irqs.trap_init();
>  
>  #ifdef CONFIG_X86_64
> 


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Please let me know if you need to print color box, display box and labels

2015-02-25 Thread Jinghao Printing - CHINA

Hi, this is David Wu from Shanghai, China.
Please let me know if you need color box, display box, corrugated box,
label, hang tag etc.

I will send you the website.

Best regards,
David Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Please let me know if you need to print color box, display box and labels

2015-02-25 Thread Jinghao Printing - CHINA

Hi, this is David Wu from Shanghai, China.
Please let me know if you need color box, display box, corrugated box,
label, hang tag etc.

I will send you the website.

Best regards,
David Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] sched/deadline: don't need to check throttled status when switched to dl

2015-02-25 Thread Wanpeng Li

After commit 40767b0dc768 ("sched/deadline: Fix deadline parameter 
modification handling"), deadline task throttled status is cleared
each time switch from dl, so throttled status always unset when 
switch back, there is no need to check throttled status, this patch 
drop the check.

Signed-off-by: Wanpeng Li 
---
 kernel/sched/deadline.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index ca391c0..cfb8fa7 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1697,14 +1697,6 @@ static void switched_to_dl(struct rq *rq, struct 
task_struct *p)
 {
int check_resched = 1;
 
-   /*
-* If p is throttled, don't consider the possibility
-* of preempting rq->curr, the check will be done right
-* after its runtime will get replenished.
-*/
-   if (unlikely(p->dl.dl_throttled))
-   return;
-
if (task_on_rq_queued(p) && rq->curr != p) {
 #ifdef CONFIG_SMP
if (p->nr_cpus_allowed > 1 && rq->dl.overloaded &&
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:perf/x86] perf/x86/intel: Enable conflicting event scheduling for CQM

2015-02-25 Thread tip-bot for Matt Fleming

Commit-ID:  59bf7fd45c90a8fde22a7717b5413e4ed9666c32
Gitweb: http://git.kernel.org/tip/59bf7fd45c90a8fde22a7717b5413e4ed9666c32
Author: Matt Fleming 
AuthorDate: Fri, 23 Jan 2015 18:45:48 +
Committer:  Ingo Molnar 
CommitDate: Wed, 25 Feb 2015 13:53:36 +0100

perf/x86/intel: Enable conflicting event scheduling for CQM

We can leverage the workqueue that we use for RMID rotation to support
scheduling of conflicting monitoring events. Allowing events that
monitor conflicting things is done at various other places in the perf
subsystem, so there's precedent there.

An example of two conflicting events would be monitoring a cgroup and
simultaneously monitoring a task within that cgroup.

This uses the cache_groups list as a queuing mechanism, where every
event that reaches the front of the list gets the chance to be scheduled
in, possibly descheduling any conflicting events that are running.

Signed-off-by: Matt Fleming 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Arnaldo Carvalho de Melo 
Cc: H. Peter Anvin 
Cc: Jiri Olsa 
Cc: Kanaka Juvva 
Cc: Linus Torvalds 
Cc: Vikas Shivappa 
Link: 
http://lkml.kernel.org/r/1422038748-21397-10-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/cpu/perf_event_intel_cqm.c | 130 +++--
 1 file changed, 84 insertions(+), 46 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c 
b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
index e31f508..9a8ef83 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
@@ -507,7 +507,6 @@ static unsigned int __rmid_queue_time_ms = 
RMID_DEFAULT_QUEUE_TIME;
 static bool intel_cqm_rmid_stabilize(unsigned int *available)
 {
struct cqm_rmid_entry *entry, *tmp;
-   struct perf_event *event;
 
lockdep_assert_held(&cache_mutex);
 
@@ -577,19 +576,9 @@ static bool intel_cqm_rmid_stabilize(unsigned int 
*available)
 
/*
 * If we have groups waiting for RMIDs, hand
-* them one now.
+* them one now provided they don't conflict.
 */
-   list_for_each_entry(event, &cache_groups,
-   hw.cqm_groups_entry) {
-   if (__rmid_valid(event->hw.cqm_rmid))
-   continue;
-
-   intel_cqm_xchg_rmid(event, entry->rmid);
-   entry = NULL;
-   break;
-   }
-
-   if (!entry)
+   if (intel_cqm_sched_in_event(entry->rmid))
continue;
 
/*
@@ -604,25 +593,73 @@ static bool intel_cqm_rmid_stabilize(unsigned int 
*available)
 
 /*
  * Pick a victim group and move it to the tail of the group list.
+ * @next: The first group without an RMID
  */
-static struct perf_event *
-__intel_cqm_pick_and_rotate(void)
+static void __intel_cqm_pick_and_rotate(struct perf_event *next)
 {
struct perf_event *rotor;
+   unsigned int rmid;
 
lockdep_assert_held(&cache_mutex);
-   lockdep_assert_held(&cache_lock);
 
rotor = list_first_entry(&cache_groups, struct perf_event,
 hw.cqm_groups_entry);
+
+   /*
+* The group at the front of the list should always have a valid
+* RMID. If it doesn't then no groups have RMIDs assigned and we
+* don't need to rotate the list.
+*/
+   if (next == rotor)
+   return;
+
+   rmid = intel_cqm_xchg_rmid(rotor, INVALID_RMID);
+   __put_rmid(rmid);
+
list_rotate_left(&cache_groups);
+}
+
+/*
+ * Deallocate the RMIDs from any events that conflict with @event, and
+ * place them on the back of the group list.
+ */
+static void intel_cqm_sched_out_conflicting_events(struct perf_event *event)
+{
+   struct perf_event *group, *g;
+   unsigned int rmid;
+
+   lockdep_assert_held(&cache_mutex);
+
+   list_for_each_entry_safe(group, g, &cache_groups, hw.cqm_groups_entry) {
+   if (group == event)
+   continue;
+
+   rmid = group->hw.cqm_rmid;
+
+   /*
+* Skip events that don't have a valid RMID.
+*/
+   if (!__rmid_valid(rmid))
+   continue;
+
+   /*
+* No conflict? No problem! Leave the event alone.
+*/
+   if (!__conflict_event(group, event))
+   continue;
 
-   return rotor;
+   intel_cqm_xchg_rmid(group, INVALID_RMID);
+   __put_rmid(rmid);
+   }
 }
 
 /*
  * Attempt to rotate the groups and assign new RMIDs.
  *
+ * We rotate for two reasons,
+ *   1. To handle the scheduling of conflicting events
+ *   2. To recycle RMIDs
+ *
  * Rotating RMIDs is complicated because the hardware doesn't give us
  * any clues.
  *
@@ -642,11 +679,10 @@ __intel_cq

[tip:perf/x86] perf/x86/intel: Perform rotation on Intel CQM RMIDs

2015-02-25 Thread tip-bot for Matt Fleming

Commit-ID:  bff671dba7981195a644a5dc210d65de8ae2d251
Gitweb: http://git.kernel.org/tip/bff671dba7981195a644a5dc210d65de8ae2d251
Author: Matt Fleming 
AuthorDate: Fri, 23 Jan 2015 18:45:47 +
Committer:  Ingo Molnar 
CommitDate: Wed, 25 Feb 2015 13:53:35 +0100

perf/x86/intel: Perform rotation on Intel CQM RMIDs

There are many use cases where people will want to monitor more tasks
than there exist RMIDs in the hardware, meaning that we have to perform
some kind of multiplexing.

We do this by "rotating" the RMIDs in a workqueue, and assigning an RMID
to a waiting event when the RMID becomes unused.

This scheme reserves one RMID at all times for rotation. When we need to
schedule a new event we give it the reserved RMID, pick a victim event
from the front of the global CQM list and wait for the victim's RMID to
drop to zero occupancy, before it becomes the new reserved RMID.

We put the victim's RMID onto the limbo list, where it resides for a
"minimum queue time", which is intended to save ourselves an expensive
smp IPI when the RMID is unlikely to have a occupancy value below
__intel_cqm_threshold.

If we fail to recycle an RMID, even after waiting the minimum queue time
then we need to increment __intel_cqm_threshold. There is an upper bound
on this threshold, __intel_cqm_max_threshold, which is programmable from
userland as /sys/devices/intel_cqm/max_recycling_threshold.

The comments above __intel_cqm_rmid_rotate() have more details.

Signed-off-by: Matt Fleming 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Arnaldo Carvalho de Melo 
Cc: H. Peter Anvin 
Cc: Jiri Olsa 
Cc: Kanaka Juvva 
Cc: Linus Torvalds 
Cc: Vikas Shivappa 
Link: 
http://lkml.kernel.org/r/1422038748-21397-9-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/cpu/perf_event_intel_cqm.c | 671 ++---
 1 file changed, 623 insertions(+), 48 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c 
b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
index 8003d87..e31f508 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
@@ -25,9 +25,13 @@ struct intel_cqm_state {
 static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state);
 
 /*
- * Protects cache_cgroups and cqm_rmid_lru.
+ * Protects cache_cgroups and cqm_rmid_free_lru and cqm_rmid_limbo_lru.
+ * Also protects event->hw.cqm_rmid
+ *
+ * Hold either for stability, both for modification of ->hw.cqm_rmid.
  */
 static DEFINE_MUTEX(cache_mutex);
+static DEFINE_RAW_SPINLOCK(cache_lock);
 
 /*
  * Groups of events that have the same target(s), one RMID per group.
@@ -46,7 +50,34 @@ static cpumask_t cqm_cpumask;
 
 #define QOS_EVENT_MASK QOS_L3_OCCUP_EVENT_ID
 
-static u64 __rmid_read(unsigned long rmid)
+/*
+ * This is central to the rotation algorithm in __intel_cqm_rmid_rotate().
+ *
+ * This rmid is always free and is guaranteed to have an associated
+ * near-zero occupancy value, i.e. no cachelines are tagged with this
+ * RMID, once __intel_cqm_rmid_rotate() returns.
+ */
+static unsigned int intel_cqm_rotation_rmid;
+
+#define INVALID_RMID   (-1)
+
+/*
+ * Is @rmid valid for programming the hardware?
+ *
+ * rmid 0 is reserved by the hardware for all non-monitored tasks, which
+ * means that we should never come across an rmid with that value.
+ * Likewise, an rmid value of -1 is used to indicate "no rmid currently
+ * assigned" and is used as part of the rotation code.
+ */
+static inline bool __rmid_valid(unsigned int rmid)
+{
+   if (!rmid || rmid == INVALID_RMID)
+   return false;
+
+   return true;
+}
+
+static u64 __rmid_read(unsigned int rmid)
 {
u64 val;
 
@@ -64,13 +95,21 @@ static u64 __rmid_read(unsigned long rmid)
return val;
 }
 
+enum rmid_recycle_state {
+   RMID_YOUNG = 0,
+   RMID_AVAILABLE,
+   RMID_DIRTY,
+};
+
 struct cqm_rmid_entry {
-   u64 rmid;
+   unsigned int rmid;
+   enum rmid_recycle_state state;
struct list_head list;
+   unsigned long queue_time;
 };
 
 /*
- * A least recently used list of RMIDs.
+ * cqm_rmid_free_lru - A least recently used list of RMIDs.
  *
  * Oldest entry at the head, newest (most recently used) entry at the
  * tail. This list is never traversed, it's only used to keep track of
@@ -81,9 +120,18 @@ struct cqm_rmid_entry {
  * in use. To mark an RMID as in use, remove its entry from the lru
  * list.
  *
- * This list is protected by cache_mutex.
+ *
+ * cqm_rmid_limbo_lru - list of currently unused but (potentially) dirty RMIDs.
+ *
+ * This list is contains RMIDs that no one is currently using but that
+ * may have a non-zero occupancy value associated with them. The
+ * rotation worker moves RMIDs from the limbo list to the free list once
+ * the occupancy value drops below __intel_cqm_threshold.
+ *
+ * Both lists are protected by cache_mutex.
  */
-static LIST_HEAD(cqm_rmid_lru);
+static LIST_HEAD(cqm_rmid_free_lru);
+static LI

[tip:perf/x86] perf/x86/intel: Support task events with Intel CQM

2015-02-25 Thread tip-bot for Matt Fleming

Commit-ID:  bfe1fcd2688f557a6b6a88f59ea7619228728bd7
Gitweb: http://git.kernel.org/tip/bfe1fcd2688f557a6b6a88f59ea7619228728bd7
Author: Matt Fleming 
AuthorDate: Fri, 23 Jan 2015 18:45:46 +
Committer:  Ingo Molnar 
CommitDate: Wed, 25 Feb 2015 13:53:34 +0100

perf/x86/intel: Support task events with Intel CQM

Add support for task events as well as system-wide events. This change
has a big impact on the way that we gather LLC occupancy values in
intel_cqm_event_read().

Currently, for system-wide (per-cpu) events we defer processing to
userspace which knows how to discard all but one cpu result per package.

Things aren't so simple for task events because we need to do the value
aggregation ourselves. To do this, we defer updating the LLC occupancy
value in event->count from intel_cqm_event_read() and do an SMP
cross-call to read values for all packages in intel_cqm_event_count().
We need to ensure that we only do this for one task event per cache
group, otherwise we'll report duplicate values.

If we're a system-wide event we want to fallback to the default
perf_event_count() implementation. Refactor this into a common function
so that we don't duplicate the code.

Also, introduce PERF_TYPE_INTEL_CQM, since we need a way to track an
event's task (if the event isn't per-cpu) inside of the Intel CQM PMU
driver.  This task information is only availble in the upper layers of
the perf infrastructure.

Other perf backends stash the target task in event->hw.*target so we
need to do something similar. The task is used to determine whether
events should share a cache group and an RMID.

Signed-off-by: Matt Fleming 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: H. Peter Anvin 
Cc: Jiri Olsa 
Cc: Kanaka Juvva 
Cc: Linus Torvalds 
Cc: Vikas Shivappa 
Cc: linux-...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1422038748-21397-8-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/cpu/perf_event_intel_cqm.c | 195 +
 include/linux/perf_event.h |   1 +
 include/uapi/linux/perf_event.h|   1 +
 kernel/events/core.c   |   2 +
 4 files changed, 178 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c 
b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
index b5d9d74..8003d87 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
@@ -182,23 +182,124 @@ fail:
 
 /*
  * Determine if @a and @b measure the same set of tasks.
+ *
+ * If @a and @b measure the same set of tasks then we want to share a
+ * single RMID.
  */
 static bool __match_event(struct perf_event *a, struct perf_event *b)
 {
+   /* Per-cpu and task events don't mix */
if ((a->attach_state & PERF_ATTACH_TASK) !=
(b->attach_state & PERF_ATTACH_TASK))
return false;
 
-   /* not task */
+#ifdef CONFIG_CGROUP_PERF
+   if (a->cgrp != b->cgrp)
+   return false;
+#endif
+
+   /* If not task event, we're machine wide */
+   if (!(b->attach_state & PERF_ATTACH_TASK))
+   return true;
+
+   /*
+* Events that target same task are placed into the same cache group.
+*/
+   if (a->hw.cqm_target == b->hw.cqm_target)
+   return true;
+
+   /*
+* Are we an inherited event?
+*/
+   if (b->parent == a)
+   return true;
+
+   return false;
+}
+
+#ifdef CONFIG_CGROUP_PERF
+static inline struct perf_cgroup *event_to_cgroup(struct perf_event *event)
+{
+   if (event->attach_state & PERF_ATTACH_TASK)
+   return perf_cgroup_from_task(event->hw.cqm_target);
 
-   return true; /* if not task, we're machine wide */
+   return event->cgrp;
 }
+#endif
 
 /*
  * Determine if @a's tasks intersect with @b's tasks
+ *
+ * There are combinations of events that we explicitly prohibit,
+ *
+ *PROHIBITS
+ * system-wide->   cgroup and task
+ * cgroup->system-wide
+ *   ->task in cgroup
+ * task  ->system-wide
+ *   ->task in cgroup
+ *
+ * Call this function before allocating an RMID.
  */
 static bool __conflict_event(struct perf_event *a, struct perf_event *b)
 {
+#ifdef CONFIG_CGROUP_PERF
+   /*
+* We can have any number of cgroups but only one system-wide
+* event at a time.
+*/
+   if (a->cgrp && b->cgrp) {
+   struct perf_cgroup *ac = a->cgrp;
+   struct perf_cgroup *bc = b->cgrp;
+
+   /*
+* This condition should have been caught in
+* __match_event() and we should be sharing an RMID.
+*/
+   WARN_ON_ONCE(ac == bc);
+
+   if (cgroup_is_descendant(ac->css.cgroup, bc->css.cgroup) ||
+

[tip:perf/x86] perf: Move cgroup init before PMU ->event_init()

2015-02-25 Thread tip-bot for Matt Fleming

Commit-ID:  79dff51e900fd26a073be8b23acfbd8c15edb181
Gitweb: http://git.kernel.org/tip/79dff51e900fd26a073be8b23acfbd8c15edb181
Author: Matt Fleming 
AuthorDate: Fri, 23 Jan 2015 18:45:42 +
Committer:  Ingo Molnar 
CommitDate: Wed, 25 Feb 2015 13:53:30 +0100

perf: Move cgroup init before PMU ->event_init()

The Intel QoS PMU needs to know whether an event is part of a cgroup
during ->event_init(), because tasks in the same cgroup share a
monitoring ID.

Move the cgroup initialisation before calling into the PMU driver.

Signed-off-by: Matt Fleming 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: H. Peter Anvin 
Cc: Jiri Olsa 
Cc: Kanaka Juvva 
Cc: Linus Torvalds 
Cc: Vikas Shivappa 
Link: 
http://lkml.kernel.org/r/1422038748-21397-4-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar 
---
 kernel/events/core.c | 28 
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4e8dc59..1fc3bae 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7116,7 +7116,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 struct perf_event *group_leader,
 struct perf_event *parent_event,
 perf_overflow_handler_t overflow_handler,
-void *context)
+void *context, int cgroup_fd)
 {
struct pmu *pmu;
struct perf_event *event;
@@ -7212,6 +7212,12 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
if (!has_branch_stack(event))
event->attr.branch_sample_type = 0;
 
+   if (cgroup_fd != -1) {
+   err = perf_cgroup_connect(cgroup_fd, event, attr, group_leader);
+   if (err)
+   goto err_ns;
+   }
+
pmu = perf_init_event(event);
if (!pmu)
goto err_ns;
@@ -7235,6 +7241,8 @@ err_pmu:
event->destroy(event);
module_put(pmu->module);
 err_ns:
+   if (is_cgroup_event(event))
+   perf_detach_cgroup(event);
if (event->ns)
put_pid_ns(event->ns);
kfree(event);
@@ -7453,6 +7461,7 @@ SYSCALL_DEFINE5(perf_event_open,
int move_group = 0;
int err;
int f_flags = O_RDWR;
+   int cgroup_fd = -1;
 
/* for future expandability... */
if (flags & ~PERF_FLAG_ALL)
@@ -7518,21 +7527,16 @@ SYSCALL_DEFINE5(perf_event_open,
 
get_online_cpus();
 
+   if (flags & PERF_FLAG_PID_CGROUP)
+   cgroup_fd = pid;
+
event = perf_event_alloc(&attr, cpu, task, group_leader, NULL,
-NULL, NULL);
+NULL, NULL, cgroup_fd);
if (IS_ERR(event)) {
err = PTR_ERR(event);
goto err_cpus;
}
 
-   if (flags & PERF_FLAG_PID_CGROUP) {
-   err = perf_cgroup_connect(pid, event, &attr, group_leader);
-   if (err) {
-   __free_event(event);
-   goto err_cpus;
-   }
-   }
-
if (is_sampling_event(event)) {
if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) {
err = -ENOTSUPP;
@@ -7769,7 +7773,7 @@ perf_event_create_kernel_counter(struct perf_event_attr 
*attr, int cpu,
 */
 
event = perf_event_alloc(attr, cpu, task, NULL, NULL,
-overflow_handler, context);
+overflow_handler, context, -1);
if (IS_ERR(event)) {
err = PTR_ERR(event);
goto err;
@@ -8130,7 +8134,7 @@ inherit_event(struct perf_event *parent_event,
   parent_event->cpu,
   child,
   group_leader, parent_event,
-  NULL, NULL);
+  NULL, NULL, -1);
if (IS_ERR(child_event))
return child_event;
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:perf/x86] x86: Add support for Intel Cache QoS Monitoring ( CQM) detection

2015-02-25 Thread tip-bot for Peter P Waskiewicz Jr

Commit-ID:  cbc82b17263877ea5d21e84c58ce03f0292458a1
Gitweb: http://git.kernel.org/tip/cbc82b17263877ea5d21e84c58ce03f0292458a1
Author: Peter P Waskiewicz Jr 
AuthorDate: Fri, 23 Jan 2015 18:45:43 +
Committer:  Ingo Molnar 
CommitDate: Wed, 25 Feb 2015 13:53:31 +0100

x86: Add support for Intel Cache QoS Monitoring (CQM) detection

This patch adds support for the new Cache QoS Monitoring (CQM)
feature found in future Intel Xeon processors.  It includes the
new values to track CQM resources to the cpuinfo_x86 structure,
plus the CPUID detection routines for CQM.

CQM allows a process, or set of processes, to be tracked by the CPU
to determine the cache usage of that task group.  Using this data
from the CPU, software can be written to extract this data and
report cache usage and occupancy for a particular process, or
group of processes.

More information about Cache QoS Monitoring can be found in the
Intel (R) x86 Architecture Software Developer Manual, section 17.14.

Signed-off-by: Peter P Waskiewicz Jr 
Signed-off-by: Matt Fleming 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Andy Lutomirski 
Cc: Arnaldo Carvalho de Melo 
Cc: Borislav Petkov 
Cc: Chris Webb 
Cc: Dave Hansen 
Cc: Fenghua Yu 
Cc: H. Peter Anvin 
Cc: Igor Mammedov 
Cc: Jacob Shin 
Cc: Jan Beulich 
Cc: Jiri Olsa 
Cc: Kanaka Juvva 
Cc: Linus Torvalds 
Cc: Steven Honeyman 
Cc: Steven Rostedt 
Cc: Vikas Shivappa 
Link: 
http://lkml.kernel.org/r/1422038748-21397-5-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar 
---
 arch/x86/include/asm/cpufeature.h |  9 -
 arch/x86/include/asm/processor.h  |  3 +++
 arch/x86/kernel/cpu/common.c  | 39 +++
 3 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index 90a5485..361922d 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -12,7 +12,7 @@
 #include 
 #endif
 
-#define NCAPINTS   11  /* N 32-bit words worth of info */
+#define NCAPINTS   13  /* N 32-bit words worth of info */
 #define NBUGINTS   1   /* N 32-bit bug flags */
 
 /*
@@ -226,6 +226,7 @@
 #define X86_FEATURE_ERMS   ( 9*32+ 9) /* Enhanced REP MOVSB/STOSB */
 #define X86_FEATURE_INVPCID( 9*32+10) /* Invalidate Processor Context ID */
 #define X86_FEATURE_RTM( 9*32+11) /* Restricted Transactional 
Memory */
+#define X86_FEATURE_CQM( 9*32+12) /* Cache QoS Monitoring */
 #define X86_FEATURE_MPX( 9*32+14) /* Memory Protection 
Extension */
 #define X86_FEATURE_AVX512F( 9*32+16) /* AVX-512 Foundation */
 #define X86_FEATURE_RDSEED ( 9*32+18) /* The RDSEED instruction */
@@ -242,6 +243,12 @@
 #define X86_FEATURE_XGETBV1(10*32+ 2) /* XGETBV with ECX = 1 */
 #define X86_FEATURE_XSAVES (10*32+ 3) /* XSAVES/XRSTORS */
 
+/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x000F:0 (edx), word 11 */
+#define X86_FEATURE_CQM_LLC(11*32+ 1) /* LLC QoS if 1 */
+
+/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x000F:1 (edx), word 12 */
+#define X86_FEATURE_CQM_OCCUP_LLC (12*32+ 0) /* LLC occupancy monitoring if 1 
*/
+
 /*
  * BUG word(s)
  */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index ec1c935..a12d50e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -109,6 +109,9 @@ struct cpuinfo_x86 {
/* in KB - valid for CPUS which support this call: */
int x86_cache_size;
int x86_cache_alignment;/* In bytes */
+   /* Cache QoS architectural values: */
+   int x86_cache_max_rmid; /* max index */
+   int x86_cache_occ_scale;/* scale to bytes */
int x86_power;
unsigned long   loops_per_jiffy;
/* cpuid returned max cores value: */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 07f2fc3..9fa00b2 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -645,6 +645,30 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
c->x86_capability[10] = eax;
}
 
+   /* Additional Intel-defined flags: level 0x000F */
+   if (c->cpuid_level >= 0x000F) {
+   u32 eax, ebx, ecx, edx;
+
+   /* QoS sub-leaf, EAX=0Fh, ECX=0 */
+   cpuid_count(0x000F, 0, &eax, &ebx, &ecx, &edx);
+   c->x86_capability[11] = edx;
+   if (cpu_has(c, X86_FEATURE_CQM_LLC)) {
+   /* will be overridden if occupancy monitoring exists */
+   c->x86_cache_max_rmid = ebx;
+
+   /* QoS sub-leaf, EAX=0Fh, ECX=1 */
+   cpuid_count(0x000F, 1, &eax, &ebx, &ecx, &edx);
+   c->x86_capability[12] = edx;
+   if (cp

[tip:perf/x86] perf/x86/intel: Implement LRU monitoring ID allocation for CQM

2015-02-25 Thread tip-bot for Matt Fleming

Commit-ID:  35298e554c74b7849875e3676ba8eaf833c7b917
Gitweb: http://git.kernel.org/tip/35298e554c74b7849875e3676ba8eaf833c7b917
Author: Matt Fleming 
AuthorDate: Fri, 23 Jan 2015 18:45:45 +
Committer:  Ingo Molnar 
CommitDate: Wed, 25 Feb 2015 13:53:33 +0100

perf/x86/intel: Implement LRU monitoring ID allocation for CQM

It's possible to run into issues with re-using unused monitoring IDs
because there may be stale cachelines associated with that ID from a
previous allocation. This can cause the LLC occupancy values to be
inaccurate.

To attempt to mitigate this problem we place the IDs on a least recently
used list, essentially a FIFO. The basic idea is that the longer the
time period between ID re-use the lower the probability that stale
cachelines exist in the cache.

Signed-off-by: Matt Fleming 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: H. Peter Anvin 
Cc: Jiri Olsa 
Cc: Kanaka Juvva 
Cc: Linus Torvalds 
Cc: Vikas Shivappa 
Link: 
http://lkml.kernel.org/r/1422038748-21397-7-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/cpu/perf_event_intel_cqm.c | 100 ++---
 1 file changed, 92 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c 
b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
index 05b4cd2..b5d9d74 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
@@ -25,7 +25,7 @@ struct intel_cqm_state {
 static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state);
 
 /*
- * Protects cache_cgroups.
+ * Protects cache_cgroups and cqm_rmid_lru.
  */
 static DEFINE_MUTEX(cache_mutex);
 
@@ -64,36 +64,120 @@ static u64 __rmid_read(unsigned long rmid)
return val;
 }
 
-static unsigned long *cqm_rmid_bitmap;
+struct cqm_rmid_entry {
+   u64 rmid;
+   struct list_head list;
+};
+
+/*
+ * A least recently used list of RMIDs.
+ *
+ * Oldest entry at the head, newest (most recently used) entry at the
+ * tail. This list is never traversed, it's only used to keep track of
+ * the lru order. That is, we only pick entries of the head or insert
+ * them on the tail.
+ *
+ * All entries on the list are 'free', and their RMIDs are not currently
+ * in use. To mark an RMID as in use, remove its entry from the lru
+ * list.
+ *
+ * This list is protected by cache_mutex.
+ */
+static LIST_HEAD(cqm_rmid_lru);
+
+/*
+ * We use a simple array of pointers so that we can lookup a struct
+ * cqm_rmid_entry in O(1). This alleviates the callers of __get_rmid()
+ * and __put_rmid() from having to worry about dealing with struct
+ * cqm_rmid_entry - they just deal with rmids, i.e. integers.
+ *
+ * Once this array is initialized it is read-only. No locks are required
+ * to access it.
+ *
+ * All entries for all RMIDs can be looked up in the this array at all
+ * times.
+ */
+static struct cqm_rmid_entry **cqm_rmid_ptrs;
+
+static inline struct cqm_rmid_entry *__rmid_entry(int rmid)
+{
+   struct cqm_rmid_entry *entry;
+
+   entry = cqm_rmid_ptrs[rmid];
+   WARN_ON(entry->rmid != rmid);
+
+   return entry;
+}
 
 /*
  * Returns < 0 on fail.
+ *
+ * We expect to be called with cache_mutex held.
  */
 static int __get_rmid(void)
 {
-   return bitmap_find_free_region(cqm_rmid_bitmap, cqm_max_rmid, 0);
+   struct cqm_rmid_entry *entry;
+
+   lockdep_assert_held(&cache_mutex);
+
+   if (list_empty(&cqm_rmid_lru))
+   return -EAGAIN;
+
+   entry = list_first_entry(&cqm_rmid_lru, struct cqm_rmid_entry, list);
+   list_del(&entry->list);
+
+   return entry->rmid;
 }
 
 static void __put_rmid(int rmid)
 {
-   bitmap_release_region(cqm_rmid_bitmap, rmid, 0);
+   struct cqm_rmid_entry *entry;
+
+   lockdep_assert_held(&cache_mutex);
+
+   entry = __rmid_entry(rmid);
+
+   list_add_tail(&entry->list, &cqm_rmid_lru);
 }
 
 static int intel_cqm_setup_rmid_cache(void)
 {
-   cqm_rmid_bitmap = kmalloc(sizeof(long) * BITS_TO_LONGS(cqm_max_rmid), 
GFP_KERNEL);
-   if (!cqm_rmid_bitmap)
+   struct cqm_rmid_entry *entry;
+   int r;
+
+   cqm_rmid_ptrs = kmalloc(sizeof(struct cqm_rmid_entry *) *
+   (cqm_max_rmid + 1), GFP_KERNEL);
+   if (!cqm_rmid_ptrs)
return -ENOMEM;
 
-   bitmap_zero(cqm_rmid_bitmap, cqm_max_rmid);
+   for (r = 0; r <= cqm_max_rmid; r++) {
+   struct cqm_rmid_entry *entry;
+
+   entry = kmalloc(sizeof(*entry), GFP_KERNEL);
+   if (!entry)
+   goto fail;
+
+   INIT_LIST_HEAD(&entry->list);
+   entry->rmid = r;
+   cqm_rmid_ptrs[r] = entry;
+
+   list_add_tail(&entry->list, &cqm_rmid_lru);
+   }
 
/*
 * RMID 0 is special and is always allocated. It's used for all
 * tasks that are not monitored.
 */
-   bitmap_allocate_r

[tip:perf/x86] perf/x86/intel: Add Intel Cache QoS Monitoring support

2015-02-25 Thread tip-bot for Matt Fleming

Commit-ID:  4afbb24ce5e723c8a093a6674a3c33062175078a
Gitweb: http://git.kernel.org/tip/4afbb24ce5e723c8a093a6674a3c33062175078a
Author: Matt Fleming 
AuthorDate: Fri, 23 Jan 2015 18:45:44 +
Committer:  Ingo Molnar 
CommitDate: Wed, 25 Feb 2015 13:53:32 +0100

perf/x86/intel: Add Intel Cache QoS Monitoring support

Future Intel Xeon processors support a Cache QoS Monitoring feature that
allows tracking of the LLC occupancy for a task or task group, i.e. the
amount of data in pulled into the LLC for the task (group).

Currently the PMU only supports per-cpu events. We create an event for
each cpu and read out all the LLC occupancy values.

Because this results in duplicate values being written out to userspace,
we also export a .per-pkg event file so that the perf tools only
accumulate values for one cpu per package.

Signed-off-by: Matt Fleming 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: H. Peter Anvin 
Cc: Jiri Olsa 
Cc: Kanaka Juvva 
Cc: Linus Torvalds 
Cc: Vikas Shivappa 
Link: 
http://lkml.kernel.org/r/1422038748-21397-6-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/cpu/perf_event_intel_cqm.c | 530 +
 include/linux/perf_event.h |   7 +
 2 files changed, 537 insertions(+)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c 
b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
new file mode 100644
index 000..05b4cd2
--- /dev/null
+++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
@@ -0,0 +1,530 @@
+/*
+ * Intel Cache Quality-of-Service Monitoring (CQM) support.
+ *
+ * Based very, very heavily on work by Peter Zijlstra.
+ */
+
+#include 
+#include 
+#include 
+#include "perf_event.h"
+
+#define MSR_IA32_PQR_ASSOC 0x0c8f
+#define MSR_IA32_QM_CTR0x0c8e
+#define MSR_IA32_QM_EVTSEL 0x0c8d
+
+static unsigned int cqm_max_rmid = -1;
+static unsigned int cqm_l3_scale; /* supposedly cacheline size */
+
+struct intel_cqm_state {
+   raw_spinlock_t  lock;
+   int rmid;
+   int cnt;
+};
+
+static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state);
+
+/*
+ * Protects cache_cgroups.
+ */
+static DEFINE_MUTEX(cache_mutex);
+
+/*
+ * Groups of events that have the same target(s), one RMID per group.
+ */
+static LIST_HEAD(cache_groups);
+
+/*
+ * Mask of CPUs for reading CQM values. We only need one per-socket.
+ */
+static cpumask_t cqm_cpumask;
+
+#define RMID_VAL_ERROR (1ULL << 63)
+#define RMID_VAL_UNAVAIL   (1ULL << 62)
+
+#define QOS_L3_OCCUP_EVENT_ID  (1 << 0)
+
+#define QOS_EVENT_MASK QOS_L3_OCCUP_EVENT_ID
+
+static u64 __rmid_read(unsigned long rmid)
+{
+   u64 val;
+
+   /*
+* Ignore the SDM, this thing is _NOTHING_ like a regular perfcnt,
+* it just says that to increase confusion.
+*/
+   wrmsr(MSR_IA32_QM_EVTSEL, QOS_L3_OCCUP_EVENT_ID, rmid);
+   rdmsrl(MSR_IA32_QM_CTR, val);
+
+   /*
+* Aside from the ERROR and UNAVAIL bits, assume this thing returns
+* the number of cachelines tagged with @rmid.
+*/
+   return val;
+}
+
+static unsigned long *cqm_rmid_bitmap;
+
+/*
+ * Returns < 0 on fail.
+ */
+static int __get_rmid(void)
+{
+   return bitmap_find_free_region(cqm_rmid_bitmap, cqm_max_rmid, 0);
+}
+
+static void __put_rmid(int rmid)
+{
+   bitmap_release_region(cqm_rmid_bitmap, rmid, 0);
+}
+
+static int intel_cqm_setup_rmid_cache(void)
+{
+   cqm_rmid_bitmap = kmalloc(sizeof(long) * BITS_TO_LONGS(cqm_max_rmid), 
GFP_KERNEL);
+   if (!cqm_rmid_bitmap)
+   return -ENOMEM;
+
+   bitmap_zero(cqm_rmid_bitmap, cqm_max_rmid);
+
+   /*
+* RMID 0 is special and is always allocated. It's used for all
+* tasks that are not monitored.
+*/
+   bitmap_allocate_region(cqm_rmid_bitmap, 0, 0);
+
+   return 0;
+}
+
+/*
+ * Determine if @a and @b measure the same set of tasks.
+ */
+static bool __match_event(struct perf_event *a, struct perf_event *b)
+{
+   if ((a->attach_state & PERF_ATTACH_TASK) !=
+   (b->attach_state & PERF_ATTACH_TASK))
+   return false;
+
+   /* not task */
+
+   return true; /* if not task, we're machine wide */
+}
+
+/*
+ * Determine if @a's tasks intersect with @b's tasks
+ */
+static bool __conflict_event(struct perf_event *a, struct perf_event *b)
+{
+   /*
+* If one of them is not a task, same story as above with cgroups.
+*/
+   if (!(a->attach_state & PERF_ATTACH_TASK) ||
+   !(b->attach_state & PERF_ATTACH_TASK))
+   return true;
+
+   /*
+* Must be non-overlapping.
+*/
+   return false;
+}
+
+/*
+ * Find a group and setup RMID.
+ *
+ * If we're part of a group, we use the group's RMID.
+ */
+static int intel_cqm_setup_event(struct perf_event *event,
+struct perf_event **group

[tip:perf/x86] perf: Add ->count() function to read per-package counters

2015-02-25 Thread tip-bot for Matt Fleming

Commit-ID:  eacd3ecc34472ce3751eedfc94e44c7cc6eb6305
Gitweb: http://git.kernel.org/tip/eacd3ecc34472ce3751eedfc94e44c7cc6eb6305
Author: Matt Fleming 
AuthorDate: Fri, 23 Jan 2015 18:45:41 +
Committer:  Ingo Molnar 
CommitDate: Wed, 25 Feb 2015 13:53:29 +0100

perf: Add ->count() function to read per-package counters

For PMU drivers that record per-package counters, the ->count variable
cannot be used to record an accurate aggregated value, since it's not
possible to perform SMP cross-calls to cpus on other packages from the
context in which we update ->count.

Introduce a new optional ->count() accessor function that can be used to
customize how values are collected. If a PMU driver doesn't provide a
->count() function, we fallback to the existing code.

There is necessarily a window of staleness with this approach because
the task that generated the counter value may not have been scheduled by
the cpu recently.

An alternative and more complex approach would be to use a hrtimer to
periodically refresh the values from a more permissive scheduling
context. So, we're trading off complexity for accuracy.

Signed-off-by: Matt Fleming 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: H. Peter Anvin 
Cc: Jiri Olsa 
Cc: Kanaka Juvva 
Cc: Linus Torvalds 
Cc: Vikas Shivappa 
Link: 
http://lkml.kernel.org/r/1422038748-21397-3-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar 
---
 include/linux/perf_event.h | 10 ++
 kernel/events/core.c   |  5 -
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index cae4a94..9fc9b0d 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -272,6 +272,11 @@ struct pmu {
 */
size_t  task_ctx_size;
 
+
+   /*
+* Return the count value for a counter.
+*/
+   u64 (*count)(struct perf_event *event); /*optional*/
 };
 
 /**
@@ -770,6 +775,11 @@ static inline void perf_event_task_sched_out(struct 
task_struct *prev,
__perf_event_task_sched_out(prev, next);
 }
 
+static inline u64 __perf_event_count(struct perf_event *event)
+{
+   return local64_read(&event->count) + atomic64_read(&event->child_count);
+}
+
 extern void perf_event_mmap(struct vm_area_struct *vma);
 extern struct perf_guest_info_callbacks *perf_guest_cbs;
 extern int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks 
*callbacks);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 072de31..4e8dc59 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3194,7 +3194,10 @@ static void __perf_event_read(void *info)
 
 static inline u64 perf_event_count(struct perf_event *event)
 {
-   return local64_read(&event->count) + atomic64_read(&event->child_count);
+   if (event->pmu->count)
+   return event->pmu->count(event);
+
+   return __perf_event_count(event);
 }
 
 static u64 perf_event_read(struct perf_event *event)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:perf/x86] perf: Make perf_cgroup_from_task() global

2015-02-25 Thread tip-bot for Matt Fleming

Commit-ID:  39bed6cbb842d8edf5a26b01122b391d36775b5e
Gitweb: http://git.kernel.org/tip/39bed6cbb842d8edf5a26b01122b391d36775b5e
Author: Matt Fleming 
AuthorDate: Fri, 23 Jan 2015 18:45:40 +
Committer:  Ingo Molnar 
CommitDate: Wed, 25 Feb 2015 13:53:28 +0100

perf: Make perf_cgroup_from_task() global

Move perf_cgroup_from_task() from kernel/events/ to include/linux/ along
with the necessary struct definitions, so that it can be used by the PMU
code.

When the upcoming Intel Cache Monitoring PMU driver assigns monitoring
IDs to perf events, it needs to be able to check whether any two
monitoring events overlap (say, a cgroup and task event), which means we
need to be able to lookup the cgroup associated with a task (if any).

Signed-off-by: Matt Fleming 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: H. Peter Anvin 
Cc: Jiri Olsa 
Cc: Kanaka Juvva 
Cc: Linus Torvalds 
Cc: Paul Mackerras 
Cc: Vikas Shivappa 
Link: 
http://lkml.kernel.org/r/1422038748-21397-2-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar 
---
 include/linux/perf_event.h | 30 ++
 kernel/events/core.c   | 28 +---
 2 files changed, 31 insertions(+), 27 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 724d372..cae4a94 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -53,6 +53,7 @@ struct perf_guest_info_callbacks {
 #include 
 #include 
 #include 
+#include 
 #include 
 
 struct perf_callchain_entry {
@@ -547,6 +548,35 @@ struct perf_output_handle {
int page;
 };
 
+#ifdef CONFIG_CGROUP_PERF
+
+/*
+ * perf_cgroup_info keeps track of time_enabled for a cgroup.
+ * This is a per-cpu dynamically allocated data structure.
+ */
+struct perf_cgroup_info {
+   u64 time;
+   u64 timestamp;
+};
+
+struct perf_cgroup {
+   struct cgroup_subsys_state  css;
+   struct perf_cgroup_info __percpu *info;
+};
+
+/*
+ * Must ensure cgroup is pinned (css_get) before calling
+ * this function. In other words, we cannot call this function
+ * if there is no cgroup event for the current CPU context.
+ */
+static inline struct perf_cgroup *
+perf_cgroup_from_task(struct task_struct *task)
+{
+   return container_of(task_css(task, perf_event_cgrp_id),
+   struct perf_cgroup, css);
+}
+#endif /* CONFIG_CGROUP_PERF */
+
 #ifdef CONFIG_PERF_EVENTS
 
 extern int perf_pmu_register(struct pmu *pmu, const char *name, int type);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 20cece0..072de31 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -34,11 +34,11 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -351,32 +351,6 @@ static void perf_ctx_unlock(struct perf_cpu_context 
*cpuctx,
 
 #ifdef CONFIG_CGROUP_PERF
 
-/*
- * perf_cgroup_info keeps track of time_enabled for a cgroup.
- * This is a per-cpu dynamically allocated data structure.
- */
-struct perf_cgroup_info {
-   u64 time;
-   u64 timestamp;
-};
-
-struct perf_cgroup {
-   struct cgroup_subsys_state  css;
-   struct perf_cgroup_info __percpu *info;
-};
-
-/*
- * Must ensure cgroup is pinned (css_get) before calling
- * this function. In other words, we cannot call this function
- * if there is no cgroup event for the current CPU context.
- */
-static inline struct perf_cgroup *
-perf_cgroup_from_task(struct task_struct *task)
-{
-   return container_of(task_css(task, perf_event_cgrp_id),
-   struct perf_cgroup, css);
-}
-
 static inline bool
 perf_cgroup_match(struct perf_event *event)
 {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86, traps: Enable DEBUG_STACK after cpu_init() for TRAP_DB/BP.

2015-02-25 Thread Steven Rostedt

On Thu, 26 Feb 2015 11:57:56 +0800
Wang Nan  wrote:

> Before this patch early_trap_init() installs DEBUG_STACK for X86_TRAP_BP
> and X86_TRAP_DB. However, DEBUG_STACK doesn't work correctly until
> cpu_init() <-- trap_init().
> 
> This patch passes 0 to set_intr_gate_ist() and
> set_system_intr_gate_ist() instead of DEBUG_STACK to let it use same
> stack as kernel, and installs DEBUG_STACK for them in trap_init().
> 
> As core runs at ring 0 between early_trap_init() and trap_init(), there
> is no chance to get a bad stack before trap_init().
> 
> As NMI is also enabled in trap_init(), we don't need to care about
> is_debug_stack() and related things used in arch/x86/kernel/nmi.c.

Looks fine to me, except for some grammar issues in the comments.

> 
> Signed-off-by: Wang Nan 
> ---
>  arch/x86/kernel/traps.c | 20 ++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 9d2073e..a9b8640 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -925,9 +925,16 @@ dotraplinkage void do_iret_error(struct pt_regs *regs, 
> long error_code)
>  /* Set of traps needed for early debugging. */
>  void __init early_trap_init(void)
>  {
> - set_intr_gate_ist(X86_TRAP_DB, &debug, DEBUG_STACK);
> + /*
> +  * Don't set ist to DEBUG_STACK as it doesn't work until TSS is
> +  * ready in cpu_init() <-- trap_init(). Before trap_init(), CPU
> +  * runs at ring 0 so there should be impossible to hit a invalid

 "so it should be impossible", although I might prefer, "it is
 impossible to", but nothing is ever guaranteed.

 "an invalid"


> +  * stack. Use original stack is enough. DEBUG_STACK will be

 "Using the original stack works well enough at this early stage."

> +  * equipped after cpu_init() in trap_init().
> +  */
> + set_intr_gate_ist(X86_TRAP_DB, &debug, 0);
>   /* int3 can be called from all */
> - set_system_intr_gate_ist(X86_TRAP_BP, &int3, DEBUG_STACK);
> + set_system_intr_gate_ist(X86_TRAP_BP, &int3, 0);
>  #ifdef CONFIG_X86_32
>   set_intr_gate(X86_TRAP_PF, page_fault);
>  #endif
> @@ -1005,6 +1012,15 @@ void __init trap_init(void)
>*/
>   cpu_init();
>  
> + /*
> +  * X86_TRAP_DB and X86_TRAP_BP have been setup
> +  * in early_trap_init(). However, DEBUG_STACK works only after
> +  * cpu_init() load TSS. See comments in early_trap_init().

  "cpu_init() loads TSS"

-- Steve

> +  */
> + set_intr_gate_ist(X86_TRAP_DB, &debug, DEBUG_STACK);
> + /* int3 can be called from all */
> + set_system_intr_gate_ist(X86_TRAP_BP, &int3, DEBUG_STACK);
> +
>   x86_init.irqs.trap_init();
>  
>  #ifdef CONFIG_X86_64

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86, traps: Enable DEBUG_STACK after cpu_init() for TRAP_DB/BP.

2015-02-25 Thread Wang Nan

Before this patch early_trap_init() installs DEBUG_STACK for X86_TRAP_BP
and X86_TRAP_DB. However, DEBUG_STACK doesn't work correctly until
cpu_init() <-- trap_init().

This patch passes 0 to set_intr_gate_ist() and
set_system_intr_gate_ist() instead of DEBUG_STACK to let it use same
stack as kernel, and installs DEBUG_STACK for them in trap_init().

As core runs at ring 0 between early_trap_init() and trap_init(), there
is no chance to get a bad stack before trap_init().

As NMI is also enabled in trap_init(), we don't need to care about
is_debug_stack() and related things used in arch/x86/kernel/nmi.c.

Signed-off-by: Wang Nan 
---
 arch/x86/kernel/traps.c | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 9d2073e..a9b8640 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -925,9 +925,16 @@ dotraplinkage void do_iret_error(struct pt_regs *regs, 
long error_code)
 /* Set of traps needed for early debugging. */
 void __init early_trap_init(void)
 {
-   set_intr_gate_ist(X86_TRAP_DB, &debug, DEBUG_STACK);
+   /*
+* Don't set ist to DEBUG_STACK as it doesn't work until TSS is
+* ready in cpu_init() <-- trap_init(). Before trap_init(), CPU
+* runs at ring 0 so there should be impossible to hit a invalid
+* stack. Use original stack is enough. DEBUG_STACK will be
+* equipped after cpu_init() in trap_init().
+*/
+   set_intr_gate_ist(X86_TRAP_DB, &debug, 0);
/* int3 can be called from all */
-   set_system_intr_gate_ist(X86_TRAP_BP, &int3, DEBUG_STACK);
+   set_system_intr_gate_ist(X86_TRAP_BP, &int3, 0);
 #ifdef CONFIG_X86_32
set_intr_gate(X86_TRAP_PF, page_fault);
 #endif
@@ -1005,6 +1012,15 @@ void __init trap_init(void)
 */
cpu_init();
 
+   /*
+* X86_TRAP_DB and X86_TRAP_BP have been setup
+* in early_trap_init(). However, DEBUG_STACK works only after
+* cpu_init() load TSS. See comments in early_trap_init().
+*/
+   set_intr_gate_ist(X86_TRAP_DB, &debug, DEBUG_STACK);
+   /* int3 can be called from all */
+   set_system_intr_gate_ist(X86_TRAP_BP, &int3, DEBUG_STACK);
+
x86_init.irqs.trap_init();
 
 #ifdef CONFIG_X86_64
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] phy: exynos-mipi-video: Fixup the test for state->regmap

2015-02-25 Thread Axel Lin

syscon_regmap_lookup_by_phandle() returns ERR_PTR on error.
Thus don't use null test against state->regmap.

Signed-off-by: Axel Lin 
---
 drivers/phy/phy-exynos-mipi-video.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/phy/phy-exynos-mipi-video.c 
b/drivers/phy/phy-exynos-mipi-video.c
index f017b2f..d196493 100644
--- a/drivers/phy/phy-exynos-mipi-video.c
+++ b/drivers/phy/phy-exynos-mipi-video.c
@@ -59,7 +59,7 @@ static int __set_phy_state(struct exynos_mipi_video_phy 
*state,
else
reset = EXYNOS4_MIPI_PHY_SRESETN;
 
-   if (state->regmap) {
+   if (!IS_ERR(state->regmap)) {
mutex_lock(&state->mutex);
regmap_read(state->regmap, offset, &val);
if (on)
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: Tree for Feb 26

2015-02-25 Thread Stephen Rothwell

Hi all,

Changes since 20150225:

The drm-intel tree gained a conflict against the drm-intel-fixes tree.

The clk tree gained a build failure so I used the version from
next-20150225.

Non-merge commits (relative to Linus' tree): 1558
 1217 files changed, 31765 insertions(+), 30907 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a
multi_v7_defconfig for arm. After the final fixups (if any), it is also
built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and
allyesconfig (this fails its final link) and i386, sparc, sparc64 and arm
defconfig.

Below is a summary of the state of the merge.

I am currently merging 206 trees (counting Linus' and 30 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (b24e2bdde4af Merge tag 'stable/for-linus-4.0-rc1-tag' of 
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip)
Merging fixes/master (b94d525e58dc Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
Merging kbuild-current/rc-fixes (c517d838eb7d Linux 4.0-rc1)
Merging arc-current/for-curr (2ce7598c9a45 Linux 3.17-rc4)
Merging arm-current/fixes (23be7fdafa50 ARM: 8305/1: DMA: Fix kzalloc flags in 
__iommu_alloc_buffer())
Merging m68k-current/for-linus (4436820a98cd m68k/defconfig: Enable Ethernet 
bridging)
Merging metag-fixes/fixes (c2996cb29bfb metag: Fix KSTK_EIP() and KSTK_ESP() 
macros)
Merging mips-fixes/mips-fixes (1795cd9b3a91 Linux 3.16-rc5)
Merging powerpc-merge/merge (c517d838eb7d Linux 4.0-rc1)
Merging powerpc-merge-mpe/fixes (c59c961ca511 Merge branch 'drm-fixes' of 
git://people.freedesktop.org/~airlied/linux)
Merging sparc/master (66d0f7ec9f10 sparc32: destroy_context() and switch_mm() 
needs to disable interrupts.)
Merging net/master (31639b94cadc MAINTAINERS: update my email address)
Merging ipsec/master (ac37e2515c1a xfrm: release dst_orig in case of error in 
xfrm_lookup())
Merging sound-current/for-linus (de5d0ad506cb ALSA: hda - Disable runtime PM 
for Panther Point again)
Merging pci-current/for-linus (4efe874aace5 PCI: Don't read past the end of 
sysfs "driver_override" buffer)
Merging wireless-drivers/master (aeb2d2a4c0ae rtlwifi: Remove logging statement 
that is no longer needed)
Merging driver-core.current/driver-core-linus (c517d838eb7d Linux 4.0-rc1)
Merging tty.current/tty-linus (c517d838eb7d Linux 4.0-rc1)
Merging usb.current/usb-linus (b20b1618b8fc cdc-acm: Add support for Denso 
cradle CU-321)
Merging usb-gadget-fixes/fixes (a0456399fb07 usb: gadget: configfs: don't 
NUL-terminate (sub)compatible ids)
Merging usb-serial-fixes/usb-linus (a6f0331236fa USB: cp210x: add ID for 
RUGGEDCOM USB Serial Console)
Merging staging.current/staging-linus (c517d838eb7d Linux 4.0-rc1)
Merging char-misc.current/char-misc-linus (c517d838eb7d Linux 4.0-rc1)
Merging input-current/for-linus (4c971aa78314 Merge branch 'next' into 
for-linus)
Merging crypto-current/master (96692a7305c4 crypto: tcrypt - do not allocate iv 
on stack for aead speed tests)
Merging ide/master (f96fe225677b Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
Merging devicetree-current/devicetree/merge (6b1271de3723 of/unittest: Overlays 
with sub-devices tests)
Merging rr-fixes/fixes (f47689345931 lguest: update help text.)
Merging vfio-fixes/for-linus (7c2e211f3c95 vfio-pci: Fix the check on pci 
device type in vfio_pci_probe())
Merging kselftest-fixes/fixes (f5db310d77ef selftests/vm: fix link error for 
transhuge-stress test)
Merging drm-intel-fixes/for-linux-next-fixes (62e537f8d568 drm/i915: Fix 
frontbuffer false positve.)
Merging asm-generic/master (643165c8bbc8 Merge tag 'uaccess_for_upstream' of 
git://git.kernel.org/pub/scm/linux/kernel/gi

Re: [PATCH] test-hexdump: test the return value of the hex_dump_to_buffer

2015-02-25 Thread long.wanglong

On 2015/2/16 17:47, Andy Shevchenko wrote:
> On Sun, 2015-02-15 at 09:50 +, Wang Long wrote:
>> As the function hex_dump_to_buffer returns the amount of bytes placed
>> in the buffer without terminating NUL. the test-hexdump should test
>> the return value of it.
> 
> I don't think it's needed. When the prototype was changed the new test
> case had been introduced to cover the overflow cases, i.e.
> test_hexdump_overflow().
> 

ok, thanks.

>>
>> Signed-off-by: Wang Long 
>> ---
>>  lib/test-hexdump.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/test-hexdump.c b/lib/test-hexdump.c
>> index daf29a39..9243be7 100644
>> --- a/lib/test-hexdump.c
>> +++ b/lib/test-hexdump.c
>> @@ -52,8 +52,9 @@ static void __init test_hexdump(size_t len, int rowsize, 
>> int groupsize,
>>  size_t l = len;
>>  int gs = groupsize, rs = rowsize;
>>  unsigned int i;
>> +int r;
>>  
>> -hex_dump_to_buffer(data_b, l, rs, gs, real, sizeof(real), ascii);
>> +r = hex_dump_to_buffer(data_b, l, rs, gs, real, sizeof(real), ascii);
>>  
>>  if (rs != 16 && rs != 32)
>>  rs = 16;
>> @@ -96,7 +97,7 @@ static void __init test_hexdump(size_t len, int rowsize, 
>> int groupsize,
>>  
>>  *p = '\0';
>>  
>> -if (strcmp(test, real)) {
>> +if (strcmp(test, real) || r != strlen(real)) {
>>  pr_err("Len: %zu row: %d group: %d\n", len, rowsize, groupsize);
>>  pr_err("Result: '%s'\n", real);
>>  pr_err("Expect: '%s'\n", test);
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[LKP] [mm] 4d942466994: +4.8% will-it-scale.per_process_ops

2015-02-25 Thread Huang Ying

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
commit 4d9424669946532be754a6e116618dcb58430cb4 ("mm: convert 
p[te|md]_mknonnuma and remaining page table manipulations")


testbox/testcase/testparams: lkp-g5/will-it-scale/performance-readseek3

842915f56667f9ee  4d9424669946532be754a6e116  
  --  
 %stddev %change %stddev
 \  |\  
   1526912 ±  1%  +4.8%1599490 ±  0%  will-it-scale.per_process_ops
 15364 ±  4% +10.1%  16920 ±  5%  
will-it-scale.time.involuntary_context_switches
131290 ±  0%  +4.8% 137653 ±  0%  
will-it-scale.time.minor_page_faults
   318 ±  1%  -1.6%313 ±  0%  
will-it-scale.time.elapsed_time.max
   318 ±  1%  -1.6%313 ±  0%  will-it-scale.time.elapsed_time
54 ± 25% -83.8%  8 ± 48%  
sched_debug.cfs_rq[53]:/.tg_load_contrib
11 ± 39%+316.9% 47 ± 22%  
sched_debug.cfs_rq[18]:/.tg_load_contrib
 8 ± 24% -79.0%  1 ± 47%  
sched_debug.cfs_rq[11]:/.nr_spread_over
46 ±  7%+271.8%172 ± 45%  numa-vmstat.node2.nr_inactive_anon
   186 ±  7%+270.8%691 ± 44%  numa-meminfo.node2.Inactive(anon)
   642 ± 15%+116.3%   1388 ± 41%  sched_debug.cpu#2.ttwu_count
37 ± 17%+170.3%100 ± 15%  
numa-vmstat.node6.nr_page_table_pages
19 ± 41% +98.7% 37 ± 38%  
sched_debug.cfs_rq[62]:/.blocked_load_avg
   347 ± 19%+103.9%708 ± 49%  sched_debug.cpu#6.sched_goidle
   215 ±  5%+148.8%535 ± 31%  sched_debug.cpu#101.sched_goidle
   519 ±  7%+135.5%   1222 ± 30%  sched_debug.cpu#101.nr_switches
23 ± 35% +84.3% 43 ± 33%  
sched_debug.cfs_rq[62]:/.tg_load_contrib
   833 ± 18% +89.1%   1576 ± 47%  sched_debug.cpu#6.nr_switches
   1083043 ±  0% -55.2% 485505 ±  1%  proc-vmstat.numa_pte_updates
   250 ± 28%+120.1%551 ± 41%  sched_debug.cpu#74.ttwu_local
   865 ± 18% +61.1%   1394 ± 29%  sched_debug.cpu#32.ttwu_local
   515 ± 18% +81.0%932 ± 41%  sched_debug.cpu#2.sched_goidle
  1242 ± 22% +76.4%   2191 ± 39%  sched_debug.cpu#2.nr_switches
23 ± 17% +63.9% 38 ± 19%  
sched_debug.cfs_rq[96]:/.tg_load_contrib
   302 ±  0% +88.9%570 ± 31%  sched_debug.cpu#47.ttwu_local
  3126 ±  6% +77.9%   5562 ± 36%  sched_debug.cpu#15.sched_count
  1717 ± 30% +30.5%   2240 ± 21%  sched_debug.cpu#32.ttwu_count
   256 ± 14% +87.1%479 ± 22%  sched_debug.cpu#7.ttwu_count
  4200 ± 30% -55.3%   1878 ± 30%  sched_debug.cpu#67.ttwu_count
  1036 ± 10% -50.7%511 ± 35%  sched_debug.cpu#79.ttwu_count
   527 ± 18% +72.5%909 ± 34%  sched_debug.cpu#15.ttwu_local
   359 ±  7% +64.1%589 ± 17%  sched_debug.cpu#50.ttwu_count
  1692 ± 16% +34.2%   2271 ± 21%  sched_debug.cpu#32.sched_goidle
   653 ± 14% -51.7%315 ± 37%  sched_debug.cpu#79.ttwu_local
   137 ±  3% +99.1%272 ± 38%  sched_debug.cpu#40.ttwu_local
  6398 ± 24% +55.9%   9973 ± 16%  numa-meminfo.node3.SReclaimable
  1599 ± 24% +55.9%   2493 ± 16%  
numa-vmstat.node3.nr_slab_reclaimable
  3669 ± 18% +33.6%   4900 ± 16%  sched_debug.cpu#32.nr_switches
  3681 ± 18% +33.4%   4910 ± 16%  sched_debug.cpu#32.sched_count
   546 ±  9% +73.0%945 ± 24%  sched_debug.cpu#40.nr_switches
   965 ± 26% -38.8%590 ±  4%  sched_debug.cpu#64.sched_goidle
   569 ±  8% +69.2%963 ± 23%  sched_debug.cpu#40.sched_count
   225 ±  7% +79.0%402 ± 25%  sched_debug.cpu#40.sched_goidle
   199 ±  4% +53.6%306 ± 17%  sched_debug.cpu#103.sched_goidle
   112 ±  3% +44.9%162 ± 20%  sched_debug.cpu#102.ttwu_local
   198 ± 15% +71.8%340 ± 39%  sched_debug.cpu#40.ttwu_count
   475 ±  2% +51.8%721 ± 17%  sched_debug.cpu#103.nr_switches
   162 ±  9% +86.6%302 ± 25%  sched_debug.cpu#7.ttwu_local
   203 ±  1% +54.7%314 ± 15%  sched_debug.cpu#102.sched_goidle
   499 ±  7% +68.9%843 ± 35%  sched_debug.cpu#47.ttwu_count
   478 ±  2% +54.4%739 ± 16%  sched_debug.cpu#102.nr_switches
   767 ± 16% +66.6%   1278 ± 25%  sched_debug.cpu#7.nr_switches
  4369 ± 15% -41.3%   2564 ±  4%  sched_debug.cpu#65.ttwu_count
   169 ± 14% +32.2%224 ± 21%  sched_debug.cpu#105.ttwu_local
   559 ±  6% +40.3%784 ± 17%  sched_debug.cpu#47.sched_goidle
   770 ±  7% -28.2%552 ± 21%  sched_debug.cpu#126.ttwu_local
  1195 ± 21% +35.0%   1614 ± 16%  cpuidle.C1E-NHM.usage
  1450 ± 17%

Re: [PATCH 4/4] powerpc/mpic: remove unused functions

2015-02-25 Thread Scott Wood

On Wed, 2015-02-25 at 20:39 -0600, Jia Hongtao-B38951 wrote:
> Hi Scott,
> 
> I'm really sorry for leave this patch like a zombie.
> Now I have plan to revisit this patch.
> 
> From the previous comments the compile error was fixed.
> But beyond that I have had no plan to update it.
> 
> Could you please comment on why it's still on hold?
> 

Kumar had some comments.

-Scott


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC][PATCH v3] sched/rt: Use IPI to trigger RT task push migration instead of pulling

2015-02-25 Thread Steven Rostedt


When debugging the latencies on a 40 core box, where we hit 300 to
500 microsecond latencies, I found there was a huge contention on the
runqueue locks.

Investigating it further, running ftrace, I found that it was due to
the pulling of RT tasks.

The test that was run was the following:

 cyclictest --numa -p95 -m -d0 -i100

This created a thread on each CPU, that would set its wakeup in interations
of 100 microseconds. The -d0 means that all the threads had the same
interval (100us). Each thread sleeps for 100us and wakes up and measures
its latencies.

What happened was another RT task would be scheduled on one of the CPUs
that was running our test, when the other CPU tests went to sleep and
scheduled idle. This caused the "pull" operation to execute on all
these CPUs. Each one of these saw the RT task that was overloaded on
the CPU of the test that was still running, and each one tried
to grab that task in a thundering herd way.

To grab the task, each thread would do a double rq lock grab, grabbing
its own lock as well as the rq of the overloaded CPU. As the sched
domains on this box was rather flat for its size, I saw up to 12 CPUs
block on this lock at once. This caused a ripple affect with the
rq locks. As these locks were blocked, any wakeups or load balanceing
on these CPUs would also block on these locks, and the wait time escalated.

I've tried various methods to lesson the load, but things like an
atomic counter to only let one CPU grab the task wont work, because
the task may have a limited affinity, and we may pick the wrong
CPU to take that lock and do the pull, to only find out that the
CPU we picked isn't in the task's affinity.

Instead of doing the PULL, I now have the CPUs that want the pull to
send over an IPI to the overloaded CPU, and let that CPU pick what
CPU to push the task to. No more need to grab the rq lock, and the
push/pull algorithm still works fine.

With this patch, the latency dropped to just 150us over a 20 hour run.
Without the patch, the huge latencies would trigger in seconds.

I've created a new sched feature called RT_PUSH_IPI, which is enabled
by default.

When RT_PUSH_IPI is not enabled, the old method of grabbing the rq locks
and having the pulling CPU do the work is implemented. When RT_PUSH_IPI
is enabled, the IPI is sent to the overloaded CPU to do a push.

To enabled or disable this at run time:

 # mount -t debugfs nodev /sys/kernel/debug
 # echo RT_PUSH_IPI > /sys/kernel/debug/sched_features
or
 # echo NO_RT_PUSH_IPI > /sys/kernel/debug/sched_features

Update: This original patch would send an IPI to all CPUs in the RT overload
list. But that could theoretically cause the reverse issue. That is, there
could be lots of overloaded RT queues and on CPU lowers its priority. It would
then send an IPI to all the overloaded RT queues and they could then all try
to grab the rq lock of the CPU lowering its priority, and then we have the
same problem.

The latest design sends out only one IPI to the first overloaded CPU. It tries 
to
push any tasks that it can, and then looks for the next overloaded CPU that can
push to the source CPU. The IPIs stop when all overloaded CPUs that have 
pushable
tasks that have priorities greater than the source CPU are covered. In case the
source CPU lowers its priority again, a flag is set to tell the IPI traversal to
restart with the first RT overloaded CPU after the source CPU.

Parts-suggested-by: Peter Zijlstra 
Signed-off-by: Steven Rostedt 

---
Changes since V2:

 o Redesigned to mostly eliminate the covering of for each cpu on the rto mask.
   I say mostly, because it skips over any CPU that does not have a task that
   has high enough priority to push to the original CPU.

   The new design finds the next CPU that has an RT overload with a pushable
   task of priority greater than the source CPU and sends an IPI to it. Instead
   of stopping if that CPU succeeds in pushing a task, it instead will continue
   the IPI to the next CPU with a pushable task higher in priority than the
   source CPU tasks. It continues until it covers all the CPUs in the rto_mask
   up to the source CPU.

   There's also a "reset" flag that gets set if the source CPU changes its 
priority
   before the IPI loop completes. That is, the IPI will start again with the 
first
   overloaded CPU after the source CPU.

   This version is actually cleaner than the previous version.

Changes since V1:

 o As already mentioned in the "update", a redesign was done to send
   only one IPI, and to the highest rt overloaded queue. If for some
   reason that could not push a task, it would look for the next rq and
   send an ipi (irq work actually) to the next one. Limits are in place
   to prevent any ping pong affect of two rqs sending IPIs back and
   forth.

 o Made this sched feature enabled by default instead of enabling it
   when we have 16 or more CPUs.


 kernel/sched/features.h |   11 +++
 kernel/sched/rt.c   |  176

[PATCH] tick/broadcast-hrtimer : Fix suspicious RCU usage in idle loop

2015-02-25 Thread Preeti U Murthy

The hrtimer mode of broadcast queues hrtimers in the idle entry
path so as to wakeup cpus in deep idle states. hrtimer_{start/cancel}
functions call into tracing which uses RCU. But it is not legal to call
into RCU in cpuidle because it is one of the quiescent states. Hence
protect this region with RCU_NONIDLE which informs RCU that the cpu
is momentarily non-idle.

Signed-off-by: Preeti U Murthy 
Reviewed-by: Paul E. McKenney 
---

 kernel/time/tick-broadcast-hrtimer.c |   11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/kernel/time/tick-broadcast-hrtimer.c 
b/kernel/time/tick-broadcast-hrtimer.c
index eb682d5..6aac4be 100644
--- a/kernel/time/tick-broadcast-hrtimer.c
+++ b/kernel/time/tick-broadcast-hrtimer.c
@@ -49,6 +49,7 @@ static void bc_set_mode(enum clock_event_mode mode,
  */
 static int bc_set_next(ktime_t expires, struct clock_event_device *bc)
 {
+   int bc_moved;
/*
 * We try to cancel the timer first. If the callback is on
 * flight on some other cpu then we let it handle it. If we
@@ -60,9 +61,15 @@ static int bc_set_next(ktime_t expires, struct 
clock_event_device *bc)
 * restart the timer because we are in the callback, but we
 * can set the expiry time and let the callback return
 * HRTIMER_RESTART.
+*
+* Since we are in the idle loop at this point and because
+* hrtimer_{start/cancel} functions call into tracing,
+* calls to these functions must be bound within RCU_NONIDLE.
 */
-   if (hrtimer_try_to_cancel(&bctimer) >= 0) {
-   hrtimer_start(&bctimer, expires, HRTIMER_MODE_ABS_PINNED);
+   RCU_NONIDLE(bc_moved = (hrtimer_try_to_cancel(&bctimer) >= 0) ?
+   !hrtimer_start(&bctimer, expires, HRTIMER_MODE_ABS_PINNED) :
+   0);
+   if (bc_moved) {
/* Bind the "device" to the cpu */
bc->bound_on = smp_processor_id();
} else if (bc->bound_on == smp_processor_id()) {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/2] backlight: pwm: Add backlight-boot-off property

2015-02-25 Thread huang lin


Add backlight-boot-off property, so we can keeping the
backlight disabled at boot until it is enabled implicitly
by a panel driver, or explicitly by userspace



huang lin (2):
  Documentation: devicetree: add backlight-boot-off property in
pwm-backlight
  backlight: pwm: Add backlight-boot-off property

 Documentation/devicetree/bindings/video/backlight/pwm-backlight.txt | 1 +
 drivers/video/backlight/pwm_bl.c| 4 
 2 files changed, 5 insertions(+)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] Documentation: devicetree: add backlight-boot-off property in pwm-backlight

2015-02-25 Thread huang lin

Add the backlight-boot-ff property, so we can keeping the backlight
disabled at boot until it is enabled implicitly by a panel driver,
or explicitly by userspace.

Signed-off-by: huang lin 

---

 Documentation/devicetree/bindings/video/backlight/pwm-backlight.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git 
a/Documentation/devicetree/bindings/video/backlight/pwm-backlight.txt 
b/Documentation/devicetree/bindings/video/backlight/pwm-backlight.txt
index 764db86..28b0b4d 100644
--- a/Documentation/devicetree/bindings/video/backlight/pwm-backlight.txt
+++ b/Documentation/devicetree/bindings/video/backlight/pwm-backlight.txt
@@ -17,6 +17,7 @@ Optional properties:
"pwms" property (see PWM binding[0])
   - enable-gpios: contains a single GPIO specifier for the GPIO which enables
   and disables the backlight (see GPIO binding[1])
+  - backlight-boot-off: turn off backlight when pwm backlight probe
 
 [0]: Documentation/devicetree/bindings/pwm/pwm.txt
 [1]: Documentation/devicetree/bindings/gpio/gpio.txt
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Documentation: add print bitmap description

2015-02-25 Thread Wang Long

as the commit: "lib/vsprintf: implement bitmap printing through
'%*pb[l]'" add an easy way to print bitmaps. so printk-formats.txt
should reflect it.

Signed-off-by: Wang Long 
---
 Documentation/printk-formats.txt | 9 +
 1 file changed, 9 insertions(+)

diff --git a/Documentation/printk-formats.txt b/Documentation/printk-formats.txt
index 5a615c1..255a061 100644
--- a/Documentation/printk-formats.txt
+++ b/Documentation/printk-formats.txt
@@ -239,6 +239,15 @@ s64 SHOULD be printed with %lld/%llx:
 
printk("%lld", s64_var);
 
+bitmap and its derivatives such as cpumask and nodemask:
+
+   %*pb0779
+   %*pbl   0,3-6,8-10
+
+   For printing bitmap and its derivatives such as cpumask and nodemask,
+   %*pb output the bitmap with field width as the number of bits and %*pbl
+   output the bitmap as range list with field width as the number of bits.
+
 If  is dependent on a config option for its size (e.g., sector_t,
 blkcnt_t) or is architecture-dependent for its size (e.g., tcflag_t), use a
 format specifier of its largest possible type and explicitly cast to it.
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kasan, module, vmalloc: rework shadow allocation for modules

2015-02-25 Thread Rusty Russell

Andrey Ryabinin  writes:
> On 02/25/2015 09:25 AM, Rusty Russell wrote:
>> Andrey Ryabinin  writes:
>>> On 02/23/2015 11:26 AM, Rusty Russell wrote:
 Andrey Ryabinin  writes:
> On 02/20/2015 03:15 AM, Rusty Russell wrote:
>> Andrey Ryabinin  writes:
>>> On 02/19/2015 02:10 AM, Rusty Russell wrote:
 This is not portable.  Other archs don't use vmalloc, or don't use
 (or define) MODULES_VADDR.  If you really want to hook here, you'd
 need a new flag (or maybe use PAGE_KERNEL_EXEC after an audit).

>>>
>>> Well, instead of explicit (addr >= MODULES_VADDR && addr < MODULES_END)
>>> I could hide this into arch-specific function: 
>>> 'kasan_need_to_allocate_shadow(const void *addr)'
>>> or make make all those functions weak and allow arch code to redefine 
>>> them.
>>
>> That adds another layer of indirection.  And how would the caller of
>> plain vmalloc() even know what to return?
>>
>
> I think I don't understand what do you mean here. vmalloc() callers 
> shouldn't know
> anything about kasan/shadow.

 How else would kasan_need_to_allocate_shadow(const void *addr) work for
 architectures which don't have a reserved vmalloc region for modules?

>>>
>>>
>>> I think I need to clarify what I'm doing.
>>>
>>> Address sanitizer algorithm in short:
>>> -
>>> Every memory access is transformed by the compiler in the following way:
>>>
>>> Before:
>>> *address = ...;
>>>
>>> after:
>>>
>>> if (memory_is_poisoned(address)) {
>>> report_error(address, access_size);
>>> }
>>> *address = ...;
>>>
>>> where memory_is_poisoned():
>>> bool memory_is_poisoned(unsigned long addr)
>>> {
>>> s8 shadow_value = *(s8 *)kasan_mem_to_shadow((void *)addr);
>>> if (unlikely(shadow_value)) {
>>> s8 last_accessible_byte = addr & KASAN_SHADOW_MASK;
>>> return unlikely(last_accessible_byte >= shadow_value);
>>> }
>>> return false;
>>> }
>>> --
>>>
>>> So shadow memory should be present for every accessible address in kernel
>>> otherwise it will be unhandled page fault on reading shadow value.
>>>
>>> Shadow for vmalloc addresses (on x86_64) is readonly mapping of one zero 
>>> page.
>>> Zero byte in shadow means that it's ok to access to that address.
>>> Currently we don't catch bugs in vmalloc because most of such bugs could be 
>>> caught
>>> in more simple way with CONFIG_DEBUG_PAGEALLOC.
>>> That's why we don't need RW shadow for vmalloc, it just one zero page that 
>>> readonly
>>> mapped early on boot for the whole [kasan_mem_to_shadow(VMALLOC_START, 
>>> kasan_mem_to_shadow(VMALLOC_END)] range
>>> So every access to vmalloc range assumed to be valid.
>>>
>>> To catch out of bounds accesses in global variables we need to fill shadow 
>>> corresponding
>>> to variable's redzone with non-zero (negative) values.
>>> So for kernel image and modules we need a writable shadow.
>>>
>>> If some arch don't have separate address range for modules and it uses 
>>> general vmalloc()
>>> shadow for vmalloc should be writable, so it means that shadow has to be 
>>> allocated
>>> for every vmalloc() call.
>>>
>>> In such arch kasan_need_to_allocate_shadow(const void *addr) should return 
>>> true for every vmalloc address:
>>> bool kasan_need_to_allocate_shadow(const void *addr)
>>> {
>>> return addr >= VMALLOC_START && addr < VMALLOC_END;
>>> }
>> 
>> Thanks for the explanation.
>> 
>>> All above means that current code is not very portable.
>>> And 'kasan_module_alloc(p, size) after module alloc' approach is not 
>>> portable
>>> too. This won't work for arches that use [VMALLOC_START, VMALLOC_END] 
>>> addresses for modules,
>>> because now we need to handle all vmalloc() calls.
>> 
>> I'm confused.  That's what you do now, and it hasn't been a problem,
>> has it?  The problem is on the freeing from interrupt context...
>> 
>
> It's not problem now. It's only about portability.

Your first patch in this conversation says "Current approach in handling
shadow memory for modules is broken."

>> #define VM_KASAN 0x0080  /* has shadow kasan map */
>> 
>> Set that in kasan_module_alloc():
>> 
>> if (ret) {
>> struct vm_struct *vma = find_vm_area(addr);
>> 
>> BUG_ON(!vma);
>> /* Set VM_KASAN so vfree() can free up shadow. */
>> vma->flags |= VM_KASAN;
>> }
>> 
>> And check that in __vunmap():
>> 
>> if (area->flags & VM_KASAN)
>> kasan_module_free(addr);
>> 
>> That is portable, and is actually a fairly small patch on what you
>> have at the moment.
>> 
>> What am I missing?
>> 
>
> That is not portable.
> Architectures that don't have separate region for modules should allocate 
> shadow
> for every vmallo

Re: [PATCH 3/3] kernel/module.c: Update debug alignment after symtable generation

2015-02-25 Thread Rusty Russell

Laura Abbott  writes:
> When CONFIG_DEBUG_SET_MODULE_RONX is enabled, the sizes of
> module sections are aligned up so appropriate permissions can
> be applied. Adjusting for the symbol table may cause them to
> become unaligned. Make sure to re-align the sizes afterward.
>
> Signed-off-by: Laura Abbott 

Acked-by: Rusty Russell 

This won't clash with anything I'm planning, so happy for this to go in
through the arm trees.  CC:stable should be fine if you want too.

Thanks,
Rusty.

> ---
>  kernel/module.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/kernel/module.c b/kernel/module.c
> index b34813f..cc93cf6 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -2313,11 +2313,13 @@ static void layout_symtab(struct module *mod, struct 
> load_info *info)
>   info->symoffs = ALIGN(mod->core_size, symsect->sh_addralign ?: 1);
>   info->stroffs = mod->core_size = info->symoffs + ndst * sizeof(Elf_Sym);
>   mod->core_size += strtab_size;
> + mod->core_size = debug_align(mod->core_size);
>  
>   /* Put string table section at end of init part of module. */
>   strsect->sh_flags |= SHF_ALLOC;
>   strsect->sh_entsize = get_offset(mod, &mod->init_size, strsect,
>info->index.str) | INIT_OFFSET_MASK;
> + mod->init_size = debug_align(mod->init_size);
>   pr_debug("\t%s\n", info->secstrings + strsect->sh_name);
>  }
>  
> -- 
> Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
> Foundation Collaborative Project
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[LKP] [vmstat] ba4877b9ca5: not primary result change, -62.5% will-it-scale.time.involuntary_context_switches

2015-02-25 Thread Huang Ying

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
commit ba4877b9ca51f80b5d30f304a46762f0509e1635 ("vmstat: do not use deferrable 
delayed work for vmstat_update")

testbox/testcase/testparams: wsm/will-it-scale/performance-malloc1

9c0415eb8cbf0c8f  ba4877b9ca51f80b5d30f304a4  
  --  
 %stddev %change %stddev
 \  |\  
  1194 ±  0% -62.5%447 ±  7%  
will-it-scale.time.involuntary_context_switches
   246 ±  0%  +2.3%252 ±  1%  will-it-scale.time.system_time
  18001.54 ± 22%-100.0%   0.00 ±  0%  
sched_debug.cfs_rq[3]:/.MIN_vruntime
  18001.54 ± 22%-100.0%   0.00 ±  0%  
sched_debug.cfs_rq[3]:/.max_vruntime
   1097152 ±  3% -82.4% 192865 ±  1%  cpuidle.C6-NHM.usage
 99560 ± 16% +57.7% 157029 ± 23%  sched_debug.cfs_rq[8]:/.spread0
 27671 ± 23% -65.9%   9439 ±  8%  sched_debug.cfs_rq[5]:/.exec_clock
  1194 ±  0% -62.5%447 ±  7%  time.involuntary_context_switches
247334 ± 20% -61.2%  96086 ±  3%  
sched_debug.cfs_rq[5]:/.min_vruntime
 20417 ± 35% -48.7%  10473 ±  8%  sched_debug.cfs_rq[3]:/.exec_clock
104076 ± 38% +73.9% 181000 ± 30%  sched_debug.cpu#2.ttwu_local
180071 ± 29% -41.3% 105641 ± 10%  
sched_debug.cfs_rq[3]:/.min_vruntime
34 ± 14% -48.6% 17 ± 10%  sched_debug.cpu#5.cpu_load[4]
 43629 ± 18% -32.7%  29370 ± 13%  sched_debug.cpu#3.nr_load_updates
 42653 ± 14% -42.6%  24488 ± 14%  sched_debug.cpu#5.nr_load_updates
 13660 ±  9% -41.4%   8010 ±  3%  
sched_debug.cfs_rq[5]:/.avg->runnable_avg_sum
   296 ±  9% -41.2%174 ±  3%  
sched_debug.cfs_rq[5]:/.tg_runnable_contrib
205846 ±  6% -11.2% 182783 ±  6%  sched_debug.cpu#7.sched_count
37 ± 10% -38.4% 23 ±  8%  sched_debug.cpu#5.cpu_load[3]
  1378 ± 12% -20.6%   1094 ±  4%  sched_debug.cpu#11.ttwu_local
205691 ±  6% -11.2% 182623 ±  6%  sched_debug.cpu#7.nr_switches
102423 ±  6% -11.2%  90915 ±  6%  sched_debug.cpu#7.sched_goidle
25 ± 21% +41.6% 35 ± 17%  sched_debug.cpu#3.cpu_load[0]
68 ± 16% -29.3% 48 ±  9%  sched_debug.cpu#8.cpu_load[0]
32 ± 14% +54.2% 50 ±  6%  sched_debug.cpu#11.cpu_load[4]
   507 ± 10% -30.0%355 ±  3%  
sched_debug.cfs_rq[10]:/.blocked_load_avg
 39084 ± 16% +48.0%  57862 ±  2%  
sched_debug.cfs_rq[11]:/.exec_clock
  10022712 ±  9% -28.8%7139491 ± 13%  cpuidle.C1-NHM.time
341246 ± 14% +47.3% 502560 ±  6%  
sched_debug.cfs_rq[11]:/.min_vruntime
   562 ±  9% -28.8%400 ±  4%  
sched_debug.cfs_rq[10]:/.tg_load_contrib
66 ±  7% -20.8% 52 ± 14%  
sched_debug.cfs_rq[8]:/.runnable_load_avg
36 ± 18% +45.8% 52 ±  6%  sched_debug.cpu#11.cpu_load[3]
 43079 ±  1%  +8.0%  46513 ±  2%  softirqs.RCU
43 ±  9% -25.6% 32 ± 10%  sched_debug.cpu#5.cpu_load[2]
   1745173 ±  4% +43.2%2499517 ±  3%  cpuidle.C3-NHM.usage
44 ± 18% +25.3% 55 ± 10%  sched_debug.cpu#9.cpu_load[2]
 64453 ±  8% +27.0%  81824 ±  3%  sched_debug.cpu#11.nr_load_updates
 58719 ±  7% -14.3%  50299 ±  9%  sched_debug.cpu#0.ttwu_count
40 ± 16% +24.7% 50 ±  3%  sched_debug.cpu#9.cpu_load[4]
42 ± 16% +26.2% 53 ±  5%  sched_debug.cpu#9.cpu_load[3]
 61887 ±  4% -16.2%  51890 ± 11%  sched_debug.cpu#0.sched_goidle
125652 ±  4% -16.1% 105434 ± 10%  sched_debug.cpu#0.nr_switches
125769 ±  4% -16.1% 105564 ± 10%  sched_debug.cpu#0.sched_count
 16164 ±  7% +35.2%  21852 ±  1%  
sched_debug.cfs_rq[11]:/.avg->runnable_avg_sum
   352 ±  7% +34.9%475 ±  1%  
sched_debug.cfs_rq[11]:/.tg_runnable_contrib
  1442 ± 11% +20.9%   1742 ±  3%  sched_debug.cpu#11.curr->pid
 7.243e+08 ±  1% +20.0%   8.69e+08 ±  3%  cpuidle.C3-NHM.time
172138 ±  5% +11.9% 192649 ±  6%  sched_debug.cpu#9.sched_count
 85576 ±  5% +12.0%  95879 ±  6%  sched_debug.cpu#9.sched_goidle
 91826 ±  0% +13.0% 103784 ± 11%  sched_debug.cfs_rq[6]:/.exec_clock
 46977 ± 15% +21.8%  57227 ±  2%  sched_debug.cfs_rq[9]:/.exec_clock
115370 ±  1% +11.5% 128602 ±  8%  sched_debug.cpu#6.nr_load_updates
 67629 ± 10% +19.7%  80928 ±  0%  sched_debug.cpu#9.nr_load_updates
  0.92 ±  4%  +9.2%   1.00 ±  3%  
perf-profile.cpu-cycles.__vma_link_rb.vma_link.mmap_region.do_mmap_pgoff.vm_mmap_pgoff
  0.89 ±  3%  +9.5%   0.98 ±  5%  
perf-profile.cpu-cycles._cond_resched.unmap_single_vma.unmap_vmas.unmap_region.do_munmap
 17.84 ±  3%  -7.2%  16.56 ±  1%  turbostat.CP

[LKP] [drm/i915] 3f678c96abb: piglit.igt/kms_cursor_crc/cursor-64-offscreen.fail

2015-02-25 Thread Huang Ying

FYI, we noticed the below changes on

git://anongit.freedesktop.org/drm-intel for-linux-next
commit 3f678c96abb43a977d2ea41aefccdc49e8a3e896 ("drm/i915: Switch planes from 
transitional helpers to full atomic helpers")


testbox/testcase/testparams: lkp-t410/piglit/performance-igt-035

1ed1f968b6bec3a8  3f678c96abb43a977d2ea41aef  
  --  
   fail:runs  %reproductionfail:runs
   | | |
   :10 100%  10:10
piglit.igt/kms_cursor_crc/cursor-64-offscreen.fail

testbox/testcase/testparams: lkp-t410/piglit/performance-igt-037

1ed1f968b6bec3a8  3f678c96abb43a977d2ea41aef  
  --  
   :10 100%  10:10
piglit.igt/kms_cursor_crc/cursor-128-offscreen.fail

lkp-t410: Westmere
Memory: 2G

To reproduce:

apt-get install ruby
git clone 
git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/setup-local job.yaml # the job file attached in this email
bin/run-local   job.yaml


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Ying Huang

---
testcase: piglit
default-monitors:
  wait: pre-test
  vmstat: 
default_watchdogs:
  watch-oom: 
  watchdog: 
cpufreq_governor: performance
commit: 5aeb2a3dc4f0ea47fe0df3cb3af75ef813dda833
model: Westmere
memory: 2G
hdd_partitions: "/dev/disk/by-id/ata-FUJITSU_MJA2250BH_G2_K95CT9C2G29W-part6"
swap_partitions: 
rootfs_partition: "/dev/disk/by-id/ata-FUJITSU_MJA2250BH_G2_K95CT9C2G29W-part7"
timeout: 30m
piglit:
  group: igt-037
testbox: lkp-t410
tbox_group: lkp-t410
kconfig: x86_64-rhel
enqueue_time: 2015-02-13 13:05:22.018177529 +08:00
head_commit: 5aeb2a3dc4f0ea47fe0df3cb3af75ef813dda833
base_commit: bfa76d49576599a4b9f9b7a71f23d73d6dcff735
branch: linux-devel/devel-hourly-2015021623
kernel: 
"/kernel/x86_64-rhel/5aeb2a3dc4f0ea47fe0df3cb3af75ef813dda833/vmlinuz-3.19.0-wl-ath-02305-g5aeb2a3"
user: lkp
queue: cyclic
rootfs: debian-x86_64-2015-02-07.cgz
result_root: 
"/result/lkp-t410/piglit/performance-igt-037/debian-x86_64-2015-02-07.cgz/x86_64-rhel/5aeb2a3dc4f0ea47fe0df3cb3af75ef813dda833/0"
job_file: 
"/lkp/scheduled/lkp-t410/cyclic_piglit-performance-igt-037-x86_64-rhel-HEAD-5aeb2a3dc4f0ea47fe0df3cb3af75ef813dda833-0-20150213-66430-1r51l8q.yaml"
dequeue_time: 2015-02-17 18:25:48.403142737 +08:00
nr_cpu: "$(nproc)"
job_state: finished
loadavg: 0.94 0.44 0.16 1/120 678
start_time: '1424168785'
end_time: '1424168844'
version: "/lkp/lkp/.src-20150217-174623"
echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
piglit run igt -t igt/kms_cursor_crc/cursor-128-offscreen 
/tmp/lkp/piglit-results-0
piglit summary console /tmp/lkp/piglit-results-0
___
LKP mailing list
l...@linux.intel.com

Re: [PATCH v6 1/5] mfd: max77843: Add max77843 MFD driver core driver

2015-02-25 Thread Jaewon Kim


Hi Lee Jones,

On 26/02/2015 01:47, Lee Jones wrote:

On Tue, 24 Feb 2015, Jaewon Kim wrote:


This patch adds MAX77843 core/irq driver to support PMIC,
MUIC(Micro USB Interface Controller), Charger, Fuel Gauge,
LED and Haptic device.

Cc: Lee Jones 
Signed-off-by: Jaewon Kim 
Signed-off-by: Beomho Seo 
---
  drivers/mfd/Kconfig  |   14 ++
  drivers/mfd/Makefile |1 +
  drivers/mfd/max77843.c   |  248 +++
  include/linux/mfd/max77843-private.h |  454 ++
  4 files changed, 717 insertions(+)
  create mode 100644 drivers/mfd/max77843.c
  create mode 100644 include/linux/mfd/max77843-private.h

diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index 38356e3..f2fd5e5 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -455,6 +455,20 @@ config MFD_MAX77693
  additional drivers must be enabled in order to use the functionality
  of the device.
  
+config MFD_MAX77843

+   bool "Maxim Semiconductor MAX77843 PMIC Support"
+   depends on I2C=y
+   select MFD_CORE
+   select REGMAP_I2C
+   select REGMAP_IRQ
+   help
+ Say yes here to add support for Maxim Semiconductor MAX77843.
+ This is companion Power Management IC with LEDs, Haptic, Charger,
+ Fuel Gauge, MUIC(Micro USB Interface Controller) controls on chip.
+ This driver provides common support for accessing the device;
+ additional drivers must be enabled in order to use the functionality
+ of the device.
+
  config MFD_MAX8907
tristate "Maxim Semiconductor MAX8907 PMIC Support"
select MFD_CORE
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index 19f3d74..b8ac555 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -117,6 +117,7 @@ obj-$(CONFIG_MFD_DA9150)+= da9150-core.o
  obj-$(CONFIG_MFD_MAX14577)+= max14577.o
  obj-$(CONFIG_MFD_MAX77686)+= max77686.o
  obj-$(CONFIG_MFD_MAX77693)+= max77693.o
+obj-$(CONFIG_MFD_MAX77843) += max77843.o
  obj-$(CONFIG_MFD_MAX8907) += max8907.o
  max8925-objs  := max8925-core.o max8925-i2c.o
  obj-$(CONFIG_MFD_MAX8925) += max8925.o
diff --git a/drivers/mfd/max77843.c b/drivers/mfd/max77843.c
new file mode 100644
index 000..2d8b3cc
--- /dev/null
+++ b/drivers/mfd/max77843.c
@@ -0,0 +1,248 @@
+/*
+ * max77843.c - MFD core driver for the Maxim MAX77843

Please remove the filename.

Okay, I will remove it.



+ * Copyright (C) 2015 Samsung Electronics
+ * Author: Jaewon Kim 
+ * Author: Beomho Seo 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static const struct mfd_cell max77843_devs[] = {
+   {
+   .name = "max77843-muic",
+   .of_compatible = "maxim,max77843-muic",
+   }, {
+   .name = "max77843-regulator",
+   .of_compatible = "maxim,max77843-regulator",
+   }, {
+   .name = "max77843-charger",
+   .of_compatible = "maxim,max77843-charger"
+   }, {
+   .name = "max77843-fuelgauge",
+   .of_compatible = "maxim,max77843-fuelgauge",
+   }, {
+   .name = "max77843-haptic",
+   .of_compatible = "maxim,max77843-haptic",
+   },
+};
+
+static const struct regmap_config max77843_charger_regmap_config = {
+   .reg_bits   = 8,
+   .val_bits   = 8,
+   .max_register   = MAX77843_CHG_REG_END,
+};
+
+static const struct regmap_config max77843_regmap_config = {
+   .reg_bits   = 8,
+   .val_bits   = 8,
+   .max_register   = MAX77843_SYS_REG_END,
+};
+
+static const struct regmap_irq max77843_irqs[] = {
+   /* TOPSYS interrupts */
+   { .reg_offset = 0, .mask = MAX77843_SYS_IRQ_SYSUVLO_INT, },
+   { .reg_offset = 0, .mask = MAX77843_SYS_IRQ_SYSOVLO_INT, },
+   { .reg_offset = 0, .mask = MAX77843_SYS_IRQ_TSHDN_INT, },
+   { .reg_offset = 0, .mask = MAX77843_SYS_IRQ_TM_INT, },
+};
+
+static const struct regmap_irq_chip max77843_irq_chip = {
+   .name   = "max77843",
+   .status_base= MAX77843_SYS_REG_SYSINTSRC,
+   .mask_base  = MAX77843_SYS_REG_SYSINTMASK,
+   .mask_invert= false,
+   .num_regs   = 1,
+   .irqs   = max77843_irqs,
+   .num_irqs   = ARRAY_SIZE(max77843_irqs),
+};
+
+/* Charger and Charger regulator use same regmap. */
+static int max77843_chg_init(struct max77843 *max77843)
+{
+   int ret;
+
+   max77843->i2c_chg = i2c_new_dummy(max77843->i2c->adapter, I2C_ADDR_CHG);
+   if (!max77843->i2c_chg) {
+   dev_err(&max77843->i2c->dev,
+

RE: [PATCH 4/4] powerpc/mpic: remove unused functions

2015-02-25 Thread Hongtao Jia

Hi Scott,

I'm really sorry for leave this patch like a zombie.
Now I have plan to revisit this patch.

From the previous comments the compile error was fixed.
But beyond that I have had no plan to update it.

Could you please comment on why it's still on hold?

Thanks.


> -Original Message-
> From: Wood Scott-B07421
> Sent: Tuesday, February 24, 2015 5:32 AM
> To: Arseny Solokha
> Cc: Michael Ellerman; Benjamin Herrenschmidt; Paul Mackerras; linuxppc-
> d...@lists.ozlabs.org; linux-kernel@vger.kernel.org; Jia Hongtao-B38951
> Subject: Re: [PATCH 4/4] powerpc/mpic: remove unused functions
> 
> On Thu, 2015-02-19 at 19:26 +0700, Arseny Solokha wrote:
> >   + fsl_mpic_primary_get_version() is just a safe wrapper around
> > fsl_mpic_get_version() for SMP configurations. While the latter is
> > called explicitly for handling PIC initialization and setting up error
> > interrupt vector depending on PIC hardware version, the former isn't
> > used for anything.
> 
> It was meant to be used by http://patchwork.ozlabs.org/patch/233211/
> which never got respun.  Hongtao, do you plan to revisit that patch?
> 
> -Scott
>

Continually increasing inflight IO statistics with md mirror and blk-mq enabled on 3.19 release

2015-02-25 Thread james owens

Please cc me personally on any replies as I am not a subscriber to the 
list.


I currently have a 2 device md raid 1 on my Linux workstation where the 
slave devices are usb 3.0 external drives 4 TB each which I use for 
backup purposes. I also have a 3ware 9750 RAID controller with a HW RAID 
1 SSD mirror and 12 drive RAID 6.


I recently installed the 3.19 release kernel from openSuSE current to 
try out the block-mq optimizations. The new mq subsystem is clearly 
enabled by the presence of the mq subdirectory under /sys/block/foo. The 
performance on the 3ware is great, especially the SSD mirror; however, 
when using the md RAID 1 with external USB 3.0 devices, I noticed the 
inflight IO counters at /sys/block/md127/inflight are continuously 
increasing. This is not normal as these should go to 0 once any pending 
reads/writes are finalized. Is this simply an accounting problem I can 
safely ignore, or is it an unsafe situation that I need to be concerned 
about?


Note that this does not occur with the exported disks on the 3ware array 
/dev/sda and /dev/sdb.


I could try and see if this happens with a plain USB drive as I have 
some available.


If you dismount the volume and stop the md array and then reassemble it, 
the statistics reset (as one would expect).


I use a bitmap file to speed up resyncs, and the bitmap file is 
clearing, and there is no filesystem corruption that I can see, so the 
writes are clearly getting to the disks correctly.


I will provide more information as best I can if needed. My guess is 
that this will be pretty easy to reproduce.


Best regards,

Jim Owens
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v6 4/5] Input: add haptic drvier on max77843

2015-02-25 Thread Jaewon Kim


Hi Dmitry,

On 26/02/2015 10:23, Dmitry Torokhov wrote:

Hi Jaewon,

On Tue, Feb 24, 2015 at 10:29:07AM +0900, Jaewon Kim wrote:

+static void max77843_haptic_play_work(struct work_struct *work)
+{
+   struct max77843_haptic *haptic =
+   container_of(work, struct max77843_haptic, work);
+   int error;
+
+   mutex_lock(&haptic->mutex);
+
+   if (haptic->suspended) {
+   goto err_play;
+   }
+

You do not need braces around single statement. Also, this is not error
that you are handling, I'd prefer if we called this label out_unlock.

You are right.
I will change label name and remove braces.

+   error = max77843_haptic_set_duty_cycle(haptic);
+   if (error) {
+   dev_err(haptic->dev, "failed to set duty cycle: %d\n", error);
+   goto err_play;
+   }

Do you need to configure duty cycle if you stopping the playback? Or
maybe disabling pwm is enough?


It do not need to set duty cycle requisitely when disabling haptic.

I will move this function to front of max77843_haptic_enable().




+
+   if (haptic->magnitude) {
+   error = max77843_haptic_enable(haptic);
+   if (error)
+   dev_err(haptic->dev,
+   "cannot enable haptic: %d\n", error);
+   } else {
+   max77843_haptic_disable(haptic);
+   if (error)
+   dev_err(haptic->dev,
+   "cannot disable haptic: %d\n", error);

What error? You did not assign it...

Detailed error message printed in enable/disable() function.



Thanks.


Thanks to review my patch.

Thanks,
Jaewon Kim


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RESEND] sched/deadline: fix pull if dl task who's prio changed is not on queue

2015-02-25 Thread Wanpeng Li

Dl task who is not on queue and it is also the curr task simultaneously
can not happen. In addition, pull since the priority of a not on queue
dl task doesn't make any sense.

This patch fix it by don't pull if dl task who's prio changed is not on
queue.

Signed-off-by: Wanpeng Li 
---
 kernel/sched/deadline.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 49f92c8..ca391c0 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1728,7 +1728,10 @@ static void switched_to_dl(struct rq *rq, struct 
task_struct *p)
 static void prio_changed_dl(struct rq *rq, struct task_struct *p,
int oldprio)
 {
-   if (task_on_rq_queued(p) || rq->curr == p) {
+   if (!task_on_rq_queued(p))
+   return;
+
+   if (rq->curr == p) {
 #ifdef CONFIG_SMP
/*
 * This might be too much, but unfortunately
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] spi: qup: Add DMA capabilities

2015-02-25 Thread Mark Brown

On Tue, Feb 24, 2015 at 06:08:54PM +0200, Stanimir Varbanov wrote:

> yes, there is a potential race between atomic_inc and dma callback. I
> reordered these calls to save few checks, and now it returns to me.

> I imagine few options here:

>  - reorder the dmaengine calls and atomic operations, i.e.
> call atomic_inc for rx and tx channels before corresponding
> dmaengine_submit and dmaengine_issue_pending.

>  - have two different dma callbacks and two completions and waiting for
> the two.

>  - manage to receive only one dma callback, i.e. the last transfer in
> case of presence of the rx_buf and tx_buf at the same time.

>  - let me see for better solution.

Any solution which doesn't make use of atomics is likely to be better,
as I said they are enormously error prone.  A more common approach is a
single completion triggering on the RX (for RX only or bidirectional
transfers) or TX if that's the only thing active.  For most hardware you
can just use the RX to manage completion since it must of necessity
complete at the same time as or later than the transmit side, transmit
often completes early since the DMA completes when the FIFO is full not
when the data is on the wire.

signature.asc
Description: Digital signature

Re: [Patch] ASoC: max98357a: Use standard DAI names

2015-02-25 Thread Mark Brown

On Tue, Feb 24, 2015 at 10:39:04PM -0800, Kenneth Westfield wrote:
> From: Kenneth Westfield 
> 
> Use the standard naming convention for the codec DAI.

Applied, thanks.  Please pay attention to who you're CCing and try to
only include relevant people/lists - mail volumes are often very high
and sending people irrelevant stuff adds to the problem.

signature.asc
Description: Digital signature

Re: [PATCH 01/15] power_supply core: support use of devres to register/unregister a power supply.

2015-02-25 Thread Sebastian Reichel

Hi Heil,

On Tue, Feb 24, 2015 at 03:33:50PM +1100, NeilBrown wrote:
> Using devm_power_supply_register allows the unregister to happen
> automatically on error or final put.
> 
> Signed-off-by: NeilBrown 

Thanks, applied to battery-2.6.git.

-- Sebastian


signature.asc
Description: Digital signature

linux-next: build failure after merge of the clk tree

2015-02-25 Thread Stephen Rothwell

Hi Mike,

After merging the clk tree, today's linux-next build (x86_64 allmodconfig)
failed like this:

drivers/clk/clk.c: In function 'clk_disable_unused_subtree':
drivers/clk/clk.c:514:3: error: label 'out' used but not defined
   goto out;
   ^

Caused by commit a2146f032294 ("clk: Use lockdep asserts to find
missing hold of prepare_lock").  Commit c440525cb967 ("clk: Remove
unneeded NULL checks") removed that label along with the NULL check
that a2146f032294 reintroduces (was this a bad rebase?).  Please do
simple build tests.

I have used the clk tree from next-20150225 for today.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgp3Xjl16oW5x.pgp
Description: OpenPGP digital signature

Re: [PATCH 1/3] e820: Don't let unknown DIMM type come out BUSY

2015-02-25 Thread Dan Williams

On Mon, Feb 23, 2015 at 11:59 PM, Boaz Harrosh  wrote:
> No, this is a complete HACK, since when do we hard code specific (GLOBAL)
> ARCHs strings in common code. Please look at linux/ioport.h see the richness
> of options for all kind of buses and systems. The flag system works perfectly
> and I just continue this here.
>
> And really DAN, you prefer a global string that's dead garbage in 99% of 
> arches
> to a simple bit flag definition that costs nothing? I don't think so.

Glad we're moving ahead with the IORESOURCE_MEM_WARN solution rather
than this or the 64-bit-limited IORESOURCE_WARN approach.

>
>> + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
>
> NACK!!
>

I disagree.  Ultimately what goes into kernel/resource.c is not up to
me, but firmware/driver combinations that subvert standards should be
flagged by the kernel.  Stepping back from the original motivation, in
the general case, an unknown memory type is indiscernible from a BIOS
bug.

TAINT_FIRMWARE_WORKAROUND is simply a notification that firmware needs
to be updated, and I believe a driver attaching to unknown memory is
such an event.  It does not block a user from using that memory
however he or she sees fit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] spi: spidev_test: Added functionalities

2015-02-25 Thread Mark Brown

On Wed, Feb 25, 2015 at 08:08:44PM +0100, Adrian Remonda wrote:
> This is a patch that add functionalities to the spidev_test tool found
> in Documentation/spi/spidev_test.c.

> - Cleaned hexadecimal dump
> - Added verbose mode to see the transmitting sequence
> - Added input buffer from the terminal. Now it is possible to send
>   string and hexadecimal data as an input parameter:

Since this is doing several different things then it should be a series
of patches rather than a single patch - please see SubmittingPatches for
advice on how to prepare patches for submission.  It is hard to review
changes which do more than one thing.

signature.asc
Description: Digital signature

[GIT PULL] Btrfs

2015-02-25 Thread Chris Mason

Hi Linus,

I'm still testing more fixes, but I wanted to get out the fix for the
btrfs raid5/6 memory corruption I mentioned in my merge window pull.

Please pull my for-linus:

git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git for-linus

Chris Mason (1) commits (+8/-1):
Btrfs: fix allocation size calculations in alloc_btrfs_bio

Total: (1) commits (+8/-1)

 fs/btrfs/volumes.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 02/20] power_supply: Move run-time configuration to separate structure

2015-02-25 Thread Sebastian Reichel

On Wed, Feb 25, 2015 at 05:10:07PM -0800, Dmitry Torokhov wrote:
> On Thu, Feb 26, 2015 at 01:34:09AM +0100, Sebastian Reichel wrote:
> > Hi,
> > 
> > On Mon, Feb 23, 2015 at 12:47:23PM +0100, Krzysztof Kozlowski wrote:
> > > Add new structure 'power_supply_config' for holding run-time
> > > initialization data like of_node, supplies and private driver data.
> > > 
> > > The power_supply_register() function is changed so all power supply
> > > drivers need updating.
> > > 
> > > When registering the power supply this new 'power_supply_config' should be
> > > used instead of directly initializing 'struct power_supply'. This allows
> > > changing the ownership of power_supply structure from driver to the
> > > power supply core in next patches.
> > > 
> > > When a driver does not use of_node or supplies then it should use NULL
> > > as config. If driver uses of_node or supplies then it should allocate
> > > config on stack and initialize it with proper values.
> > > 
> > > Signed-off-by: Krzysztof Kozlowski 
> > > Reviewed-by: Bartlomiej Zolnierkiewicz 
> > > Acked-by: Pavel Machek 
> > > 
> > > [for the nvec part]
> > > Reviewed-by: Marc Dietrich 
> > > 
> > > [for drivers/platform/x86/compal-laptop.c]
> > > Reviewed-by: Darren Hart 
> > > ---
> > >  drivers/acpi/ac.c |  2 +-
> > >  drivers/acpi/battery.c|  3 ++-
> > >  drivers/acpi/sbs.c|  4 ++--
> > >  drivers/hid/hid-input.c   |  2 +-
> > >  drivers/hid/hid-sony.c|  2 +-
> > >  drivers/hid/hid-wiimote-modules.c |  2 +-
> > >  drivers/hid/wacom_sys.c   |  5 +++--
> > >  drivers/platform/x86/compal-laptop.c  |  2 +-
> > >  drivers/power/[...]   |  lots of changes
> > >  drivers/staging/nvec/nvec_power.c |  7 ---
> > >  include/linux/power_supply.h  | 16 ++--
> > >  67 files changed, 211 insertions(+), 158 deletions(-)
> > 
> > I would like to merge this via the power supply tree.
> > 
> > Rafael, Dmitry, are you ok with this change?
> 
> Hmm, I do not see anything affecting input here, you want Jiri for HID I
> guess...

uhm right. Thanks for pointing out and sorry for the noise.

-- Sebastian


signature.asc
Description: Digital signature

Re: [PATCH v2 0/4] KVM: APIC improvements (with bonus mixed mode)

2015-02-25 Thread Marcelo Tosatti

Radim,

On Thu, Feb 12, 2015 at 07:41:30PM +0100, Radim Krčmář wrote:
> Each patch has a diff from v1, here is only a prologue on the mythical
> mixed xAPIC and x2APIC mode:
> 
> There is one interesting alias in xAPIC and x2APIC ICR destination, the
> 0xff00, which is a broadcast in xAPIC and either a physical message
> to high x2APIC ID or a message to an empty set in x2APIC cluster.
> 
> This corner case in mixed mode can be triggered by a weird message
>  1) when we have no x2APIC ID 0xff00, but send x2APIC message there

10.7 SYSTEM AND APIC BUS ARBITRATION

Note that except for the SIPI IPI (see Section 10.6.1, “Interrupt
Command Register (ICR)”), all bus messages that fail to be delivered
to their specified destination or destinations are automatically
retried. Software should avoid situations in which IPIs are sent to
disabled or nonexistent local APICs, causing the messages to be resent
repeatedly.

> or after something that isn't forbidden in SDM most likely because they
> didn't think that anyone would ever consider it
>  2) we have x2APIC ID 0xff00 and reset other x2APICs into xAPIC mode

Or just x2APIC initialization in non lockstep:

vcpu0   vcpu1
T0) xapic mode  xapic mode
T1) x2apic enable
T2) broadcast IPI

> Current KVM doesn't need to consider (2), so there only is a slim chance
> that some hobbyist OS pushes the specification to its limits.
> 
> The problem is that SDM for xAPIC doesn't believably specify[1] if all
> messages beginning with 0xff are considered as broadcasts, 10.6.2.1:
>   A broadcast IPI (bits 28-31 of the MDA are 1's)
> 
> No volunteer came to clear this up, so I hacked Linux to have one x2APIC
> core between xAPIC ones.  Physical IPI to 0xff00 got delivered only
> to CPU0, like most other destinations, Logical IPI to 0xff00 got
> dropped and only 0x worked as a broadcast in both modes.
> 
> I think it is because Intel never considered mixed mode to be valid, and
> seen delivery might be an emergent feature of QPI routing.

> Luckily, broadcast from xAPIC isn't delivered to x2APIC.

In real hardware?

> Real exploration would require greater modifications to Linux (ideally
> writing a custom kernel), so this series implements something that makes
> some sense and isn't too far from reality.

Ok, so the problem is that broadcast (ICR.destination == 0xff00)
from xAPIC CPU is not delivered to x2APIC CPUs ?

> Radim Krčmář (4):
>   KVM: x86: use MDA for interrupt matching
>   KVM: x86: fix mixed APIC mode broadcast
>   KVM: x86: avoid logical_map when it is invalid
>   KVM: x86: simplify kvm_apic_map

I can't find any restriction against (cpu0==x2APIC, cpu1==xAPIC) in 
the documentation.

Anyway, emulation should match physical hardware. From your message above,
it is not clear what is the behaviour there:

"No volunteer came to clear this up, so I hacked Linux to have one x2APIC
core between xAPIC ones.  Physical IPI to 0xff00 got delivered only
to CPU0, like most other destinations, Logical IPI to 0xff00 got
dropped and only 0x worked as a broadcast in both modes.

Luckily, broadcast from xAPIC isn't delivered to x2APIC."

>From the x2APIC CPU or the xAPIC ones?

It should be easy to write kvm-unit-test testcases that match physical
hardware behaviour (in general, i am having a hard time figure out
in what way "mixed mode" is supposed to behave, please describe it
clearly).

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Input: mma8450 - convert to using managed resources

2015-02-25 Thread Dmitry Torokhov

This simplifies error handling and device removal code. Also let's
get rid of setting driver's owner since i2c core does it for us.

Signed-off-by: Dmitry Torokhov 
---

Note that the following removal was intentional as
devm_input_allocate_polled_device() does this for us:

-   idev->input->dev.parent = &c->dev;


 drivers/input/misc/mma8450.c | 35 ---
 1 file changed, 8 insertions(+), 27 deletions(-)

diff --git a/drivers/input/misc/mma8450.c b/drivers/input/misc/mma8450.c
index 9822877..19c7357 100644
--- a/drivers/input/misc/mma8450.c
+++ b/drivers/input/misc/mma8450.c
@@ -174,12 +174,13 @@ static int mma8450_probe(struct i2c_client *c,
struct mma8450 *m;
int err;
 
-   m = kzalloc(sizeof(struct mma8450), GFP_KERNEL);
-   idev = input_allocate_polled_device();
-   if (!m || !idev) {
-   err = -ENOMEM;
-   goto err_free_mem;
-   }
+   m = devm_kzalloc(&c->dev, sizeof(*m), GFP_KERNEL);
+   if (!m)
+   return -ENOMEM;
+
+   idev = devm_input_allocate_polled_device(&c->dev);
+   if (!idev)
+   return -ENOMEM;
 
m->client = c;
m->idev = idev;
@@ -187,7 +188,6 @@ static int mma8450_probe(struct i2c_client *c,
idev->private   = m;
idev->input->name   = MMA8450_DRV_NAME;
idev->input->id.bustype = BUS_I2C;
-   idev->input->dev.parent = &c->dev;
idev->poll  = mma8450_poll;
idev->poll_interval = POLL_INTERVAL;
idev->poll_interval_max = POLL_INTERVAL_MAX;
@@ -202,29 +202,12 @@ static int mma8450_probe(struct i2c_client *c,
err = input_register_polled_device(idev);
if (err) {
dev_err(&c->dev, "failed to register polled input device\n");
-   goto err_free_mem;
+   return err;
}
 
i2c_set_clientdata(c, m);
 
return 0;
-
-err_free_mem:
-   input_free_polled_device(idev);
-   kfree(m);
-   return err;
-}
-
-static int mma8450_remove(struct i2c_client *c)
-{
-   struct mma8450 *m = i2c_get_clientdata(c);
-   struct input_polled_dev *idev = m->idev;
-
-   input_unregister_polled_device(idev);
-   input_free_polled_device(idev);
-   kfree(m);
-
-   return 0;
 }
 
 static const struct i2c_device_id mma8450_id[] = {
@@ -242,11 +225,9 @@ MODULE_DEVICE_TABLE(of, mma8450_dt_ids);
 static struct i2c_driver mma8450_driver = {
.driver = {
.name   = MMA8450_DRV_NAME,
-   .owner  = THIS_MODULE,
.of_match_table = mma8450_dt_ids,
},
.probe  = mma8450_probe,
-   .remove = mma8450_remove,
.id_table   = mma8450_id,
 };
 
-- 
2.2.0.rc0.207.ga3a616c


-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 1/1] vfs: Respect MS_RDONLY at bind mount creation

2015-02-25 Thread Mateusz Guzik

On Wed, Nov 05, 2014 at 08:44:03PM -0500, Richard Yao wrote:
> `mount -o bind,ro ...` suffers from a silent failure where the readonly
> flag is ignored. The bind mount will be created rw whenever the target
> is rw. Users typically workaround this by remounting readonly, but that
> does not work when you want to define readonly bind mounts in fstab.
> This is a major annoyance when dealing with recursive bind mounts
> because the userland mount command does not expose the option to
> recursively remount a subtree as readonly.

This is still a problem in 4.0 kernels.

Can we get this fixed please?

Is there anything wrong with the patch posted here? AFAICT the patch is
still fine (apart from some whitespace issues).

Thanks,
> 
> Signed-off-by: Richard Yao 
> ---
>  fs/namespace.c | 20 
>  fs/pnode.h | 17 +
>  2 files changed, 25 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 5b66b2b..6f07336 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -990,6 +990,14 @@ static struct mount *clone_mnt(struct mount *old, struct 
> dentry *root,
>   if (flag & CL_MAKE_SHARED)
>   set_mnt_shared(mnt);
>  
> + /*
> +  * We set the flag directly because the mount point is not yet visible.
> +  * This means there are no writers that require the checks in
> +  * mnt_make_readonly().
> +  */
> + if (flag & CL_MAKE_RDONLY)
> + mnt->mnt.mnt_flags |= MNT_READONLY;
> +
>   /* stick the duplicate mount on the same expiry list
>* as the original if that was on one */
>   if (flag & CL_EXPIRE) {
> @@ -1992,11 +2000,13 @@ static bool has_locked_children(struct mount *mnt, 
> struct dentry *dentry)
>   * do loopback mount.
>   */
>  static int do_loopback(struct path *path, const char *old_name,
> - int recurse)
> + unsigned long flags)
>  {
>   struct path old_path;
>   struct mount *mnt = NULL, *old, *parent;
>   struct mountpoint *mp;
> + int recurse = flags & MS_REC;
> + int clflags = (flags & MS_RDONLY) ? CL_MAKE_RDONLY : 0;
>   int err;
>   if (!old_name || !*old_name)
>   return -EINVAL;
> @@ -2027,9 +2037,10 @@ static int do_loopback(struct path *path, const char 
> *old_name,
>   goto out2;
>  
>   if (recurse)
> - mnt = copy_tree(old, old_path.dentry, CL_COPY_MNT_NS_FILE);
> + mnt = copy_tree(old, old_path.dentry, CL_COPY_MNT_NS_FILE |
> + clflags);
>   else
> - mnt = clone_mnt(old, old_path.dentry, 0);
> + mnt = clone_mnt(old, old_path.dentry, clflags);
>  
>   if (IS_ERR(mnt)) {
>   err = PTR_ERR(mnt);
> @@ -2625,7 +2636,8 @@ long do_mount(const char *dev_name, const char __user 
> *dir_name,
>   retval = do_remount(&path, flags & ~MS_REMOUNT, mnt_flags,
>   data_page);
>   else if (flags & MS_BIND)
> - retval = do_loopback(&path, dev_name, flags & MS_REC);
> + retval = do_loopback(&path, dev_name, flags & (MS_REC |
> +MS_RDONLY));
>   else if (flags & (MS_SHARED | MS_PRIVATE | MS_SLAVE | MS_UNBINDABLE))
>   retval = do_change_type(&path, flags);
>   else if (flags & MS_MOVE)
> diff --git a/fs/pnode.h b/fs/pnode.h
> index 4a24635..326f5be 100644
> --- a/fs/pnode.h
> +++ b/fs/pnode.h
> @@ -20,14 +20,15 @@
>  #define SET_MNT_MARK(m) ((m)->mnt.mnt_flags |= MNT_MARKED)
>  #define CLEAR_MNT_MARK(m) ((m)->mnt.mnt_flags &= ~MNT_MARKED)
>  
> -#define CL_EXPIRE0x01
> -#define CL_SLAVE 0x02
> -#define CL_COPY_UNBINDABLE   0x04
> -#define CL_MAKE_SHARED   0x08
> -#define CL_PRIVATE   0x10
> -#define CL_SHARED_TO_SLAVE   0x20
> -#define CL_UNPRIVILEGED  0x40
> -#define CL_COPY_MNT_NS_FILE  0x80
> +#define CL_EXPIRE0x001
> +#define CL_SLAVE 0x002
> +#define CL_COPY_UNBINDABLE   0x004
> +#define CL_MAKE_SHARED   0x008
> +#define CL_PRIVATE   0x010
> +#define CL_SHARED_TO_SLAVE   0x020
> +#define CL_UNPRIVILEGED  0x040
> +#define CL_COPY_MNT_NS_FILE  0x080
> +#define CL_MAKE_RDONLY   0x100
>  
>  #define CL_COPY_ALL  (CL_COPY_UNBINDABLE | CL_COPY_MNT_NS_FILE)
>  
> -- 
> 2.0.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Mateusz Guzik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 17/30] PCI/powerpc: Rename pcibios_root_bridge_prepare()

2015-02-25 Thread Yijing Wang

Pcibios_root_bridge_prepare() in powerpc is used
to set root bus speed. Rename it to
pcibios_set_root_bus_speed() for better readability.

Signed-off-by: Yijing Wang 
CC: Benjamin Herrenschmidt 
CC: linuxppc-...@lists.ozlabs.org
---
 arch/powerpc/include/asm/machdep.h   |2 +-
 arch/powerpc/kernel/pci-common.c |6 +++---
 arch/powerpc/platforms/pseries/pci.c |2 +-
 arch/powerpc/platforms/pseries/pseries.h |2 +-
 arch/powerpc/platforms/pseries/setup.c   |2 +-
 5 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h 
b/arch/powerpc/include/asm/machdep.h
index c8175a3..8e7f2a8 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -129,7 +129,7 @@ struct machdep_calls {
void(*pcibios_fixup)(void);
int (*pci_probe_mode)(struct pci_bus *);
void(*pci_irq_fixup)(struct pci_dev *dev);
-   int (*pcibios_root_bridge_prepare)(struct pci_host_bridge
+   int (*pcibios_set_root_bus_speed)(struct pci_host_bridge
*bridge);
 
/* To setup PHBs when using automatic OF platform driver for PCI */
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 78cd41b..4401b6a 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -769,9 +769,9 @@ int pci_proc_domain(struct pci_bus *bus)
 
 int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
 {
-   if (ppc_md.pcibios_root_bridge_prepare)
-   return ppc_md.pcibios_root_bridge_prepare(bridge);
-
+   if (ppc_md.pcibios_set_root_bus_speed)
+   return ppc_md.pcibios_set_root_bus_speed(bridge);
+
return 0;
 }
 
diff --git a/arch/powerpc/platforms/pseries/pci.c 
b/arch/powerpc/platforms/pseries/pci.c
index fe16a50..af685d6 100644
--- a/arch/powerpc/platforms/pseries/pci.c
+++ b/arch/powerpc/platforms/pseries/pci.c
@@ -110,7 +110,7 @@ static void fixup_winbond_82c105(struct pci_dev* dev)
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_WINBOND, PCI_DEVICE_ID_WINBOND_82C105,
 fixup_winbond_82c105);
 
-int pseries_root_bridge_prepare(struct pci_host_bridge *bridge)
+int pseries_set_root_bus_speed(struct pci_host_bridge *bridge)
 {
struct device_node *dn, *pdn;
struct pci_bus *bus;
diff --git a/arch/powerpc/platforms/pseries/pseries.h 
b/arch/powerpc/platforms/pseries/pseries.h
index 1796c54..5d0be3a 100644
--- a/arch/powerpc/platforms/pseries/pseries.h
+++ b/arch/powerpc/platforms/pseries/pseries.h
@@ -63,7 +63,7 @@ extern int dlpar_detach_node(struct device_node *);
 
 /* PCI root bridge prepare function override for pseries */
 struct pci_host_bridge;
-int pseries_root_bridge_prepare(struct pci_host_bridge *bridge);
+int pseries_set_root_bus_speed(struct pci_host_bridge *bridge);
 
 unsigned long pseries_memory_block_size(void);
 
diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index e445b67..b196c0d 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -496,7 +496,7 @@ static void __init pSeries_setup_arch(void)
ppc_md.enable_pmcs = power4_enable_pmcs;
}
 
-   ppc_md.pcibios_root_bridge_prepare = pseries_root_bridge_prepare;
+   ppc_md.pcibios_set_root_bus_speed = pseries_set_root_bus_speed;
 
if (firmware_has_feature(FW_FEATURE_SET_MODE)) {
long rc;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 >

1 - 100 of 878 matches

Mail list logo