date:20170910

Re: [PATCH] mmc: tegra: Mark 64 bit dma broken on Tegra186

2017-09-10 Thread Adrian Hunter

On 08/09/17 22:48, Krishna Reddy wrote:
> SDHCI controllers on Tegra186 support 40 bit addressing.
> IOVA addresses are 48-bit wide on Tegra186.
> SDHCI host common code sets dma mask as either 32-bit or 64-bit.
> To avoid access issues when SMMU is enabled, disable 64-bit dma.
> 
> Signed-off-by: Krishna Reddy 

Acked-by: Adrian Hunter 

> ---
>  drivers/mmc/host/sdhci-tegra.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
> index 0cd6fa80db66..b877c13184c2 100644
> --- a/drivers/mmc/host/sdhci-tegra.c
> +++ b/drivers/mmc/host/sdhci-tegra.c
> @@ -422,7 +422,15 @@ static const struct sdhci_pltfm_data 
> sdhci_tegra186_pdata = {
> SDHCI_QUIRK_NO_HISPD_BIT |
> SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC |
> SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
> - .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN,
> + .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
> +/* SDHCI controllers on Tegra186 support 40-bit addressing.
> + * IOVA addresses are 48-bit wide on Tegra186.
> + * With 64-bit dma mask used for SDHCI, accesses can
> + * be broken. Disable 64-bit dma, which would fall back
> + * to 32-bit dma mask. Ideally 40-bit dma mask would work,
> + * But it is not supported as of now.
> + */
> +SDHCI_QUIRK2_BROKEN_64_BIT_DMA,
>   .ops  = _sdhci_ops,
>  };
>  
>

Re: [PATCH] mmc: tegra: Mark 64 bit dma broken on Tegra186

2017-09-10 Thread Adrian Hunter

On 08/09/17 22:48, Krishna Reddy wrote:
> SDHCI controllers on Tegra186 support 40 bit addressing.
> IOVA addresses are 48-bit wide on Tegra186.
> SDHCI host common code sets dma mask as either 32-bit or 64-bit.
> To avoid access issues when SMMU is enabled, disable 64-bit dma.
> 
> Signed-off-by: Krishna Reddy 

Acked-by: Adrian Hunter 

> ---
>  drivers/mmc/host/sdhci-tegra.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
> index 0cd6fa80db66..b877c13184c2 100644
> --- a/drivers/mmc/host/sdhci-tegra.c
> +++ b/drivers/mmc/host/sdhci-tegra.c
> @@ -422,7 +422,15 @@ static const struct sdhci_pltfm_data 
> sdhci_tegra186_pdata = {
> SDHCI_QUIRK_NO_HISPD_BIT |
> SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC |
> SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
> - .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN,
> + .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
> +/* SDHCI controllers on Tegra186 support 40-bit addressing.
> + * IOVA addresses are 48-bit wide on Tegra186.
> + * With 64-bit dma mask used for SDHCI, accesses can
> + * be broken. Disable 64-bit dma, which would fall back
> + * to 32-bit dma mask. Ideally 40-bit dma mask would work,
> + * But it is not supported as of now.
> + */
> +SDHCI_QUIRK2_BROKEN_64_BIT_DMA,
>   .ops  = _sdhci_ops,
>  };
>  
>

Re: [PATCH] bnx2i: Clean up unused pointers in bnx2i_hwi

2017-09-10 Thread Rangankar, Manish


On 10/09/17 5:48 PM, "Christos Gkekas"  wrote:

>Pointers bnx2i_cmd are set but never used, so they can be removed.
>
>Signed-off-by: Christos Gkekas 
>---
> drivers/scsi/bnx2i/bnx2i_hwi.c | 10 --
> 1 file changed, 10 deletions(-)
>
>diff --git a/drivers/scsi/bnx2i/bnx2i_hwi.c
>b/drivers/scsi/bnx2i/bnx2i_hwi.c
>index 42921db..e3f22cb 100644
>--- a/drivers/scsi/bnx2i/bnx2i_hwi.c
>+++ b/drivers/scsi/bnx2i/bnx2i_hwi.c
>@@ -332,12 +332,10 @@ static void
>bnx2i_ring_dbell_update_sq_params(struct bnx2i_conn *bnx2i_conn,
> int bnx2i_send_iscsi_login(struct bnx2i_conn *bnx2i_conn,
>  struct iscsi_task *task)
> {
>-  struct bnx2i_cmd *bnx2i_cmd;
>   struct bnx2i_login_request *login_wqe;
>   struct iscsi_login_req *login_hdr;
>   u32 dword;
> 
>-  bnx2i_cmd = (struct bnx2i_cmd *)task->dd_data;
>   login_hdr = (struct iscsi_login_req *)task->hdr;
>   login_wqe = (struct bnx2i_login_request *)
>   bnx2i_conn->ep->qp.sq_prod_qe;
>@@ -391,12 +389,10 @@ int bnx2i_send_iscsi_tmf(struct bnx2i_conn
>*bnx2i_conn,
>   struct iscsi_tm *tmfabort_hdr;
>   struct scsi_cmnd *ref_sc;
>   struct iscsi_task *ctask;
>-  struct bnx2i_cmd *bnx2i_cmd;
>   struct bnx2i_tmf_request *tmfabort_wqe;
>   u32 dword;
>   u32 scsi_lun[2];
> 
>-  bnx2i_cmd = (struct bnx2i_cmd *)mtask->dd_data;
>   tmfabort_hdr = (struct iscsi_tm *)mtask->hdr;
>   tmfabort_wqe = (struct bnx2i_tmf_request *)
>   bnx2i_conn->ep->qp.sq_prod_qe;
>@@ -463,12 +459,10 @@ int bnx2i_send_iscsi_tmf(struct bnx2i_conn
>*bnx2i_conn,
> int bnx2i_send_iscsi_text(struct bnx2i_conn *bnx2i_conn,
> struct iscsi_task *mtask)
> {
>-  struct bnx2i_cmd *bnx2i_cmd;
>   struct bnx2i_text_request *text_wqe;
>   struct iscsi_text *text_hdr;
>   u32 dword;
> 
>-  bnx2i_cmd = (struct bnx2i_cmd *)mtask->dd_data;
>   text_hdr = (struct iscsi_text *)mtask->hdr;
>   text_wqe = (struct bnx2i_text_request *) bnx2i_conn->ep->qp.sq_prod_qe;
> 
>@@ -541,11 +535,9 @@ int bnx2i_send_iscsi_nopout(struct bnx2i_conn
>*bnx2i_conn,
>   char *datap, int data_len, int unsol)
> {
>   struct bnx2i_endpoint *ep = bnx2i_conn->ep;
>-  struct bnx2i_cmd *bnx2i_cmd;
>   struct bnx2i_nop_out_request *nopout_wqe;
>   struct iscsi_nopout *nopout_hdr;
> 
>-  bnx2i_cmd = (struct bnx2i_cmd *)task->dd_data;
>   nopout_hdr = (struct iscsi_nopout *)task->hdr;
>   nopout_wqe = (struct bnx2i_nop_out_request *)ep->qp.sq_prod_qe;
> 
>@@ -602,11 +594,9 @@ int bnx2i_send_iscsi_nopout(struct bnx2i_conn
>*bnx2i_conn,
> int bnx2i_send_iscsi_logout(struct bnx2i_conn *bnx2i_conn,
>   struct iscsi_task *task)
> {
>-  struct bnx2i_cmd *bnx2i_cmd;
>   struct bnx2i_logout_request *logout_wqe;
>   struct iscsi_logout *logout_hdr;
> 
>-  bnx2i_cmd = (struct bnx2i_cmd *)task->dd_data;
>   logout_hdr = (struct iscsi_logout *)task->hdr;
> 
>   logout_wqe = (struct bnx2i_logout_request *)
>-- 
>2.7.4

Thanks,

Acked-by: Manish Rangankar

Re: [PATCH 2/2] x86/hibernate/64: Mask off CR3's PCID bits in the saved CR3

2017-09-10 Thread Ingo Molnar


* Andy Lutomirski  wrote:

> On Fri, Sep 8, 2017 at 12:59 AM, Jiri Kosina  wrote:
> > On Thu, 7 Sep 2017, Andy Lutomirski wrote:
> >
> >> Jiri reported a resume-from-hibernation failure triggered by PCID.
> >> The root cause appears to be rather odd.  The hibernation asm
> >> restores a CR3 value that comes from the image header.  If the image
> >> kernel has PCID on, it's entirely reasonable for this CR3 value to
> >> have one of the low 12 bits set.  The restore code restores it with
> >> CR4.PCIDE=0, which means that those low 12 bits are accepted by the
> >> CPU but are either ignored or interpreted as a caching mode.  This
> >> is odd, but still works.  We blow up later when the image kernel
> >> restores CR4, though, since changing CR4.PCIDE with CR3[11:0] != 0
> >> is illegal.  Boom!
> >>
> >> FWIW, it's entirely unclear to me what's supposed to happen if a PAE
> >> kernel restores a non-PAE image or vice versa.  Ditto for LA57.
> >
> > I've just performed 15 hibernation cycles with current Linus' tree
> > (5969d1bb3082) with these two patches applied on top of it, and I haven't
> > encountered any issue (and the warning in switch_mm_irqs_off() didn't
> > trigger either).
> >
> >> Reported-by: Jiri Kosina 
> >> Fixes: 660da7c9228f ("x86/mm: Enable CR4.PCIDE on supported systems")
> >> Signed-off-by: Andy Lutomirski 
> >
> > Tested-by: Jiri Kosina 
> >
> 
> Ingo, please do *not* apply this patch yet.  The code is fine, but the
> comment is about to become wrong.  I just found a nasty initialization
> order issue, and I need to rework a bunch of the way we deal with
> PCIDE.

Ok, I'll delay everything PCID delayed - once you've gathered it all together 
please send a full series against Linus's latest collecting all the 
fixes/cleanups.

If you find unexpected complications then there will be a point in time where 
it 
might be better to just disable PCID for this release and re-try in v4.15. As 
the 
number and complexity of fixes increases so does the risk that we'll introduce 
some last-minute regression.

Thanks,

Ingo

Re: [PATCH] bnx2i: Clean up unused pointers in bnx2i_hwi

2017-09-10 Thread Rangankar, Manish


On 10/09/17 5:48 PM, "Christos Gkekas"  wrote:

>Pointers bnx2i_cmd are set but never used, so they can be removed.
>
>Signed-off-by: Christos Gkekas 
>---
> drivers/scsi/bnx2i/bnx2i_hwi.c | 10 --
> 1 file changed, 10 deletions(-)
>
>diff --git a/drivers/scsi/bnx2i/bnx2i_hwi.c
>b/drivers/scsi/bnx2i/bnx2i_hwi.c
>index 42921db..e3f22cb 100644
>--- a/drivers/scsi/bnx2i/bnx2i_hwi.c
>+++ b/drivers/scsi/bnx2i/bnx2i_hwi.c
>@@ -332,12 +332,10 @@ static void
>bnx2i_ring_dbell_update_sq_params(struct bnx2i_conn *bnx2i_conn,
> int bnx2i_send_iscsi_login(struct bnx2i_conn *bnx2i_conn,
>  struct iscsi_task *task)
> {
>-  struct bnx2i_cmd *bnx2i_cmd;
>   struct bnx2i_login_request *login_wqe;
>   struct iscsi_login_req *login_hdr;
>   u32 dword;
> 
>-  bnx2i_cmd = (struct bnx2i_cmd *)task->dd_data;
>   login_hdr = (struct iscsi_login_req *)task->hdr;
>   login_wqe = (struct bnx2i_login_request *)
>   bnx2i_conn->ep->qp.sq_prod_qe;
>@@ -391,12 +389,10 @@ int bnx2i_send_iscsi_tmf(struct bnx2i_conn
>*bnx2i_conn,
>   struct iscsi_tm *tmfabort_hdr;
>   struct scsi_cmnd *ref_sc;
>   struct iscsi_task *ctask;
>-  struct bnx2i_cmd *bnx2i_cmd;
>   struct bnx2i_tmf_request *tmfabort_wqe;
>   u32 dword;
>   u32 scsi_lun[2];
> 
>-  bnx2i_cmd = (struct bnx2i_cmd *)mtask->dd_data;
>   tmfabort_hdr = (struct iscsi_tm *)mtask->hdr;
>   tmfabort_wqe = (struct bnx2i_tmf_request *)
>   bnx2i_conn->ep->qp.sq_prod_qe;
>@@ -463,12 +459,10 @@ int bnx2i_send_iscsi_tmf(struct bnx2i_conn
>*bnx2i_conn,
> int bnx2i_send_iscsi_text(struct bnx2i_conn *bnx2i_conn,
> struct iscsi_task *mtask)
> {
>-  struct bnx2i_cmd *bnx2i_cmd;
>   struct bnx2i_text_request *text_wqe;
>   struct iscsi_text *text_hdr;
>   u32 dword;
> 
>-  bnx2i_cmd = (struct bnx2i_cmd *)mtask->dd_data;
>   text_hdr = (struct iscsi_text *)mtask->hdr;
>   text_wqe = (struct bnx2i_text_request *) bnx2i_conn->ep->qp.sq_prod_qe;
> 
>@@ -541,11 +535,9 @@ int bnx2i_send_iscsi_nopout(struct bnx2i_conn
>*bnx2i_conn,
>   char *datap, int data_len, int unsol)
> {
>   struct bnx2i_endpoint *ep = bnx2i_conn->ep;
>-  struct bnx2i_cmd *bnx2i_cmd;
>   struct bnx2i_nop_out_request *nopout_wqe;
>   struct iscsi_nopout *nopout_hdr;
> 
>-  bnx2i_cmd = (struct bnx2i_cmd *)task->dd_data;
>   nopout_hdr = (struct iscsi_nopout *)task->hdr;
>   nopout_wqe = (struct bnx2i_nop_out_request *)ep->qp.sq_prod_qe;
> 
>@@ -602,11 +594,9 @@ int bnx2i_send_iscsi_nopout(struct bnx2i_conn
>*bnx2i_conn,
> int bnx2i_send_iscsi_logout(struct bnx2i_conn *bnx2i_conn,
>   struct iscsi_task *task)
> {
>-  struct bnx2i_cmd *bnx2i_cmd;
>   struct bnx2i_logout_request *logout_wqe;
>   struct iscsi_logout *logout_hdr;
> 
>-  bnx2i_cmd = (struct bnx2i_cmd *)task->dd_data;
>   logout_hdr = (struct iscsi_logout *)task->hdr;
> 
>   logout_wqe = (struct bnx2i_logout_request *)
>-- 
>2.7.4

Thanks,

Acked-by: Manish Rangankar

Re: [PATCH 2/2] x86/hibernate/64: Mask off CR3's PCID bits in the saved CR3

2017-09-10 Thread Ingo Molnar


* Andy Lutomirski  wrote:

> On Fri, Sep 8, 2017 at 12:59 AM, Jiri Kosina  wrote:
> > On Thu, 7 Sep 2017, Andy Lutomirski wrote:
> >
> >> Jiri reported a resume-from-hibernation failure triggered by PCID.
> >> The root cause appears to be rather odd.  The hibernation asm
> >> restores a CR3 value that comes from the image header.  If the image
> >> kernel has PCID on, it's entirely reasonable for this CR3 value to
> >> have one of the low 12 bits set.  The restore code restores it with
> >> CR4.PCIDE=0, which means that those low 12 bits are accepted by the
> >> CPU but are either ignored or interpreted as a caching mode.  This
> >> is odd, but still works.  We blow up later when the image kernel
> >> restores CR4, though, since changing CR4.PCIDE with CR3[11:0] != 0
> >> is illegal.  Boom!
> >>
> >> FWIW, it's entirely unclear to me what's supposed to happen if a PAE
> >> kernel restores a non-PAE image or vice versa.  Ditto for LA57.
> >
> > I've just performed 15 hibernation cycles with current Linus' tree
> > (5969d1bb3082) with these two patches applied on top of it, and I haven't
> > encountered any issue (and the warning in switch_mm_irqs_off() didn't
> > trigger either).
> >
> >> Reported-by: Jiri Kosina 
> >> Fixes: 660da7c9228f ("x86/mm: Enable CR4.PCIDE on supported systems")
> >> Signed-off-by: Andy Lutomirski 
> >
> > Tested-by: Jiri Kosina 
> >
> 
> Ingo, please do *not* apply this patch yet.  The code is fine, but the
> comment is about to become wrong.  I just found a nasty initialization
> order issue, and I need to rework a bunch of the way we deal with
> PCIDE.

Ok, I'll delay everything PCID delayed - once you've gathered it all together 
please send a full series against Linus's latest collecting all the 
fixes/cleanups.

If you find unexpected complications then there will be a point in time where 
it 
might be better to just disable PCID for this release and re-try in v4.15. As 
the 
number and complexity of fixes increases so does the risk that we'll introduce 
some last-minute regression.

Thanks,

Ingo

Re: [PATCH v2 1/2] completion: Add support for initializing completion with lockdep_map

2017-09-10 Thread Byungchul Park

On Fri, Sep 08, 2017 at 05:46:24PM +0900, Byungchul Park wrote:
> To peter,
> 
> Does it work?
> 
> Chagnes from v1
>  - Add a completion initialization function with a lockdep map
>  - Enhance readability of the workqueue code

Do not you like it becasue it introduces another init_completion_xxx()?

Anyway, I think the redundant acquisitions in the workqueue code should
be removed. Right?

> ->8-
> >From e148617e20ebc9c9eefe7bb222b9bba07cb963bc Mon Sep 17 00:00:00 2001
> From: Byungchul Park 
> Date: Fri, 8 Sep 2017 17:39:48 +0900
> Subject: [PATCH v2 1/2] completion: Add support for initializing completion
>  with lockdep_map
> 
> Sometimes, we want to initialize completions with sparate lockdep maps
> to assign lock classes under control. For example, the workqueue code
> manages lockdep maps, as it can classify lockdep maps properly.
> Provided a function for that purpose.
> 
> Signed-off-by: Byungchul Park 
> ---
>  include/linux/completion.h | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/include/linux/completion.h b/include/linux/completion.h
> index cae5400..182d56e 100644
> --- a/include/linux/completion.h
> +++ b/include/linux/completion.h
> @@ -49,6 +49,13 @@ static inline void complete_release_commit(struct 
> completion *x)
>   lock_commit_crosslock((struct lockdep_map *)>map);
>  }
>  
> +#define init_completion_with_map(x, m)   
> \
> +do { \
> + lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map, \
> + (m)->name, (m)->key, 0);
> \
> + __init_completion(x);   \
> +} while (0)
> +
>  #define init_completion(x)   \
>  do { \
>   static struct lock_class_key __key; \
> @@ -58,6 +65,7 @@ static inline void complete_release_commit(struct 
> completion *x)
>   __init_completion(x);   \
>  } while (0)
>  #else
> +#define init_completion_with_map(x, m) __init_completion(x)
>  #define init_completion(x) __init_completion(x)
>  static inline void complete_acquire(struct completion *x) {}
>  static inline void complete_release(struct completion *x) {}
> -- 
> 1.9.1

Re: [PATCH v2 1/2] completion: Add support for initializing completion with lockdep_map

2017-09-10 Thread Byungchul Park

On Fri, Sep 08, 2017 at 05:46:24PM +0900, Byungchul Park wrote:
> To peter,
> 
> Does it work?
> 
> Chagnes from v1
>  - Add a completion initialization function with a lockdep map
>  - Enhance readability of the workqueue code

Do not you like it becasue it introduces another init_completion_xxx()?

Anyway, I think the redundant acquisitions in the workqueue code should
be removed. Right?

> ->8-
> >From e148617e20ebc9c9eefe7bb222b9bba07cb963bc Mon Sep 17 00:00:00 2001
> From: Byungchul Park 
> Date: Fri, 8 Sep 2017 17:39:48 +0900
> Subject: [PATCH v2 1/2] completion: Add support for initializing completion
>  with lockdep_map
> 
> Sometimes, we want to initialize completions with sparate lockdep maps
> to assign lock classes under control. For example, the workqueue code
> manages lockdep maps, as it can classify lockdep maps properly.
> Provided a function for that purpose.
> 
> Signed-off-by: Byungchul Park 
> ---
>  include/linux/completion.h | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/include/linux/completion.h b/include/linux/completion.h
> index cae5400..182d56e 100644
> --- a/include/linux/completion.h
> +++ b/include/linux/completion.h
> @@ -49,6 +49,13 @@ static inline void complete_release_commit(struct 
> completion *x)
>   lock_commit_crosslock((struct lockdep_map *)>map);
>  }
>  
> +#define init_completion_with_map(x, m)   
> \
> +do { \
> + lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map, \
> + (m)->name, (m)->key, 0);
> \
> + __init_completion(x);   \
> +} while (0)
> +
>  #define init_completion(x)   \
>  do { \
>   static struct lock_class_key __key; \
> @@ -58,6 +65,7 @@ static inline void complete_release_commit(struct 
> completion *x)
>   __init_completion(x);   \
>  } while (0)
>  #else
> +#define init_completion_with_map(x, m) __init_completion(x)
>  #define init_completion(x) __init_completion(x)
>  static inline void complete_acquire(struct completion *x) {}
>  static inline void complete_release(struct completion *x) {}
> -- 
> 1.9.1

Re: [PATCH] sysrq : fix Show Regs call trace on ARM

2017-09-10 Thread Jiri Slaby

On 09/11/2017, 05:11 AM, Jibin Xu wrote:
...
> --- a/drivers/tty/sysrq.c
> +++ b/drivers/tty/sysrq.c
> @@ -245,8 +245,10 @@ static void sysrq_handle_showallcpus(int key)
>* architecture has no support for it:
>*/
>   if (!trigger_all_cpu_backtrace()) {
> - struct pt_regs *regs = get_irq_regs();
> + struct pt_regs *regs = NULL;
>  
> + if (in_irq())
> + regs = get_irq_regs();

Maybe a stupid question: how does get_irq_regs() behave in the softirq
context? I.e. what about s/in_irq/in_interrupt/?

thanks,
-- 
js
suse labs

Re: [PATCH] sysrq : fix Show Regs call trace on ARM

2017-09-10 Thread Jiri Slaby

On 09/11/2017, 05:11 AM, Jibin Xu wrote:
...
> --- a/drivers/tty/sysrq.c
> +++ b/drivers/tty/sysrq.c
> @@ -245,8 +245,10 @@ static void sysrq_handle_showallcpus(int key)
>* architecture has no support for it:
>*/
>   if (!trigger_all_cpu_backtrace()) {
> - struct pt_regs *regs = get_irq_regs();
> + struct pt_regs *regs = NULL;
>  
> + if (in_irq())
> + regs = get_irq_regs();

Maybe a stupid question: how does get_irq_regs() behave in the softirq
context? I.e. what about s/in_irq/in_interrupt/?

thanks,
-- 
js
suse labs

Re: btusb "firmware request while host is not available" at resume

2017-09-10 Thread Marcel Holtmann

Hi Linus,

>> Yes 81f95076281f is to blame.. After reverting it all is fine again.
>> 
>> 15 resume cycles on the one laptop , 10 on the other without to hit the
>> trace.
> 
> Yeah, I think that disable/enable_firmware in the suspend/resume path
> is basically just completely random code. There  is nothing that
> serializes with anything else, and it has no actual basis for saying
> "I am now disabled/enabled" except for some entirely random time of
> whenever the suspend/resume callbacks happen.
> 
> I'm going to revert it.
> 
> I wonder why 4.13 seemed to work for me. Or maybe it didn't, and I
> just didn't notice, because I didn't use bluetooth and I wasn't
> traveling. I test my laptop every release, but I don't necessarily
> always _use_ it.

we have a bug report that might go into the similar direction. So the problem 
is that modern Bluetooth controller require full firmware loading (not just ROM 
patching) and if the controller has the firmware somehow already loaded, but 
then looses power and needs a reload, then it is not cached (since it was never 
requested in the first place).

Of course if the reload triggers during resume when not file system is 
available, it can not request the firmware. That will just fail and thus you 
might see this issue. We have a few RFC patches on the mailing list that I have 
to review. It is however not that easy all the time to find the right firmware 
file (at least not for Intel hardware) since the boot loader provides different 
information than the fully operational firmware. So I need to make sure that 
request the right firmware (if we do not need it initially) so that it gets 
cached.

Regards

Marcel

Re: btusb "firmware request while host is not available" at resume

2017-09-10 Thread Marcel Holtmann

Hi Linus,

>> Yes 81f95076281f is to blame.. After reverting it all is fine again.
>> 
>> 15 resume cycles on the one laptop , 10 on the other without to hit the
>> trace.
> 
> Yeah, I think that disable/enable_firmware in the suspend/resume path
> is basically just completely random code. There  is nothing that
> serializes with anything else, and it has no actual basis for saying
> "I am now disabled/enabled" except for some entirely random time of
> whenever the suspend/resume callbacks happen.
> 
> I'm going to revert it.
> 
> I wonder why 4.13 seemed to work for me. Or maybe it didn't, and I
> just didn't notice, because I didn't use bluetooth and I wasn't
> traveling. I test my laptop every release, but I don't necessarily
> always _use_ it.

we have a bug report that might go into the similar direction. So the problem 
is that modern Bluetooth controller require full firmware loading (not just ROM 
patching) and if the controller has the firmware somehow already loaded, but 
then looses power and needs a reload, then it is not cached (since it was never 
requested in the first place).

Of course if the reload triggers during resume when not file system is 
available, it can not request the firmware. That will just fail and thus you 
might see this issue. We have a few RFC patches on the mailing list that I have 
to review. It is however not that easy all the time to find the right firmware 
file (at least not for Intel hardware) since the boot loader provides different 
information than the fully operational firmware. So I need to make sure that 
request the right firmware (if we do not need it initially) so that it gets 
cached.

Regards

Marcel

Questions about NVMEM

2017-09-10 Thread Masahiro Yamada

Hi Srinivas,


I have 3 questions about the nvmem sybsystem.

Please correct me if something is missing from my thought.




(Q1) How to allocate struct nvmem_config?

I see 3 ways in allocating struct nvmem_config.
What is a good / bad practice?


   (A)  Allocate statically in .data section

bcm-ocotp.c
imx-ocotp.c
lpc18xx_eeprom.c
lpc18xx_otp.c
mxs-octop.c
qfprom.c
rockchip-efuse.c
sunxi_sid
vf610-ocotp.c
meson-efuse.c

   (B) devm_kzalloc()

   imx-iim.c
   mtk-efuse.c
   drivers/misc/eeprom/at24.c

   (C) Stack

   drivers/thunderbolt/switch.c



I think (A) is safe only when we know the system has
just one instance of the device.
(A) should not be used if two or more instances exist.
Is this correct?


I think (B) is wasting memory because nvmem_register()
copies all members of nvmem_config to nvmem_device.
nvmem_config is never dereferenced after nvmem_register() finished.
I do not see much sense to keep it until the driver is detached.



(C) looks reasonable because nvmem_config is pretty small.
(sizeof(struct nvmem_config) = 104 byte on 64bit systems)

Several subsystems receive configuration data from stack,
for example,

   "struct clk_init_data" in clk drivers,
   "struct uart_8250_port" in 8250 serial drivers.

sizeof(struct uart_8250_port) = 528 byte,
but it is still working in stack.





(Q2) Is nvmem_config::read_only necessary?

If .reg_write() callback is set, it is probably writable.
If .reg_write() is missing, it must be read-only.

I have no idea when nvmem_config::read_only is useful...





(Q3) The style of  drivers/nvmem/Makefile

This Makefile looks ugly to me.
All nvmem drivers are just single file modules.
Why are they renamed when modules are created?

For the name-space reason for modules,
prefix "nvmem-" makes sense to me.

It is true that adding "nvmem-" prefix is redundant while
they are located in drivers/nvmem/ directory,
but renaming in the Makefile is even more annoying to me.
Having said that, we may not want to churn this.





-- 
Best Regards
Masahiro Yamada

Questions about NVMEM

2017-09-10 Thread Masahiro Yamada

Hi Srinivas,


I have 3 questions about the nvmem sybsystem.

Please correct me if something is missing from my thought.




(Q1) How to allocate struct nvmem_config?

I see 3 ways in allocating struct nvmem_config.
What is a good / bad practice?


   (A)  Allocate statically in .data section

bcm-ocotp.c
imx-ocotp.c
lpc18xx_eeprom.c
lpc18xx_otp.c
mxs-octop.c
qfprom.c
rockchip-efuse.c
sunxi_sid
vf610-ocotp.c
meson-efuse.c

   (B) devm_kzalloc()

   imx-iim.c
   mtk-efuse.c
   drivers/misc/eeprom/at24.c

   (C) Stack

   drivers/thunderbolt/switch.c



I think (A) is safe only when we know the system has
just one instance of the device.
(A) should not be used if two or more instances exist.
Is this correct?


I think (B) is wasting memory because nvmem_register()
copies all members of nvmem_config to nvmem_device.
nvmem_config is never dereferenced after nvmem_register() finished.
I do not see much sense to keep it until the driver is detached.



(C) looks reasonable because nvmem_config is pretty small.
(sizeof(struct nvmem_config) = 104 byte on 64bit systems)

Several subsystems receive configuration data from stack,
for example,

   "struct clk_init_data" in clk drivers,
   "struct uart_8250_port" in 8250 serial drivers.

sizeof(struct uart_8250_port) = 528 byte,
but it is still working in stack.





(Q2) Is nvmem_config::read_only necessary?

If .reg_write() callback is set, it is probably writable.
If .reg_write() is missing, it must be read-only.

I have no idea when nvmem_config::read_only is useful...





(Q3) The style of  drivers/nvmem/Makefile

This Makefile looks ugly to me.
All nvmem drivers are just single file modules.
Why are they renamed when modules are created?

For the name-space reason for modules,
prefix "nvmem-" makes sense to me.

It is true that adding "nvmem-" prefix is redundant while
they are located in drivers/nvmem/ directory,
but renaming in the Makefile is even more annoying to me.
Having said that, we may not want to churn this.





-- 
Best Regards
Masahiro Yamada

Re: btusb "firmware request while host is not available" at resume

2017-09-10 Thread Linus Torvalds

On Sun, Sep 10, 2017 at 8:49 PM, Gabriel C  wrote:
>
> Yes 81f95076281f is to blame.. After reverting it all is fine again.
>
> 15 resume cycles on the one laptop , 10 on the other without to hit the
> trace.

Yeah, I think that disable/enable_firmware in the suspend/resume path
is basically just completely random code. There  is nothing that
serializes with anything else, and it has no actual basis for saying
"I am now disabled/enabled" except for some entirely random time of
whenever the suspend/resume callbacks happen.

I'm going to revert it.

I wonder why 4.13 seemed to work for me. Or maybe it didn't, and I
just didn't notice, because I didn't use bluetooth and I wasn't
traveling. I test my laptop every release, but I don't necessarily
always _use_ it.

   Linus

Re: btusb "firmware request while host is not available" at resume

2017-09-10 Thread Linus Torvalds

On Sun, Sep 10, 2017 at 8:49 PM, Gabriel C  wrote:
>
> Yes 81f95076281f is to blame.. After reverting it all is fine again.
>
> 15 resume cycles on the one laptop , 10 on the other without to hit the
> trace.

Yeah, I think that disable/enable_firmware in the suspend/resume path
is basically just completely random code. There  is nothing that
serializes with anything else, and it has no actual basis for saying
"I am now disabled/enabled" except for some entirely random time of
whenever the suspend/resume callbacks happen.

I'm going to revert it.

I wonder why 4.13 seemed to work for me. Or maybe it didn't, and I
just didn't notice, because I didn't use bluetooth and I wasn't
traveling. I test my laptop every release, but I don't necessarily
always _use_ it.

   Linus

Re: [RFC v2 PATCH] x86/boot: Add the secdata section to the setup header

2017-09-10 Thread Gary Lin

On Fri, Sep 08, 2017 at 01:59:00PM -0700, h...@zytor.com wrote:
> On September 8, 2017 2:45:10 AM PDT, Gary Lin  wrote:
> >On Thu, Sep 07, 2017 at 02:16:21PM -0700, h...@zytor.com wrote:
> >> On September 7, 2017 2:44:51 AM PDT, Gary Lin  wrote:
> >> >On Thu, Jun 01, 2017 at 08:46:26AM +, Ard Biesheuvel wrote:
> >> >> On 1 June 2017 at 08:11, Gary Lin  wrote:
> >> >> > On Fri, May 12, 2017 at 04:05:34PM +0800, Gary Lin wrote:
> >> >> >> A new section, secdata, in the setup header is introduced to
> >store
> >> >the
> >> >> >> distro-specific security version which is designed to help the
> >> >> >> bootloader to warn the user when loading a less secure or
> >> >vulnerable
> >> >> >> kernel. The secdata section can be presented as the following:
> >> >> >>
> >> >> >> struct sec_hdr {
> >> >> >>   __u16 header_length;
> >> >> >>   __u32 distro_version;
> >> >> >>   __u16 security_version;
> >> >> >> } __attribute__((packed));
> >> >> >> char *signer;
> >> >> >>
> >> >> >> It consists of a fixed size structure and a null-terminated
> >> >string.
> >> >> >> "header_length" is the size of "struct sec_hdr" and can be used
> >as
> >> >the
> >> >> >> offset to "signer". It also can be a kind of the "header
> >version"
> >> >to
> >> >> >> detect if any new member is introduced.
> >> >> >>
> >> >> >> The kernel packager of the distribution can put the distro name
> >in
> >> >> >> "signer" and the distro version in "distro_version". When a
> >severe
> >> >> >> vulnerability is fixed, the packager increases
> >"security_version"
> >> >in
> >> >> >> the kernel build afterward. The bootloader can maintain a list
> >of
> >> >the
> >> >> >> security versions of the current kernels and only allows the
> >> >kernel with
> >> >> >> a higher or equal security version to boot. If the user is
> >going
> >> >to boot
> >> >> >> a kernel with a lower security version, a warning should show
> >to
> >> >prevent
> >> >> >> the user from loading a vulnerable kernel accidentally.
> >> >> >>
> >> >> >> Enabling UEFI Secure Boot is recommended when using the
> >security
> >> >version
> >> >> >> or the attacker may alter the security version stealthily.
> >> >> >>
> >> >> > Any comment?
> >> >> >
> >> >> 
> >> >> This is now entirely x86-specific. My preference would be to have
> >a
> >> >> generic solution instead.
> >> >> 
> >> >After check the headers again, another idea came to my mind: the
> >MS-DOS
> >> >stub. It's designed to show a warning while the image is loaded in
> >> >DOS(*),
> >> >but I wonder if it still matters. In the x86 linux efi header, the
> >stub
> >> >is just a 3-lines message, while arm64 completely ignores the stub.
> >> >
> >> >Since there is a offset to the PE header at 0x3c, we can
> >theoretically
> >> >put any thing between 0x40 and the PE header without affecting the
> >> >current settings.
> >> >
> >> >HPA,
> >> >
> >> >Does the MS-DOS stub raise any concern to you?
> >> >
> >> >Thanks,
> >> >
> >> >Gary Lin
> >> >
> >> >(*)
> >>
> >>https://msdn.microsoft.com/zh-tw/library/windows/desktop/ms680547(v=vs.85).aspx#ms-dos_stub__image_only_
> >> >
> >> >> -- 
> >> >> Ard.
> >> >> 
> >> >> 
> >> >> >> v2:
> >> >> >> - Decrease the size of secdata_offset to 2 bytes since the
> >setup
> >> >header
> >> >> >>   is limited to around 32KB.
> >> >> >> - Restructure the secdata section. The signer is now a
> >> >null-terminated
> >> >> >>   string. The type of distro_version changes to u32 in case the
> >> >distro
> >> >> >>   uses a long version.
> >> >> >> - Modify the Kconfig names and add help.
> >> >> >> - Remove the signer name hack in build.c.
> >> >> >>
> >> >> >> Cc: Ard Biesheuvel 
> >> >> >> Cc: "H. Peter Anvin" 
> >> >> >> Cc: Thomas Gleixner 
> >> >> >> Cc: Ingo Molnar 
> >> >> >> Cc: Joey Lee 
> >> >> >> Signed-off-by: Gary Lin 
> >> >> >> ---
> >[snip]
> >> >> >> --
> >> >> >> 2.12.2
> >> >> >>
> >> >> 
> >> 
> >> I really don't think that is a good idea.  I would much rather keep
> >this in a space we fully own.
> >Fine. I'll find another place for ARM64 (probably append the structure
> >right after the PE-header and denote the 2-byte offset in the reserved
> >fields in the first 64 bytes header).
> >
> >Thanks,
> >
> >Gary Lin
> 
> Another "safe" option would be to put it in a COFF segment; then it would be 
> system-independent.

Creating a new COFF section looks promising. Thanks for pointing the direction.

Gary Lin

Re: [RFC v2 PATCH] x86/boot: Add the secdata section to the setup header

2017-09-10 Thread Gary Lin

On Fri, Sep 08, 2017 at 01:59:00PM -0700, h...@zytor.com wrote:
> On September 8, 2017 2:45:10 AM PDT, Gary Lin  wrote:
> >On Thu, Sep 07, 2017 at 02:16:21PM -0700, h...@zytor.com wrote:
> >> On September 7, 2017 2:44:51 AM PDT, Gary Lin  wrote:
> >> >On Thu, Jun 01, 2017 at 08:46:26AM +, Ard Biesheuvel wrote:
> >> >> On 1 June 2017 at 08:11, Gary Lin  wrote:
> >> >> > On Fri, May 12, 2017 at 04:05:34PM +0800, Gary Lin wrote:
> >> >> >> A new section, secdata, in the setup header is introduced to
> >store
> >> >the
> >> >> >> distro-specific security version which is designed to help the
> >> >> >> bootloader to warn the user when loading a less secure or
> >> >vulnerable
> >> >> >> kernel. The secdata section can be presented as the following:
> >> >> >>
> >> >> >> struct sec_hdr {
> >> >> >>   __u16 header_length;
> >> >> >>   __u32 distro_version;
> >> >> >>   __u16 security_version;
> >> >> >> } __attribute__((packed));
> >> >> >> char *signer;
> >> >> >>
> >> >> >> It consists of a fixed size structure and a null-terminated
> >> >string.
> >> >> >> "header_length" is the size of "struct sec_hdr" and can be used
> >as
> >> >the
> >> >> >> offset to "signer". It also can be a kind of the "header
> >version"
> >> >to
> >> >> >> detect if any new member is introduced.
> >> >> >>
> >> >> >> The kernel packager of the distribution can put the distro name
> >in
> >> >> >> "signer" and the distro version in "distro_version". When a
> >severe
> >> >> >> vulnerability is fixed, the packager increases
> >"security_version"
> >> >in
> >> >> >> the kernel build afterward. The bootloader can maintain a list
> >of
> >> >the
> >> >> >> security versions of the current kernels and only allows the
> >> >kernel with
> >> >> >> a higher or equal security version to boot. If the user is
> >going
> >> >to boot
> >> >> >> a kernel with a lower security version, a warning should show
> >to
> >> >prevent
> >> >> >> the user from loading a vulnerable kernel accidentally.
> >> >> >>
> >> >> >> Enabling UEFI Secure Boot is recommended when using the
> >security
> >> >version
> >> >> >> or the attacker may alter the security version stealthily.
> >> >> >>
> >> >> > Any comment?
> >> >> >
> >> >> 
> >> >> This is now entirely x86-specific. My preference would be to have
> >a
> >> >> generic solution instead.
> >> >> 
> >> >After check the headers again, another idea came to my mind: the
> >MS-DOS
> >> >stub. It's designed to show a warning while the image is loaded in
> >> >DOS(*),
> >> >but I wonder if it still matters. In the x86 linux efi header, the
> >stub
> >> >is just a 3-lines message, while arm64 completely ignores the stub.
> >> >
> >> >Since there is a offset to the PE header at 0x3c, we can
> >theoretically
> >> >put any thing between 0x40 and the PE header without affecting the
> >> >current settings.
> >> >
> >> >HPA,
> >> >
> >> >Does the MS-DOS stub raise any concern to you?
> >> >
> >> >Thanks,
> >> >
> >> >Gary Lin
> >> >
> >> >(*)
> >>
> >>https://msdn.microsoft.com/zh-tw/library/windows/desktop/ms680547(v=vs.85).aspx#ms-dos_stub__image_only_
> >> >
> >> >> -- 
> >> >> Ard.
> >> >> 
> >> >> 
> >> >> >> v2:
> >> >> >> - Decrease the size of secdata_offset to 2 bytes since the
> >setup
> >> >header
> >> >> >>   is limited to around 32KB.
> >> >> >> - Restructure the secdata section. The signer is now a
> >> >null-terminated
> >> >> >>   string. The type of distro_version changes to u32 in case the
> >> >distro
> >> >> >>   uses a long version.
> >> >> >> - Modify the Kconfig names and add help.
> >> >> >> - Remove the signer name hack in build.c.
> >> >> >>
> >> >> >> Cc: Ard Biesheuvel 
> >> >> >> Cc: "H. Peter Anvin" 
> >> >> >> Cc: Thomas Gleixner 
> >> >> >> Cc: Ingo Molnar 
> >> >> >> Cc: Joey Lee 
> >> >> >> Signed-off-by: Gary Lin 
> >> >> >> ---
> >[snip]
> >> >> >> --
> >> >> >> 2.12.2
> >> >> >>
> >> >> 
> >> 
> >> I really don't think that is a good idea.  I would much rather keep
> >this in a space we fully own.
> >Fine. I'll find another place for ARM64 (probably append the structure
> >right after the PE-header and denote the 2-byte offset in the reserved
> >fields in the first 64 bytes header).
> >
> >Thanks,
> >
> >Gary Lin
> 
> Another "safe" option would be to put it in a COFF segment; then it would be 
> system-independent.

Creating a new COFF section looks promising. Thanks for pointing the direction.

Gary Lin

Re: btusb "firmware request while host is not available" at resume

2017-09-10 Thread Gabriel C


On 11.09.2017 05:15, Gabriel C wrote:

On 11.09.2017 03:25, Greg Kroah-Hartman wrote:

On Sun, Sep 10, 2017 at 12:26:02PM -0700, Linus Torvalds wrote:

This seems to be a new problem at resume for the Intel btusb driver,
but I'm not seeing anything in that driver itself that looks like a
likely trigger, so I wonder if it's some driver core change, a generic
resume path issue, or a workqueue change that has made it trigger for
me.

It might also just be a timing difference, maybe it's always been there?

Does anybody have any ideas? It does't happen on every resume, and the
machine works despite this (but no bluetooth - the *next* resume might
bring it back, though).


Ah, it's not just me having this problem.  I don't see it happening in
4.12, and haven't had the time to bisect it.  I seem to be able to
trigger it every suspend/resume cycle, so I don't know if it's a timing
issue.



I see the same problem with QCA hardware.. but a bit different.

On first resume cycle the firmware call is fine but the adapter dies a bit 
later with :


  'Bluetooth: hci0: Failed to send body at 4 of 1857 (-110)'

On second resume cycle I hit the trace too.






  Linus

--

   ACPI: Low-level resume complete
   ACPI: EC: EC started
   PM: Restoring platform NVS memory
   Enabling non-boot CPUs ...
   x86: Booting SMP configuration:
   smpboot: Booting Node 0 Processor 1 APIC 0x2
    cache: parent cpu1 should not be sleeping
   CPU1 is up
   smpboot: Booting Node 0 Processor 2 APIC 0x1
    cache: parent cpu2 should not be sleeping
   CPU2 is up
   smpboot: Booting Node 0 Processor 3 APIC 0x3
    cache: parent cpu3 should not be sleeping
   CPU3 is up
   ACPI: Waking up from system sleep state S3
   ACPI: EC: event unblocked
   usb 1-3: reset full-speed USB device number 2 using xhci_hcd
   usb 1-4: reset full-speed USB device number 3 using xhci_hcd
   usb 1-5: reset high-speed USB device number 4 using xhci_hcd
   usb 1-3:1.0: rebind failed: -517
   usb 1-3:1.1: rebind failed: -517
   Bluetooth: hci0: Bootloader revision 0.0 build 2 week 52 2014
   OOM killer enabled.
   Restarting tasks ...
   Bluetooth: hci0: Device revision is 5
   Bluetooth: hci0: Secure boot is enabled
   Bluetooth: hci0: OTP lock is enabled
   Bluetooth: hci0: API lock is enabled
   Bluetooth: hci0: Debug lock is disabled
   Bluetooth: hci0: Minimum firmware build 1 week 10 2014
   firmware request while host is not available
   [ cut here ]
   WARNING: CPU: 2 PID: 621 at drivers/base/firmware_class.c:1250
_request_firmware+0x460/0x790
   CPU: 2 PID: 621 Comm: kworker/u9:2 Not tainted 4.13.0-10313-ge860d2c904d1 #11
   Hardware name: Dell Inc. XPS 13 9350/09JHRY, BIOS 1.4.17 05/10/2017
   Workqueue: hci0 hci_power_on [bluetooth]
   task: 8d3767895ac0 task.stack: 9d3481efc000
   RIP: 0010:_request_firmware+0x460/0x790
   Call Trace:
    request_firmware+0x37/0x50
    btusb_setup_intel_new+0x227/0x7e0 [btusb]
    hci_dev_do_open+0x3da/0x570 [bluetooth]
    hci_power_on+0x52/0x1f0 [bluetooth]
    process_one_work+0x1db/0x3d0
    worker_thread+0x47/0x3e0
    kthread+0x125/0x140
    ret_from_fork+0x22/0x30
   ---[ end trace 007b222491432927 ]---
   Bluetooth: hci0: Failed to load Intel firmware file (-112)
   [drm] RC6 on
   done.
   thermal thermal_zone11: failed to read out thermal zone (-5)
   PM: suspend exit


Ah, I'll blame Luis for this, I think it might be due to 81f95076281f
("firmware: add sanity check on shutdown/suspend")

Luis, any ideas?  I'll try to revert this and try it out tomorrow when
I get a chance.



I can revert it an fire up some testing..



Yes 81f95076281f is to blame.. After reverting it all is fine again.

15 resume cycles on the one laptop , 10 on the other without to hit the trace.

Regards

Re: btusb "firmware request while host is not available" at resume

2017-09-10 Thread Gabriel C


On 11.09.2017 05:15, Gabriel C wrote:

On 11.09.2017 03:25, Greg Kroah-Hartman wrote:

On Sun, Sep 10, 2017 at 12:26:02PM -0700, Linus Torvalds wrote:

This seems to be a new problem at resume for the Intel btusb driver,
but I'm not seeing anything in that driver itself that looks like a
likely trigger, so I wonder if it's some driver core change, a generic
resume path issue, or a workqueue change that has made it trigger for
me.

It might also just be a timing difference, maybe it's always been there?

Does anybody have any ideas? It does't happen on every resume, and the
machine works despite this (but no bluetooth - the *next* resume might
bring it back, though).


Ah, it's not just me having this problem.  I don't see it happening in
4.12, and haven't had the time to bisect it.  I seem to be able to
trigger it every suspend/resume cycle, so I don't know if it's a timing
issue.



I see the same problem with QCA hardware.. but a bit different.

On first resume cycle the firmware call is fine but the adapter dies a bit 
later with :


  'Bluetooth: hci0: Failed to send body at 4 of 1857 (-110)'

On second resume cycle I hit the trace too.






  Linus

--

   ACPI: Low-level resume complete
   ACPI: EC: EC started
   PM: Restoring platform NVS memory
   Enabling non-boot CPUs ...
   x86: Booting SMP configuration:
   smpboot: Booting Node 0 Processor 1 APIC 0x2
    cache: parent cpu1 should not be sleeping
   CPU1 is up
   smpboot: Booting Node 0 Processor 2 APIC 0x1
    cache: parent cpu2 should not be sleeping
   CPU2 is up
   smpboot: Booting Node 0 Processor 3 APIC 0x3
    cache: parent cpu3 should not be sleeping
   CPU3 is up
   ACPI: Waking up from system sleep state S3
   ACPI: EC: event unblocked
   usb 1-3: reset full-speed USB device number 2 using xhci_hcd
   usb 1-4: reset full-speed USB device number 3 using xhci_hcd
   usb 1-5: reset high-speed USB device number 4 using xhci_hcd
   usb 1-3:1.0: rebind failed: -517
   usb 1-3:1.1: rebind failed: -517
   Bluetooth: hci0: Bootloader revision 0.0 build 2 week 52 2014
   OOM killer enabled.
   Restarting tasks ...
   Bluetooth: hci0: Device revision is 5
   Bluetooth: hci0: Secure boot is enabled
   Bluetooth: hci0: OTP lock is enabled
   Bluetooth: hci0: API lock is enabled
   Bluetooth: hci0: Debug lock is disabled
   Bluetooth: hci0: Minimum firmware build 1 week 10 2014
   firmware request while host is not available
   [ cut here ]
   WARNING: CPU: 2 PID: 621 at drivers/base/firmware_class.c:1250
_request_firmware+0x460/0x790
   CPU: 2 PID: 621 Comm: kworker/u9:2 Not tainted 4.13.0-10313-ge860d2c904d1 #11
   Hardware name: Dell Inc. XPS 13 9350/09JHRY, BIOS 1.4.17 05/10/2017
   Workqueue: hci0 hci_power_on [bluetooth]
   task: 8d3767895ac0 task.stack: 9d3481efc000
   RIP: 0010:_request_firmware+0x460/0x790
   Call Trace:
    request_firmware+0x37/0x50
    btusb_setup_intel_new+0x227/0x7e0 [btusb]
    hci_dev_do_open+0x3da/0x570 [bluetooth]
    hci_power_on+0x52/0x1f0 [bluetooth]
    process_one_work+0x1db/0x3d0
    worker_thread+0x47/0x3e0
    kthread+0x125/0x140
    ret_from_fork+0x22/0x30
   ---[ end trace 007b222491432927 ]---
   Bluetooth: hci0: Failed to load Intel firmware file (-112)
   [drm] RC6 on
   done.
   thermal thermal_zone11: failed to read out thermal zone (-5)
   PM: suspend exit


Ah, I'll blame Luis for this, I think it might be due to 81f95076281f
("firmware: add sanity check on shutdown/suspend")

Luis, any ideas?  I'll try to revert this and try it out tomorrow when
I get a chance.



I can revert it an fire up some testing..



Yes 81f95076281f is to blame.. After reverting it all is fine again.

15 resume cycles on the one laptop , 10 on the other without to hit the trace.

Regards

Re: [PATCH v2 3/5] dmaengine: zynqmp_ps_pcie: Adding PS PCIe DMA driver

2017-09-10 Thread kbuild test robot

Hi Ravi,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.13 next-20170908]
[cannot apply to xlnx/master]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Ravi-Shankar-Jonnalagadda/dmaengine-ZynqMP-PS-PCIe-DMA-driver/20170911-052150
config: arm64-allmodconfig (attached as .config)
compiler: aarch64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm64 

Note: the 
linux-review/Ravi-Shankar-Jonnalagadda/dmaengine-ZynqMP-PS-PCIe-DMA-driver/20170911-052150
 HEAD 442a00e11c4fa28ed29541773946aa1cda153e7e builds fine.
  It only hurts bisectibility.

All errors (new ones prefixed by >>):

>> make[4]: *** No rule to make target 'drivers/dma/xilinx/ps_pcie_platform.o', 
>> needed by 'drivers/dma/xilinx/ps_pcie_dma.o'.
   make[4]: Target '__build' not remade because of errors.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [PATCH v2 3/5] dmaengine: zynqmp_ps_pcie: Adding PS PCIe DMA driver

2017-09-10 Thread kbuild test robot

Hi Ravi,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.13 next-20170908]
[cannot apply to xlnx/master]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Ravi-Shankar-Jonnalagadda/dmaengine-ZynqMP-PS-PCIe-DMA-driver/20170911-052150
config: arm64-allmodconfig (attached as .config)
compiler: aarch64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm64 

Note: the 
linux-review/Ravi-Shankar-Jonnalagadda/dmaengine-ZynqMP-PS-PCIe-DMA-driver/20170911-052150
 HEAD 442a00e11c4fa28ed29541773946aa1cda153e7e builds fine.
  It only hurts bisectibility.

All errors (new ones prefixed by >>):

>> make[4]: *** No rule to make target 'drivers/dma/xilinx/ps_pcie_platform.o', 
>> needed by 'drivers/dma/xilinx/ps_pcie_dma.o'.
   make[4]: Target '__build' not remade because of errors.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

[PATCH] x86/processor: Remove unused declaration of __generic_processor_info()

2017-09-10 Thread Dou Liyang

Commit:

  2b85b3d22920 ("x86/acpi: Restore the order of CPU IDs")

... reverted the __generic_processor_info() to generic_processor_info(),
but forgot to remove its declaration in mpspec.h together.

Remove the declaration and make the comments consistent.

Signed-off-by: Dou Liyang 
---
 arch/x86/include/asm/mpspec.h | 1 -
 arch/x86/kernel/apic/apic.c   | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/mpspec.h b/arch/x86/include/asm/mpspec.h
index 831eb78..c471ca1 100644
--- a/arch/x86/include/asm/mpspec.h
+++ b/arch/x86/include/asm/mpspec.h
@@ -86,7 +86,6 @@ static inline void 
e820__memblock_alloc_reserved_mpc_new(void) { }
 #endif
 
 int generic_processor_info(int apicid, int version);
-int __generic_processor_info(int apicid, int version, bool enabled);
 
 #define PHYSID_ARRAY_SIZE  BITS_TO_LONGS(MAX_LOCAL_APIC)
 
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 8315e2f..d705c76 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2130,7 +2130,7 @@ int generic_processor_info(int apicid, int version)
 * Since fixing handling of boot_cpu_physical_apicid requires
 * another discussion and tests on each platform, we leave it
 * for now and here we use read_apic_id() directly in this
-* function, __generic_processor_info().
+* function, generic_processor_info().
 */
if (disabled_cpu_apicid != BAD_APICID &&
disabled_cpu_apicid != read_apic_id() &&
-- 
2.5.5

[PATCH] x86/processor: Remove unused declaration of __generic_processor_info()

2017-09-10 Thread Dou Liyang

Commit:

  2b85b3d22920 ("x86/acpi: Restore the order of CPU IDs")

... reverted the __generic_processor_info() to generic_processor_info(),
but forgot to remove its declaration in mpspec.h together.

Remove the declaration and make the comments consistent.

Signed-off-by: Dou Liyang 
---
 arch/x86/include/asm/mpspec.h | 1 -
 arch/x86/kernel/apic/apic.c   | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/mpspec.h b/arch/x86/include/asm/mpspec.h
index 831eb78..c471ca1 100644
--- a/arch/x86/include/asm/mpspec.h
+++ b/arch/x86/include/asm/mpspec.h
@@ -86,7 +86,6 @@ static inline void 
e820__memblock_alloc_reserved_mpc_new(void) { }
 #endif
 
 int generic_processor_info(int apicid, int version);
-int __generic_processor_info(int apicid, int version, bool enabled);
 
 #define PHYSID_ARRAY_SIZE  BITS_TO_LONGS(MAX_LOCAL_APIC)
 
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 8315e2f..d705c76 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2130,7 +2130,7 @@ int generic_processor_info(int apicid, int version)
 * Since fixing handling of boot_cpu_physical_apicid requires
 * another discussion and tests on each platform, we leave it
 * for now and here we use read_apic_id() directly in this
-* function, __generic_processor_info().
+* function, generic_processor_info().
 */
if (disabled_cpu_apicid != BAD_APICID &&
disabled_cpu_apicid != read_apic_id() &&
-- 
2.5.5

[PATCH] f2fs: fix double count on issued discard commands

2017-09-10 Thread Jaegeuk Kim

If issue_cond is true, it does double count for # of issued commands.

Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/segment.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 7fd742f747ce..25196ff5d587 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1087,7 +1087,7 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi, 
bool issue_cond)
issued++;
__submit_discard_cmd(sbi, dc);
}
-   if (issue_cond && iter++ > DISCARD_ISSUE_RATE)
+   if (issue_cond && iter > DISCARD_ISSUE_RATE)
goto out;
}
if (list_empty(pend_list) && dcc->pend_list_tag[i] & P_TRIM)
-- 
2.14.0.rc1.383.gd1ce394fe2-goog

[PATCH] f2fs: fix double count on issued discard commands

2017-09-10 Thread Jaegeuk Kim

If issue_cond is true, it does double count for # of issued commands.

Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/segment.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 7fd742f747ce..25196ff5d587 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1087,7 +1087,7 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi, 
bool issue_cond)
issued++;
__submit_discard_cmd(sbi, dc);
}
-   if (issue_cond && iter++ > DISCARD_ISSUE_RATE)
+   if (issue_cond && iter > DISCARD_ISSUE_RATE)
goto out;
}
if (list_empty(pend_list) && dcc->pend_list_tag[i] & P_TRIM)
-- 
2.14.0.rc1.383.gd1ce394fe2-goog

Re: WARNING: CPU: 2 PID: 4277 at lib/refcount.c:186

2017-09-10 Thread Eric Dumazet

On Fri, 2017-09-08 at 11:46 -0700, Eric Dumazet wrote:
> On Fri, 2017-09-08 at 10:21 -0700, Cong Wang wrote:
> > (Cc'ing netdev)
> > 
> > On Fri, Sep 8, 2017 at 5:59 AM, Shankara Pailoor  
> > wrote:
> > > Hi,
> > >
> > > I found a warning while fuzzing with Syzkaller on linux 4.13-rc7 on
> > > x86_64. The full stack trace is below:
> > >
> > > WARNING: CPU: 2 PID: 4277 at lib/refcount.c:186
> > > refcount_sub_and_test+0x167/0x1b0 lib/refcount.c:186
> > > Kernel panic - not syncing: panic_on_warn set ...
> > >
> > > CPU: 2 PID: 4277 Comm: syz-executor0 Not tainted 4.13.0-rc7 #3
> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > > Ubuntu-1.8.2-1ubuntu1 04/01/2014
> > > Call Trace:
> > >  
> > >  __dump_stack lib/dump_stack.c:16 [inline]
> > >  dump_stack+0xf7/0x1aa lib/dump_stack.c:52
> > >  panic+0x1ae/0x3a7 kernel/panic.c:180
> > >  __warn+0x1c4/0x1d9 kernel/panic.c:541
> > >  report_bug+0x211/0x2d0 lib/bug.c:183
> > >  fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190
> > >  do_trap_no_signal arch/x86/kernel/traps.c:224 [inline]
> > >  do_trap+0x260/0x390 arch/x86/kernel/traps.c:273
> > >  do_error_trap+0x118/0x340 arch/x86/kernel/traps.c:310
> > >  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323
> > >  invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:846
> > > RIP: 0010:refcount_sub_and_test+0x167/0x1b0 lib/refcount.c:186
> > > RSP: 0018:88006e006b60 EFLAGS: 00010286
> > > RAX: 0026 RBX:  RCX: 
> > > RDX: 0026 RSI: 11000dc00d2c RDI: ed000dc00d60
> > > RBP: 88006e006bf0 R08: 0001 R09: 
> > > R10:  R11:  R12: 11000dc00d6d
> > > R13:  R14: 0001 R15: 88006ce9d340
> > >  refcount_dec_and_test+0x1a/0x20 lib/refcount.c:211
> > >  reqsk_put+0x71/0x2b0 include/net/request_sock.h:123
> > >  tcp_v4_rcv+0x259e/0x2e20 net/ipv4/tcp_ipv4.c:1729
> > >  ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
> > >  NF_HOOK include/linux/netfilter.h:248 [inline]
> > >  ip_local_deliver+0x1ce/0x6d0 net/ipv4/ip_input.c:257
> > >  dst_input include/net/dst.h:477 [inline]
> > >  ip_rcv_finish+0x8db/0x19c0 net/ipv4/ip_input.c:397
> > >  NF_HOOK include/linux/netfilter.h:248 [inline]
> > >  ip_rcv+0xc3f/0x17d0 net/ipv4/ip_input.c:488
> > >  __netif_receive_skb_core+0x1fb7/0x31f0 net/core/dev.c:4298
> > >  __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4336
> > >  process_backlog+0x1c5/0x6d0 net/core/dev.c:5102
> > >  napi_poll net/core/dev.c:5499 [inline]
> > >  net_rx_action+0x6d3/0x14a0 net/core/dev.c:5565
> > >  __do_softirq+0x2cb/0xb2d kernel/softirq.c:284
> > >  do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:898
> > >  
> > >  do_softirq.part.16+0x63/0x80 kernel/softirq.c:328
> > >  do_softirq kernel/softirq.c:176 [inline]
> > >  __local_bh_enable_ip+0x84/0x90 kernel/softirq.c:181
> > >  local_bh_enable include/linux/bottom_half.h:31 [inline]
> > >  rcu_read_unlock_bh include/linux/rcupdate.h:705 [inline]
> > >  ip_finish_output2+0x8ad/0x1360 net/ipv4/ip_output.c:231
> > >  ip_finish_output+0x74e/0xb80 net/ipv4/ip_output.c:317
> > >  NF_HOOK_COND include/linux/netfilter.h:237 [inline]
> > >  ip_output+0x1cc/0x850 net/ipv4/ip_output.c:405
> > >  dst_output include/net/dst.h:471 [inline]
> > >  ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124
> > >  ip_queue_xmit+0x8c6/0x1810 net/ipv4/ip_output.c:504
> > >  tcp_transmit_skb+0x1963/0x3320 net/ipv4/tcp_output.c:1123
> > >  tcp_send_ack.part.35+0x38c/0x620 net/ipv4/tcp_output.c:3575
> > >  tcp_send_ack+0x49/0x60 net/ipv4/tcp_output.c:3545
> > >  tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:5795 [inline]
> > >  tcp_rcv_state_process+0x4876/0x4b60 net/ipv4/tcp_input.c:5930
> > >  tcp_v4_do_rcv+0x58a/0x820 net/ipv4/tcp_ipv4.c:1483
> > >  sk_backlog_rcv include/net/sock.h:907 [inline]
> > >  __release_sock+0x124/0x360 net/core/sock.c:2223
> > >  release_sock+0xa4/0x2a0 net/core/sock.c:2715
> > >  inet_wait_for_connect net/ipv4/af_inet.c:557 [inline]
> > >  __inet_stream_connect+0x671/0xf00 net/ipv4/af_inet.c:643
> > >  inet_stream_connect+0x58/0xa0 net/ipv4/af_inet.c:682
> > >  SYSC_connect+0x204/0x470 net/socket.c:1628
> > >  SyS_connect+0x24/0x30 net/socket.c:1609
> > >  entry_SYSCALL_64_fastpath+0x18/0xad
> > > RIP: 0033:0x451e59
> > > RSP: 002b:7f474843fc08 EFLAGS: 0216 ORIG_RAX: 002a
> > > RAX: ffda RBX: 00718000 RCX: 00451e59
> > > RDX: 0010 RSI: 20002000 RDI: 0007
> > > RBP: 0046 R08:  R09: 
> > > R10:  R11: 0216 R12: 
> > > R13: 7ffc040a0f8f R14: 7f47484409c0 R15: 
> > >
> > >
> > >
> > >
> > > I found that the following program is able to reproduce the warning:
> > >
> > >
> > > Pastebin: https://pastebin.com/B75BdYKz
> > >
> > > Here are my configs:

Re: WARNING: CPU: 2 PID: 4277 at lib/refcount.c:186

2017-09-10 Thread Eric Dumazet

On Fri, 2017-09-08 at 11:46 -0700, Eric Dumazet wrote:
> On Fri, 2017-09-08 at 10:21 -0700, Cong Wang wrote:
> > (Cc'ing netdev)
> > 
> > On Fri, Sep 8, 2017 at 5:59 AM, Shankara Pailoor  
> > wrote:
> > > Hi,
> > >
> > > I found a warning while fuzzing with Syzkaller on linux 4.13-rc7 on
> > > x86_64. The full stack trace is below:
> > >
> > > WARNING: CPU: 2 PID: 4277 at lib/refcount.c:186
> > > refcount_sub_and_test+0x167/0x1b0 lib/refcount.c:186
> > > Kernel panic - not syncing: panic_on_warn set ...
> > >
> > > CPU: 2 PID: 4277 Comm: syz-executor0 Not tainted 4.13.0-rc7 #3
> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > > Ubuntu-1.8.2-1ubuntu1 04/01/2014
> > > Call Trace:
> > >  
> > >  __dump_stack lib/dump_stack.c:16 [inline]
> > >  dump_stack+0xf7/0x1aa lib/dump_stack.c:52
> > >  panic+0x1ae/0x3a7 kernel/panic.c:180
> > >  __warn+0x1c4/0x1d9 kernel/panic.c:541
> > >  report_bug+0x211/0x2d0 lib/bug.c:183
> > >  fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190
> > >  do_trap_no_signal arch/x86/kernel/traps.c:224 [inline]
> > >  do_trap+0x260/0x390 arch/x86/kernel/traps.c:273
> > >  do_error_trap+0x118/0x340 arch/x86/kernel/traps.c:310
> > >  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323
> > >  invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:846
> > > RIP: 0010:refcount_sub_and_test+0x167/0x1b0 lib/refcount.c:186
> > > RSP: 0018:88006e006b60 EFLAGS: 00010286
> > > RAX: 0026 RBX:  RCX: 
> > > RDX: 0026 RSI: 11000dc00d2c RDI: ed000dc00d60
> > > RBP: 88006e006bf0 R08: 0001 R09: 
> > > R10:  R11:  R12: 11000dc00d6d
> > > R13:  R14: 0001 R15: 88006ce9d340
> > >  refcount_dec_and_test+0x1a/0x20 lib/refcount.c:211
> > >  reqsk_put+0x71/0x2b0 include/net/request_sock.h:123
> > >  tcp_v4_rcv+0x259e/0x2e20 net/ipv4/tcp_ipv4.c:1729
> > >  ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
> > >  NF_HOOK include/linux/netfilter.h:248 [inline]
> > >  ip_local_deliver+0x1ce/0x6d0 net/ipv4/ip_input.c:257
> > >  dst_input include/net/dst.h:477 [inline]
> > >  ip_rcv_finish+0x8db/0x19c0 net/ipv4/ip_input.c:397
> > >  NF_HOOK include/linux/netfilter.h:248 [inline]
> > >  ip_rcv+0xc3f/0x17d0 net/ipv4/ip_input.c:488
> > >  __netif_receive_skb_core+0x1fb7/0x31f0 net/core/dev.c:4298
> > >  __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4336
> > >  process_backlog+0x1c5/0x6d0 net/core/dev.c:5102
> > >  napi_poll net/core/dev.c:5499 [inline]
> > >  net_rx_action+0x6d3/0x14a0 net/core/dev.c:5565
> > >  __do_softirq+0x2cb/0xb2d kernel/softirq.c:284
> > >  do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:898
> > >  
> > >  do_softirq.part.16+0x63/0x80 kernel/softirq.c:328
> > >  do_softirq kernel/softirq.c:176 [inline]
> > >  __local_bh_enable_ip+0x84/0x90 kernel/softirq.c:181
> > >  local_bh_enable include/linux/bottom_half.h:31 [inline]
> > >  rcu_read_unlock_bh include/linux/rcupdate.h:705 [inline]
> > >  ip_finish_output2+0x8ad/0x1360 net/ipv4/ip_output.c:231
> > >  ip_finish_output+0x74e/0xb80 net/ipv4/ip_output.c:317
> > >  NF_HOOK_COND include/linux/netfilter.h:237 [inline]
> > >  ip_output+0x1cc/0x850 net/ipv4/ip_output.c:405
> > >  dst_output include/net/dst.h:471 [inline]
> > >  ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124
> > >  ip_queue_xmit+0x8c6/0x1810 net/ipv4/ip_output.c:504
> > >  tcp_transmit_skb+0x1963/0x3320 net/ipv4/tcp_output.c:1123
> > >  tcp_send_ack.part.35+0x38c/0x620 net/ipv4/tcp_output.c:3575
> > >  tcp_send_ack+0x49/0x60 net/ipv4/tcp_output.c:3545
> > >  tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:5795 [inline]
> > >  tcp_rcv_state_process+0x4876/0x4b60 net/ipv4/tcp_input.c:5930
> > >  tcp_v4_do_rcv+0x58a/0x820 net/ipv4/tcp_ipv4.c:1483
> > >  sk_backlog_rcv include/net/sock.h:907 [inline]
> > >  __release_sock+0x124/0x360 net/core/sock.c:2223
> > >  release_sock+0xa4/0x2a0 net/core/sock.c:2715
> > >  inet_wait_for_connect net/ipv4/af_inet.c:557 [inline]
> > >  __inet_stream_connect+0x671/0xf00 net/ipv4/af_inet.c:643
> > >  inet_stream_connect+0x58/0xa0 net/ipv4/af_inet.c:682
> > >  SYSC_connect+0x204/0x470 net/socket.c:1628
> > >  SyS_connect+0x24/0x30 net/socket.c:1609
> > >  entry_SYSCALL_64_fastpath+0x18/0xad
> > > RIP: 0033:0x451e59
> > > RSP: 002b:7f474843fc08 EFLAGS: 0216 ORIG_RAX: 002a
> > > RAX: ffda RBX: 00718000 RCX: 00451e59
> > > RDX: 0010 RSI: 20002000 RDI: 0007
> > > RBP: 0046 R08:  R09: 
> > > R10:  R11: 0216 R12: 
> > > R13: 7ffc040a0f8f R14: 7f47484409c0 R15: 
> > >
> > >
> > >
> > >
> > > I found that the following program is able to reproduce the warning:
> > >
> > >
> > > Pastebin: https://pastebin.com/B75BdYKz
> > >
> > > Here are my configs: https://pastebin.com/zRYCXbak

Re: iov_iter_pipe warning.

2017-09-10 Thread Al Viro

On Mon, Sep 11, 2017 at 10:31:13AM +1000, Dave Chinner wrote:

> splice does not go down the direct IO path, so iomap_dio_actor()
> should never be handled a pipe as the destination for the IO data.
> Indeed, splice read has to supply the pages to be put into the pipe,
> which the DIO path does not do - it requires pages be supplied to
> it. So I'm not sure why we'd care about pipe destination limitations
> in the DIO path?

splice doesn't give a rat's arse for direct IO; it's up to filesystem.
generic_file_splice_read() simply sets up a pipe-backed iov_iter and
calls ->read_iter(), period.

iov_iter_get_pages() for pipe-backed destination does page allocation
and inserts freshly allocated pages into pipe.  copy_to_iter() does
the same + copies data; copy_page_to_iter() grabs an extra reference
to page and inserts it into pipe, not that O_DIRECT ->read_iter()
had been likely to use the last one.

Normally O_DIRECT would work just fine - pages get allocated, references
to them put into pipe cyclic buffer *and* into a number of bio, bio
would get submitted and once the IO is completed we unlock the pipe,
making those pages available for readers.

With minimal care it works just fine - all you really need is
* cope with failing copy_to_... / iov_iter_get_pages().
Short read if we'd already gotten something, -EFAULT otherwise.
That goes for pipe-backed same as for iovec-backed - any ->read_iter()
that fails to handle that is already in trouble.
* make sure that iov_iter_get_pages()/iov_iter_get_pages_alloc()
is followed by iov_iter_advance() for the amount you've actually filled,
before any subsequent copy_to_iter()/copy_page_to_iter() or return
from ->read_iter(), whichever comes first.  That includes the situation
when you actually hadn't filled anything at all - just remember to
do iov_iter_advance(to, 0) in that case.  That's about the only
extra requirement imposed by pipes and it's not hard to satisfy.
Combination of iov_iter_advance() with iov_iter_revert() works as
usual.

Normally a filesystem doesn't need to care about splice at all -
just use generic_file_splice_read() and be done with that.
It will use the normal ->read_iter(), with whatever locking, etc.,
your filesystem would do on a normal read.

Re: iov_iter_pipe warning.

2017-09-10 Thread Al Viro

On Mon, Sep 11, 2017 at 10:31:13AM +1000, Dave Chinner wrote:

> splice does not go down the direct IO path, so iomap_dio_actor()
> should never be handled a pipe as the destination for the IO data.
> Indeed, splice read has to supply the pages to be put into the pipe,
> which the DIO path does not do - it requires pages be supplied to
> it. So I'm not sure why we'd care about pipe destination limitations
> in the DIO path?

splice doesn't give a rat's arse for direct IO; it's up to filesystem.
generic_file_splice_read() simply sets up a pipe-backed iov_iter and
calls ->read_iter(), period.

iov_iter_get_pages() for pipe-backed destination does page allocation
and inserts freshly allocated pages into pipe.  copy_to_iter() does
the same + copies data; copy_page_to_iter() grabs an extra reference
to page and inserts it into pipe, not that O_DIRECT ->read_iter()
had been likely to use the last one.

Normally O_DIRECT would work just fine - pages get allocated, references
to them put into pipe cyclic buffer *and* into a number of bio, bio
would get submitted and once the IO is completed we unlock the pipe,
making those pages available for readers.

With minimal care it works just fine - all you really need is
* cope with failing copy_to_... / iov_iter_get_pages().
Short read if we'd already gotten something, -EFAULT otherwise.
That goes for pipe-backed same as for iovec-backed - any ->read_iter()
that fails to handle that is already in trouble.
* make sure that iov_iter_get_pages()/iov_iter_get_pages_alloc()
is followed by iov_iter_advance() for the amount you've actually filled,
before any subsequent copy_to_iter()/copy_page_to_iter() or return
from ->read_iter(), whichever comes first.  That includes the situation
when you actually hadn't filled anything at all - just remember to
do iov_iter_advance(to, 0) in that case.  That's about the only
extra requirement imposed by pipes and it's not hard to satisfy.
Combination of iov_iter_advance() with iov_iter_revert() works as
usual.

Normally a filesystem doesn't need to care about splice at all -
just use generic_file_splice_read() and be done with that.
It will use the normal ->read_iter(), with whatever locking, etc.,
your filesystem would do on a normal read.

Re: [RFC PATCH v5 0/5] vfio-pci: Add support for mmapping MSI-X table

2017-09-10 Thread Alexey Kardashevskiy

On 29/08/17 12:58, Alexey Kardashevskiy wrote:
> On 21/08/17 12:47, Alexey Kardashevskiy wrote:
>> Folks,
>>
>> Ok, people did talk, exchanged ideas, lovely :) What happens now? Do I
>> repost this or go back to PCI bus flags or something else? Thanks.
> 
> 
> Anyone, any help? How do we proceed with this? Thanks.

Ping, anyone?

Or everybody is just waiting for the kvm forum to discuss it in person? Thanks.



> 
> 
> 
>>
>>
>>
>> On 14/08/17 19:45, Alexey Kardashevskiy wrote:
>>> Folks,
>>>
>>> Is there anything to change besides those compiler errors and David's
>>> comment in 5/5? Or the while patchset is too bad? Thanks.
>>>
>>>
>>>
>>> On 07/08/17 17:25, Alexey Kardashevskiy wrote:
 This is a followup for "[PATCH kernel v4 0/6] vfio-pci: Add support for 
 mmapping MSI-X table"
 http://www.spinics.net/lists/kvm/msg152232.html

 This time it is using "caps" in IOMMU groups. The main question is if PCI
 bus flags or IOMMU domains are still better (and which one).
>>>



 Here is some background:

 Current vfio-pci implementation disallows to mmap the page
 containing MSI-X table in case that users can write directly
 to MSI-X table and generate an incorrect MSIs.

 However, this will cause some performance issue when there
 are some critical device registers in the same page as the
 MSI-X table. We have to handle the mmio access to these
 registers in QEMU emulation rather than in guest.

 To solve this issue, this series allows to expose MSI-X table
 to userspace when hardware enables the capability of interrupt
 remapping which can ensure that a given PCI device can only
 shoot the MSIs assigned for it. And we introduce a new bus_flags
 PCI_BUS_FLAGS_MSI_REMAP to test this capability on PCI side
 for different archs.


 This is based on sha1
 26c5cebfdb6c "Merge branch 'parisc-4.13-4' of 
 git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux"

 Please comment. Thanks.

 Changelog:

 v5:
 * redid the whole thing via so-called IOMMU group capabilities

 v4:
 * rebased on recent upstream
 * got all 6 patches from v2 (v3 was missing some)




 Alexey Kardashevskiy (5):
   iommu: Add capabilities to a group
   iommu: Set IOMMU_GROUP_CAP_ISOLATE_MSIX if MSI controller enables IRQ
 remapping
   iommu/intel/amd: Set IOMMU_GROUP_CAP_ISOLATE_MSIX if IRQ remapping is
 enabled
   powerpc/iommu: Set IOMMU_GROUP_CAP_ISOLATE_MSIX
   vfio-pci: Allow to expose MSI-X table to userspace when safe

  include/linux/iommu.h| 20 
  include/linux/vfio.h |  1 +
  arch/powerpc/kernel/iommu.c  |  1 +
  drivers/iommu/amd_iommu.c|  3 +++
  drivers/iommu/intel-iommu.c  |  3 +++
  drivers/iommu/iommu.c| 35 +++
  drivers/vfio/pci/vfio_pci.c  | 20 +---
  drivers/vfio/pci/vfio_pci_rdwr.c |  5 -
  drivers/vfio/vfio.c  | 15 +++
  9 files changed, 99 insertions(+), 4 deletions(-)

>>>
>>>
>>
>>
> 
> 


-- 
Alexey

Re: [RFC PATCH v5 0/5] vfio-pci: Add support for mmapping MSI-X table

2017-09-10 Thread Alexey Kardashevskiy

On 29/08/17 12:58, Alexey Kardashevskiy wrote:
> On 21/08/17 12:47, Alexey Kardashevskiy wrote:
>> Folks,
>>
>> Ok, people did talk, exchanged ideas, lovely :) What happens now? Do I
>> repost this or go back to PCI bus flags or something else? Thanks.
> 
> 
> Anyone, any help? How do we proceed with this? Thanks.

Ping, anyone?

Or everybody is just waiting for the kvm forum to discuss it in person? Thanks.



> 
> 
> 
>>
>>
>>
>> On 14/08/17 19:45, Alexey Kardashevskiy wrote:
>>> Folks,
>>>
>>> Is there anything to change besides those compiler errors and David's
>>> comment in 5/5? Or the while patchset is too bad? Thanks.
>>>
>>>
>>>
>>> On 07/08/17 17:25, Alexey Kardashevskiy wrote:
 This is a followup for "[PATCH kernel v4 0/6] vfio-pci: Add support for 
 mmapping MSI-X table"
 http://www.spinics.net/lists/kvm/msg152232.html

 This time it is using "caps" in IOMMU groups. The main question is if PCI
 bus flags or IOMMU domains are still better (and which one).
>>>



 Here is some background:

 Current vfio-pci implementation disallows to mmap the page
 containing MSI-X table in case that users can write directly
 to MSI-X table and generate an incorrect MSIs.

 However, this will cause some performance issue when there
 are some critical device registers in the same page as the
 MSI-X table. We have to handle the mmio access to these
 registers in QEMU emulation rather than in guest.

 To solve this issue, this series allows to expose MSI-X table
 to userspace when hardware enables the capability of interrupt
 remapping which can ensure that a given PCI device can only
 shoot the MSIs assigned for it. And we introduce a new bus_flags
 PCI_BUS_FLAGS_MSI_REMAP to test this capability on PCI side
 for different archs.


 This is based on sha1
 26c5cebfdb6c "Merge branch 'parisc-4.13-4' of 
 git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux"

 Please comment. Thanks.

 Changelog:

 v5:
 * redid the whole thing via so-called IOMMU group capabilities

 v4:
 * rebased on recent upstream
 * got all 6 patches from v2 (v3 was missing some)




 Alexey Kardashevskiy (5):
   iommu: Add capabilities to a group
   iommu: Set IOMMU_GROUP_CAP_ISOLATE_MSIX if MSI controller enables IRQ
 remapping
   iommu/intel/amd: Set IOMMU_GROUP_CAP_ISOLATE_MSIX if IRQ remapping is
 enabled
   powerpc/iommu: Set IOMMU_GROUP_CAP_ISOLATE_MSIX
   vfio-pci: Allow to expose MSI-X table to userspace when safe

  include/linux/iommu.h| 20 
  include/linux/vfio.h |  1 +
  arch/powerpc/kernel/iommu.c  |  1 +
  drivers/iommu/amd_iommu.c|  3 +++
  drivers/iommu/intel-iommu.c  |  3 +++
  drivers/iommu/iommu.c| 35 +++
  drivers/vfio/pci/vfio_pci.c  | 20 +---
  drivers/vfio/pci/vfio_pci_rdwr.c |  5 -
  drivers/vfio/vfio.c  | 15 +++
  9 files changed, 99 insertions(+), 4 deletions(-)

>>>
>>>
>>
>>
> 
> 


-- 
Alexey

[PATCH v3] dt-bindings: add nutsboard vendor prefix

2017-09-10 Thread Wig Cheng

From: YuanCheng Cheng 

NutsBoard (http://nutsboard.org/) is an organization and focus
on ARM based embedded boards with open source software.

Signed-off-by: YuanCheng Cheng 
---
 Documentation/devicetree/bindings/vendor-prefixes.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt 
b/Documentation/devicetree/bindings/vendor-prefixes.txt
index 401ed98..3a16be8 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -234,6 +234,7 @@ nintendoNintendo
 nltNLT Technologies, Ltd.
 nokia  Nokia
 nordic Nordic Semiconductor
+nutsboard  NutsBoard
 nuvotonNuvoton Technology Corporation
 nvdNew Vision Display
 nvidia NVIDIA
-- 
2.7.4

[PATCH v3] dt-bindings: add nutsboard vendor prefix

2017-09-10 Thread Wig Cheng

From: YuanCheng Cheng 

NutsBoard (http://nutsboard.org/) is an organization and focus
on ARM based embedded boards with open source software.

Signed-off-by: YuanCheng Cheng 
---
 Documentation/devicetree/bindings/vendor-prefixes.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt 
b/Documentation/devicetree/bindings/vendor-prefixes.txt
index 401ed98..3a16be8 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -234,6 +234,7 @@ nintendoNintendo
 nltNLT Technologies, Ltd.
 nokia  Nokia
 nordic Nordic Semiconductor
+nutsboard  NutsBoard
 nuvotonNuvoton Technology Corporation
 nvdNew Vision Display
 nvidia NVIDIA
-- 
2.7.4

Re: btusb "firmware request while host is not available" at resume

2017-09-10 Thread Gabriel C


On 11.09.2017 03:25, Greg Kroah-Hartman wrote:

On Sun, Sep 10, 2017 at 12:26:02PM -0700, Linus Torvalds wrote:

This seems to be a new problem at resume for the Intel btusb driver,
but I'm not seeing anything in that driver itself that looks like a
likely trigger, so I wonder if it's some driver core change, a generic
resume path issue, or a workqueue change that has made it trigger for
me.

It might also just be a timing difference, maybe it's always been there?

Does anybody have any ideas? It does't happen on every resume, and the
machine works despite this (but no bluetooth - the *next* resume might
bring it back, though).


Ah, it's not just me having this problem.  I don't see it happening in
4.12, and haven't had the time to bisect it.  I seem to be able to
trigger it every suspend/resume cycle, so I don't know if it's a timing
issue.



I see the same problem with QCA hardware.. but a bit different.

On first resume cycle the firmware call is fine but the adapter dies a bit 
later with :


 'Bluetooth: hci0: Failed to send body at 4 of 1857 (-110)'

On second resume cycle I hit the trace too.






  Linus

--

   ACPI: Low-level resume complete
   ACPI: EC: EC started
   PM: Restoring platform NVS memory
   Enabling non-boot CPUs ...
   x86: Booting SMP configuration:
   smpboot: Booting Node 0 Processor 1 APIC 0x2
cache: parent cpu1 should not be sleeping
   CPU1 is up
   smpboot: Booting Node 0 Processor 2 APIC 0x1
cache: parent cpu2 should not be sleeping
   CPU2 is up
   smpboot: Booting Node 0 Processor 3 APIC 0x3
cache: parent cpu3 should not be sleeping
   CPU3 is up
   ACPI: Waking up from system sleep state S3
   ACPI: EC: event unblocked
   usb 1-3: reset full-speed USB device number 2 using xhci_hcd
   usb 1-4: reset full-speed USB device number 3 using xhci_hcd
   usb 1-5: reset high-speed USB device number 4 using xhci_hcd
   usb 1-3:1.0: rebind failed: -517
   usb 1-3:1.1: rebind failed: -517
   Bluetooth: hci0: Bootloader revision 0.0 build 2 week 52 2014
   OOM killer enabled.
   Restarting tasks ...
   Bluetooth: hci0: Device revision is 5
   Bluetooth: hci0: Secure boot is enabled
   Bluetooth: hci0: OTP lock is enabled
   Bluetooth: hci0: API lock is enabled
   Bluetooth: hci0: Debug lock is disabled
   Bluetooth: hci0: Minimum firmware build 1 week 10 2014
   firmware request while host is not available
   [ cut here ]
   WARNING: CPU: 2 PID: 621 at drivers/base/firmware_class.c:1250
_request_firmware+0x460/0x790
   CPU: 2 PID: 621 Comm: kworker/u9:2 Not tainted 4.13.0-10313-ge860d2c904d1 #11
   Hardware name: Dell Inc. XPS 13 9350/09JHRY, BIOS 1.4.17 05/10/2017
   Workqueue: hci0 hci_power_on [bluetooth]
   task: 8d3767895ac0 task.stack: 9d3481efc000
   RIP: 0010:_request_firmware+0x460/0x790
   Call Trace:
request_firmware+0x37/0x50
btusb_setup_intel_new+0x227/0x7e0 [btusb]
hci_dev_do_open+0x3da/0x570 [bluetooth]
hci_power_on+0x52/0x1f0 [bluetooth]
process_one_work+0x1db/0x3d0
worker_thread+0x47/0x3e0
kthread+0x125/0x140
ret_from_fork+0x22/0x30
   ---[ end trace 007b222491432927 ]---
   Bluetooth: hci0: Failed to load Intel firmware file (-112)
   [drm] RC6 on
   done.
   thermal thermal_zone11: failed to read out thermal zone (-5)
   PM: suspend exit


Ah, I'll blame Luis for this, I think it might be due to 81f95076281f
("firmware: add sanity check on shutdown/suspend")

Luis, any ideas?  I'll try to revert this and try it out tomorrow when
I get a chance.



I can revert it an fire up some testing..

This time with 4.13.x I hit all kind bugs on this box anyway :)

Re: btusb "firmware request while host is not available" at resume

2017-09-10 Thread Gabriel C


On 11.09.2017 03:25, Greg Kroah-Hartman wrote:

On Sun, Sep 10, 2017 at 12:26:02PM -0700, Linus Torvalds wrote:

This seems to be a new problem at resume for the Intel btusb driver,
but I'm not seeing anything in that driver itself that looks like a
likely trigger, so I wonder if it's some driver core change, a generic
resume path issue, or a workqueue change that has made it trigger for
me.

It might also just be a timing difference, maybe it's always been there?

Does anybody have any ideas? It does't happen on every resume, and the
machine works despite this (but no bluetooth - the *next* resume might
bring it back, though).


Ah, it's not just me having this problem.  I don't see it happening in
4.12, and haven't had the time to bisect it.  I seem to be able to
trigger it every suspend/resume cycle, so I don't know if it's a timing
issue.



I see the same problem with QCA hardware.. but a bit different.

On first resume cycle the firmware call is fine but the adapter dies a bit 
later with :


 'Bluetooth: hci0: Failed to send body at 4 of 1857 (-110)'

On second resume cycle I hit the trace too.






  Linus

--

   ACPI: Low-level resume complete
   ACPI: EC: EC started
   PM: Restoring platform NVS memory
   Enabling non-boot CPUs ...
   x86: Booting SMP configuration:
   smpboot: Booting Node 0 Processor 1 APIC 0x2
cache: parent cpu1 should not be sleeping
   CPU1 is up
   smpboot: Booting Node 0 Processor 2 APIC 0x1
cache: parent cpu2 should not be sleeping
   CPU2 is up
   smpboot: Booting Node 0 Processor 3 APIC 0x3
cache: parent cpu3 should not be sleeping
   CPU3 is up
   ACPI: Waking up from system sleep state S3
   ACPI: EC: event unblocked
   usb 1-3: reset full-speed USB device number 2 using xhci_hcd
   usb 1-4: reset full-speed USB device number 3 using xhci_hcd
   usb 1-5: reset high-speed USB device number 4 using xhci_hcd
   usb 1-3:1.0: rebind failed: -517
   usb 1-3:1.1: rebind failed: -517
   Bluetooth: hci0: Bootloader revision 0.0 build 2 week 52 2014
   OOM killer enabled.
   Restarting tasks ...
   Bluetooth: hci0: Device revision is 5
   Bluetooth: hci0: Secure boot is enabled
   Bluetooth: hci0: OTP lock is enabled
   Bluetooth: hci0: API lock is enabled
   Bluetooth: hci0: Debug lock is disabled
   Bluetooth: hci0: Minimum firmware build 1 week 10 2014
   firmware request while host is not available
   [ cut here ]
   WARNING: CPU: 2 PID: 621 at drivers/base/firmware_class.c:1250
_request_firmware+0x460/0x790
   CPU: 2 PID: 621 Comm: kworker/u9:2 Not tainted 4.13.0-10313-ge860d2c904d1 #11
   Hardware name: Dell Inc. XPS 13 9350/09JHRY, BIOS 1.4.17 05/10/2017
   Workqueue: hci0 hci_power_on [bluetooth]
   task: 8d3767895ac0 task.stack: 9d3481efc000
   RIP: 0010:_request_firmware+0x460/0x790
   Call Trace:
request_firmware+0x37/0x50
btusb_setup_intel_new+0x227/0x7e0 [btusb]
hci_dev_do_open+0x3da/0x570 [bluetooth]
hci_power_on+0x52/0x1f0 [bluetooth]
process_one_work+0x1db/0x3d0
worker_thread+0x47/0x3e0
kthread+0x125/0x140
ret_from_fork+0x22/0x30
   ---[ end trace 007b222491432927 ]---
   Bluetooth: hci0: Failed to load Intel firmware file (-112)
   [drm] RC6 on
   done.
   thermal thermal_zone11: failed to read out thermal zone (-5)
   PM: suspend exit


Ah, I'll blame Luis for this, I think it might be due to 81f95076281f
("firmware: add sanity check on shutdown/suspend")

Luis, any ideas?  I'll try to revert this and try it out tomorrow when
I get a chance.



I can revert it an fire up some testing..

This time with 4.13.x I hit all kind bugs on this box anyway :)

[PATCH] sysrq : fix Show Regs call trace on ARM

2017-09-10 Thread Jibin Xu

When kernel configuration SMP,PREEMPT and DEBUG_PREEMPT are enabled,
echo 1 >/proc/sys/kernel/sysrq
echo p >/proc/sysrq-trigger
kernel will print call trace as below:

sysrq: SysRq : Show Regs
BUG: using __this_cpu_read() in preemptible [] code: sh/435
caller is __this_cpu_preempt_check+0x18/0x20
Call trace:
[] dump_backtrace+0x0/0x1d0
[] show_stack+0x24/0x30
[] dump_stack+0x90/0xb0
[] check_preemption_disabled+0x100/0x108
[] __this_cpu_preempt_check+0x18/0x20
[] sysrq_handle_showregs+0x1c/0x40
[] __handle_sysrq+0x12c/0x1a0
[] write_sysrq_trigger+0x60/0x70
[] proc_reg_write+0x90/0xd0
[] __vfs_write+0x48/0x90
[] vfs_write+0xa4/0x190
[] SyS_write+0x54/0xb0
[] el0_svc_naked+0x24/0x28

This can be seen on a common board like an r-pi3.
This happens because when echo p >/proc/sysrq-trigger,
get_irq_regs() is called outside of IRQ context,
if preemption is enabled in this situation,kernel will
print the call trace. Since many prior discussions on
the mailing lists have made it clear that get_irq_regs
either just returns NULL or stale data when used outside
of IRQ context,we simply avoid calling it outside of
IRQ context.

Signed-off-by: Jibin Xu 
---
 drivers/tty/sysrq.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index 3ffc1ce29023..6ed8c47312a8 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -245,8 +245,10 @@ static void sysrq_handle_showallcpus(int key)
 * architecture has no support for it:
 */
if (!trigger_all_cpu_backtrace()) {
-   struct pt_regs *regs = get_irq_regs();
+   struct pt_regs *regs = NULL;
 
+   if (in_irq())
+   regs = get_irq_regs();
if (regs) {
pr_info("CPU%d:\n", smp_processor_id());
show_regs(regs);
@@ -265,7 +267,10 @@ static struct sysrq_key_op sysrq_showallcpus_op = {
 
 static void sysrq_handle_showregs(int key)
 {
-   struct pt_regs *regs = get_irq_regs();
+   struct pt_regs *regs = NULL;
+
+   if (in_irq())
+   regs = get_irq_regs();
if (regs)
show_regs(regs);
perf_event_print_debug();
-- 
2.13.0

[PATCH] sysrq : fix Show Regs call trace on ARM

2017-09-10 Thread Jibin Xu

When kernel configuration SMP,PREEMPT and DEBUG_PREEMPT are enabled,
echo 1 >/proc/sys/kernel/sysrq
echo p >/proc/sysrq-trigger
kernel will print call trace as below:

sysrq: SysRq : Show Regs
BUG: using __this_cpu_read() in preemptible [] code: sh/435
caller is __this_cpu_preempt_check+0x18/0x20
Call trace:
[] dump_backtrace+0x0/0x1d0
[] show_stack+0x24/0x30
[] dump_stack+0x90/0xb0
[] check_preemption_disabled+0x100/0x108
[] __this_cpu_preempt_check+0x18/0x20
[] sysrq_handle_showregs+0x1c/0x40
[] __handle_sysrq+0x12c/0x1a0
[] write_sysrq_trigger+0x60/0x70
[] proc_reg_write+0x90/0xd0
[] __vfs_write+0x48/0x90
[] vfs_write+0xa4/0x190
[] SyS_write+0x54/0xb0
[] el0_svc_naked+0x24/0x28

This can be seen on a common board like an r-pi3.
This happens because when echo p >/proc/sysrq-trigger,
get_irq_regs() is called outside of IRQ context,
if preemption is enabled in this situation,kernel will
print the call trace. Since many prior discussions on
the mailing lists have made it clear that get_irq_regs
either just returns NULL or stale data when used outside
of IRQ context,we simply avoid calling it outside of
IRQ context.

Signed-off-by: Jibin Xu 
---
 drivers/tty/sysrq.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index 3ffc1ce29023..6ed8c47312a8 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -245,8 +245,10 @@ static void sysrq_handle_showallcpus(int key)
 * architecture has no support for it:
 */
if (!trigger_all_cpu_backtrace()) {
-   struct pt_regs *regs = get_irq_regs();
+   struct pt_regs *regs = NULL;
 
+   if (in_irq())
+   regs = get_irq_regs();
if (regs) {
pr_info("CPU%d:\n", smp_processor_id());
show_regs(regs);
@@ -265,7 +267,10 @@ static struct sysrq_key_op sysrq_showallcpus_op = {
 
 static void sysrq_handle_showregs(int key)
 {
-   struct pt_regs *regs = get_irq_regs();
+   struct pt_regs *regs = NULL;
+
+   if (in_irq())
+   regs = get_irq_regs();
if (regs)
show_regs(regs);
perf_event_print_debug();
-- 
2.13.0

linux-next: Tree for Sep 11

2017-09-10 Thread Stephen Rothwell

Hi all,

Please do not add any v4.15 related material to your linux-next
included branches until after v4.14-rc1 has been released.

Changes since 20170908:

The watchdog tree gained a conflict against the arm-soc tree.

The aio tree gained a conflict against the vfs tree.

The akpm-current tree still had its build failure for which I applied
a fix patch.

Non-merge commits (relative to Linus' tree): 1830
 2014 files changed, 77242 insertions(+), 18986 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
and pseries_le_defconfig and i386, sparc and sparc64 defconfig. And
finally, a simple boot test of the powerpc pseries_le_defconfig kernel
in qemu.

Below is a summary of the state of the merge.

I am currently merging 267 trees (counting Linus' and 41 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (d719518d9ce9 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next)
Merging fixes/master (b4b8cbf679c4 Cavium CNN55XX: fix broken default Kconfig 
entry)
Merging kbuild-current/fixes (64236e315955 kbuild: update comments of 
Makefile.asm-generic)
Merging arc-current/for-curr (1ee55a8f7f6b ARC: Re-enable MMU upon Machine 
Check exception)
Merging arm-current/fixes (746a272e4414 ARM: 8692/1: mm: abort uaccess retries 
upon fatal signal)
Merging m68k-current/for-linus (558d5ad276c9 m68k/mac: Avoid soft-lockup 
warning after mach_power_off)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (1a92a80ad386 powerpc/mm: Ensure cpumask update is 
ordered)
Merging sparc/master (6470812e2226 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (1f4f554a72be net: qualcomm: rmnet: Fix a double free)
Merging ipsec/master (f581a0dd744f wl1251: add a missing spin_lock_init())
Merging netfilter/master (90c4ae4e2c1d netfilter: xt_hashlimit: fix build error 
caused by 64bit division)
Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook 
mask only if set)
Merging wireless-drivers/master (f957dd3c8db2 brcmfmac: feature check for 
multi-scheduled scan fails on bcm4345 devices)
Merging mac80211/master (126f760ca94d rds: Fix incorrect statistics counting)
Merging sound-current/for-linus (2a0d85d9add7 ALSA: asihpi: Kill BUG_ON() 
usages)
Merging pci-current/for-linus (8e1101d25164 PCI/MSI: Don't warn when 
irq_create_affinity_masks() returns NULL)
Merging driver-core.current/driver-core-linus (d34fc1adf01f Merge branch 'akpm' 
(patches from Andrew))
Merging tty.current/tty-linus (d34fc1adf01f Merge branch 'akpm' (patches from 
Andrew))
Merging usb.current/usb-linus (d34fc1adf01f Merge branch 'akpm' (patches from 
Andrew))
Merging usb-gadget-fixes/fixes (b7d44c36a6f6 usb: renesas_usbhs: gadget: fix 
unused-but-set-variable warning)
Merging usb-serial-fixes/usb-linus (fd1b8668af59 USB: serial: option: add 
D-Link DWM-222 device ID)
Merging usb-chipidea-fixes/ci-for-usb-stable (cbb22ebcfb99 usb: chipidea: core: 
check before accessing ci_role in ci_role_show)
Merging phy/fixes (5771a8c08880 Linux v4.13-rc1)
Merging staging.current/staging-linus (d34fc1adf01f Merge branch 'akpm' 
(patches from Andrew))
Merging char-misc.current/char-misc-linus (d34fc1adf01f Merge branch 'akpm' 
(patches from Andrew))
Merging input-current/for-linus (a6cbfa1e6d38 Merge branch 'next' into 
for-linus)
Merging crypto-current/master (2d45a7e89833 crypto:

linux-next: Tree for Sep 11

2017-09-10 Thread Stephen Rothwell

Hi all,

Please do not add any v4.15 related material to your linux-next
included branches until after v4.14-rc1 has been released.

Changes since 20170908:

The watchdog tree gained a conflict against the arm-soc tree.

The aio tree gained a conflict against the vfs tree.

The akpm-current tree still had its build failure for which I applied
a fix patch.

Non-merge commits (relative to Linus' tree): 1830
 2014 files changed, 77242 insertions(+), 18986 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
and pseries_le_defconfig and i386, sparc and sparc64 defconfig. And
finally, a simple boot test of the powerpc pseries_le_defconfig kernel
in qemu.

Below is a summary of the state of the merge.

I am currently merging 267 trees (counting Linus' and 41 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (d719518d9ce9 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next)
Merging fixes/master (b4b8cbf679c4 Cavium CNN55XX: fix broken default Kconfig 
entry)
Merging kbuild-current/fixes (64236e315955 kbuild: update comments of 
Makefile.asm-generic)
Merging arc-current/for-curr (1ee55a8f7f6b ARC: Re-enable MMU upon Machine 
Check exception)
Merging arm-current/fixes (746a272e4414 ARM: 8692/1: mm: abort uaccess retries 
upon fatal signal)
Merging m68k-current/for-linus (558d5ad276c9 m68k/mac: Avoid soft-lockup 
warning after mach_power_off)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (1a92a80ad386 powerpc/mm: Ensure cpumask update is 
ordered)
Merging sparc/master (6470812e2226 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (1f4f554a72be net: qualcomm: rmnet: Fix a double free)
Merging ipsec/master (f581a0dd744f wl1251: add a missing spin_lock_init())
Merging netfilter/master (90c4ae4e2c1d netfilter: xt_hashlimit: fix build error 
caused by 64bit division)
Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook 
mask only if set)
Merging wireless-drivers/master (f957dd3c8db2 brcmfmac: feature check for 
multi-scheduled scan fails on bcm4345 devices)
Merging mac80211/master (126f760ca94d rds: Fix incorrect statistics counting)
Merging sound-current/for-linus (2a0d85d9add7 ALSA: asihpi: Kill BUG_ON() 
usages)
Merging pci-current/for-linus (8e1101d25164 PCI/MSI: Don't warn when 
irq_create_affinity_masks() returns NULL)
Merging driver-core.current/driver-core-linus (d34fc1adf01f Merge branch 'akpm' 
(patches from Andrew))
Merging tty.current/tty-linus (d34fc1adf01f Merge branch 'akpm' (patches from 
Andrew))
Merging usb.current/usb-linus (d34fc1adf01f Merge branch 'akpm' (patches from 
Andrew))
Merging usb-gadget-fixes/fixes (b7d44c36a6f6 usb: renesas_usbhs: gadget: fix 
unused-but-set-variable warning)
Merging usb-serial-fixes/usb-linus (fd1b8668af59 USB: serial: option: add 
D-Link DWM-222 device ID)
Merging usb-chipidea-fixes/ci-for-usb-stable (cbb22ebcfb99 usb: chipidea: core: 
check before accessing ci_role in ci_role_show)
Merging phy/fixes (5771a8c08880 Linux v4.13-rc1)
Merging staging.current/staging-linus (d34fc1adf01f Merge branch 'akpm' 
(patches from Andrew))
Merging char-misc.current/char-misc-linus (d34fc1adf01f Merge branch 'akpm' 
(patches from Andrew))
Merging input-current/for-linus (a6cbfa1e6d38 Merge branch 'next' into 
for-linus)
Merging crypto-current/master (2d45a7e89833 crypto:

Re: [lkp-robot] [sched/fair] 6d46bd3d97: netperf.Throughput_tps -11.3% regression

2017-09-10 Thread Mike Galbraith

On Sun, 2017-09-10 at 09:53 -0700, Joel Fernandes wrote:
> 
> Anyone know what in the netperf test triggers use of the sync flag?

homer:..kernel/linux-master # git grep wake_up_interruptible_sync_poll net
net/core/sock.c:wake_up_interruptible_sync_poll(>wait, 
POLLIN | POLLPRI |
net/core/sock.c:
wake_up_interruptible_sync_poll(>wait, POLLOUT |
net/sctp/socket.c:  wake_up_interruptible_sync_poll(>wait, 
POLLIN |
net/smc/smc_rx.c:   wake_up_interruptible_sync_poll(>wait, 
POLLIN | POLLPRI |
net/tipc/socket.c:  wake_up_interruptible_sync_poll(>wait, 
POLLOUT |
net/tipc/socket.c:  wake_up_interruptible_sync_poll(>wait, 
POLLIN |
net/unix/af_unix.c: 
wake_up_interruptible_sync_poll(>wait,
net/unix/af_unix.c: wake_up_interruptible_sync_poll(>peer_wait,

The same as metric tons of other stuff.

Once upon a time, we had avg_overlap to help decide whether to wake
core affine or not, on top of the wake_affine() imbalance constraint,
but instrumentation showed it to be too error prone, so it had to die.
 These days, an affine wakeup generally means cache affine, and the
sync hint gives you a wee bit more chance of migration near to tasty
hot data being approved.

The sync hint was born back in the bad old days, when communicating
tasks not sharing L2 may as well have been talking over two tin cans
and a limp string.  These days, things are oodles better, but truly
synchronous stuff could still benefit from core affinity (up to hugely
for very fast/light stuff) if it weren't for all the caveats that can
lead to tossing concurrency opportunities out the window.

-Mike

Re: [lkp-robot] [sched/fair] 6d46bd3d97: netperf.Throughput_tps -11.3% regression

2017-09-10 Thread Mike Galbraith

On Sun, 2017-09-10 at 09:53 -0700, Joel Fernandes wrote:
> 
> Anyone know what in the netperf test triggers use of the sync flag?

homer:..kernel/linux-master # git grep wake_up_interruptible_sync_poll net
net/core/sock.c:wake_up_interruptible_sync_poll(>wait, 
POLLIN | POLLPRI |
net/core/sock.c:
wake_up_interruptible_sync_poll(>wait, POLLOUT |
net/sctp/socket.c:  wake_up_interruptible_sync_poll(>wait, 
POLLIN |
net/smc/smc_rx.c:   wake_up_interruptible_sync_poll(>wait, 
POLLIN | POLLPRI |
net/tipc/socket.c:  wake_up_interruptible_sync_poll(>wait, 
POLLOUT |
net/tipc/socket.c:  wake_up_interruptible_sync_poll(>wait, 
POLLIN |
net/unix/af_unix.c: 
wake_up_interruptible_sync_poll(>wait,
net/unix/af_unix.c: wake_up_interruptible_sync_poll(>peer_wait,

The same as metric tons of other stuff.

Once upon a time, we had avg_overlap to help decide whether to wake
core affine or not, on top of the wake_affine() imbalance constraint,
but instrumentation showed it to be too error prone, so it had to die.
 These days, an affine wakeup generally means cache affine, and the
sync hint gives you a wee bit more chance of migration near to tasty
hot data being approved.

The sync hint was born back in the bad old days, when communicating
tasks not sharing L2 may as well have been talking over two tin cans
and a limp string.  These days, things are oodles better, but truly
synchronous stuff could still benefit from core affinity (up to hugely
for very fast/light stuff) if it weren't for all the caveats that can
lead to tossing concurrency opportunities out the window.

-Mike

Re: [PATCH 01/10] spi-nor: intel-spi: Fix number of protected range registers for BYT/LPT

2017-09-10 Thread Bin Meng

Hi Marek,

On Fri, Sep 1, 2017 at 5:52 PM, Marek Vasut  wrote:
> On 09/01/2017 10:00 AM, Bin Meng wrote:
>> The number of protected range registers is not the same on BYT/LPT/
>> BXT. GPR0 only exists on Apollo Lake and its offset is reserved on
>> other platforms.
>>
>> Signed-off-by: Bin Meng 
>
> Can this use regmap ? What you're implementing here seems like regmap to me.

No. regmap does not apply here.

Regards,
Bin

Re: [PATCH 01/10] spi-nor: intel-spi: Fix number of protected range registers for BYT/LPT

2017-09-10 Thread Bin Meng

Hi Marek,

On Fri, Sep 1, 2017 at 5:52 PM, Marek Vasut  wrote:
> On 09/01/2017 10:00 AM, Bin Meng wrote:
>> The number of protected range registers is not the same on BYT/LPT/
>> BXT. GPR0 only exists on Apollo Lake and its offset is reserved on
>> other platforms.
>>
>> Signed-off-by: Bin Meng 
>
> Can this use regmap ? What you're implementing here seems like regmap to me.

No. regmap does not apply here.

Regards,
Bin

Re: [PATCH] scsi: ufs: Make use of UFS_BIT macro wherever possible

2017-09-10 Thread Alim Akhtar

Hi,

Ping!!!

Should I drop this patch and send another one which removes UFS_BIT() macro?

On Tue, Aug 29, 2017 at 4:35 PM, Alim Akhtar  wrote:
> Hi Bart,
> Thanks for your review.
>
> On 08/28/2017 09:15 PM, Bart Van Assche wrote:
>> On Mon, 2017-08-28 at 17:49 +0530, Alim Akhtar wrote:
>>> This entire file uses UFS_BIT macro for bits definition, expect for few
>>> places. This patch convert those defines to use UFS_BIT macro to be aligned
>>> with reset of the file.
>>
>> This is the definition of the UFS_BIT() macro I found in
>> drivers/scsi/ufs/ufshci.h:
>>
>> #define UFS_BIT(x)(1L << (x))
>>
>> Using this macro makes code longer instead of shorter and does not improve
>> code readability. Is this macro really useful? Wouldn't it be better to
>> remove the UFS_BIT() macro instead of introducing more uses of it?
>>
> Well, the intension of this patch is to make use of already existing
> UFS_BIT() macro.
>
> I am not aware of the history why this macro was created at first place.
>
> Well, it does improve code readability, for me at least, no need for one
> to do a calculation to see which bit it is, as we pass _bit_ number to
> UFS_BIT.
>
> I am totally okay, if you or other reviewers suggests me to change
> UFS_BIT to actual bit position, something like the original case, which
> this patch is trying to change.
>
>> Thanks,
>>
>> Bart.
>>
> Thanks!
> Alim



-- 
Regards,
Alim

Re: [PATCH] scsi: ufs: Make use of UFS_BIT macro wherever possible

2017-09-10 Thread Alim Akhtar

Hi,

Ping!!!

Should I drop this patch and send another one which removes UFS_BIT() macro?

On Tue, Aug 29, 2017 at 4:35 PM, Alim Akhtar  wrote:
> Hi Bart,
> Thanks for your review.
>
> On 08/28/2017 09:15 PM, Bart Van Assche wrote:
>> On Mon, 2017-08-28 at 17:49 +0530, Alim Akhtar wrote:
>>> This entire file uses UFS_BIT macro for bits definition, expect for few
>>> places. This patch convert those defines to use UFS_BIT macro to be aligned
>>> with reset of the file.
>>
>> This is the definition of the UFS_BIT() macro I found in
>> drivers/scsi/ufs/ufshci.h:
>>
>> #define UFS_BIT(x)(1L << (x))
>>
>> Using this macro makes code longer instead of shorter and does not improve
>> code readability. Is this macro really useful? Wouldn't it be better to
>> remove the UFS_BIT() macro instead of introducing more uses of it?
>>
> Well, the intension of this patch is to make use of already existing
> UFS_BIT() macro.
>
> I am not aware of the history why this macro was created at first place.
>
> Well, it does improve code readability, for me at least, no need for one
> to do a calculation to see which bit it is, as we pass _bit_ number to
> UFS_BIT.
>
> I am totally okay, if you or other reviewers suggests me to change
> UFS_BIT to actual bit position, something like the original case, which
> this patch is trying to change.
>
>> Thanks,
>>
>> Bart.
>>
> Thanks!
> Alim



-- 
Regards,
Alim

[PATCH RFC V2 05/10] perf tools: lock to protect thread list

2017-09-10 Thread kan . liang

From: Kan Liang 

Add two locks to protect namespaces_list and comm_list.

The comm which is used in db-export are not protected. Because the
multithread code will not touch it. It can be added later if required.

Signed-off-by: Kan Liang 
---
 tools/perf/ui/browsers/hists.c |  2 +-
 tools/perf/util/thread.c   | 60 +-
 tools/perf/util/thread.h   |  6 +++--
 3 files changed, 52 insertions(+), 16 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 13dfb0a..429e1dd 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -2370,7 +2370,7 @@ static int perf_evsel_browser_title(struct hist_browser 
*browser,
char unit;
int printed;
const struct dso *dso = hists->dso_filter;
-   const struct thread *thread = hists->thread_filter;
+   struct thread *thread = hists->thread_filter;
int socket_id = hists->socket_filter;
unsigned long nr_samples = hists->stats.nr_events[PERF_RECORD_SAMPLE];
u64 nr_events = hists->stats.total_period;
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index b7957da..25830b2 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -45,6 +45,8 @@ struct thread *thread__new(pid_t pid, pid_t tid)
thread->cpu = -1;
INIT_LIST_HEAD(>namespaces_list);
INIT_LIST_HEAD(>comm_list);
+   pthread_mutex_init(>namespaces_lock, NULL);
+   pthread_mutex_init(>comm_lock, NULL);
 
comm_str = malloc(32);
if (!comm_str)
@@ -83,15 +85,21 @@ void thread__delete(struct thread *thread)
map_groups__put(thread->mg);
thread->mg = NULL;
}
+   pthread_mutex_lock(>namespaces_lock);
list_for_each_entry_safe(namespaces, tmp_namespaces,
 >namespaces_list, list) {
list_del(>list);
namespaces__free(namespaces);
}
+   pthread_mutex_unlock(>namespaces_lock);
+
+   pthread_mutex_lock(>comm_lock);
list_for_each_entry_safe(comm, tmp_comm, >comm_list, list) {
list_del(>list);
comm__free(comm);
}
+   pthread_mutex_unlock(>comm_lock);
+
unwind__finish_access(thread);
nsinfo__zput(thread->nsinfo);
 
@@ -128,11 +136,17 @@ struct namespaces *thread__namespaces(const struct thread 
*thread)
 int thread__set_namespaces(struct thread *thread, u64 timestamp,
   struct namespaces_event *event)
 {
-   struct namespaces *new, *curr = thread__namespaces(thread);
+   struct namespaces *new, *curr;
+
+   pthread_mutex_lock(>namespaces_lock);
+
+   curr = thread__namespaces(thread);
 
new = namespaces__new(event);
-   if (!new)
+   if (!new) {
+   pthread_mutex_unlock(>namespaces_lock);
return -ENOMEM;
+   }
 
list_add(>list, >namespaces_list);
 
@@ -146,17 +160,21 @@ int thread__set_namespaces(struct thread *thread, u64 
timestamp,
curr->end_time = timestamp;
}
 
+   pthread_mutex_unlock(>namespaces_lock);
+
return 0;
 }
 
-void thread__namespaces_id(const struct thread *thread,
+void thread__namespaces_id(struct thread *thread,
   u64 *dev, u64 *ino)
 {
struct namespaces *ns;
 
+   pthread_mutex_lock(>namespaces_lock);
ns = thread__namespaces(thread);
*dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0;
*ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0;
+   pthread_mutex_unlock(>namespaces_lock);
 }
 
 struct comm *thread__comm(const struct thread *thread)
@@ -183,17 +201,23 @@ struct comm *thread__exec_comm(const struct thread 
*thread)
 int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
   bool exec)
 {
-   struct comm *new, *curr = thread__comm(thread);
+   struct comm *new, *curr;
+   int err = 0;
+
+   pthread_mutex_lock(>comm_lock);
+   curr = thread__comm(thread);
 
/* Override the default :tid entry */
if (!thread->comm_set) {
-   int err = comm__override(curr, str, timestamp, exec);
+   err = comm__override(curr, str, timestamp, exec);
if (err)
-   return err;
+   goto unlock;
} else {
new = comm__new(str, timestamp, exec);
-   if (!new)
-   return -ENOMEM;
+   if (!new) {
+   err = -ENOMEM;
+   goto unlock;
+   }
list_add(>list, >comm_list);
 
if (exec)
@@ -202,7 +226,9 @@ int __thread__set_comm(struct thread *thread, const char 
*str, u64 timestamp,
 
thread->comm_set = true;
 
-

[PATCH RFC V2 05/10] perf tools: lock to protect thread list

2017-09-10 Thread kan . liang

From: Kan Liang 

Add two locks to protect namespaces_list and comm_list.

The comm which is used in db-export are not protected. Because the
multithread code will not touch it. It can be added later if required.

Signed-off-by: Kan Liang 
---
 tools/perf/ui/browsers/hists.c |  2 +-
 tools/perf/util/thread.c   | 60 +-
 tools/perf/util/thread.h   |  6 +++--
 3 files changed, 52 insertions(+), 16 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 13dfb0a..429e1dd 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -2370,7 +2370,7 @@ static int perf_evsel_browser_title(struct hist_browser 
*browser,
char unit;
int printed;
const struct dso *dso = hists->dso_filter;
-   const struct thread *thread = hists->thread_filter;
+   struct thread *thread = hists->thread_filter;
int socket_id = hists->socket_filter;
unsigned long nr_samples = hists->stats.nr_events[PERF_RECORD_SAMPLE];
u64 nr_events = hists->stats.total_period;
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index b7957da..25830b2 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -45,6 +45,8 @@ struct thread *thread__new(pid_t pid, pid_t tid)
thread->cpu = -1;
INIT_LIST_HEAD(>namespaces_list);
INIT_LIST_HEAD(>comm_list);
+   pthread_mutex_init(>namespaces_lock, NULL);
+   pthread_mutex_init(>comm_lock, NULL);
 
comm_str = malloc(32);
if (!comm_str)
@@ -83,15 +85,21 @@ void thread__delete(struct thread *thread)
map_groups__put(thread->mg);
thread->mg = NULL;
}
+   pthread_mutex_lock(>namespaces_lock);
list_for_each_entry_safe(namespaces, tmp_namespaces,
 >namespaces_list, list) {
list_del(>list);
namespaces__free(namespaces);
}
+   pthread_mutex_unlock(>namespaces_lock);
+
+   pthread_mutex_lock(>comm_lock);
list_for_each_entry_safe(comm, tmp_comm, >comm_list, list) {
list_del(>list);
comm__free(comm);
}
+   pthread_mutex_unlock(>comm_lock);
+
unwind__finish_access(thread);
nsinfo__zput(thread->nsinfo);
 
@@ -128,11 +136,17 @@ struct namespaces *thread__namespaces(const struct thread 
*thread)
 int thread__set_namespaces(struct thread *thread, u64 timestamp,
   struct namespaces_event *event)
 {
-   struct namespaces *new, *curr = thread__namespaces(thread);
+   struct namespaces *new, *curr;
+
+   pthread_mutex_lock(>namespaces_lock);
+
+   curr = thread__namespaces(thread);
 
new = namespaces__new(event);
-   if (!new)
+   if (!new) {
+   pthread_mutex_unlock(>namespaces_lock);
return -ENOMEM;
+   }
 
list_add(>list, >namespaces_list);
 
@@ -146,17 +160,21 @@ int thread__set_namespaces(struct thread *thread, u64 
timestamp,
curr->end_time = timestamp;
}
 
+   pthread_mutex_unlock(>namespaces_lock);
+
return 0;
 }
 
-void thread__namespaces_id(const struct thread *thread,
+void thread__namespaces_id(struct thread *thread,
   u64 *dev, u64 *ino)
 {
struct namespaces *ns;
 
+   pthread_mutex_lock(>namespaces_lock);
ns = thread__namespaces(thread);
*dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0;
*ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0;
+   pthread_mutex_unlock(>namespaces_lock);
 }
 
 struct comm *thread__comm(const struct thread *thread)
@@ -183,17 +201,23 @@ struct comm *thread__exec_comm(const struct thread 
*thread)
 int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
   bool exec)
 {
-   struct comm *new, *curr = thread__comm(thread);
+   struct comm *new, *curr;
+   int err = 0;
+
+   pthread_mutex_lock(>comm_lock);
+   curr = thread__comm(thread);
 
/* Override the default :tid entry */
if (!thread->comm_set) {
-   int err = comm__override(curr, str, timestamp, exec);
+   err = comm__override(curr, str, timestamp, exec);
if (err)
-   return err;
+   goto unlock;
} else {
new = comm__new(str, timestamp, exec);
-   if (!new)
-   return -ENOMEM;
+   if (!new) {
+   err = -ENOMEM;
+   goto unlock;
+   }
list_add(>list, >comm_list);
 
if (exec)
@@ -202,7 +226,9 @@ int __thread__set_comm(struct thread *thread, const char 
*str, u64 timestamp,
 
thread->comm_set = true;
 
-   return 0;
+unlock:
+

[PATCH RFC V2 04/10] petf tools: introduce a new function to set namespaces id

2017-09-10 Thread kan . liang

From: Kan Liang 

Finish the namespaces id setting job in thread.c.

Signed-off-by: Kan Liang 
---
 tools/perf/util/hist.c   |  7 ++-
 tools/perf/util/thread.c | 10 ++
 tools/perf/util/thread.h |  2 ++
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 0f00dd9..e332c4f 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -584,14 +584,9 @@ __hists__add_entry(struct hists *hists,
   bool sample_self,
   struct hist_entry_ops *ops)
 {
-   struct namespaces *ns = thread__namespaces(al->thread);
struct hist_entry entry = {
.thread = al->thread,
.comm_str = thread__comm_str(al->thread),
-   .cgroup_id = {
-   .dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0,
-   .ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0,
-   },
.ms = {
.map= al->map,
.sym= al->sym,
@@ -617,6 +612,8 @@ __hists__add_entry(struct hists *hists,
.ops = ops,
};
 
+   thread__namespaces_id(al->thread, _id.dev,
+ _id.ino);
return hists__findnew_entry(hists, , al, sample_self);
 }
 
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index aee9a42..b7957da 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -149,6 +149,16 @@ int thread__set_namespaces(struct thread *thread, u64 
timestamp,
return 0;
 }
 
+void thread__namespaces_id(const struct thread *thread,
+  u64 *dev, u64 *ino)
+{
+   struct namespaces *ns;
+
+   ns = thread__namespaces(thread);
+   *dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0;
+   *ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0;
+}
+
 struct comm *thread__comm(const struct thread *thread)
 {
if (list_empty(>comm_list))
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index cb1a5dd..fd7bd41 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -68,6 +68,8 @@ static inline void thread__exited(struct thread *thread)
 struct namespaces *thread__namespaces(const struct thread *thread);
 int thread__set_namespaces(struct thread *thread, u64 timestamp,
   struct namespaces_event *event);
+void thread__namespaces_id(const struct thread *thread,
+  u64 *dev, u64 *ino);
 
 int __thread__set_comm(struct thread *thread, const char *comm, u64 timestamp,
   bool exec);
-- 
2.5.5

[PATCH RFC V2 04/10] petf tools: introduce a new function to set namespaces id

2017-09-10 Thread kan . liang

From: Kan Liang 

Finish the namespaces id setting job in thread.c.

Signed-off-by: Kan Liang 
---
 tools/perf/util/hist.c   |  7 ++-
 tools/perf/util/thread.c | 10 ++
 tools/perf/util/thread.h |  2 ++
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 0f00dd9..e332c4f 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -584,14 +584,9 @@ __hists__add_entry(struct hists *hists,
   bool sample_self,
   struct hist_entry_ops *ops)
 {
-   struct namespaces *ns = thread__namespaces(al->thread);
struct hist_entry entry = {
.thread = al->thread,
.comm_str = thread__comm_str(al->thread),
-   .cgroup_id = {
-   .dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0,
-   .ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0,
-   },
.ms = {
.map= al->map,
.sym= al->sym,
@@ -617,6 +612,8 @@ __hists__add_entry(struct hists *hists,
.ops = ops,
};
 
+   thread__namespaces_id(al->thread, _id.dev,
+ _id.ino);
return hists__findnew_entry(hists, , al, sample_self);
 }
 
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index aee9a42..b7957da 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -149,6 +149,16 @@ int thread__set_namespaces(struct thread *thread, u64 
timestamp,
return 0;
 }
 
+void thread__namespaces_id(const struct thread *thread,
+  u64 *dev, u64 *ino)
+{
+   struct namespaces *ns;
+
+   ns = thread__namespaces(thread);
+   *dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0;
+   *ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0;
+}
+
 struct comm *thread__comm(const struct thread *thread)
 {
if (list_empty(>comm_list))
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index cb1a5dd..fd7bd41 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -68,6 +68,8 @@ static inline void thread__exited(struct thread *thread)
 struct namespaces *thread__namespaces(const struct thread *thread);
 int thread__set_namespaces(struct thread *thread, u64 timestamp,
   struct namespaces_event *event);
+void thread__namespaces_id(const struct thread *thread,
+  u64 *dev, u64 *ino);
 
 int __thread__set_comm(struct thread *thread, const char *comm, u64 timestamp,
   bool exec);
-- 
2.5.5

[PATCH RFC V2 00/10] perf top optimization

2017-09-10 Thread kan . liang

From: Kan Liang 

The patch series intends to fix the severe performance issue in
Knights Landing/Mill, when monitoring in heavy load system.
perf top costs a few minutes to show the result, which is
unacceptable.
With the patch series applied, the latency will reduces to
several seconds.

machine__synthesize_threads and perf_top__mmap_read costs most of
the perf top time (> 99%).
Patch 1-9 do the optimization for machine__synthesize_threads.
Patch 10 does the optimization for perf_top__mmap_read.

Optimization for machine__synthesize_threads
  - Multithreading the whole process.
  - The threads number is set to the max online CPU# by default.
User can change the threads number through the new option.
  - Introduces hashtable for machine threads to reduce the lock
contention.
  - The optimization can also benefit other platforms and other
perf tools, like perf record. But this patch series doesn't
do the optimization for other tools. It can be done later
separately.
  - With this optimization applied, there is a 1.56x speedup in
Knights Mill with heavy workload.

Optimization for perf_top__mmap_read
  - switch back to overwrite mode
For non overwrite mode, it tries to read everything in the ring buffer
and does not check the messup. Once there are lots of samples delivered
shortly, the processing time could be very long.
Considering the real time requirement for perf top, it should switch
back to overwrite mode.
  - With this optimization applied, there is a huge 53.8x speedup in
Knights Mill with heavy workload.
  - With this optimization applied, the latency of perf_top__mmap_read is
less than the default perf top fresh time (2s) in Knights Mill with
heavy workload.

Here are perf top latency test result on Knights Mill and Skylake server

The heavy workload is to compile Linux kernel as below
"sudo nice make -j$(grep -c '^processor' /proc/cpuinfo)"
Then, "sudo perf top"

The latency period is the time between perf top launched and the first
profiling result shown.

- Latency on Knights Mill (272 CPUs)

Original(s) With patch(s)   Speedup
272.68  16.48   16.54x

- Latency on Skylake server (192 CPUs)

Original(s) With patch(s)   Speedup
12.28   2.964.15x

Changes since V1:
 - Patch 1: machine threads and hashtable related renaming (Arnaldo)
 - Patch 6: use a smaller locked section for comm_str__put
   add a locked wrapper for comm_str__findnew  (Arnaldo)

Kan Liang (10):
  perf tools: hashtable for machine threads
  perf tools: using scandir to replace readdir
  petf tools: using comm_str to replace comm in hist_entry
  petf tools: introduce a new function to set namespaces id
  perf tools: lock to protect thread list
  perf tools: lock to protect comm_str rb tree
  perf tools: change machine comm_exec type to atomic
  perf top: implement multithreading for perf_event__synthesize_threads
  perf top: add option to set the number of thread for event synthesize
  perf top: switch back to overwrite mode

 tools/perf/builtin-kvm.c  |   3 +-
 tools/perf/builtin-record.c   |   2 +-
 tools/perf/builtin-top.c  |   9 +-
 tools/perf/builtin-trace.c|  21 +++--
 tools/perf/tests/mmap-thread-lookup.c |   2 +-
 tools/perf/ui/browsers/hists.c|   2 +-
 tools/perf/util/comm.c|  18 +++-
 tools/perf/util/event.c   | 149 +---
 tools/perf/util/event.h   |  14 ++-
 tools/perf/util/evlist.c  |   5 +-
 tools/perf/util/hist.c|  11 +--
 tools/perf/util/machine.c | 158 +-
 tools/perf/util/machine.h |  34 ++--
 tools/perf/util/rb_resort.h   |   5 +-
 tools/perf/util/sort.c|   8 +-
 tools/perf/util/sort.h|   2 +-
 tools/perf/util/thread.c  |  68 ---
 tools/perf/util/thread.h  |   6 +-
 tools/perf/util/top.h |   1 +
 19 files changed, 376 insertions(+), 142 deletions(-)

-- 
2.5.5

[PATCH RFC V2 00/10] perf top optimization

2017-09-10 Thread kan . liang

From: Kan Liang 

The patch series intends to fix the severe performance issue in
Knights Landing/Mill, when monitoring in heavy load system.
perf top costs a few minutes to show the result, which is
unacceptable.
With the patch series applied, the latency will reduces to
several seconds.

machine__synthesize_threads and perf_top__mmap_read costs most of
the perf top time (> 99%).
Patch 1-9 do the optimization for machine__synthesize_threads.
Patch 10 does the optimization for perf_top__mmap_read.

Optimization for machine__synthesize_threads
  - Multithreading the whole process.
  - The threads number is set to the max online CPU# by default.
User can change the threads number through the new option.
  - Introduces hashtable for machine threads to reduce the lock
contention.
  - The optimization can also benefit other platforms and other
perf tools, like perf record. But this patch series doesn't
do the optimization for other tools. It can be done later
separately.
  - With this optimization applied, there is a 1.56x speedup in
Knights Mill with heavy workload.

Optimization for perf_top__mmap_read
  - switch back to overwrite mode
For non overwrite mode, it tries to read everything in the ring buffer
and does not check the messup. Once there are lots of samples delivered
shortly, the processing time could be very long.
Considering the real time requirement for perf top, it should switch
back to overwrite mode.
  - With this optimization applied, there is a huge 53.8x speedup in
Knights Mill with heavy workload.
  - With this optimization applied, the latency of perf_top__mmap_read is
less than the default perf top fresh time (2s) in Knights Mill with
heavy workload.

Here are perf top latency test result on Knights Mill and Skylake server

The heavy workload is to compile Linux kernel as below
"sudo nice make -j$(grep -c '^processor' /proc/cpuinfo)"
Then, "sudo perf top"

The latency period is the time between perf top launched and the first
profiling result shown.

- Latency on Knights Mill (272 CPUs)

Original(s) With patch(s)   Speedup
272.68  16.48   16.54x

- Latency on Skylake server (192 CPUs)

Original(s) With patch(s)   Speedup
12.28   2.964.15x

Changes since V1:
 - Patch 1: machine threads and hashtable related renaming (Arnaldo)
 - Patch 6: use a smaller locked section for comm_str__put
   add a locked wrapper for comm_str__findnew  (Arnaldo)

Kan Liang (10):
  perf tools: hashtable for machine threads
  perf tools: using scandir to replace readdir
  petf tools: using comm_str to replace comm in hist_entry
  petf tools: introduce a new function to set namespaces id
  perf tools: lock to protect thread list
  perf tools: lock to protect comm_str rb tree
  perf tools: change machine comm_exec type to atomic
  perf top: implement multithreading for perf_event__synthesize_threads
  perf top: add option to set the number of thread for event synthesize
  perf top: switch back to overwrite mode

 tools/perf/builtin-kvm.c  |   3 +-
 tools/perf/builtin-record.c   |   2 +-
 tools/perf/builtin-top.c  |   9 +-
 tools/perf/builtin-trace.c|  21 +++--
 tools/perf/tests/mmap-thread-lookup.c |   2 +-
 tools/perf/ui/browsers/hists.c|   2 +-
 tools/perf/util/comm.c|  18 +++-
 tools/perf/util/event.c   | 149 +---
 tools/perf/util/event.h   |  14 ++-
 tools/perf/util/evlist.c  |   5 +-
 tools/perf/util/hist.c|  11 +--
 tools/perf/util/machine.c | 158 +-
 tools/perf/util/machine.h |  34 ++--
 tools/perf/util/rb_resort.h   |   5 +-
 tools/perf/util/sort.c|   8 +-
 tools/perf/util/sort.h|   2 +-
 tools/perf/util/thread.c  |  68 ---
 tools/perf/util/thread.h  |   6 +-
 tools/perf/util/top.h |   1 +
 19 files changed, 376 insertions(+), 142 deletions(-)

-- 
2.5.5

[PATCH RFC V2 03/10] petf tools: using comm_str to replace comm in hist_entry

2017-09-10 Thread kan . liang

From: Kan Liang 

For hist_entry, it only needs comm_str for cmp.

Signed-off-by: Kan Liang 
---
 tools/perf/util/hist.c | 4 ++--
 tools/perf/util/sort.c | 8 
 tools/perf/util/sort.h | 2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index e60d8d8..0f00dd9 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -587,7 +587,7 @@ __hists__add_entry(struct hists *hists,
struct namespaces *ns = thread__namespaces(al->thread);
struct hist_entry entry = {
.thread = al->thread,
-   .comm = thread__comm(al->thread),
+   .comm_str = thread__comm_str(al->thread),
.cgroup_id = {
.dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0,
.ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0,
@@ -944,7 +944,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
.hists = evsel__hists(evsel),
.cpu = al->cpu,
.thread = al->thread,
-   .comm = thread__comm(al->thread),
+   .comm_str = thread__comm_str(al->thread),
.ip = al->addr,
.ms = {
.map = al->map,
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index eb3ab90..99ab411 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -114,26 +114,26 @@ static int64_t
 sort__comm_cmp(struct hist_entry *left, struct hist_entry *right)
 {
/* Compare the addr that should be unique among comm */
-   return strcmp(comm__str(right->comm), comm__str(left->comm));
+   return strcmp(right->comm_str, left->comm_str);
 }
 
 static int64_t
 sort__comm_collapse(struct hist_entry *left, struct hist_entry *right)
 {
/* Compare the addr that should be unique among comm */
-   return strcmp(comm__str(right->comm), comm__str(left->comm));
+   return strcmp(right->comm_str, left->comm_str);
 }
 
 static int64_t
 sort__comm_sort(struct hist_entry *left, struct hist_entry *right)
 {
-   return strcmp(comm__str(right->comm), comm__str(left->comm));
+   return strcmp(right->comm_str, left->comm_str);
 }
 
 static int hist_entry__comm_snprintf(struct hist_entry *he, char *bf,
 size_t size, unsigned int width)
 {
-   return repsep_snprintf(bf, size, "%-*.*s", width, width, 
comm__str(he->comm));
+   return repsep_snprintf(bf, size, "%-*.*s", width, width, he->comm_str);
 }
 
 struct sort_entry sort_comm = {
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index f36dc49..861cbe7 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -96,7 +96,7 @@ struct hist_entry {
struct he_stat  *stat_acc;
struct map_symbol   ms;
struct thread   *thread;
-   struct comm *comm;
+   const char  *comm_str;
struct namespace_id cgroup_id;
u64 ip;
u64 transaction;
-- 
2.5.5

[PATCH RFC V2 03/10] petf tools: using comm_str to replace comm in hist_entry

2017-09-10 Thread kan . liang

From: Kan Liang 

For hist_entry, it only needs comm_str for cmp.

Signed-off-by: Kan Liang 
---
 tools/perf/util/hist.c | 4 ++--
 tools/perf/util/sort.c | 8 
 tools/perf/util/sort.h | 2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index e60d8d8..0f00dd9 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -587,7 +587,7 @@ __hists__add_entry(struct hists *hists,
struct namespaces *ns = thread__namespaces(al->thread);
struct hist_entry entry = {
.thread = al->thread,
-   .comm = thread__comm(al->thread),
+   .comm_str = thread__comm_str(al->thread),
.cgroup_id = {
.dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0,
.ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0,
@@ -944,7 +944,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
.hists = evsel__hists(evsel),
.cpu = al->cpu,
.thread = al->thread,
-   .comm = thread__comm(al->thread),
+   .comm_str = thread__comm_str(al->thread),
.ip = al->addr,
.ms = {
.map = al->map,
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index eb3ab90..99ab411 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -114,26 +114,26 @@ static int64_t
 sort__comm_cmp(struct hist_entry *left, struct hist_entry *right)
 {
/* Compare the addr that should be unique among comm */
-   return strcmp(comm__str(right->comm), comm__str(left->comm));
+   return strcmp(right->comm_str, left->comm_str);
 }
 
 static int64_t
 sort__comm_collapse(struct hist_entry *left, struct hist_entry *right)
 {
/* Compare the addr that should be unique among comm */
-   return strcmp(comm__str(right->comm), comm__str(left->comm));
+   return strcmp(right->comm_str, left->comm_str);
 }
 
 static int64_t
 sort__comm_sort(struct hist_entry *left, struct hist_entry *right)
 {
-   return strcmp(comm__str(right->comm), comm__str(left->comm));
+   return strcmp(right->comm_str, left->comm_str);
 }
 
 static int hist_entry__comm_snprintf(struct hist_entry *he, char *bf,
 size_t size, unsigned int width)
 {
-   return repsep_snprintf(bf, size, "%-*.*s", width, width, 
comm__str(he->comm));
+   return repsep_snprintf(bf, size, "%-*.*s", width, width, he->comm_str);
 }
 
 struct sort_entry sort_comm = {
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index f36dc49..861cbe7 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -96,7 +96,7 @@ struct hist_entry {
struct he_stat  *stat_acc;
struct map_symbol   ms;
struct thread   *thread;
-   struct comm *comm;
+   const char  *comm_str;
struct namespace_id cgroup_id;
u64 ip;
u64 transaction;
-- 
2.5.5

[PATCH RFC V2 08/10] perf top: implement multithreading for perf_event__synthesize_threads

2017-09-10 Thread kan . liang

From: Kan Liang 

The proc files which is sorted with alphabetical order are evenly
assigned to several synthesize threads to be processed in parallel.

For perf top, the threads number hard code to online CPU number. The
following patch will introduce an option to set it.
For other perf tools, the thread number is 1. Because the process
function is not ready for multithreading, e.g.
process_synthesized_event.
This patch series only support event synthesize multithreading for perf
top. For other tools, it can be done separately later.

With multithread applied, the total processing time can get up to 1.56x
speedup on Knights Mill for perf top.

For specific single event processing, the processing time could increase
because of the lock contention. So proc_map_timeout may need to be
increased. Otherwise some proc maps will be truncated.
Based on my test, increasing the proc_map_timeout has small impact
on the total processing time. The total processing time still get 1.49x
speedup on Knights Mill after increasing the proc_map_timeout.
The patch itself doesn't increase the proc_map_timeout.

Doesn't need to implement multithreading for
perf_event__synthesize_thread_map. It doesn't have performance issue.

Signed-off-by: Kan Liang 
---
 tools/perf/builtin-kvm.c  |   3 +-
 tools/perf/builtin-record.c   |   2 +-
 tools/perf/builtin-top.c  |   4 +-
 tools/perf/builtin-trace.c|   2 +-
 tools/perf/tests/mmap-thread-lookup.c |   2 +-
 tools/perf/util/event.c   | 146 ++
 tools/perf/util/event.h   |  14 +++-
 tools/perf/util/machine.c |   8 +-
 tools/perf/util/machine.h |   9 ++-
 9 files changed, 146 insertions(+), 44 deletions(-)

diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index f309c37..edd14cb 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -1442,7 +1442,8 @@ static int kvm_events_live(struct perf_kvm_stat *kvm,
perf_session__set_id_hdr_size(kvm->session);
ordered_events__set_copy_on_queue(>session->ordered_events, true);
machine__synthesize_threads(>session->machines.host, 
>opts.target,
-   kvm->evlist->threads, false, 
kvm->opts.proc_map_timeout);
+   kvm->evlist->threads, false,
+   kvm->opts.proc_map_timeout, 1);
err = kvm_live_open_events(kvm);
if (err)
goto out;
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 56f8142..f8a6c89 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -863,7 +863,7 @@ static int record__synthesize(struct record *rec, bool tail)
 
err = __machine__synthesize_threads(machine, tool, >target, 
rec->evlist->threads,
process_synthesized_event, 
opts->sample_address,
-   opts->proc_map_timeout);
+   opts->proc_map_timeout, 1);
 out:
return err;
 }
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index ee954bd..4b8fdc1 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -959,7 +959,9 @@ static int __cmd_top(struct perf_top *top)
goto out_delete;
 
machine__synthesize_threads(>session->machines.host, >target,
-   top->evlist->threads, false, 
opts->proc_map_timeout);
+   top->evlist->threads, false,
+   opts->proc_map_timeout,
+   (unsigned 
int)sysconf(_SC_NPROCESSORS_ONLN));
 
if (perf_hpp_list.socket) {
ret = perf_env__read_cpu_topology_map(_env);
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index ee8c6e8..c6acd91 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1131,7 +1131,7 @@ static int trace__symbols_init(struct trace *trace, 
struct perf_evlist *evlist)
 
err = __machine__synthesize_threads(trace->host, >tool, 
>opts.target,
evlist->threads, 
trace__tool_process, false,
-   trace->opts.proc_map_timeout);
+   trace->opts.proc_map_timeout, 1);
if (err)
symbol__exit();
 
diff --git a/tools/perf/tests/mmap-thread-lookup.c 
b/tools/perf/tests/mmap-thread-lookup.c
index f94a419..2a0068a 100644
--- a/tools/perf/tests/mmap-thread-lookup.c
+++ b/tools/perf/tests/mmap-thread-lookup.c
@@ -131,7 +131,7 @@ static int synth_all(struct machine *machine)
 {
return perf_event__synthesize_threads(NULL,
  perf_event__process,
- machine, 0,

[PATCH RFC V2 08/10] perf top: implement multithreading for perf_event__synthesize_threads

2017-09-10 Thread kan . liang

From: Kan Liang 

The proc files which is sorted with alphabetical order are evenly
assigned to several synthesize threads to be processed in parallel.

For perf top, the threads number hard code to online CPU number. The
following patch will introduce an option to set it.
For other perf tools, the thread number is 1. Because the process
function is not ready for multithreading, e.g.
process_synthesized_event.
This patch series only support event synthesize multithreading for perf
top. For other tools, it can be done separately later.

With multithread applied, the total processing time can get up to 1.56x
speedup on Knights Mill for perf top.

For specific single event processing, the processing time could increase
because of the lock contention. So proc_map_timeout may need to be
increased. Otherwise some proc maps will be truncated.
Based on my test, increasing the proc_map_timeout has small impact
on the total processing time. The total processing time still get 1.49x
speedup on Knights Mill after increasing the proc_map_timeout.
The patch itself doesn't increase the proc_map_timeout.

Doesn't need to implement multithreading for
perf_event__synthesize_thread_map. It doesn't have performance issue.

Signed-off-by: Kan Liang 
---
 tools/perf/builtin-kvm.c  |   3 +-
 tools/perf/builtin-record.c   |   2 +-
 tools/perf/builtin-top.c  |   4 +-
 tools/perf/builtin-trace.c|   2 +-
 tools/perf/tests/mmap-thread-lookup.c |   2 +-
 tools/perf/util/event.c   | 146 ++
 tools/perf/util/event.h   |  14 +++-
 tools/perf/util/machine.c |   8 +-
 tools/perf/util/machine.h |   9 ++-
 9 files changed, 146 insertions(+), 44 deletions(-)

diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index f309c37..edd14cb 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -1442,7 +1442,8 @@ static int kvm_events_live(struct perf_kvm_stat *kvm,
perf_session__set_id_hdr_size(kvm->session);
ordered_events__set_copy_on_queue(>session->ordered_events, true);
machine__synthesize_threads(>session->machines.host, 
>opts.target,
-   kvm->evlist->threads, false, 
kvm->opts.proc_map_timeout);
+   kvm->evlist->threads, false,
+   kvm->opts.proc_map_timeout, 1);
err = kvm_live_open_events(kvm);
if (err)
goto out;
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 56f8142..f8a6c89 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -863,7 +863,7 @@ static int record__synthesize(struct record *rec, bool tail)
 
err = __machine__synthesize_threads(machine, tool, >target, 
rec->evlist->threads,
process_synthesized_event, 
opts->sample_address,
-   opts->proc_map_timeout);
+   opts->proc_map_timeout, 1);
 out:
return err;
 }
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index ee954bd..4b8fdc1 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -959,7 +959,9 @@ static int __cmd_top(struct perf_top *top)
goto out_delete;
 
machine__synthesize_threads(>session->machines.host, >target,
-   top->evlist->threads, false, 
opts->proc_map_timeout);
+   top->evlist->threads, false,
+   opts->proc_map_timeout,
+   (unsigned 
int)sysconf(_SC_NPROCESSORS_ONLN));
 
if (perf_hpp_list.socket) {
ret = perf_env__read_cpu_topology_map(_env);
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index ee8c6e8..c6acd91 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1131,7 +1131,7 @@ static int trace__symbols_init(struct trace *trace, 
struct perf_evlist *evlist)
 
err = __machine__synthesize_threads(trace->host, >tool, 
>opts.target,
evlist->threads, 
trace__tool_process, false,
-   trace->opts.proc_map_timeout);
+   trace->opts.proc_map_timeout, 1);
if (err)
symbol__exit();
 
diff --git a/tools/perf/tests/mmap-thread-lookup.c 
b/tools/perf/tests/mmap-thread-lookup.c
index f94a419..2a0068a 100644
--- a/tools/perf/tests/mmap-thread-lookup.c
+++ b/tools/perf/tests/mmap-thread-lookup.c
@@ -131,7 +131,7 @@ static int synth_all(struct machine *machine)
 {
return perf_event__synthesize_threads(NULL,
  perf_event__process,
- machine, 0, 500);
+

[PATCH RFC V2 06/10] perf tools: lock to protect comm_str rb tree

2017-09-10 Thread kan . liang

From: Kan Liang 

Add comm_str_lock to protect comm_str rb tree.

Signed-off-by: Kan Liang 
---
 tools/perf/util/comm.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/comm.c b/tools/perf/util/comm.c
index 7bc981b..cb7f98c 100644
--- a/tools/perf/util/comm.c
+++ b/tools/perf/util/comm.c
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct comm_str {
char *str;
@@ -14,6 +15,7 @@ struct comm_str {
 
 /* Should perhaps be moved to struct machine */
 static struct rb_root comm_str_root;
+static pthread_mutex_t comm_str_lock = PTHREAD_MUTEX_INITIALIZER;
 
 static struct comm_str *comm_str__get(struct comm_str *cs)
 {
@@ -25,7 +27,9 @@ static struct comm_str *comm_str__get(struct comm_str *cs)
 static void comm_str__put(struct comm_str *cs)
 {
if (cs && refcount_dec_and_test(>refcnt)) {
+   pthread_mutex_lock(_str_lock);
rb_erase(>rb_node, _str_root);
+   pthread_mutex_unlock(_str_lock);
zfree(>str);
free(cs);
}
@@ -50,7 +54,8 @@ static struct comm_str *comm_str__alloc(const char *str)
return cs;
 }
 
-static struct comm_str *comm_str__findnew(const char *str, struct rb_root 
*root)
+static
+struct comm_str *__comm_str__findnew(const char *str, struct rb_root *root)
 {
struct rb_node **p = >rb_node;
struct rb_node *parent = NULL;
@@ -81,6 +86,17 @@ static struct comm_str *comm_str__findnew(const char *str, 
struct rb_root *root)
return new;
 }
 
+static struct comm_str *comm_str__findnew(const char *str, struct rb_root 
*root)
+{
+   struct comm_str *cs;
+
+   pthread_mutex_lock(_str_lock);
+   cs = __comm_str__findnew(str, root);
+   pthread_mutex_unlock(_str_lock);
+
+   return cs;
+}
+
 struct comm *comm__new(const char *str, u64 timestamp, bool exec)
 {
struct comm *comm = zalloc(sizeof(*comm));
-- 
2.5.5

[PATCH RFC V2 10/10] perf top: switch back to overwrite mode

2017-09-10 Thread kan . liang

From: Kan Liang 

perf_top__mmap_read has severe performance issue in
Knights Landing/Mill, when monitoring in heavy load system. It costs
several minutes to finish, which is unacceptable.

perf top was overwrite mode. But it is changed to non overwrite mode
since commit 93fc64f14472 ("perf top: Switch to non overwrite mode").
For non overwrite mode, it tries to read everything in the ring buffer
and does not check the messup. Once there are lots of samples delivered
shortly, the processing time could be very long.
Knights Landing/Mill as a manycore processor contains a large number of
small cores. Because of the huge core number, it will generated lots of
samples in a heavy load system. Also, since the huge sample#, the mmap
writer probably bite the tail and mess up the samples.

Switching to overwrite mode, which dropping the unsure mmap entries,
significantly speeds up the whole progress.
Considering the real time requirement for perf top, it should switch
back to overwrite mode.

Only warning once if the messup is detected.
Providing some hints to users.

Signed-off-by: Kan Liang 
---
 tools/perf/builtin-top.c | 2 +-
 tools/perf/util/evlist.c | 5 -
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 5f59aa7..8124b8f 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -902,7 +902,7 @@ static int perf_top__start_counters(struct perf_top *top)
}
}
 
-   if (perf_evlist__mmap(evlist, opts->mmap_pages, false) < 0) {
+   if (perf_evlist__mmap(evlist, opts->mmap_pages, true) < 0) {
ui__error("Failed to mmap with %d (%s)\n",
errno, str_error_r(errno, msg, sizeof(msg)));
goto out_err;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 6a0d7ff..ef0b6b1 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -723,7 +723,10 @@ perf_mmap__read(struct perf_mmap *md, bool check_messup, 
u64 start,
 * In either case, truncate and restart at 'end'.
 */
if (diff > md->mask / 2 || diff < 0) {
-   fprintf(stderr, "WARNING: failed to keep up with mmap 
data.\n");
+   WARN_ONCE(1, "WARNING: failed to keep up with mmap 
data.\n"
+"Please try increasing the period (-c) 
or\n"
+"decreasing the freq (-F) or\n"
+"limiting the number of CPUs");
 
/*
 * 'end' points to a known good entry, start there.
-- 
2.5.5

[PATCH RFC V2 06/10] perf tools: lock to protect comm_str rb tree

2017-09-10 Thread kan . liang

From: Kan Liang 

Add comm_str_lock to protect comm_str rb tree.

Signed-off-by: Kan Liang 
---
 tools/perf/util/comm.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/comm.c b/tools/perf/util/comm.c
index 7bc981b..cb7f98c 100644
--- a/tools/perf/util/comm.c
+++ b/tools/perf/util/comm.c
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct comm_str {
char *str;
@@ -14,6 +15,7 @@ struct comm_str {
 
 /* Should perhaps be moved to struct machine */
 static struct rb_root comm_str_root;
+static pthread_mutex_t comm_str_lock = PTHREAD_MUTEX_INITIALIZER;
 
 static struct comm_str *comm_str__get(struct comm_str *cs)
 {
@@ -25,7 +27,9 @@ static struct comm_str *comm_str__get(struct comm_str *cs)
 static void comm_str__put(struct comm_str *cs)
 {
if (cs && refcount_dec_and_test(>refcnt)) {
+   pthread_mutex_lock(_str_lock);
rb_erase(>rb_node, _str_root);
+   pthread_mutex_unlock(_str_lock);
zfree(>str);
free(cs);
}
@@ -50,7 +54,8 @@ static struct comm_str *comm_str__alloc(const char *str)
return cs;
 }
 
-static struct comm_str *comm_str__findnew(const char *str, struct rb_root 
*root)
+static
+struct comm_str *__comm_str__findnew(const char *str, struct rb_root *root)
 {
struct rb_node **p = >rb_node;
struct rb_node *parent = NULL;
@@ -81,6 +86,17 @@ static struct comm_str *comm_str__findnew(const char *str, 
struct rb_root *root)
return new;
 }
 
+static struct comm_str *comm_str__findnew(const char *str, struct rb_root 
*root)
+{
+   struct comm_str *cs;
+
+   pthread_mutex_lock(_str_lock);
+   cs = __comm_str__findnew(str, root);
+   pthread_mutex_unlock(_str_lock);
+
+   return cs;
+}
+
 struct comm *comm__new(const char *str, u64 timestamp, bool exec)
 {
struct comm *comm = zalloc(sizeof(*comm));
-- 
2.5.5

[PATCH RFC V2 10/10] perf top: switch back to overwrite mode

2017-09-10 Thread kan . liang

From: Kan Liang 

perf_top__mmap_read has severe performance issue in
Knights Landing/Mill, when monitoring in heavy load system. It costs
several minutes to finish, which is unacceptable.

perf top was overwrite mode. But it is changed to non overwrite mode
since commit 93fc64f14472 ("perf top: Switch to non overwrite mode").
For non overwrite mode, it tries to read everything in the ring buffer
and does not check the messup. Once there are lots of samples delivered
shortly, the processing time could be very long.
Knights Landing/Mill as a manycore processor contains a large number of
small cores. Because of the huge core number, it will generated lots of
samples in a heavy load system. Also, since the huge sample#, the mmap
writer probably bite the tail and mess up the samples.

Switching to overwrite mode, which dropping the unsure mmap entries,
significantly speeds up the whole progress.
Considering the real time requirement for perf top, it should switch
back to overwrite mode.

Only warning once if the messup is detected.
Providing some hints to users.

Signed-off-by: Kan Liang 
---
 tools/perf/builtin-top.c | 2 +-
 tools/perf/util/evlist.c | 5 -
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 5f59aa7..8124b8f 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -902,7 +902,7 @@ static int perf_top__start_counters(struct perf_top *top)
}
}
 
-   if (perf_evlist__mmap(evlist, opts->mmap_pages, false) < 0) {
+   if (perf_evlist__mmap(evlist, opts->mmap_pages, true) < 0) {
ui__error("Failed to mmap with %d (%s)\n",
errno, str_error_r(errno, msg, sizeof(msg)));
goto out_err;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 6a0d7ff..ef0b6b1 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -723,7 +723,10 @@ perf_mmap__read(struct perf_mmap *md, bool check_messup, 
u64 start,
 * In either case, truncate and restart at 'end'.
 */
if (diff > md->mask / 2 || diff < 0) {
-   fprintf(stderr, "WARNING: failed to keep up with mmap 
data.\n");
+   WARN_ONCE(1, "WARNING: failed to keep up with mmap 
data.\n"
+"Please try increasing the period (-c) 
or\n"
+"decreasing the freq (-F) or\n"
+"limiting the number of CPUs");
 
/*
 * 'end' points to a known good entry, start there.
-- 
2.5.5

[PATCH RFC V2 01/10] perf tools: hashtable for machine threads

2017-09-10 Thread kan . liang

From: Kan Liang 

To process any events, it needs to find the thread in the machine first.
The machine maintains a rb tree to store all threads. The rb tree is
protected by a rw lock.
It is not a problem for current perf which serially processing events.
However, it will have scalability performance issue to process events in
parallel, especially on a heavy load system which have many threads.

Introduce a hashtable to divide the big rb tree into many samll rb tree
for threads. The index is thread id % hashtable size. It can reduce the
lock contention.

Signed-off-by: Kan Liang 
---
 tools/perf/builtin-trace.c  |  19 +++---
 tools/perf/util/machine.c   | 139 
 tools/perf/util/machine.h   |  23 ++--
 tools/perf/util/rb_resort.h |   5 +-
 4 files changed, 120 insertions(+), 66 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 771ddab..ee8c6e8 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2730,20 +2730,23 @@ DEFINE_RESORT_RB(threads, 
(thread__nr_events(a->thread->priv) < thread__nr_event
 
 static size_t trace__fprintf_thread_summary(struct trace *trace, FILE *fp)
 {
-   DECLARE_RESORT_RB_MACHINE_THREADS(threads, trace->host);
size_t printed = trace__fprintf_threads_header(fp);
struct rb_node *nd;
+   int i;
 
-   if (threads == NULL) {
-   fprintf(fp, "%s", "Error sorting output by nr_events!\n");
-   return 0;
-   }
+   for (i = 0; i < THREADS__TABLE_SIZE; i++) {
+   DECLARE_RESORT_RB_MACHINE_THREADS(threads, trace->host, i);
 
-   resort_rb__for_each_entry(nd, threads)
-   printed += trace__fprintf_thread(fp, threads_entry->thread, 
trace);
+   if (threads == NULL) {
+   fprintf(fp, "%s", "Error sorting output by 
nr_events!\n");
+   return 0;
+   }
 
-   resort_rb__delete(threads);
+   resort_rb__for_each_entry(nd, threads)
+   printed += trace__fprintf_thread(fp, 
threads_entry->thread, trace);
 
+   resort_rb__delete(threads);
+   }
return printed;
 }
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index df70936..2f1ad29 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -33,6 +33,21 @@ static void dsos__init(struct dsos *dsos)
pthread_rwlock_init(>lock, NULL);
 }
 
+static void machine__thread_init(struct machine *machine)
+{
+   struct threads *thread;
+   int i;
+
+   for (i = 0; i < THREADS__TABLE_SIZE; i++) {
+   thread = >threads[i];
+   thread->entries = RB_ROOT;
+   pthread_rwlock_init(>lock, NULL);
+   thread->nr = 0;
+   INIT_LIST_HEAD(>dead);
+   thread->last_match = NULL;
+   }
+}
+
 int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
 {
memset(machine, 0, sizeof(*machine));
@@ -40,11 +55,7 @@ int machine__init(struct machine *machine, const char 
*root_dir, pid_t pid)
RB_CLEAR_NODE(>rb_node);
dsos__init(>dsos);
 
-   machine->threads = RB_ROOT;
-   pthread_rwlock_init(>threads_lock, NULL);
-   machine->nr_threads = 0;
-   INIT_LIST_HEAD(>dead_threads);
-   machine->last_match = NULL;
+   machine__thread_init(machine);
 
machine->vdso_info = NULL;
machine->env = NULL;
@@ -140,28 +151,39 @@ static void dsos__exit(struct dsos *dsos)
 
 void machine__delete_threads(struct machine *machine)
 {
+   struct threads *thread;
struct rb_node *nd;
+   int i;
 
-   pthread_rwlock_wrlock(>threads_lock);
-   nd = rb_first(>threads);
-   while (nd) {
-   struct thread *t = rb_entry(nd, struct thread, rb_node);
+   for (i = 0; i < THREADS__TABLE_SIZE; i++) {
+   thread = >threads[i];
+   pthread_rwlock_wrlock(>lock);
+   nd = rb_first(>entries);
+   while (nd) {
+   struct thread *t = rb_entry(nd, struct thread, rb_node);
 
-   nd = rb_next(nd);
-   __machine__remove_thread(machine, t, false);
+   nd = rb_next(nd);
+   __machine__remove_thread(machine, t, false);
+   }
+   pthread_rwlock_unlock(>lock);
}
-   pthread_rwlock_unlock(>threads_lock);
 }
 
 void machine__exit(struct machine *machine)
 {
+   struct threads *thread;
+   int i;
+
machine__destroy_kernel_maps(machine);
map_groups__exit(>kmaps);
dsos__exit(>dsos);
machine__exit_vdso(machine);
zfree(>root_dir);
zfree(>current_tid);
-   pthread_rwlock_destroy(>threads_lock);
+   for (i = 0; i < THREADS__TABLE_SIZE; i++) {
+   thread = >threads[i];
+

[PATCH RFC V2 01/10] perf tools: hashtable for machine threads

2017-09-10 Thread kan . liang

From: Kan Liang 

To process any events, it needs to find the thread in the machine first.
The machine maintains a rb tree to store all threads. The rb tree is
protected by a rw lock.
It is not a problem for current perf which serially processing events.
However, it will have scalability performance issue to process events in
parallel, especially on a heavy load system which have many threads.

Introduce a hashtable to divide the big rb tree into many samll rb tree
for threads. The index is thread id % hashtable size. It can reduce the
lock contention.

Signed-off-by: Kan Liang 
---
 tools/perf/builtin-trace.c  |  19 +++---
 tools/perf/util/machine.c   | 139 
 tools/perf/util/machine.h   |  23 ++--
 tools/perf/util/rb_resort.h |   5 +-
 4 files changed, 120 insertions(+), 66 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 771ddab..ee8c6e8 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2730,20 +2730,23 @@ DEFINE_RESORT_RB(threads, 
(thread__nr_events(a->thread->priv) < thread__nr_event
 
 static size_t trace__fprintf_thread_summary(struct trace *trace, FILE *fp)
 {
-   DECLARE_RESORT_RB_MACHINE_THREADS(threads, trace->host);
size_t printed = trace__fprintf_threads_header(fp);
struct rb_node *nd;
+   int i;
 
-   if (threads == NULL) {
-   fprintf(fp, "%s", "Error sorting output by nr_events!\n");
-   return 0;
-   }
+   for (i = 0; i < THREADS__TABLE_SIZE; i++) {
+   DECLARE_RESORT_RB_MACHINE_THREADS(threads, trace->host, i);
 
-   resort_rb__for_each_entry(nd, threads)
-   printed += trace__fprintf_thread(fp, threads_entry->thread, 
trace);
+   if (threads == NULL) {
+   fprintf(fp, "%s", "Error sorting output by 
nr_events!\n");
+   return 0;
+   }
 
-   resort_rb__delete(threads);
+   resort_rb__for_each_entry(nd, threads)
+   printed += trace__fprintf_thread(fp, 
threads_entry->thread, trace);
 
+   resort_rb__delete(threads);
+   }
return printed;
 }
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index df70936..2f1ad29 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -33,6 +33,21 @@ static void dsos__init(struct dsos *dsos)
pthread_rwlock_init(>lock, NULL);
 }
 
+static void machine__thread_init(struct machine *machine)
+{
+   struct threads *thread;
+   int i;
+
+   for (i = 0; i < THREADS__TABLE_SIZE; i++) {
+   thread = >threads[i];
+   thread->entries = RB_ROOT;
+   pthread_rwlock_init(>lock, NULL);
+   thread->nr = 0;
+   INIT_LIST_HEAD(>dead);
+   thread->last_match = NULL;
+   }
+}
+
 int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
 {
memset(machine, 0, sizeof(*machine));
@@ -40,11 +55,7 @@ int machine__init(struct machine *machine, const char 
*root_dir, pid_t pid)
RB_CLEAR_NODE(>rb_node);
dsos__init(>dsos);
 
-   machine->threads = RB_ROOT;
-   pthread_rwlock_init(>threads_lock, NULL);
-   machine->nr_threads = 0;
-   INIT_LIST_HEAD(>dead_threads);
-   machine->last_match = NULL;
+   machine__thread_init(machine);
 
machine->vdso_info = NULL;
machine->env = NULL;
@@ -140,28 +151,39 @@ static void dsos__exit(struct dsos *dsos)
 
 void machine__delete_threads(struct machine *machine)
 {
+   struct threads *thread;
struct rb_node *nd;
+   int i;
 
-   pthread_rwlock_wrlock(>threads_lock);
-   nd = rb_first(>threads);
-   while (nd) {
-   struct thread *t = rb_entry(nd, struct thread, rb_node);
+   for (i = 0; i < THREADS__TABLE_SIZE; i++) {
+   thread = >threads[i];
+   pthread_rwlock_wrlock(>lock);
+   nd = rb_first(>entries);
+   while (nd) {
+   struct thread *t = rb_entry(nd, struct thread, rb_node);
 
-   nd = rb_next(nd);
-   __machine__remove_thread(machine, t, false);
+   nd = rb_next(nd);
+   __machine__remove_thread(machine, t, false);
+   }
+   pthread_rwlock_unlock(>lock);
}
-   pthread_rwlock_unlock(>threads_lock);
 }
 
 void machine__exit(struct machine *machine)
 {
+   struct threads *thread;
+   int i;
+
machine__destroy_kernel_maps(machine);
map_groups__exit(>kmaps);
dsos__exit(>dsos);
machine__exit_vdso(machine);
zfree(>root_dir);
zfree(>current_tid);
-   pthread_rwlock_destroy(>threads_lock);
+   for (i = 0; i < THREADS__TABLE_SIZE; i++) {
+   thread = >threads[i];
+   pthread_rwlock_destroy(>lock);
+   }
 }
 
 void

[PATCH RFC V2 09/10] perf top: add option to set the number of thread for event synthesize

2017-09-10 Thread kan . liang

From: Kan Liang 

Using UINT_MAX to indicate the default thread#, which is the max number
of online CPU.

Signed-off-by: Kan Liang 
---
 tools/perf/builtin-top.c | 5 -
 tools/perf/util/event.c  | 5 -
 tools/perf/util/top.h| 1 +
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 4b8fdc1..5f59aa7 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -961,7 +961,7 @@ static int __cmd_top(struct perf_top *top)
machine__synthesize_threads(>session->machines.host, >target,
top->evlist->threads, false,
opts->proc_map_timeout,
-   (unsigned 
int)sysconf(_SC_NPROCESSORS_ONLN));
+   top->nr_threads_synthesize);
 
if (perf_hpp_list.socket) {
ret = perf_env__read_cpu_topology_map(_env);
@@ -1114,6 +1114,7 @@ int cmd_top(int argc, const char **argv)
},
.max_stack   = sysctl_perf_event_max_stack,
.sym_pcnt_filter = 5,
+   .nr_threads_synthesize = UINT_MAX,
};
struct record_opts *opts = _opts;
struct target *target = >target;
@@ -1223,6 +1224,8 @@ int cmd_top(int argc, const char **argv)
OPT_BOOLEAN(0, "hierarchy", _conf.report_hierarchy,
"Show entries in a hierarchy"),
OPT_BOOLEAN(0, "force", _conf.force, "don't complain, do it"),
+   OPT_UINTEGER(0, "num-thread-synthesize", _threads_synthesize,
+   "number of thread to run event synthesize"),
OPT_END()
};
const char * const top_usage[] = {
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 8c4e072..ecef279 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -778,7 +778,10 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
if (n < 0)
return err;
 
-   thread_nr = nr_threads_synthesize;
+   if (nr_threads_synthesize == UINT_MAX)
+   thread_nr = sysconf(_SC_NPROCESSORS_ONLN);
+   else
+   thread_nr = nr_threads_synthesize;
if (thread_nr <= 0)
thread_nr = 1;
if (thread_nr > n)
diff --git a/tools/perf/util/top.h b/tools/perf/util/top.h
index 9bdfb78..f4296e1 100644
--- a/tools/perf/util/top.h
+++ b/tools/perf/util/top.h
@@ -37,6 +37,7 @@ struct perf_top {
intsym_pcnt_filter;
const char *sym_filter;
float  min_percent;
+   unsigned int   nr_threads_synthesize;
 };
 
 #define CONSOLE_CLEAR "[H[2J"
-- 
2.5.5

[PATCH RFC V2 09/10] perf top: add option to set the number of thread for event synthesize

2017-09-10 Thread kan . liang

From: Kan Liang 

Using UINT_MAX to indicate the default thread#, which is the max number
of online CPU.

Signed-off-by: Kan Liang 
---
 tools/perf/builtin-top.c | 5 -
 tools/perf/util/event.c  | 5 -
 tools/perf/util/top.h| 1 +
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 4b8fdc1..5f59aa7 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -961,7 +961,7 @@ static int __cmd_top(struct perf_top *top)
machine__synthesize_threads(>session->machines.host, >target,
top->evlist->threads, false,
opts->proc_map_timeout,
-   (unsigned 
int)sysconf(_SC_NPROCESSORS_ONLN));
+   top->nr_threads_synthesize);
 
if (perf_hpp_list.socket) {
ret = perf_env__read_cpu_topology_map(_env);
@@ -1114,6 +1114,7 @@ int cmd_top(int argc, const char **argv)
},
.max_stack   = sysctl_perf_event_max_stack,
.sym_pcnt_filter = 5,
+   .nr_threads_synthesize = UINT_MAX,
};
struct record_opts *opts = _opts;
struct target *target = >target;
@@ -1223,6 +1224,8 @@ int cmd_top(int argc, const char **argv)
OPT_BOOLEAN(0, "hierarchy", _conf.report_hierarchy,
"Show entries in a hierarchy"),
OPT_BOOLEAN(0, "force", _conf.force, "don't complain, do it"),
+   OPT_UINTEGER(0, "num-thread-synthesize", _threads_synthesize,
+   "number of thread to run event synthesize"),
OPT_END()
};
const char * const top_usage[] = {
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 8c4e072..ecef279 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -778,7 +778,10 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
if (n < 0)
return err;
 
-   thread_nr = nr_threads_synthesize;
+   if (nr_threads_synthesize == UINT_MAX)
+   thread_nr = sysconf(_SC_NPROCESSORS_ONLN);
+   else
+   thread_nr = nr_threads_synthesize;
if (thread_nr <= 0)
thread_nr = 1;
if (thread_nr > n)
diff --git a/tools/perf/util/top.h b/tools/perf/util/top.h
index 9bdfb78..f4296e1 100644
--- a/tools/perf/util/top.h
+++ b/tools/perf/util/top.h
@@ -37,6 +37,7 @@ struct perf_top {
intsym_pcnt_filter;
const char *sym_filter;
float  min_percent;
+   unsigned int   nr_threads_synthesize;
 };
 
 #define CONSOLE_CLEAR "[H[2J"
-- 
2.5.5

[PATCH RFC V2 02/10] perf tools: using scandir to replace readdir

2017-09-10 Thread kan . liang

From: Kan Liang 

For perf_event__synthesize_threads, perf goes through all proc files
serially by readdir.
scandir did a snapshoot of proc, which is multithreading friendly.

It's possible that some threads which are added during event synthesize.
But the number of lost threads should be small.
They should not impact the final analysis.

Signed-off-by: Kan Liang 
---
 tools/perf/util/event.c | 46 ++
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 1c905ba..c31f678 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -683,12 +683,14 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
   bool mmap_data,
   unsigned int proc_map_timeout)
 {
-   DIR *proc;
-   char proc_path[PATH_MAX];
-   struct dirent *dirent;
union perf_event *comm_event, *mmap_event, *fork_event;
union perf_event *namespaces_event;
+   char proc_path[PATH_MAX];
+   struct dirent **dirent;
int err = -1;
+   char *end;
+   pid_t pid;
+   int n, i;
 
if (machine__is_default_guest(machine))
return 0;
@@ -712,29 +714,33 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
goto out_free_fork;
 
snprintf(proc_path, sizeof(proc_path), "%s/proc", machine->root_dir);
-   proc = opendir(proc_path);
+   n = scandir(proc_path, , 0, alphasort);
 
-   if (proc == NULL)
+   if (n < 0)
goto out_free_namespaces;
 
-   while ((dirent = readdir(proc)) != NULL) {
-   char *end;
-   pid_t pid = strtol(dirent->d_name, , 10);
-
-   if (*end) /* only interested in proper numerical dirents */
+   for (i = 0; i < n; i++) {
+   if (!isdigit(dirent[i]->d_name[0]))
continue;
-   /*
-* We may race with exiting thread, so don't stop just because
-* one thread couldn't be synthesized.
-*/
-   __event__synthesize_thread(comm_event, mmap_event, fork_event,
-  namespaces_event, pid, 1, process,
-  tool, machine, mmap_data,
-  proc_map_timeout);
-   }
 
+   pid = (pid_t)strtol(dirent[i]->d_name, , 10);
+   /* only interested in proper numerical dirents */
+   if (!*end) {
+   /*
+* We may race with exiting thread, so don't stop
+* just because one thread couldn't be synthesized.
+*/
+   __event__synthesize_thread(comm_event, mmap_event,
+  fork_event, namespaces_event,
+  pid, 1, process, tool,
+  machine, mmap_data,
+  proc_map_timeout);
+   }
+   free(dirent[i]);
+   }
+   free(dirent);
err = 0;
-   closedir(proc);
+
 out_free_namespaces:
free(namespaces_event);
 out_free_fork:
-- 
2.5.5

[PATCH RFC V2 02/10] perf tools: using scandir to replace readdir

2017-09-10 Thread kan . liang

From: Kan Liang 

For perf_event__synthesize_threads, perf goes through all proc files
serially by readdir.
scandir did a snapshoot of proc, which is multithreading friendly.

It's possible that some threads which are added during event synthesize.
But the number of lost threads should be small.
They should not impact the final analysis.

Signed-off-by: Kan Liang 
---
 tools/perf/util/event.c | 46 ++
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 1c905ba..c31f678 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -683,12 +683,14 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
   bool mmap_data,
   unsigned int proc_map_timeout)
 {
-   DIR *proc;
-   char proc_path[PATH_MAX];
-   struct dirent *dirent;
union perf_event *comm_event, *mmap_event, *fork_event;
union perf_event *namespaces_event;
+   char proc_path[PATH_MAX];
+   struct dirent **dirent;
int err = -1;
+   char *end;
+   pid_t pid;
+   int n, i;
 
if (machine__is_default_guest(machine))
return 0;
@@ -712,29 +714,33 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
goto out_free_fork;
 
snprintf(proc_path, sizeof(proc_path), "%s/proc", machine->root_dir);
-   proc = opendir(proc_path);
+   n = scandir(proc_path, , 0, alphasort);
 
-   if (proc == NULL)
+   if (n < 0)
goto out_free_namespaces;
 
-   while ((dirent = readdir(proc)) != NULL) {
-   char *end;
-   pid_t pid = strtol(dirent->d_name, , 10);
-
-   if (*end) /* only interested in proper numerical dirents */
+   for (i = 0; i < n; i++) {
+   if (!isdigit(dirent[i]->d_name[0]))
continue;
-   /*
-* We may race with exiting thread, so don't stop just because
-* one thread couldn't be synthesized.
-*/
-   __event__synthesize_thread(comm_event, mmap_event, fork_event,
-  namespaces_event, pid, 1, process,
-  tool, machine, mmap_data,
-  proc_map_timeout);
-   }
 
+   pid = (pid_t)strtol(dirent[i]->d_name, , 10);
+   /* only interested in proper numerical dirents */
+   if (!*end) {
+   /*
+* We may race with exiting thread, so don't stop
+* just because one thread couldn't be synthesized.
+*/
+   __event__synthesize_thread(comm_event, mmap_event,
+  fork_event, namespaces_event,
+  pid, 1, process, tool,
+  machine, mmap_data,
+  proc_map_timeout);
+   }
+   free(dirent[i]);
+   }
+   free(dirent);
err = 0;
-   closedir(proc);
+
 out_free_namespaces:
free(namespaces_event);
 out_free_fork:
-- 
2.5.5

[PATCH RFC V2 07/10] perf tools: change machine comm_exec type to atomic

2017-09-10 Thread kan . liang

From: Kan Liang 

In case there are two or more threads want to change it.

Signed-off-by: Kan Liang 
---
 tools/perf/util/machine.c | 11 ++-
 tools/perf/util/machine.h |  2 +-
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 2f1ad29..bbfb9e0 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -64,8 +64,8 @@ int machine__init(struct machine *machine, const char 
*root_dir, pid_t pid)
 
machine->id_hdr_size = 0;
machine->kptr_restrict_warned = false;
-   machine->comm_exec = false;
machine->kernel_start = 0;
+   atomic_set(>comm_exec, 0);
 
memset(machine->vmlinux_maps, 0, sizeof(machine->vmlinux_maps));
 
@@ -238,14 +238,15 @@ struct machine *machines__add(struct machines *machines, 
pid_t pid,
 
 void machines__set_comm_exec(struct machines *machines, bool comm_exec)
 {
+   int exec = comm_exec ? 1 : 0;
struct rb_node *nd;
 
-   machines->host.comm_exec = comm_exec;
+   atomic_set(>host.comm_exec, exec);
 
for (nd = rb_first(>guests); nd; nd = rb_next(nd)) {
struct machine *machine = rb_entry(nd, struct machine, rb_node);
 
-   machine->comm_exec = comm_exec;
+   atomic_set(>comm_exec, exec);
}
 }
 
@@ -505,7 +506,7 @@ struct thread *machine__find_thread(struct machine 
*machine, pid_t pid,
 struct comm *machine__thread_exec_comm(struct machine *machine,
   struct thread *thread)
 {
-   if (machine->comm_exec)
+   if (atomic_read(>comm_exec))
return thread__exec_comm(thread);
else
return thread__comm(thread);
@@ -521,7 +522,7 @@ int machine__process_comm_event(struct machine *machine, 
union perf_event *event
int err = 0;
 
if (exec)
-   machine->comm_exec = true;
+   atomic_set(>comm_exec, 1);
 
if (dump_trace)
perf_event__fprintf_comm(event, stdout);
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 64663eb..fb3c2a2 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -38,7 +38,7 @@ struct machine {
struct rb_noderb_node;
pid_t pid;
u16   id_hdr_size;
-   bool  comm_exec;
+   atomic_t  comm_exec;
bool  kptr_restrict_warned;
char  *root_dir;
struct threadsthreads[THREADS__TABLE_SIZE];
-- 
2.5.5

[PATCH RFC V2 07/10] perf tools: change machine comm_exec type to atomic

2017-09-10 Thread kan . liang

From: Kan Liang 

In case there are two or more threads want to change it.

Signed-off-by: Kan Liang 
---
 tools/perf/util/machine.c | 11 ++-
 tools/perf/util/machine.h |  2 +-
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 2f1ad29..bbfb9e0 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -64,8 +64,8 @@ int machine__init(struct machine *machine, const char 
*root_dir, pid_t pid)
 
machine->id_hdr_size = 0;
machine->kptr_restrict_warned = false;
-   machine->comm_exec = false;
machine->kernel_start = 0;
+   atomic_set(>comm_exec, 0);
 
memset(machine->vmlinux_maps, 0, sizeof(machine->vmlinux_maps));
 
@@ -238,14 +238,15 @@ struct machine *machines__add(struct machines *machines, 
pid_t pid,
 
 void machines__set_comm_exec(struct machines *machines, bool comm_exec)
 {
+   int exec = comm_exec ? 1 : 0;
struct rb_node *nd;
 
-   machines->host.comm_exec = comm_exec;
+   atomic_set(>host.comm_exec, exec);
 
for (nd = rb_first(>guests); nd; nd = rb_next(nd)) {
struct machine *machine = rb_entry(nd, struct machine, rb_node);
 
-   machine->comm_exec = comm_exec;
+   atomic_set(>comm_exec, exec);
}
 }
 
@@ -505,7 +506,7 @@ struct thread *machine__find_thread(struct machine 
*machine, pid_t pid,
 struct comm *machine__thread_exec_comm(struct machine *machine,
   struct thread *thread)
 {
-   if (machine->comm_exec)
+   if (atomic_read(>comm_exec))
return thread__exec_comm(thread);
else
return thread__comm(thread);
@@ -521,7 +522,7 @@ int machine__process_comm_event(struct machine *machine, 
union perf_event *event
int err = 0;
 
if (exec)
-   machine->comm_exec = true;
+   atomic_set(>comm_exec, 1);
 
if (dump_trace)
perf_event__fprintf_comm(event, stdout);
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 64663eb..fb3c2a2 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -38,7 +38,7 @@ struct machine {
struct rb_noderb_node;
pid_t pid;
u16   id_hdr_size;
-   bool  comm_exec;
+   atomic_t  comm_exec;
bool  kptr_restrict_warned;
char  *root_dir;
struct threadsthreads[THREADS__TABLE_SIZE];
-- 
2.5.5

[PATCH 2/4] dt-bindings: timer: Add Actions Semi S700

2017-09-10 Thread Andreas Färber

Define a compatible string for the Actions Semi S700 SoC timer.

Signed-off-by: Andreas Färber 
---
 Documentation/devicetree/bindings/timer/actions,owl-timer.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/timer/actions,owl-timer.txt 
b/Documentation/devicetree/bindings/timer/actions,owl-timer.txt
index e3c28da80cb2..977054f87563 100644
--- a/Documentation/devicetree/bindings/timer/actions,owl-timer.txt
+++ b/Documentation/devicetree/bindings/timer/actions,owl-timer.txt
@@ -2,6 +2,7 @@ Actions Semi Owl Timer
 
 Required properties:
 - compatible  :  "actions,s500-timer" for S500
+ "actions,s700-timer" for S700
  "actions,s900-timer" for S900
 - reg :  Offset and length of the register set for the device.
 - interrupts  :  Should contain the interrupts.
-- 
2.13.5

[PATCH 2/4] dt-bindings: timer: Add Actions Semi S700

2017-09-10 Thread Andreas Färber

Define a compatible string for the Actions Semi S700 SoC timer.

Signed-off-by: Andreas Färber 
---
 Documentation/devicetree/bindings/timer/actions,owl-timer.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/timer/actions,owl-timer.txt 
b/Documentation/devicetree/bindings/timer/actions,owl-timer.txt
index e3c28da80cb2..977054f87563 100644
--- a/Documentation/devicetree/bindings/timer/actions,owl-timer.txt
+++ b/Documentation/devicetree/bindings/timer/actions,owl-timer.txt
@@ -2,6 +2,7 @@ Actions Semi Owl Timer
 
 Required properties:
 - compatible  :  "actions,s500-timer" for S500
+ "actions,s700-timer" for S700
  "actions,s900-timer" for S900
 - reg :  Offset and length of the register set for the device.
 - interrupts  :  Should contain the interrupts.
-- 
2.13.5

[PATCH 0/4] arm64: Add initial Actions Semi S700 and CubieBoard7 support

2017-09-10 Thread Andreas Färber

Hello,

This series prepares the Actions Semi S700 SoC and Cubietech CubieBoard7.
It boots equally far as the S900 based Bubblegum-96 these patches are based on,
i.e. into an initrd with serial and all four cores up.

I have not yet managed to download the CubieBoard7 sources from Baidu, and the
Mega download has not yet been fixed, so this is based on the S700 datasheet
and /proc/device-tree/ from the preinstalled Android.

Lacking sources and instructions to replace the bootloader on eMMC or to try
booting from SD, the only working way I've found to boot mainline kernels
appears to be booting into Android, then issuing "reboot bootloader". U-Boot
is lacking the saveenv command, so it wasn't possible to change the bootdelay
to facilitate this.

Cf. https://en.opensuse.org/HCL:CubieBoard7

Work branch:
https://github.com/afaerber/linux/commits/bg96-next

Have a lot of fun!

Cheers,
Andreas

Cc: Ahha Lee 
Cc: supp...@cubietech.com

Cc: devicet...@vger.kernel.org
Cc: Daniel Lezcano 
Cc: Thomas Gleixner 

Andreas Färber (4):
  dt-bindings: arm: actions: Add S700 and CubieBoard7
  dt-bindings: timer: Add Actions Semi S700
  clocksource: owl: Prepare S700
  arm64: dts: actions: Add S700 and CubieBoard7

 Documentation/devicetree/bindings/arm/actions.txt  |  15 ++
 .../bindings/timer/actions,owl-timer.txt   |   1 +
 arch/arm64/boot/dts/actions/Makefile   |   2 +
 arch/arm64/boot/dts/actions/s700-cubieboard7.dts   |  47 ++
 arch/arm64/boot/dts/actions/s700.dtsi  | 164 +
 drivers/clocksource/owl-timer.c|   1 +
 6 files changed, 230 insertions(+)
 create mode 100644 arch/arm64/boot/dts/actions/s700-cubieboard7.dts
 create mode 100644 arch/arm64/boot/dts/actions/s700.dtsi

-- 
2.13.5

[PATCH 0/4] arm64: Add initial Actions Semi S700 and CubieBoard7 support

2017-09-10 Thread Andreas Färber

Hello,

This series prepares the Actions Semi S700 SoC and Cubietech CubieBoard7.
It boots equally far as the S900 based Bubblegum-96 these patches are based on,
i.e. into an initrd with serial and all four cores up.

I have not yet managed to download the CubieBoard7 sources from Baidu, and the
Mega download has not yet been fixed, so this is based on the S700 datasheet
and /proc/device-tree/ from the preinstalled Android.

Lacking sources and instructions to replace the bootloader on eMMC or to try
booting from SD, the only working way I've found to boot mainline kernels
appears to be booting into Android, then issuing "reboot bootloader". U-Boot
is lacking the saveenv command, so it wasn't possible to change the bootdelay
to facilitate this.

Cf. https://en.opensuse.org/HCL:CubieBoard7

Work branch:
https://github.com/afaerber/linux/commits/bg96-next

Have a lot of fun!

Cheers,
Andreas

Cc: Ahha Lee 
Cc: supp...@cubietech.com

Cc: devicet...@vger.kernel.org
Cc: Daniel Lezcano 
Cc: Thomas Gleixner 

Andreas Färber (4):
  dt-bindings: arm: actions: Add S700 and CubieBoard7
  dt-bindings: timer: Add Actions Semi S700
  clocksource: owl: Prepare S700
  arm64: dts: actions: Add S700 and CubieBoard7

 Documentation/devicetree/bindings/arm/actions.txt  |  15 ++
 .../bindings/timer/actions,owl-timer.txt   |   1 +
 arch/arm64/boot/dts/actions/Makefile   |   2 +
 arch/arm64/boot/dts/actions/s700-cubieboard7.dts   |  47 ++
 arch/arm64/boot/dts/actions/s700.dtsi  | 164 +
 drivers/clocksource/owl-timer.c|   1 +
 6 files changed, 230 insertions(+)
 create mode 100644 arch/arm64/boot/dts/actions/s700-cubieboard7.dts
 create mode 100644 arch/arm64/boot/dts/actions/s700.dtsi

-- 
2.13.5

[PATCH 3/4] clocksource: owl: Prepare S700

2017-09-10 Thread Andreas Färber

S700 has two 2Hz timers like S500, and four TIMx timers like S900.

Signed-off-by: Andreas Färber 
---
 drivers/clocksource/owl-timer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/clocksource/owl-timer.c b/drivers/clocksource/owl-timer.c
index d19c53c11094..d5f3f72c10ee 100644
--- a/drivers/clocksource/owl-timer.c
+++ b/drivers/clocksource/owl-timer.c
@@ -169,4 +169,5 @@ static int __init owl_timer_init(struct device_node *node)
return 0;
 }
 CLOCKSOURCE_OF_DECLARE(owl_s500, "actions,s500-timer", owl_timer_init);
+CLOCKSOURCE_OF_DECLARE(owl_s700, "actions,s700-timer", owl_timer_init);
 CLOCKSOURCE_OF_DECLARE(owl_s900, "actions,s900-timer", owl_timer_init);
-- 
2.13.5

[PATCH 3/4] clocksource: owl: Prepare S700

2017-09-10 Thread Andreas Färber

S700 has two 2Hz timers like S500, and four TIMx timers like S900.

Signed-off-by: Andreas Färber 
---
 drivers/clocksource/owl-timer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/clocksource/owl-timer.c b/drivers/clocksource/owl-timer.c
index d19c53c11094..d5f3f72c10ee 100644
--- a/drivers/clocksource/owl-timer.c
+++ b/drivers/clocksource/owl-timer.c
@@ -169,4 +169,5 @@ static int __init owl_timer_init(struct device_node *node)
return 0;
 }
 CLOCKSOURCE_OF_DECLARE(owl_s500, "actions,s500-timer", owl_timer_init);
+CLOCKSOURCE_OF_DECLARE(owl_s700, "actions,s700-timer", owl_timer_init);
 CLOCKSOURCE_OF_DECLARE(owl_s900, "actions,s900-timer", owl_timer_init);
-- 
2.13.5

[PATCH 4/4] arm64: dts: actions: Add S700 and CubieBoard7

2017-09-10 Thread Andreas Färber

Add Device Trees for S700 SoC and Cubietech CubieBoard7.

Signed-off-by: Andreas Färber 
---
 arch/arm64/boot/dts/actions/Makefile |   2 +
 arch/arm64/boot/dts/actions/s700-cubieboard7.dts |  47 +++
 arch/arm64/boot/dts/actions/s700.dtsi| 164 +++
 3 files changed, 213 insertions(+)
 create mode 100644 arch/arm64/boot/dts/actions/s700-cubieboard7.dts
 create mode 100644 arch/arm64/boot/dts/actions/s700.dtsi

diff --git a/arch/arm64/boot/dts/actions/Makefile 
b/arch/arm64/boot/dts/actions/Makefile
index 62922d688ce3..cfd7f2679feb 100644
--- a/arch/arm64/boot/dts/actions/Makefile
+++ b/arch/arm64/boot/dts/actions/Makefile
@@ -1,3 +1,5 @@
+dtb-$(CONFIG_ARCH_ACTIONS) += s700-cubieboard7.dtb
+
 dtb-$(CONFIG_ARCH_ACTIONS) += s900-bubblegum-96.dtb
 
 always := $(dtb-y)
diff --git a/arch/arm64/boot/dts/actions/s700-cubieboard7.dts 
b/arch/arm64/boot/dts/actions/s700-cubieboard7.dts
new file mode 100644
index ..e562f04c2490
--- /dev/null
+++ b/arch/arm64/boot/dts/actions/s700-cubieboard7.dts
@@ -0,0 +1,47 @@
+/*
+ * Copyright (c) 2017 Andreas Färber
+ *
+ * SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+ */
+
+/dts-v1/;
+
+#include "s700.dtsi"
+
+/ {
+   compatible = "cubietech,cubieboard7", "actions,s700";
+   model = "CubieBoard7";
+
+   aliases {
+   serial3 = 
+   };
+
+   chosen {
+   stdout-path = "serial3:115200n8";
+   };
+
+   memory@0 {
+   device_type = "memory";
+   reg = <0x0 0x0 0x0 0x8000>;
+   };
+
+   memory@1,e000 {
+   device_type = "memory";
+   reg = <0x1 0xe000 0x0 0x0>;
+   };
+
+   uart3_clk: uart3-clk {
+   compatible = "fixed-clock";
+   clock-frequency = <921600>;
+   #clock-cells = <0>;
+   };
+};
+
+ {
+   clocks = <>;
+};
+
+ {
+   status = "okay";
+   clocks = <_clk>;
+};
diff --git a/arch/arm64/boot/dts/actions/s700.dtsi 
b/arch/arm64/boot/dts/actions/s700.dtsi
new file mode 100644
index ..b54df405aaeb
--- /dev/null
+++ b/arch/arm64/boot/dts/actions/s700.dtsi
@@ -0,0 +1,164 @@
+/*
+ * Copyright (c) 2017 Andreas Färber
+ *
+ * SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+ */
+
+#include 
+
+/ {
+   compatible = "actions,s700";
+   interrupt-parent = <>;
+   #address-cells = <2>;
+   #size-cells = <2>;
+
+   cpus {
+   #address-cells = <2>;
+   #size-cells = <0>;
+
+   cpu0: cpu@0 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a53", "arm,armv8";
+   reg = <0x0 0x0>;
+   enable-method = "psci";
+   };
+
+   cpu1: cpu@1 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a53", "arm,armv8";
+   reg = <0x0 0x1>;
+   enable-method = "psci";
+   };
+
+   cpu2: cpu@2 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a53", "arm,armv8";
+   reg = <0x0 0x2>;
+   enable-method = "psci";
+   };
+
+   cpu3: cpu@3 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a53", "arm,armv8";
+   reg = <0x0 0x3>;
+   enable-method = "psci";
+   };
+   };
+
+   reserved-memory {
+   #address-cells = <2>;
+   #size-cells = <2>;
+   ranges;
+
+   secmon@1f00 {
+   reg = <0x0 0x1f00 0x0 0x100>;
+   no-map;
+   };
+   };
+
+   psci {
+   compatible = "arm,psci-0.2";
+   method = "smc";
+   };
+
+   arm-pmu {
+   compatible = "arm,cortex-a53-pmu";
+   interrupts = ,
+,
+,
+;
+   interrupt-affinity = <>, <>, <>, <>;
+   };
+
+   timer {
+   compatible = "arm,armv8-timer";
+   interrupts = ,
+,
+,
+;
+   };
+
+   hosc: hosc {
+   compatible = "fixed-clock";
+   clock-frequency = <2400>;
+   #clock-cells = <0>;
+   };
+
+   soc {
+   compatible = "simple-bus";
+   #address-cells = <2>;
+   #size-cells = <2>;
+   ranges;
+
+   gic: interrupt-controller@e00f1000 {
+   compatible = "arm,gic-400";
+   reg = <0x0 0xe00f1000 0x0 0x1000>,
+ <0x0 0xe00f2000 0x0 0x2000>,
+ <0x0

[PATCH 1/4] dt-bindings: arm: actions: Add S700 and CubieBoard7

2017-09-10 Thread Andreas Färber

Document the Actions Semi S700 SoC and the Cubietech CubieBoard7.

Signed-off-by: Andreas Färber 
---
 Documentation/devicetree/bindings/arm/actions.txt | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/actions.txt 
b/Documentation/devicetree/bindings/arm/actions.txt
index ced764a8549e..544a8855fad5 100644
--- a/Documentation/devicetree/bindings/arm/actions.txt
+++ b/Documentation/devicetree/bindings/arm/actions.txt
@@ -25,6 +25,21 @@ Root node property compatible must contain, depending on 
board:
  - LeMaker Guitar Base Board rev. B: "lemaker,guitar-bb-rev-b", 
"lemaker,guitar"
 
 
+S700 SoC
+
+
+Required root node properties:
+
+- compatible :  must contain "actions,s700"
+
+
+Boards:
+
+Root node property compatible must contain, depending on board:
+
+ - Cubietech CubieBoard7: "cubietech,cubieboard7"
+
+
 S900 SoC
 
 
-- 
2.13.5

[PATCH 4/4] arm64: dts: actions: Add S700 and CubieBoard7

2017-09-10 Thread Andreas Färber

Add Device Trees for S700 SoC and Cubietech CubieBoard7.

Signed-off-by: Andreas Färber 
---
 arch/arm64/boot/dts/actions/Makefile |   2 +
 arch/arm64/boot/dts/actions/s700-cubieboard7.dts |  47 +++
 arch/arm64/boot/dts/actions/s700.dtsi| 164 +++
 3 files changed, 213 insertions(+)
 create mode 100644 arch/arm64/boot/dts/actions/s700-cubieboard7.dts
 create mode 100644 arch/arm64/boot/dts/actions/s700.dtsi

diff --git a/arch/arm64/boot/dts/actions/Makefile 
b/arch/arm64/boot/dts/actions/Makefile
index 62922d688ce3..cfd7f2679feb 100644
--- a/arch/arm64/boot/dts/actions/Makefile
+++ b/arch/arm64/boot/dts/actions/Makefile
@@ -1,3 +1,5 @@
+dtb-$(CONFIG_ARCH_ACTIONS) += s700-cubieboard7.dtb
+
 dtb-$(CONFIG_ARCH_ACTIONS) += s900-bubblegum-96.dtb
 
 always := $(dtb-y)
diff --git a/arch/arm64/boot/dts/actions/s700-cubieboard7.dts 
b/arch/arm64/boot/dts/actions/s700-cubieboard7.dts
new file mode 100644
index ..e562f04c2490
--- /dev/null
+++ b/arch/arm64/boot/dts/actions/s700-cubieboard7.dts
@@ -0,0 +1,47 @@
+/*
+ * Copyright (c) 2017 Andreas Färber
+ *
+ * SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+ */
+
+/dts-v1/;
+
+#include "s700.dtsi"
+
+/ {
+   compatible = "cubietech,cubieboard7", "actions,s700";
+   model = "CubieBoard7";
+
+   aliases {
+   serial3 = 
+   };
+
+   chosen {
+   stdout-path = "serial3:115200n8";
+   };
+
+   memory@0 {
+   device_type = "memory";
+   reg = <0x0 0x0 0x0 0x8000>;
+   };
+
+   memory@1,e000 {
+   device_type = "memory";
+   reg = <0x1 0xe000 0x0 0x0>;
+   };
+
+   uart3_clk: uart3-clk {
+   compatible = "fixed-clock";
+   clock-frequency = <921600>;
+   #clock-cells = <0>;
+   };
+};
+
+ {
+   clocks = <>;
+};
+
+ {
+   status = "okay";
+   clocks = <_clk>;
+};
diff --git a/arch/arm64/boot/dts/actions/s700.dtsi 
b/arch/arm64/boot/dts/actions/s700.dtsi
new file mode 100644
index ..b54df405aaeb
--- /dev/null
+++ b/arch/arm64/boot/dts/actions/s700.dtsi
@@ -0,0 +1,164 @@
+/*
+ * Copyright (c) 2017 Andreas Färber
+ *
+ * SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+ */
+
+#include 
+
+/ {
+   compatible = "actions,s700";
+   interrupt-parent = <>;
+   #address-cells = <2>;
+   #size-cells = <2>;
+
+   cpus {
+   #address-cells = <2>;
+   #size-cells = <0>;
+
+   cpu0: cpu@0 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a53", "arm,armv8";
+   reg = <0x0 0x0>;
+   enable-method = "psci";
+   };
+
+   cpu1: cpu@1 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a53", "arm,armv8";
+   reg = <0x0 0x1>;
+   enable-method = "psci";
+   };
+
+   cpu2: cpu@2 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a53", "arm,armv8";
+   reg = <0x0 0x2>;
+   enable-method = "psci";
+   };
+
+   cpu3: cpu@3 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a53", "arm,armv8";
+   reg = <0x0 0x3>;
+   enable-method = "psci";
+   };
+   };
+
+   reserved-memory {
+   #address-cells = <2>;
+   #size-cells = <2>;
+   ranges;
+
+   secmon@1f00 {
+   reg = <0x0 0x1f00 0x0 0x100>;
+   no-map;
+   };
+   };
+
+   psci {
+   compatible = "arm,psci-0.2";
+   method = "smc";
+   };
+
+   arm-pmu {
+   compatible = "arm,cortex-a53-pmu";
+   interrupts = ,
+,
+,
+;
+   interrupt-affinity = <>, <>, <>, <>;
+   };
+
+   timer {
+   compatible = "arm,armv8-timer";
+   interrupts = ,
+,
+,
+;
+   };
+
+   hosc: hosc {
+   compatible = "fixed-clock";
+   clock-frequency = <2400>;
+   #clock-cells = <0>;
+   };
+
+   soc {
+   compatible = "simple-bus";
+   #address-cells = <2>;
+   #size-cells = <2>;
+   ranges;
+
+   gic: interrupt-controller@e00f1000 {
+   compatible = "arm,gic-400";
+   reg = <0x0 0xe00f1000 0x0 0x1000>,
+ <0x0 0xe00f2000 0x0 0x2000>,
+ <0x0 0xe00f4000 0x0 0x2000>,
+

[PATCH 1/4] dt-bindings: arm: actions: Add S700 and CubieBoard7

2017-09-10 Thread Andreas Färber

Document the Actions Semi S700 SoC and the Cubietech CubieBoard7.

Signed-off-by: Andreas Färber 
---
 Documentation/devicetree/bindings/arm/actions.txt | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/actions.txt 
b/Documentation/devicetree/bindings/arm/actions.txt
index ced764a8549e..544a8855fad5 100644
--- a/Documentation/devicetree/bindings/arm/actions.txt
+++ b/Documentation/devicetree/bindings/arm/actions.txt
@@ -25,6 +25,21 @@ Root node property compatible must contain, depending on 
board:
  - LeMaker Guitar Base Board rev. B: "lemaker,guitar-bb-rev-b", 
"lemaker,guitar"
 
 
+S700 SoC
+
+
+Required root node properties:
+
+- compatible :  must contain "actions,s700"
+
+
+Boards:
+
+Root node property compatible must contain, depending on board:
+
+ - Cubietech CubieBoard7: "cubietech,cubieboard7"
+
+
 S900 SoC
 
 
-- 
2.13.5

[PATCH] net: tcp_input: Neaten DBGUNDO

2017-09-10 Thread Joe Perches

Move the #ifdef into the static void function so that the use
of DBGUNDO is validated when FASTRETRANS_DEBUG <= 1.

Remove the now unnecessary #else and #define DBGUNDO.

Signed-off-by: Joe Perches 
---
 net/ipv4/tcp_input.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index c5d7656b..bddf724f5c02 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2333,9 +2333,9 @@ static bool tcp_any_retrans_done(const struct sock *sk)
return false;
 }
 
-#if FASTRETRANS_DEBUG > 1
 static void DBGUNDO(struct sock *sk, const char *msg)
 {
+#if FASTRETRANS_DEBUG > 1
struct tcp_sock *tp = tcp_sk(sk);
struct inet_sock *inet = inet_sk(sk);
 
@@ -2357,10 +2357,8 @@ static void DBGUNDO(struct sock *sk, const char *msg)
 tp->packets_out);
}
 #endif
-}
-#else
-#define DBGUNDO(x...) do { } while (0)
 #endif
+}
 
 static void tcp_undo_cwnd_reduction(struct sock *sk, bool unmark_loss)
 {
-- 
2.10.0.rc2.1.g053435c

[PATCH] net: tcp_input: Neaten DBGUNDO

2017-09-10 Thread Joe Perches

Move the #ifdef into the static void function so that the use
of DBGUNDO is validated when FASTRETRANS_DEBUG <= 1.

Remove the now unnecessary #else and #define DBGUNDO.

Signed-off-by: Joe Perches 
---
 net/ipv4/tcp_input.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index c5d7656b..bddf724f5c02 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2333,9 +2333,9 @@ static bool tcp_any_retrans_done(const struct sock *sk)
return false;
 }
 
-#if FASTRETRANS_DEBUG > 1
 static void DBGUNDO(struct sock *sk, const char *msg)
 {
+#if FASTRETRANS_DEBUG > 1
struct tcp_sock *tp = tcp_sk(sk);
struct inet_sock *inet = inet_sk(sk);
 
@@ -2357,10 +2357,8 @@ static void DBGUNDO(struct sock *sk, const char *msg)
 tp->packets_out);
}
 #endif
-}
-#else
-#define DBGUNDO(x...) do { } while (0)
 #endif
+}
 
 static void tcp_undo_cwnd_reduction(struct sock *sk, bool unmark_loss)
 {
-- 
2.10.0.rc2.1.g053435c

Re: [PATCH] media: i2c: OV5647: gate clock lane before stream on

2017-09-10 Thread Jacob Chen

Hi Luis,

2017-08-07 22:48 GMT+08:00 Luis Oliveira :
> Hi all,
>
> I'm new here, I got to be Maintainer of this driver by the old Maintainer
> recommendation. Still getting the hang of it :)
>
> On 07-Aug-17 13:26, Philipp Zabel wrote:
>> Hi Jacob,
>>
>> On Mon, 2017-08-07 at 19:06 +0800, Jacob Chen wrote:
>> [...]
>> --- a/drivers/media/i2c/ov5647.c
>> +++ b/drivers/media/i2c/ov5647.c
>> @@ -253,6 +253,10 @@ static int ov5647_stream_on(struct v4l2_subdev *sd)
>>  {
>> int ret;
>>
>> +   ret = ov5647_write(sd, 0x4800, 0x04);
>> +   if (ret < 0)
>> +   return ret;
>> +
>>
>> So this clears BIT(1) (force clock lane to low power mode) and BIT(5)
>> (gate clock lane while idle) that were set by ov5647_stream_off() during
>> __sensor_init() due to the change below.
>>
>> Is there a reason, btw, that this driver is full of magic register
>> addresses and values? A few #defines would make this a lot more
>> readable.
>>
>
> For what I can see I agree that a few register name setting could be done.
>
>> ret = ov5647_write(sd, 0x4202, 0x00);
>> if (ret < 0)
>> return ret;
>> @@ -264,6 +268,10 @@ static int ov5647_stream_off(struct v4l2_subdev *sd)
>>  {
>> int ret;
>>
>> +   ret = ov5647_write(sd, 0x4800, 0x25);
>> +   if (ret < 0)
>> +   return ret;
>> +
>> ret = ov5647_write(sd, 0x4202, 0x0f);
>> if (ret < 0)
>> return ret;
>> @@ -320,7 +328,7 @@ static int __sensor_init(struct v4l2_subdev *sd)
>> return ret;
>> }
>>
>> -   return ov5647_write(sd, 0x4800, 0x04);
>> +   return ov5647_stream_off(sd);
>>
>> I see now that BIT(2) (keep bus in LP-11 while idle) is and was always
>> set. So the change is that initially, additionally to LP-11 mode, the
>> clock lane is gated and forced into low power mode, as well?
>>
>
> This is my interpretation as well.
>
>>  }
>>
>>  static int ov5647_sensor_power(struct v4l2_subdev *sd, int on)
>> --
>> 2.7.4
>>
>
> Can anyone comment on it?
>
> I saw there is a same discussion in  
> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel.org_patch_9569031_=DwICaQ=DPL6_X_6JkXFx7AXWqB0tg=eMn12aiiNuIDjtRi5xEzC7tWJkpra2vl_XYFVvfxIGE=eortcRXje2uLyZNI_-Uw3Ur_z24tb-e4pZfom7WhdE0=6sLc76bhjR0IdaA3ArZ7F7slgtcyGz8pDTzAF_CBLno=
> There is a comment in i.MX CSI2 driver.
> "
> Configure MIPI Camera Sensor to put all Tx lanes in LP-11 state.
> This must be carried out by the MIPI sensor's s_power(ON) subdev
> op.
> "
> That's what this patch do, sensor driver should make sure that clock
> lanes are in stop state while not streaming.

 This is not the same, as far as I can tell. BIT(5) is just clock lane
 gating, as you describe above. To put the bus into LP-11 state, BIT(2)
 needs to be set.

>>>
>>> Yeah, but i double that clock lane is not in LP11 when continue clock
>>> mode is enabled.
>
> I think by spec it shouldn't got to stopstate in continuous clock.
>
>>
>> If indeed LP-11 state is not achieved while the sensor is idle, as long
>> as BIT(5) is cleared, I think this patch is correct.
>>
>> regards
>> Philipp
>>
>
> As far as I understand, bit[5] set to 1 will force clock lane to be gated (in
> other words it will be forced to be in LP-11 if there are no packets to
> transmit). But also LP-11 must not be achieved with the BIT(5) cleared (free
> running mode)?
>

Yes. When the BIT(5) is cleared, clock lane will be in continuous mode
immediately,
unless BIT(0) is set.


> Sorry if I misunderstood something.
>
> regards,
> Luis
>

Re: [PATCH] media: i2c: OV5647: gate clock lane before stream on

2017-09-10 Thread Jacob Chen

Hi Luis,

2017-08-07 22:48 GMT+08:00 Luis Oliveira :
> Hi all,
>
> I'm new here, I got to be Maintainer of this driver by the old Maintainer
> recommendation. Still getting the hang of it :)
>
> On 07-Aug-17 13:26, Philipp Zabel wrote:
>> Hi Jacob,
>>
>> On Mon, 2017-08-07 at 19:06 +0800, Jacob Chen wrote:
>> [...]
>> --- a/drivers/media/i2c/ov5647.c
>> +++ b/drivers/media/i2c/ov5647.c
>> @@ -253,6 +253,10 @@ static int ov5647_stream_on(struct v4l2_subdev *sd)
>>  {
>> int ret;
>>
>> +   ret = ov5647_write(sd, 0x4800, 0x04);
>> +   if (ret < 0)
>> +   return ret;
>> +
>>
>> So this clears BIT(1) (force clock lane to low power mode) and BIT(5)
>> (gate clock lane while idle) that were set by ov5647_stream_off() during
>> __sensor_init() due to the change below.
>>
>> Is there a reason, btw, that this driver is full of magic register
>> addresses and values? A few #defines would make this a lot more
>> readable.
>>
>
> For what I can see I agree that a few register name setting could be done.
>
>> ret = ov5647_write(sd, 0x4202, 0x00);
>> if (ret < 0)
>> return ret;
>> @@ -264,6 +268,10 @@ static int ov5647_stream_off(struct v4l2_subdev *sd)
>>  {
>> int ret;
>>
>> +   ret = ov5647_write(sd, 0x4800, 0x25);
>> +   if (ret < 0)
>> +   return ret;
>> +
>> ret = ov5647_write(sd, 0x4202, 0x0f);
>> if (ret < 0)
>> return ret;
>> @@ -320,7 +328,7 @@ static int __sensor_init(struct v4l2_subdev *sd)
>> return ret;
>> }
>>
>> -   return ov5647_write(sd, 0x4800, 0x04);
>> +   return ov5647_stream_off(sd);
>>
>> I see now that BIT(2) (keep bus in LP-11 while idle) is and was always
>> set. So the change is that initially, additionally to LP-11 mode, the
>> clock lane is gated and forced into low power mode, as well?
>>
>
> This is my interpretation as well.
>
>>  }
>>
>>  static int ov5647_sensor_power(struct v4l2_subdev *sd, int on)
>> --
>> 2.7.4
>>
>
> Can anyone comment on it?
>
> I saw there is a same discussion in  
> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel.org_patch_9569031_=DwICaQ=DPL6_X_6JkXFx7AXWqB0tg=eMn12aiiNuIDjtRi5xEzC7tWJkpra2vl_XYFVvfxIGE=eortcRXje2uLyZNI_-Uw3Ur_z24tb-e4pZfom7WhdE0=6sLc76bhjR0IdaA3ArZ7F7slgtcyGz8pDTzAF_CBLno=
> There is a comment in i.MX CSI2 driver.
> "
> Configure MIPI Camera Sensor to put all Tx lanes in LP-11 state.
> This must be carried out by the MIPI sensor's s_power(ON) subdev
> op.
> "
> That's what this patch do, sensor driver should make sure that clock
> lanes are in stop state while not streaming.

 This is not the same, as far as I can tell. BIT(5) is just clock lane
 gating, as you describe above. To put the bus into LP-11 state, BIT(2)
 needs to be set.

>>>
>>> Yeah, but i double that clock lane is not in LP11 when continue clock
>>> mode is enabled.
>
> I think by spec it shouldn't got to stopstate in continuous clock.
>
>>
>> If indeed LP-11 state is not achieved while the sensor is idle, as long
>> as BIT(5) is cleared, I think this patch is correct.
>>
>> regards
>> Philipp
>>
>
> As far as I understand, bit[5] set to 1 will force clock lane to be gated (in
> other words it will be forced to be in LP-11 if there are no packets to
> transmit). But also LP-11 must not be achieved with the BIT(5) cleared (free
> running mode)?
>

Yes. When the BIT(5) is cleared, clock lane will be in continuous mode
immediately,
unless BIT(0) is set.


> Sorry if I misunderstood something.
>
> regards,
> Luis
>

Re: xtensa compiler error triggered by 'locking/lockdep: Implement the 'crossrelease' feature'

2017-09-10 Thread Max Filippov

On Sun, Sep 10, 2017 at 11:31 AM, Guenter Roeck  wrote:
> xtensa:allmodconfig fails to build in mainline with compiler errors
> as follows.
>
> drivers/staging/rtl8723bs/core/rtw_ap.c: In function ‘expire_timeout_chk’:
> drivers/staging/rtl8723bs/core/rtw_ap.c:442:1: internal compiler error:
> in change_address_1, at emit-rtl.c:2126
>
> drivers/staging/rtl8188eu/core/rtw_ap.c: In function ‘expire_timeout_chk’:
> drivers/staging/rtl8188eu/core/rtw_ap.c:445:1: internal compiler error:
> in change_address_1, at emit-rtl.c:2150
>
> I tried gcc 6.3, 6.4, and 7.2; they all have the same problem.
>
> Bisect points to commit b09be676e0f ("locking/lockdep: Implement the
> 'crossrelease' feature").
>
> The compile error is still seen if I disable CONFIG_LOCKDEP_CROSSRELEASE.
> The image builds fine if I disable both CONFIG_RTL8723BS and CONFIG_R8188EU.

Thanks for the report. I'll look at the compiler.

-- 
Thanks.
-- Max

Re: xtensa compiler error triggered by 'locking/lockdep: Implement the 'crossrelease' feature'

2017-09-10 Thread Max Filippov

On Sun, Sep 10, 2017 at 11:31 AM, Guenter Roeck  wrote:
> xtensa:allmodconfig fails to build in mainline with compiler errors
> as follows.
>
> drivers/staging/rtl8723bs/core/rtw_ap.c: In function ‘expire_timeout_chk’:
> drivers/staging/rtl8723bs/core/rtw_ap.c:442:1: internal compiler error:
> in change_address_1, at emit-rtl.c:2126
>
> drivers/staging/rtl8188eu/core/rtw_ap.c: In function ‘expire_timeout_chk’:
> drivers/staging/rtl8188eu/core/rtw_ap.c:445:1: internal compiler error:
> in change_address_1, at emit-rtl.c:2150
>
> I tried gcc 6.3, 6.4, and 7.2; they all have the same problem.
>
> Bisect points to commit b09be676e0f ("locking/lockdep: Implement the
> 'crossrelease' feature").
>
> The compile error is still seen if I disable CONFIG_LOCKDEP_CROSSRELEASE.
> The image builds fine if I disable both CONFIG_RTL8723BS and CONFIG_R8188EU.

Thanks for the report. I'll look at the compiler.

-- 
Thanks.
-- Max

Re: [PATCH v3 1/3] arm64/ras: support sea error recovery

2017-09-10 Thread Xiongfeng Wang

Hi James,

Thanks for your comments.

On 2017/9/9 2:15, James Morse wrote:
> Hi Xie XiuQi,
> 
> (Sorry a few versions of this went past before I caught up with it)
> 
> On 07/09/17 08:45, Xie XiuQi wrote:
>> With ARM v8.2 RAS Extension, SEA are usually triggered when memory errors
>> are consumed. In some cases, if the error address is in a clean page or a
>> read-only page, there is a chance to recover. Such as error occurs in a
>> instruction page, we can reread this page from disk instead of killing
>> process.
> 
>> Because memory_failure() may sleep, we can not call it directly in SEA
>> exception context.
> 
> This is why we have memory_failure_queue() instead, it ... bother. That 
> doesn't
> look nmi-safe. (I thought this ended with an llist, but clearly I was looking 
> at
> the wrong thing).
> 
> It doesn't look like this is a problem for NOTIFY_SEA as it would only 
> interrupt
> itself on the same CPU if the memory-failure code/data were corrupt. (which is
> not a case we can handle). We need to fix this before any of the asynchronous
> NMI-like RAS notifications for arm64 get merged.
> 
> (this is one problem, but I don't think its 'the' problem you are trying to
> solve with this series).
> 
> 
>> So we saved faulting physical address associated with
>> a process in the ghes handler and set __TIF_SEA_NOTIFY.
> 
> A per-notification type TIF flag looks fishy, surely this would affect all
> NMI-like RAS notification methods?
> 
> 
>> When we return
>> from SEA exception context and get into do_notify_resume() before the
>> process running, we could check it and call memory_failure() to do
>> recovery. It's safe, because we are in process context.
> 
> I'm afraid I don't think this is the best approach for fixing this.
> Its tied to the notification type, but the notification should be irrelevant
> once we call ghes_proc().
> It adds code poking around in CPER and ACPI/GHES to the arm64 arch code, all 
> of
> this should be in the core common code.
> Most importantly: this means arm64 behaves differently with regard to handling
> memory errors to other architectures using ACPI. Two behaviours means twice 
> the
> code, review and bugs...

I think we can use arch_apei_report_mem_error() to report memory error, just 
like
what is implemented in x86. We can record the memory address in somewhere else, 
namely
report the error. Then the SEA handler can use the memory error reported.

> 
> 
> Delaying the handling until we re-enter user-space means faults that may 
> affect
> the kernel aren't handled until much later. Just because the fault was
> synchronous and user-space was running doesn't mean only user space is 
> affected.
> Some examples I've collected so far: the zero-page may be corrupt, this is
> mapped into every process and used by the kernel. Similarly corruption in the
> vdso affects all user-space. The fault may affect the page tables, this 
> affects
> all users of the mm_struct.

I am not sure if memory_failure can recognize  the zero page as a kernel page 
and
just panic. If several processes share a same page, I think we don't have to 
kill
the process until it access the address containing the errors. If the process 
access
the error address, a SEA will occur again, then we can handle it. If the process
will never access the error address, such as a syscall in vdso the process 
won't use,
we don't have to kill the process.

For vdso case, we havn't repaired the error. It will affect newly created 
processes.
I don't figure out how to handle this. Is it a kernel page mapped into user 
space
or memory_failure will just panic when handling this page.

> 
> (I'm sure we agree that an synchronous-external-abort interrupting the kernel 
> is
> fatal for the kernel, but the other way round isn't always true).
> 
> Setting a TIF flag to handle the error before re-entering user-space is a
> problem as the scheduler may choose to pre-empt this task and run all the 
> other
> tasks before this eventually gets handled.
> 
> 
> Assuming this is just a problem with memory_failure_queue(), two alternatives 
> I
> can suggest are making memory_failure_queue() nmi-safe,
Yes, I agree, but it may be hard to be achieved.

 or abstracting
> NOTIFY_NMI's estatus pool/cache to use for the arm64 NMI-like notifications 
> too.
> 
> If there is more to this, can you explain the problem you're trying to solve?
> (I suspect there may be an issue with multiple-signals being merged, or 
> exactly
> when memory_failure_queue()'s work gets run.) Can you outline the sequence of
> events?
> 
> 
> You're picking a physical address out of 'ARM Processor Error Information
> Structure', these correspond with Cache, TLB, Bus or (the mysterious) 'micro
> architectural error'. I don't see anything checking the error type.
> Given the physical address, are you adding error-handling for cache-errors 
> with
> this series?
> 
> 
> Thanks,
> 
> James
> 
> 
> .
> 

Thanks,

Xiongfeng Wang

Re: [PATCH v3 1/3] arm64/ras: support sea error recovery

2017-09-10 Thread Xiongfeng Wang

Hi James,

Thanks for your comments.

On 2017/9/9 2:15, James Morse wrote:
> Hi Xie XiuQi,
> 
> (Sorry a few versions of this went past before I caught up with it)
> 
> On 07/09/17 08:45, Xie XiuQi wrote:
>> With ARM v8.2 RAS Extension, SEA are usually triggered when memory errors
>> are consumed. In some cases, if the error address is in a clean page or a
>> read-only page, there is a chance to recover. Such as error occurs in a
>> instruction page, we can reread this page from disk instead of killing
>> process.
> 
>> Because memory_failure() may sleep, we can not call it directly in SEA
>> exception context.
> 
> This is why we have memory_failure_queue() instead, it ... bother. That 
> doesn't
> look nmi-safe. (I thought this ended with an llist, but clearly I was looking 
> at
> the wrong thing).
> 
> It doesn't look like this is a problem for NOTIFY_SEA as it would only 
> interrupt
> itself on the same CPU if the memory-failure code/data were corrupt. (which is
> not a case we can handle). We need to fix this before any of the asynchronous
> NMI-like RAS notifications for arm64 get merged.
> 
> (this is one problem, but I don't think its 'the' problem you are trying to
> solve with this series).
> 
> 
>> So we saved faulting physical address associated with
>> a process in the ghes handler and set __TIF_SEA_NOTIFY.
> 
> A per-notification type TIF flag looks fishy, surely this would affect all
> NMI-like RAS notification methods?
> 
> 
>> When we return
>> from SEA exception context and get into do_notify_resume() before the
>> process running, we could check it and call memory_failure() to do
>> recovery. It's safe, because we are in process context.
> 
> I'm afraid I don't think this is the best approach for fixing this.
> Its tied to the notification type, but the notification should be irrelevant
> once we call ghes_proc().
> It adds code poking around in CPER and ACPI/GHES to the arm64 arch code, all 
> of
> this should be in the core common code.
> Most importantly: this means arm64 behaves differently with regard to handling
> memory errors to other architectures using ACPI. Two behaviours means twice 
> the
> code, review and bugs...

I think we can use arch_apei_report_mem_error() to report memory error, just 
like
what is implemented in x86. We can record the memory address in somewhere else, 
namely
report the error. Then the SEA handler can use the memory error reported.

> 
> 
> Delaying the handling until we re-enter user-space means faults that may 
> affect
> the kernel aren't handled until much later. Just because the fault was
> synchronous and user-space was running doesn't mean only user space is 
> affected.
> Some examples I've collected so far: the zero-page may be corrupt, this is
> mapped into every process and used by the kernel. Similarly corruption in the
> vdso affects all user-space. The fault may affect the page tables, this 
> affects
> all users of the mm_struct.

I am not sure if memory_failure can recognize  the zero page as a kernel page 
and
just panic. If several processes share a same page, I think we don't have to 
kill
the process until it access the address containing the errors. If the process 
access
the error address, a SEA will occur again, then we can handle it. If the process
will never access the error address, such as a syscall in vdso the process 
won't use,
we don't have to kill the process.

For vdso case, we havn't repaired the error. It will affect newly created 
processes.
I don't figure out how to handle this. Is it a kernel page mapped into user 
space
or memory_failure will just panic when handling this page.

> 
> (I'm sure we agree that an synchronous-external-abort interrupting the kernel 
> is
> fatal for the kernel, but the other way round isn't always true).
> 
> Setting a TIF flag to handle the error before re-entering user-space is a
> problem as the scheduler may choose to pre-empt this task and run all the 
> other
> tasks before this eventually gets handled.
> 
> 
> Assuming this is just a problem with memory_failure_queue(), two alternatives 
> I
> can suggest are making memory_failure_queue() nmi-safe,
Yes, I agree, but it may be hard to be achieved.

 or abstracting
> NOTIFY_NMI's estatus pool/cache to use for the arm64 NMI-like notifications 
> too.
> 
> If there is more to this, can you explain the problem you're trying to solve?
> (I suspect there may be an issue with multiple-signals being merged, or 
> exactly
> when memory_failure_queue()'s work gets run.) Can you outline the sequence of
> events?
> 
> 
> You're picking a physical address out of 'ARM Processor Error Information
> Structure', these correspond with Cache, TLB, Bus or (the mysterious) 'micro
> architectural error'. I don't see anything checking the error type.
> Given the physical address, are you adding error-handling for cache-errors 
> with
> this series?
> 
> 
> Thanks,
> 
> James
> 
> 
> .
> 

Thanks,

Xiongfeng Wang

[PATCH] media: i2c: OV5647: ensure clock lane in LP-11 state before streaming on

2017-09-10 Thread Jacob Chen

When I was supporting Rpi Camera Module on the ASUS Tinker board,
I found this driver have some issues with rockchip's mipi-csi driver.
It didn't place clock lane in LP-11 state before performing
D-PHY initialisation.

>From our experience, on some OV sensors,
LP-11 state is not achieved while BIT(5)-0x4800 is cleared.

So let's set BIT(5) and BIT(0) both while not streaming, in order to
coax the clock lane into LP-11 state.

0x4800 : MIPI CTRL 00
BIT(5) : clock lane gate enable
0: continuous
1: none-continuous
BIT(0) : manually set clock lane
0: Not used
1: used

Changes in V2:
modify commit messages

Signed-off-by: Jacob Chen 
---
 drivers/media/i2c/ov5647.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/media/i2c/ov5647.c b/drivers/media/i2c/ov5647.c
index 95ce90f..247302d 100644
--- a/drivers/media/i2c/ov5647.c
+++ b/drivers/media/i2c/ov5647.c
@@ -253,6 +253,10 @@ static int ov5647_stream_on(struct v4l2_subdev *sd)
 {
int ret;
 
+   ret = ov5647_write(sd, 0x4800, 0x04);
+   if (ret < 0)
+   return ret;
+
ret = ov5647_write(sd, 0x4202, 0x00);
if (ret < 0)
return ret;
@@ -264,6 +268,10 @@ static int ov5647_stream_off(struct v4l2_subdev *sd)
 {
int ret;
 
+   ret = ov5647_write(sd, 0x4800, 0x25);
+   if (ret < 0)
+   return ret;
+
ret = ov5647_write(sd, 0x4202, 0x0f);
if (ret < 0)
return ret;
@@ -320,7 +328,10 @@ static int __sensor_init(struct v4l2_subdev *sd)
return ret;
}
 
-   return ov5647_write(sd, 0x4800, 0x04);
+   /*
+* stream off to make the clock lane into LP-11 state.
+*/
+   return ov5647_stream_off(sd);
 }
 
 static int ov5647_sensor_power(struct v4l2_subdev *sd, int on)
-- 
2.7.4

[PATCH] media: i2c: OV5647: ensure clock lane in LP-11 state before streaming on

2017-09-10 Thread Jacob Chen

When I was supporting Rpi Camera Module on the ASUS Tinker board,
I found this driver have some issues with rockchip's mipi-csi driver.
It didn't place clock lane in LP-11 state before performing
D-PHY initialisation.

>From our experience, on some OV sensors,
LP-11 state is not achieved while BIT(5)-0x4800 is cleared.

So let's set BIT(5) and BIT(0) both while not streaming, in order to
coax the clock lane into LP-11 state.

0x4800 : MIPI CTRL 00
BIT(5) : clock lane gate enable
0: continuous
1: none-continuous
BIT(0) : manually set clock lane
0: Not used
1: used

Changes in V2:
modify commit messages

Signed-off-by: Jacob Chen 
---
 drivers/media/i2c/ov5647.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/media/i2c/ov5647.c b/drivers/media/i2c/ov5647.c
index 95ce90f..247302d 100644
--- a/drivers/media/i2c/ov5647.c
+++ b/drivers/media/i2c/ov5647.c
@@ -253,6 +253,10 @@ static int ov5647_stream_on(struct v4l2_subdev *sd)
 {
int ret;
 
+   ret = ov5647_write(sd, 0x4800, 0x04);
+   if (ret < 0)
+   return ret;
+
ret = ov5647_write(sd, 0x4202, 0x00);
if (ret < 0)
return ret;
@@ -264,6 +268,10 @@ static int ov5647_stream_off(struct v4l2_subdev *sd)
 {
int ret;
 
+   ret = ov5647_write(sd, 0x4800, 0x25);
+   if (ret < 0)
+   return ret;
+
ret = ov5647_write(sd, 0x4202, 0x0f);
if (ret < 0)
return ret;
@@ -320,7 +328,10 @@ static int __sensor_init(struct v4l2_subdev *sd)
return ret;
}
 
-   return ov5647_write(sd, 0x4800, 0x04);
+   /*
+* stream off to make the clock lane into LP-11 state.
+*/
+   return ov5647_stream_off(sd);
 }
 
 static int ov5647_sensor_power(struct v4l2_subdev *sd, int on)
-- 
2.7.4

Re: Current mainline git (24e700e291d52bd2) hangs when building e.g. perf

2017-09-10 Thread Andy Lutomirski

On Sun, Sep 10, 2017 at 6:12 PM, Rik van Riel  wrote:
> On Sat, 2017-09-09 at 12:28 -0700, Andy Lutomirski wrote:
>> -
>> I propose the following fix.  If PCID is on, then, in
>> enter_lazy_tlb(), we switch to init_mm with the no-flush flag set.
>> (And we give init_mm its own dedicated ASID to keep it simple and
>> fast
>> -- no need to use the LRU ASID mapping to assign one
>> dynamically.)  We
>> clear the bit in mm_cpumask.  That is, we more or less just skip the
>> whole lazy TLB optimization and rely on PCID CPUs having reasonably
>> fast CR3 writes.  No extra IPIs.
>
> Avoiding the IPIs is probably what matters the most, especially
> on systems with deep C states, and virtual machines where the
> host may be running something else, causing the IPI service time
> to go through the roof for idle VCPUs.
>
>> Also, sorry Rik, this means your old increased laziness optimization
>> is dead in the water.  It will have exactly the same speculative load
>> problem.
>
> Doesn't a memory barrier solve that speculative load
> problem?
>
> The memory barrier could be added only to the path
> that potentially skips reloading the TLB, under the
> assumption that a memory barrier is cheaper than a
> TLB reload (even with ASID).

No, nothing stops the problematic speculative load.  Here's the issue.
One CPU removes a reference to a page table from a higher-level page
table, flushes, and then frees the page table.  Then it re-allocates
it and writes something unrelated there.  Another CPU that has CR3
pointing to the page hierarchy in question could have a reference to
the freed table in its paging structure cache.  Even if it's
guaranteed to not try to access the addresses in question (because
they're user addresses and the other CPU is in kernel mode, etc), but
there is never a guarantee that the CPU doesn't randomly try to fill
its TLB for the affected addresses.  This results in invalid PTEs in
the TLB, possible accesses using bogus memory types, and maybe even
reads from IO space.

It looks like we actually need to propagate flushes everywhere that
could have references to the flushed range, even if the software won't
access that range.

Re: Current mainline git (24e700e291d52bd2) hangs when building e.g. perf

2017-09-10 Thread Andy Lutomirski

On Sun, Sep 10, 2017 at 6:12 PM, Rik van Riel  wrote:
> On Sat, 2017-09-09 at 12:28 -0700, Andy Lutomirski wrote:
>> -
>> I propose the following fix.  If PCID is on, then, in
>> enter_lazy_tlb(), we switch to init_mm with the no-flush flag set.
>> (And we give init_mm its own dedicated ASID to keep it simple and
>> fast
>> -- no need to use the LRU ASID mapping to assign one
>> dynamically.)  We
>> clear the bit in mm_cpumask.  That is, we more or less just skip the
>> whole lazy TLB optimization and rely on PCID CPUs having reasonably
>> fast CR3 writes.  No extra IPIs.
>
> Avoiding the IPIs is probably what matters the most, especially
> on systems with deep C states, and virtual machines where the
> host may be running something else, causing the IPI service time
> to go through the roof for idle VCPUs.
>
>> Also, sorry Rik, this means your old increased laziness optimization
>> is dead in the water.  It will have exactly the same speculative load
>> problem.
>
> Doesn't a memory barrier solve that speculative load
> problem?
>
> The memory barrier could be added only to the path
> that potentially skips reloading the TLB, under the
> assumption that a memory barrier is cheaper than a
> TLB reload (even with ASID).

No, nothing stops the problematic speculative load.  Here's the issue.
One CPU removes a reference to a page table from a higher-level page
table, flushes, and then frees the page table.  Then it re-allocates
it and writes something unrelated there.  Another CPU that has CR3
pointing to the page hierarchy in question could have a reference to
the freed table in its paging structure cache.  Even if it's
guaranteed to not try to access the addresses in question (because
they're user addresses and the other CPU is in kernel mode, etc), but
there is never a guarantee that the CPU doesn't randomly try to fill
its TLB for the affected addresses.  This results in invalid PTEs in
the TLB, possible accesses using bogus memory types, and maybe even
reads from IO space.

It looks like we actually need to propagate flushes everywhere that
could have references to the flushed range, even if the software won't
access that range.

Re: [PATCH] scsi: shost->async_scan should be protected by mutex_lock

2017-09-10 Thread Tyrel Datwyler

On 09/07/2017 11:54 PM, Ouyangzhaowei (Charles) wrote:
> shost->async_scan should be protected by mutex_lock, otherwise the check
> of "called twice" won't work.
> 
> Signed-off-by: Ouyang Zhaowei 
> ---
>  drivers/scsi/scsi_scan.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index fd88dab..1d1df51 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -1722,6 +1722,7 @@ static struct async_scan_data
> *scsi_prep_async_scan(struct Scsi_Host *shost)
> if (strncmp(scsi_scan_type, "sync", 4) == 0)
> return NULL;
> 
> +   mutex_lock(>scan_mutex);

The mutex will not be unlocked in the event that either the host has called
scsi_prep_async_scan() twice, or a condition is meet the branches to the "err" 
label prior
to where the original mutex_lock() was located below.

-Tyrel

> if (shost->async_scan) {
> shost_printk(KERN_DEBUG, shost, "%s called twice\n",
> __func__);
> return NULL;
> @@ -1735,7 +1736,6 @@ static struct async_scan_data
> *scsi_prep_async_scan(struct Scsi_Host *shost)
> goto err;
> init_completion(>prev_finished);
> 
> -   mutex_lock(>scan_mutex);
> spin_lock_irqsave(shost->host_lock, flags);
> shost->async_scan = 1;
> spin_unlock_irqrestore(shost->host_lock, flags);
>

Re: [PATCH] scsi: shost->async_scan should be protected by mutex_lock

2017-09-10 Thread Tyrel Datwyler

On 09/07/2017 11:54 PM, Ouyangzhaowei (Charles) wrote:
> shost->async_scan should be protected by mutex_lock, otherwise the check
> of "called twice" won't work.
> 
> Signed-off-by: Ouyang Zhaowei 
> ---
>  drivers/scsi/scsi_scan.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index fd88dab..1d1df51 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -1722,6 +1722,7 @@ static struct async_scan_data
> *scsi_prep_async_scan(struct Scsi_Host *shost)
> if (strncmp(scsi_scan_type, "sync", 4) == 0)
> return NULL;
> 
> +   mutex_lock(>scan_mutex);

The mutex will not be unlocked in the event that either the host has called
scsi_prep_async_scan() twice, or a condition is meet the branches to the "err" 
label prior
to where the original mutex_lock() was located below.

-Tyrel

> if (shost->async_scan) {
> shost_printk(KERN_DEBUG, shost, "%s called twice\n",
> __func__);
> return NULL;
> @@ -1735,7 +1736,6 @@ static struct async_scan_data
> *scsi_prep_async_scan(struct Scsi_Host *shost)
> goto err;
> init_completion(>prev_finished);
> 
> -   mutex_lock(>scan_mutex);
> spin_lock_irqsave(shost->host_lock, flags);
> shost->async_scan = 1;
> spin_unlock_irqrestore(shost->host_lock, flags);
>

Re: [PATCH v1 1/4] perf annotate: create a new hists to manage multiple events samples

2017-09-10 Thread Jin, Yao




On 9/8/2017 9:43 PM, Arnaldo Carvalho de Melo wrote:

Em Wed, Aug 16, 2017 at 06:18:33PM +0800, Jin Yao escreveu:

An issue is found during using perf annotate.

perf record -e cycles,branches ...
perf annotate main --stdio

The result only shows cycles. It should show both cycles and
branches on the left side. It works with "--group", but need
this to work even without groups.

In current design, the hists is per event. So we need a new
hists to manage the samples for multiple events and use a new
hist_event data structure to save the map/symbol information
for per event.

Humm, why do we need another hists? Don't we have one per evsel, don't
we have a evlist from where to get all of those evsels, can't we just
use that to add one column per evsel?

- Arnaldo
  



Hi Arnaldo,

I'm considering a case.

Suppose we sample 2 events ("branches" and "cache-misses"). The samples 
of "branches" are hit in function A and the samples of "cache-misses" 
are hit in function B.


The branches evsel has one hists and cache-misses evsel has another hists.

The hists of branches evsel has one hist-entry which stands for the 
function A symbol. The hists of cache-misses evsel has one hist-entry 
which stands for the function B symbol.


If we start to show the instructions in function B from cache-misses 
evsel, we will lose the function A.


Because even if we get the branches evsel from the link in cache-misses 
evsel, but the function A is before function B and function B has been 
displayed yet, so the function A is lost.


Considering the number of events can be greater than 2, the code will be 
much more complicated. So using a global hists should be an easy solution.


Thanks
Jin Yao

Re: [PATCH v1 1/4] perf annotate: create a new hists to manage multiple events samples

2017-09-10 Thread Jin, Yao




On 9/8/2017 9:43 PM, Arnaldo Carvalho de Melo wrote:

Em Wed, Aug 16, 2017 at 06:18:33PM +0800, Jin Yao escreveu:

An issue is found during using perf annotate.

perf record -e cycles,branches ...
perf annotate main --stdio

The result only shows cycles. It should show both cycles and
branches on the left side. It works with "--group", but need
this to work even without groups.

In current design, the hists is per event. So we need a new
hists to manage the samples for multiple events and use a new
hist_event data structure to save the map/symbol information
for per event.

Humm, why do we need another hists? Don't we have one per evsel, don't
we have a evlist from where to get all of those evsels, can't we just
use that to add one column per evsel?

- Arnaldo
  



Hi Arnaldo,

I'm considering a case.

Suppose we sample 2 events ("branches" and "cache-misses"). The samples 
of "branches" are hit in function A and the samples of "cache-misses" 
are hit in function B.


The branches evsel has one hists and cache-misses evsel has another hists.

The hists of branches evsel has one hist-entry which stands for the 
function A symbol. The hists of cache-misses evsel has one hist-entry 
which stands for the function B symbol.


If we start to show the instructions in function B from cache-misses 
evsel, we will lose the function A.


Because even if we get the branches evsel from the link in cache-misses 
evsel, but the function A is before function B and function B has been 
displayed yet, so the function A is lost.


Considering the number of events can be greater than 2, the code will be 
much more complicated. So using a global hists should be an easy solution.


Thanks
Jin Yao

[git pull] m68knommu changes for v4.14

2017-09-10 Thread Greg Ungerer

Hi Linus,

Can you please pull the m68knommu git tree, for-next branch.

Only 2 changes. One removes unused code, the other makes local clock code
arguments consistent with generic clock code.

Regards
Greg




The following changes since commit cc4a41fe5541a73019a864883297bd5043aa6d98:

  Linux 4.13-rc7 (2017-08-27 17:20:40 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu.git for-next

for you to fetch changes up to 8a1943492bd629b7eae64535a96fcb959a01bbd0:

  m68knommu: remove dead code (2017-08-28 10:40:30 +1000)


Alexandre Belloni (1):
  m68knommu: remove dead code

Jonas Gorski (1):
  m68k: allow NULL clock for clk_get_rate

 arch/m68k/coldfire/clk.c|  3 +++
 arch/m68k/coldfire/m5441x.c | 37 -
 2 files changed, 3 insertions(+), 37 deletions(-)

[git pull] m68knommu changes for v4.14

2017-09-10 Thread Greg Ungerer

Hi Linus,

Can you please pull the m68knommu git tree, for-next branch.

Only 2 changes. One removes unused code, the other makes local clock code
arguments consistent with generic clock code.

Regards
Greg




The following changes since commit cc4a41fe5541a73019a864883297bd5043aa6d98:

  Linux 4.13-rc7 (2017-08-27 17:20:40 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu.git for-next

for you to fetch changes up to 8a1943492bd629b7eae64535a96fcb959a01bbd0:

  m68knommu: remove dead code (2017-08-28 10:40:30 +1000)


Alexandre Belloni (1):
  m68knommu: remove dead code

Jonas Gorski (1):
  m68k: allow NULL clock for clk_get_rate

 arch/m68k/coldfire/clk.c|  3 +++
 arch/m68k/coldfire/m5441x.c | 37 -
 2 files changed, 3 insertions(+), 37 deletions(-)

linux-next: manual merge of the aio tree with the vfs tree

2017-09-10 Thread Stephen Rothwell

Hi Benjamin,

Today's linux-next merge of the aio tree got a conflict in:

  fs/aio.c

between commit:

  32ec9f249d65 ("io_getevents: Use timespec64 to represent timeouts")

from the vfs tree and commit:

  eb5263749f68 ("aio: handle integer overflow in io_getevents() timespec usage")

from the aio tree.

I fixed it up (I just dropped the change in the latter commit) and can
carry the fix as necessary. This is now fixed as far as linux-next is
concerned, but any non trivial conflicts should be mentioned to your
upstream maintainer when your tree is submitted for merging.  You may
also want to consider cooperating with the maintainer of the conflicting
tree to minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell

linux-next: manual merge of the aio tree with the vfs tree

2017-09-10 Thread Stephen Rothwell

Hi Benjamin,

Today's linux-next merge of the aio tree got a conflict in:

  fs/aio.c

between commit:

  32ec9f249d65 ("io_getevents: Use timespec64 to represent timeouts")

from the vfs tree and commit:

  eb5263749f68 ("aio: handle integer overflow in io_getevents() timespec usage")

from the aio tree.

I fixed it up (I just dropped the change in the latter commit) and can
carry the fix as necessary. This is now fixed as far as linux-next is
concerned, but any non trivial conflicts should be mentioned to your
upstream maintainer when your tree is submitted for merging.  You may
also want to consider cooperating with the maintainer of the conflicting
tree to minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell

Re: btusb "firmware request while host is not available" at resume

2017-09-10 Thread Greg Kroah-Hartman

On Sun, Sep 10, 2017 at 12:26:02PM -0700, Linus Torvalds wrote:
> This seems to be a new problem at resume for the Intel btusb driver,
> but I'm not seeing anything in that driver itself that looks like a
> likely trigger, so I wonder if it's some driver core change, a generic
> resume path issue, or a workqueue change that has made it trigger for
> me.
> 
> It might also just be a timing difference, maybe it's always been there?
> 
> Does anybody have any ideas? It does't happen on every resume, and the
> machine works despite this (but no bluetooth - the *next* resume might
> bring it back, though).

Ah, it's not just me having this problem.  I don't see it happening in
4.12, and haven't had the time to bisect it.  I seem to be able to
trigger it every suspend/resume cycle, so I don't know if it's a timing
issue.



> 
>  Linus
> 
> --
> 
>   ACPI: Low-level resume complete
>   ACPI: EC: EC started
>   PM: Restoring platform NVS memory
>   Enabling non-boot CPUs ...
>   x86: Booting SMP configuration:
>   smpboot: Booting Node 0 Processor 1 APIC 0x2
>cache: parent cpu1 should not be sleeping
>   CPU1 is up
>   smpboot: Booting Node 0 Processor 2 APIC 0x1
>cache: parent cpu2 should not be sleeping
>   CPU2 is up
>   smpboot: Booting Node 0 Processor 3 APIC 0x3
>cache: parent cpu3 should not be sleeping
>   CPU3 is up
>   ACPI: Waking up from system sleep state S3
>   ACPI: EC: event unblocked
>   usb 1-3: reset full-speed USB device number 2 using xhci_hcd
>   usb 1-4: reset full-speed USB device number 3 using xhci_hcd
>   usb 1-5: reset high-speed USB device number 4 using xhci_hcd
>   usb 1-3:1.0: rebind failed: -517
>   usb 1-3:1.1: rebind failed: -517
>   Bluetooth: hci0: Bootloader revision 0.0 build 2 week 52 2014
>   OOM killer enabled.
>   Restarting tasks ...
>   Bluetooth: hci0: Device revision is 5
>   Bluetooth: hci0: Secure boot is enabled
>   Bluetooth: hci0: OTP lock is enabled
>   Bluetooth: hci0: API lock is enabled
>   Bluetooth: hci0: Debug lock is disabled
>   Bluetooth: hci0: Minimum firmware build 1 week 10 2014
>   firmware request while host is not available
>   [ cut here ]
>   WARNING: CPU: 2 PID: 621 at drivers/base/firmware_class.c:1250
> _request_firmware+0x460/0x790
>   CPU: 2 PID: 621 Comm: kworker/u9:2 Not tainted 4.13.0-10313-ge860d2c904d1 
> #11
>   Hardware name: Dell Inc. XPS 13 9350/09JHRY, BIOS 1.4.17 05/10/2017
>   Workqueue: hci0 hci_power_on [bluetooth]
>   task: 8d3767895ac0 task.stack: 9d3481efc000
>   RIP: 0010:_request_firmware+0x460/0x790
>   Call Trace:
>request_firmware+0x37/0x50
>btusb_setup_intel_new+0x227/0x7e0 [btusb]
>hci_dev_do_open+0x3da/0x570 [bluetooth]
>hci_power_on+0x52/0x1f0 [bluetooth]
>process_one_work+0x1db/0x3d0
>worker_thread+0x47/0x3e0
>kthread+0x125/0x140
>ret_from_fork+0x22/0x30
>   ---[ end trace 007b222491432927 ]---
>   Bluetooth: hci0: Failed to load Intel firmware file (-112)
>   [drm] RC6 on
>   done.
>   thermal thermal_zone11: failed to read out thermal zone (-5)
>   PM: suspend exit

Ah, I'll blame Luis for this, I think it might be due to 81f95076281f
("firmware: add sanity check on shutdown/suspend")

Luis, any ideas?  I'll try to revert this and try it out tomorrow when
I get a chance.

thanks,

greg k-h

Re: btusb "firmware request while host is not available" at resume

2017-09-10 Thread Greg Kroah-Hartman

On Sun, Sep 10, 2017 at 12:26:02PM -0700, Linus Torvalds wrote:
> This seems to be a new problem at resume for the Intel btusb driver,
> but I'm not seeing anything in that driver itself that looks like a
> likely trigger, so I wonder if it's some driver core change, a generic
> resume path issue, or a workqueue change that has made it trigger for
> me.
> 
> It might also just be a timing difference, maybe it's always been there?
> 
> Does anybody have any ideas? It does't happen on every resume, and the
> machine works despite this (but no bluetooth - the *next* resume might
> bring it back, though).

Ah, it's not just me having this problem.  I don't see it happening in
4.12, and haven't had the time to bisect it.  I seem to be able to
trigger it every suspend/resume cycle, so I don't know if it's a timing
issue.



> 
>  Linus
> 
> --
> 
>   ACPI: Low-level resume complete
>   ACPI: EC: EC started
>   PM: Restoring platform NVS memory
>   Enabling non-boot CPUs ...
>   x86: Booting SMP configuration:
>   smpboot: Booting Node 0 Processor 1 APIC 0x2
>cache: parent cpu1 should not be sleeping
>   CPU1 is up
>   smpboot: Booting Node 0 Processor 2 APIC 0x1
>cache: parent cpu2 should not be sleeping
>   CPU2 is up
>   smpboot: Booting Node 0 Processor 3 APIC 0x3
>cache: parent cpu3 should not be sleeping
>   CPU3 is up
>   ACPI: Waking up from system sleep state S3
>   ACPI: EC: event unblocked
>   usb 1-3: reset full-speed USB device number 2 using xhci_hcd
>   usb 1-4: reset full-speed USB device number 3 using xhci_hcd
>   usb 1-5: reset high-speed USB device number 4 using xhci_hcd
>   usb 1-3:1.0: rebind failed: -517
>   usb 1-3:1.1: rebind failed: -517
>   Bluetooth: hci0: Bootloader revision 0.0 build 2 week 52 2014
>   OOM killer enabled.
>   Restarting tasks ...
>   Bluetooth: hci0: Device revision is 5
>   Bluetooth: hci0: Secure boot is enabled
>   Bluetooth: hci0: OTP lock is enabled
>   Bluetooth: hci0: API lock is enabled
>   Bluetooth: hci0: Debug lock is disabled
>   Bluetooth: hci0: Minimum firmware build 1 week 10 2014
>   firmware request while host is not available
>   [ cut here ]
>   WARNING: CPU: 2 PID: 621 at drivers/base/firmware_class.c:1250
> _request_firmware+0x460/0x790
>   CPU: 2 PID: 621 Comm: kworker/u9:2 Not tainted 4.13.0-10313-ge860d2c904d1 
> #11
>   Hardware name: Dell Inc. XPS 13 9350/09JHRY, BIOS 1.4.17 05/10/2017
>   Workqueue: hci0 hci_power_on [bluetooth]
>   task: 8d3767895ac0 task.stack: 9d3481efc000
>   RIP: 0010:_request_firmware+0x460/0x790
>   Call Trace:
>request_firmware+0x37/0x50
>btusb_setup_intel_new+0x227/0x7e0 [btusb]
>hci_dev_do_open+0x3da/0x570 [bluetooth]
>hci_power_on+0x52/0x1f0 [bluetooth]
>process_one_work+0x1db/0x3d0
>worker_thread+0x47/0x3e0
>kthread+0x125/0x140
>ret_from_fork+0x22/0x30
>   ---[ end trace 007b222491432927 ]---
>   Bluetooth: hci0: Failed to load Intel firmware file (-112)
>   [drm] RC6 on
>   done.
>   thermal thermal_zone11: failed to read out thermal zone (-5)
>   PM: suspend exit

Ah, I'll blame Luis for this, I think it might be due to 81f95076281f
("firmware: add sanity check on shutdown/suspend")

Luis, any ideas?  I'll try to revert this and try it out tomorrow when
I get a chance.

thanks,

greg k-h

1 2 3 4 5 6 >

1 - 100 of 526 matches

Mail list logo