Re: [PATCH] mmc: tegra: Mark 64 bit dma broken on Tegra186
On 08/09/17 22:48, Krishna Reddy wrote: > SDHCI controllers on Tegra186 support 40 bit addressing. > IOVA addresses are 48-bit wide on Tegra186. > SDHCI host common code sets dma mask as either 32-bit or 64-bit. > To avoid access issues when SMMU is enabled, disable 64-bit dma. > > Signed-off-by: Krishna ReddyAcked-by: Adrian Hunter > --- > drivers/mmc/host/sdhci-tegra.c | 10 +- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c > index 0cd6fa80db66..b877c13184c2 100644 > --- a/drivers/mmc/host/sdhci-tegra.c > +++ b/drivers/mmc/host/sdhci-tegra.c > @@ -422,7 +422,15 @@ static const struct sdhci_pltfm_data > sdhci_tegra186_pdata = { > SDHCI_QUIRK_NO_HISPD_BIT | > SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC | > SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN, > - .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN, > + .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN | > +/* SDHCI controllers on Tegra186 support 40-bit addressing. > + * IOVA addresses are 48-bit wide on Tegra186. > + * With 64-bit dma mask used for SDHCI, accesses can > + * be broken. Disable 64-bit dma, which would fall back > + * to 32-bit dma mask. Ideally 40-bit dma mask would work, > + * But it is not supported as of now. > + */ > +SDHCI_QUIRK2_BROKEN_64_BIT_DMA, > .ops = _sdhci_ops, > }; > >
Re: [PATCH] mmc: tegra: Mark 64 bit dma broken on Tegra186
On 08/09/17 22:48, Krishna Reddy wrote: > SDHCI controllers on Tegra186 support 40 bit addressing. > IOVA addresses are 48-bit wide on Tegra186. > SDHCI host common code sets dma mask as either 32-bit or 64-bit. > To avoid access issues when SMMU is enabled, disable 64-bit dma. > > Signed-off-by: Krishna Reddy Acked-by: Adrian Hunter > --- > drivers/mmc/host/sdhci-tegra.c | 10 +- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c > index 0cd6fa80db66..b877c13184c2 100644 > --- a/drivers/mmc/host/sdhci-tegra.c > +++ b/drivers/mmc/host/sdhci-tegra.c > @@ -422,7 +422,15 @@ static const struct sdhci_pltfm_data > sdhci_tegra186_pdata = { > SDHCI_QUIRK_NO_HISPD_BIT | > SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC | > SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN, > - .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN, > + .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN | > +/* SDHCI controllers on Tegra186 support 40-bit addressing. > + * IOVA addresses are 48-bit wide on Tegra186. > + * With 64-bit dma mask used for SDHCI, accesses can > + * be broken. Disable 64-bit dma, which would fall back > + * to 32-bit dma mask. Ideally 40-bit dma mask would work, > + * But it is not supported as of now. > + */ > +SDHCI_QUIRK2_BROKEN_64_BIT_DMA, > .ops = _sdhci_ops, > }; > >
Re: [PATCH] bnx2i: Clean up unused pointers in bnx2i_hwi
On 10/09/17 5:48 PM, "Christos Gkekas"wrote: >Pointers bnx2i_cmd are set but never used, so they can be removed. > >Signed-off-by: Christos Gkekas >--- > drivers/scsi/bnx2i/bnx2i_hwi.c | 10 -- > 1 file changed, 10 deletions(-) > >diff --git a/drivers/scsi/bnx2i/bnx2i_hwi.c >b/drivers/scsi/bnx2i/bnx2i_hwi.c >index 42921db..e3f22cb 100644 >--- a/drivers/scsi/bnx2i/bnx2i_hwi.c >+++ b/drivers/scsi/bnx2i/bnx2i_hwi.c >@@ -332,12 +332,10 @@ static void >bnx2i_ring_dbell_update_sq_params(struct bnx2i_conn *bnx2i_conn, > int bnx2i_send_iscsi_login(struct bnx2i_conn *bnx2i_conn, > struct iscsi_task *task) > { >- struct bnx2i_cmd *bnx2i_cmd; > struct bnx2i_login_request *login_wqe; > struct iscsi_login_req *login_hdr; > u32 dword; > >- bnx2i_cmd = (struct bnx2i_cmd *)task->dd_data; > login_hdr = (struct iscsi_login_req *)task->hdr; > login_wqe = (struct bnx2i_login_request *) > bnx2i_conn->ep->qp.sq_prod_qe; >@@ -391,12 +389,10 @@ int bnx2i_send_iscsi_tmf(struct bnx2i_conn >*bnx2i_conn, > struct iscsi_tm *tmfabort_hdr; > struct scsi_cmnd *ref_sc; > struct iscsi_task *ctask; >- struct bnx2i_cmd *bnx2i_cmd; > struct bnx2i_tmf_request *tmfabort_wqe; > u32 dword; > u32 scsi_lun[2]; > >- bnx2i_cmd = (struct bnx2i_cmd *)mtask->dd_data; > tmfabort_hdr = (struct iscsi_tm *)mtask->hdr; > tmfabort_wqe = (struct bnx2i_tmf_request *) > bnx2i_conn->ep->qp.sq_prod_qe; >@@ -463,12 +459,10 @@ int bnx2i_send_iscsi_tmf(struct bnx2i_conn >*bnx2i_conn, > int bnx2i_send_iscsi_text(struct bnx2i_conn *bnx2i_conn, > struct iscsi_task *mtask) > { >- struct bnx2i_cmd *bnx2i_cmd; > struct bnx2i_text_request *text_wqe; > struct iscsi_text *text_hdr; > u32 dword; > >- bnx2i_cmd = (struct bnx2i_cmd *)mtask->dd_data; > text_hdr = (struct iscsi_text *)mtask->hdr; > text_wqe = (struct bnx2i_text_request *) bnx2i_conn->ep->qp.sq_prod_qe; > >@@ -541,11 +535,9 @@ int bnx2i_send_iscsi_nopout(struct bnx2i_conn >*bnx2i_conn, > char *datap, int data_len, int unsol) > { > struct bnx2i_endpoint *ep = bnx2i_conn->ep; >- struct bnx2i_cmd *bnx2i_cmd; > struct bnx2i_nop_out_request *nopout_wqe; > struct iscsi_nopout *nopout_hdr; > >- bnx2i_cmd = (struct bnx2i_cmd *)task->dd_data; > nopout_hdr = (struct iscsi_nopout *)task->hdr; > nopout_wqe = (struct bnx2i_nop_out_request *)ep->qp.sq_prod_qe; > >@@ -602,11 +594,9 @@ int bnx2i_send_iscsi_nopout(struct bnx2i_conn >*bnx2i_conn, > int bnx2i_send_iscsi_logout(struct bnx2i_conn *bnx2i_conn, > struct iscsi_task *task) > { >- struct bnx2i_cmd *bnx2i_cmd; > struct bnx2i_logout_request *logout_wqe; > struct iscsi_logout *logout_hdr; > >- bnx2i_cmd = (struct bnx2i_cmd *)task->dd_data; > logout_hdr = (struct iscsi_logout *)task->hdr; > > logout_wqe = (struct bnx2i_logout_request *) >-- >2.7.4 Thanks, Acked-by: Manish Rangankar
Re: [PATCH 2/2] x86/hibernate/64: Mask off CR3's PCID bits in the saved CR3
* Andy Lutomirskiwrote: > On Fri, Sep 8, 2017 at 12:59 AM, Jiri Kosina wrote: > > On Thu, 7 Sep 2017, Andy Lutomirski wrote: > > > >> Jiri reported a resume-from-hibernation failure triggered by PCID. > >> The root cause appears to be rather odd. The hibernation asm > >> restores a CR3 value that comes from the image header. If the image > >> kernel has PCID on, it's entirely reasonable for this CR3 value to > >> have one of the low 12 bits set. The restore code restores it with > >> CR4.PCIDE=0, which means that those low 12 bits are accepted by the > >> CPU but are either ignored or interpreted as a caching mode. This > >> is odd, but still works. We blow up later when the image kernel > >> restores CR4, though, since changing CR4.PCIDE with CR3[11:0] != 0 > >> is illegal. Boom! > >> > >> FWIW, it's entirely unclear to me what's supposed to happen if a PAE > >> kernel restores a non-PAE image or vice versa. Ditto for LA57. > > > > I've just performed 15 hibernation cycles with current Linus' tree > > (5969d1bb3082) with these two patches applied on top of it, and I haven't > > encountered any issue (and the warning in switch_mm_irqs_off() didn't > > trigger either). > > > >> Reported-by: Jiri Kosina > >> Fixes: 660da7c9228f ("x86/mm: Enable CR4.PCIDE on supported systems") > >> Signed-off-by: Andy Lutomirski > > > > Tested-by: Jiri Kosina > > > > Ingo, please do *not* apply this patch yet. The code is fine, but the > comment is about to become wrong. I just found a nasty initialization > order issue, and I need to rework a bunch of the way we deal with > PCIDE. Ok, I'll delay everything PCID delayed - once you've gathered it all together please send a full series against Linus's latest collecting all the fixes/cleanups. If you find unexpected complications then there will be a point in time where it might be better to just disable PCID for this release and re-try in v4.15. As the number and complexity of fixes increases so does the risk that we'll introduce some last-minute regression. Thanks, Ingo
Re: [PATCH] bnx2i: Clean up unused pointers in bnx2i_hwi
On 10/09/17 5:48 PM, "Christos Gkekas" wrote: >Pointers bnx2i_cmd are set but never used, so they can be removed. > >Signed-off-by: Christos Gkekas >--- > drivers/scsi/bnx2i/bnx2i_hwi.c | 10 -- > 1 file changed, 10 deletions(-) > >diff --git a/drivers/scsi/bnx2i/bnx2i_hwi.c >b/drivers/scsi/bnx2i/bnx2i_hwi.c >index 42921db..e3f22cb 100644 >--- a/drivers/scsi/bnx2i/bnx2i_hwi.c >+++ b/drivers/scsi/bnx2i/bnx2i_hwi.c >@@ -332,12 +332,10 @@ static void >bnx2i_ring_dbell_update_sq_params(struct bnx2i_conn *bnx2i_conn, > int bnx2i_send_iscsi_login(struct bnx2i_conn *bnx2i_conn, > struct iscsi_task *task) > { >- struct bnx2i_cmd *bnx2i_cmd; > struct bnx2i_login_request *login_wqe; > struct iscsi_login_req *login_hdr; > u32 dword; > >- bnx2i_cmd = (struct bnx2i_cmd *)task->dd_data; > login_hdr = (struct iscsi_login_req *)task->hdr; > login_wqe = (struct bnx2i_login_request *) > bnx2i_conn->ep->qp.sq_prod_qe; >@@ -391,12 +389,10 @@ int bnx2i_send_iscsi_tmf(struct bnx2i_conn >*bnx2i_conn, > struct iscsi_tm *tmfabort_hdr; > struct scsi_cmnd *ref_sc; > struct iscsi_task *ctask; >- struct bnx2i_cmd *bnx2i_cmd; > struct bnx2i_tmf_request *tmfabort_wqe; > u32 dword; > u32 scsi_lun[2]; > >- bnx2i_cmd = (struct bnx2i_cmd *)mtask->dd_data; > tmfabort_hdr = (struct iscsi_tm *)mtask->hdr; > tmfabort_wqe = (struct bnx2i_tmf_request *) > bnx2i_conn->ep->qp.sq_prod_qe; >@@ -463,12 +459,10 @@ int bnx2i_send_iscsi_tmf(struct bnx2i_conn >*bnx2i_conn, > int bnx2i_send_iscsi_text(struct bnx2i_conn *bnx2i_conn, > struct iscsi_task *mtask) > { >- struct bnx2i_cmd *bnx2i_cmd; > struct bnx2i_text_request *text_wqe; > struct iscsi_text *text_hdr; > u32 dword; > >- bnx2i_cmd = (struct bnx2i_cmd *)mtask->dd_data; > text_hdr = (struct iscsi_text *)mtask->hdr; > text_wqe = (struct bnx2i_text_request *) bnx2i_conn->ep->qp.sq_prod_qe; > >@@ -541,11 +535,9 @@ int bnx2i_send_iscsi_nopout(struct bnx2i_conn >*bnx2i_conn, > char *datap, int data_len, int unsol) > { > struct bnx2i_endpoint *ep = bnx2i_conn->ep; >- struct bnx2i_cmd *bnx2i_cmd; > struct bnx2i_nop_out_request *nopout_wqe; > struct iscsi_nopout *nopout_hdr; > >- bnx2i_cmd = (struct bnx2i_cmd *)task->dd_data; > nopout_hdr = (struct iscsi_nopout *)task->hdr; > nopout_wqe = (struct bnx2i_nop_out_request *)ep->qp.sq_prod_qe; > >@@ -602,11 +594,9 @@ int bnx2i_send_iscsi_nopout(struct bnx2i_conn >*bnx2i_conn, > int bnx2i_send_iscsi_logout(struct bnx2i_conn *bnx2i_conn, > struct iscsi_task *task) > { >- struct bnx2i_cmd *bnx2i_cmd; > struct bnx2i_logout_request *logout_wqe; > struct iscsi_logout *logout_hdr; > >- bnx2i_cmd = (struct bnx2i_cmd *)task->dd_data; > logout_hdr = (struct iscsi_logout *)task->hdr; > > logout_wqe = (struct bnx2i_logout_request *) >-- >2.7.4 Thanks, Acked-by: Manish Rangankar
Re: [PATCH 2/2] x86/hibernate/64: Mask off CR3's PCID bits in the saved CR3
* Andy Lutomirski wrote: > On Fri, Sep 8, 2017 at 12:59 AM, Jiri Kosina wrote: > > On Thu, 7 Sep 2017, Andy Lutomirski wrote: > > > >> Jiri reported a resume-from-hibernation failure triggered by PCID. > >> The root cause appears to be rather odd. The hibernation asm > >> restores a CR3 value that comes from the image header. If the image > >> kernel has PCID on, it's entirely reasonable for this CR3 value to > >> have one of the low 12 bits set. The restore code restores it with > >> CR4.PCIDE=0, which means that those low 12 bits are accepted by the > >> CPU but are either ignored or interpreted as a caching mode. This > >> is odd, but still works. We blow up later when the image kernel > >> restores CR4, though, since changing CR4.PCIDE with CR3[11:0] != 0 > >> is illegal. Boom! > >> > >> FWIW, it's entirely unclear to me what's supposed to happen if a PAE > >> kernel restores a non-PAE image or vice versa. Ditto for LA57. > > > > I've just performed 15 hibernation cycles with current Linus' tree > > (5969d1bb3082) with these two patches applied on top of it, and I haven't > > encountered any issue (and the warning in switch_mm_irqs_off() didn't > > trigger either). > > > >> Reported-by: Jiri Kosina > >> Fixes: 660da7c9228f ("x86/mm: Enable CR4.PCIDE on supported systems") > >> Signed-off-by: Andy Lutomirski > > > > Tested-by: Jiri Kosina > > > > Ingo, please do *not* apply this patch yet. The code is fine, but the > comment is about to become wrong. I just found a nasty initialization > order issue, and I need to rework a bunch of the way we deal with > PCIDE. Ok, I'll delay everything PCID delayed - once you've gathered it all together please send a full series against Linus's latest collecting all the fixes/cleanups. If you find unexpected complications then there will be a point in time where it might be better to just disable PCID for this release and re-try in v4.15. As the number and complexity of fixes increases so does the risk that we'll introduce some last-minute regression. Thanks, Ingo
Re: [PATCH v2 1/2] completion: Add support for initializing completion with lockdep_map
On Fri, Sep 08, 2017 at 05:46:24PM +0900, Byungchul Park wrote: > To peter, > > Does it work? > > Chagnes from v1 > - Add a completion initialization function with a lockdep map > - Enhance readability of the workqueue code Do not you like it becasue it introduces another init_completion_xxx()? Anyway, I think the redundant acquisitions in the workqueue code should be removed. Right? > ->8- > >From e148617e20ebc9c9eefe7bb222b9bba07cb963bc Mon Sep 17 00:00:00 2001 > From: Byungchul Park> Date: Fri, 8 Sep 2017 17:39:48 +0900 > Subject: [PATCH v2 1/2] completion: Add support for initializing completion > with lockdep_map > > Sometimes, we want to initialize completions with sparate lockdep maps > to assign lock classes under control. For example, the workqueue code > manages lockdep maps, as it can classify lockdep maps properly. > Provided a function for that purpose. > > Signed-off-by: Byungchul Park > --- > include/linux/completion.h | 8 > 1 file changed, 8 insertions(+) > > diff --git a/include/linux/completion.h b/include/linux/completion.h > index cae5400..182d56e 100644 > --- a/include/linux/completion.h > +++ b/include/linux/completion.h > @@ -49,6 +49,13 @@ static inline void complete_release_commit(struct > completion *x) > lock_commit_crosslock((struct lockdep_map *)>map); > } > > +#define init_completion_with_map(x, m) > \ > +do { \ > + lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map, \ > + (m)->name, (m)->key, 0); > \ > + __init_completion(x); \ > +} while (0) > + > #define init_completion(x) \ > do { \ > static struct lock_class_key __key; \ > @@ -58,6 +65,7 @@ static inline void complete_release_commit(struct > completion *x) > __init_completion(x); \ > } while (0) > #else > +#define init_completion_with_map(x, m) __init_completion(x) > #define init_completion(x) __init_completion(x) > static inline void complete_acquire(struct completion *x) {} > static inline void complete_release(struct completion *x) {} > -- > 1.9.1
Re: [PATCH v2 1/2] completion: Add support for initializing completion with lockdep_map
On Fri, Sep 08, 2017 at 05:46:24PM +0900, Byungchul Park wrote: > To peter, > > Does it work? > > Chagnes from v1 > - Add a completion initialization function with a lockdep map > - Enhance readability of the workqueue code Do not you like it becasue it introduces another init_completion_xxx()? Anyway, I think the redundant acquisitions in the workqueue code should be removed. Right? > ->8- > >From e148617e20ebc9c9eefe7bb222b9bba07cb963bc Mon Sep 17 00:00:00 2001 > From: Byungchul Park > Date: Fri, 8 Sep 2017 17:39:48 +0900 > Subject: [PATCH v2 1/2] completion: Add support for initializing completion > with lockdep_map > > Sometimes, we want to initialize completions with sparate lockdep maps > to assign lock classes under control. For example, the workqueue code > manages lockdep maps, as it can classify lockdep maps properly. > Provided a function for that purpose. > > Signed-off-by: Byungchul Park > --- > include/linux/completion.h | 8 > 1 file changed, 8 insertions(+) > > diff --git a/include/linux/completion.h b/include/linux/completion.h > index cae5400..182d56e 100644 > --- a/include/linux/completion.h > +++ b/include/linux/completion.h > @@ -49,6 +49,13 @@ static inline void complete_release_commit(struct > completion *x) > lock_commit_crosslock((struct lockdep_map *)>map); > } > > +#define init_completion_with_map(x, m) > \ > +do { \ > + lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map, \ > + (m)->name, (m)->key, 0); > \ > + __init_completion(x); \ > +} while (0) > + > #define init_completion(x) \ > do { \ > static struct lock_class_key __key; \ > @@ -58,6 +65,7 @@ static inline void complete_release_commit(struct > completion *x) > __init_completion(x); \ > } while (0) > #else > +#define init_completion_with_map(x, m) __init_completion(x) > #define init_completion(x) __init_completion(x) > static inline void complete_acquire(struct completion *x) {} > static inline void complete_release(struct completion *x) {} > -- > 1.9.1
Re: [PATCH] sysrq : fix Show Regs call trace on ARM
On 09/11/2017, 05:11 AM, Jibin Xu wrote: ... > --- a/drivers/tty/sysrq.c > +++ b/drivers/tty/sysrq.c > @@ -245,8 +245,10 @@ static void sysrq_handle_showallcpus(int key) >* architecture has no support for it: >*/ > if (!trigger_all_cpu_backtrace()) { > - struct pt_regs *regs = get_irq_regs(); > + struct pt_regs *regs = NULL; > > + if (in_irq()) > + regs = get_irq_regs(); Maybe a stupid question: how does get_irq_regs() behave in the softirq context? I.e. what about s/in_irq/in_interrupt/? thanks, -- js suse labs
Re: [PATCH] sysrq : fix Show Regs call trace on ARM
On 09/11/2017, 05:11 AM, Jibin Xu wrote: ... > --- a/drivers/tty/sysrq.c > +++ b/drivers/tty/sysrq.c > @@ -245,8 +245,10 @@ static void sysrq_handle_showallcpus(int key) >* architecture has no support for it: >*/ > if (!trigger_all_cpu_backtrace()) { > - struct pt_regs *regs = get_irq_regs(); > + struct pt_regs *regs = NULL; > > + if (in_irq()) > + regs = get_irq_regs(); Maybe a stupid question: how does get_irq_regs() behave in the softirq context? I.e. what about s/in_irq/in_interrupt/? thanks, -- js suse labs
Re: btusb "firmware request while host is not available" at resume
Hi Linus, >> Yes 81f95076281f is to blame.. After reverting it all is fine again. >> >> 15 resume cycles on the one laptop , 10 on the other without to hit the >> trace. > > Yeah, I think that disable/enable_firmware in the suspend/resume path > is basically just completely random code. There is nothing that > serializes with anything else, and it has no actual basis for saying > "I am now disabled/enabled" except for some entirely random time of > whenever the suspend/resume callbacks happen. > > I'm going to revert it. > > I wonder why 4.13 seemed to work for me. Or maybe it didn't, and I > just didn't notice, because I didn't use bluetooth and I wasn't > traveling. I test my laptop every release, but I don't necessarily > always _use_ it. we have a bug report that might go into the similar direction. So the problem is that modern Bluetooth controller require full firmware loading (not just ROM patching) and if the controller has the firmware somehow already loaded, but then looses power and needs a reload, then it is not cached (since it was never requested in the first place). Of course if the reload triggers during resume when not file system is available, it can not request the firmware. That will just fail and thus you might see this issue. We have a few RFC patches on the mailing list that I have to review. It is however not that easy all the time to find the right firmware file (at least not for Intel hardware) since the boot loader provides different information than the fully operational firmware. So I need to make sure that request the right firmware (if we do not need it initially) so that it gets cached. Regards Marcel
Re: btusb "firmware request while host is not available" at resume
Hi Linus, >> Yes 81f95076281f is to blame.. After reverting it all is fine again. >> >> 15 resume cycles on the one laptop , 10 on the other without to hit the >> trace. > > Yeah, I think that disable/enable_firmware in the suspend/resume path > is basically just completely random code. There is nothing that > serializes with anything else, and it has no actual basis for saying > "I am now disabled/enabled" except for some entirely random time of > whenever the suspend/resume callbacks happen. > > I'm going to revert it. > > I wonder why 4.13 seemed to work for me. Or maybe it didn't, and I > just didn't notice, because I didn't use bluetooth and I wasn't > traveling. I test my laptop every release, but I don't necessarily > always _use_ it. we have a bug report that might go into the similar direction. So the problem is that modern Bluetooth controller require full firmware loading (not just ROM patching) and if the controller has the firmware somehow already loaded, but then looses power and needs a reload, then it is not cached (since it was never requested in the first place). Of course if the reload triggers during resume when not file system is available, it can not request the firmware. That will just fail and thus you might see this issue. We have a few RFC patches on the mailing list that I have to review. It is however not that easy all the time to find the right firmware file (at least not for Intel hardware) since the boot loader provides different information than the fully operational firmware. So I need to make sure that request the right firmware (if we do not need it initially) so that it gets cached. Regards Marcel
Questions about NVMEM
Hi Srinivas, I have 3 questions about the nvmem sybsystem. Please correct me if something is missing from my thought. (Q1) How to allocate struct nvmem_config? I see 3 ways in allocating struct nvmem_config. What is a good / bad practice? (A) Allocate statically in .data section bcm-ocotp.c imx-ocotp.c lpc18xx_eeprom.c lpc18xx_otp.c mxs-octop.c qfprom.c rockchip-efuse.c sunxi_sid vf610-ocotp.c meson-efuse.c (B) devm_kzalloc() imx-iim.c mtk-efuse.c drivers/misc/eeprom/at24.c (C) Stack drivers/thunderbolt/switch.c I think (A) is safe only when we know the system has just one instance of the device. (A) should not be used if two or more instances exist. Is this correct? I think (B) is wasting memory because nvmem_register() copies all members of nvmem_config to nvmem_device. nvmem_config is never dereferenced after nvmem_register() finished. I do not see much sense to keep it until the driver is detached. (C) looks reasonable because nvmem_config is pretty small. (sizeof(struct nvmem_config) = 104 byte on 64bit systems) Several subsystems receive configuration data from stack, for example, "struct clk_init_data" in clk drivers, "struct uart_8250_port" in 8250 serial drivers. sizeof(struct uart_8250_port) = 528 byte, but it is still working in stack. (Q2) Is nvmem_config::read_only necessary? If .reg_write() callback is set, it is probably writable. If .reg_write() is missing, it must be read-only. I have no idea when nvmem_config::read_only is useful... (Q3) The style of drivers/nvmem/Makefile This Makefile looks ugly to me. All nvmem drivers are just single file modules. Why are they renamed when modules are created? For the name-space reason for modules, prefix "nvmem-" makes sense to me. It is true that adding "nvmem-" prefix is redundant while they are located in drivers/nvmem/ directory, but renaming in the Makefile is even more annoying to me. Having said that, we may not want to churn this. -- Best Regards Masahiro Yamada
Questions about NVMEM
Hi Srinivas, I have 3 questions about the nvmem sybsystem. Please correct me if something is missing from my thought. (Q1) How to allocate struct nvmem_config? I see 3 ways in allocating struct nvmem_config. What is a good / bad practice? (A) Allocate statically in .data section bcm-ocotp.c imx-ocotp.c lpc18xx_eeprom.c lpc18xx_otp.c mxs-octop.c qfprom.c rockchip-efuse.c sunxi_sid vf610-ocotp.c meson-efuse.c (B) devm_kzalloc() imx-iim.c mtk-efuse.c drivers/misc/eeprom/at24.c (C) Stack drivers/thunderbolt/switch.c I think (A) is safe only when we know the system has just one instance of the device. (A) should not be used if two or more instances exist. Is this correct? I think (B) is wasting memory because nvmem_register() copies all members of nvmem_config to nvmem_device. nvmem_config is never dereferenced after nvmem_register() finished. I do not see much sense to keep it until the driver is detached. (C) looks reasonable because nvmem_config is pretty small. (sizeof(struct nvmem_config) = 104 byte on 64bit systems) Several subsystems receive configuration data from stack, for example, "struct clk_init_data" in clk drivers, "struct uart_8250_port" in 8250 serial drivers. sizeof(struct uart_8250_port) = 528 byte, but it is still working in stack. (Q2) Is nvmem_config::read_only necessary? If .reg_write() callback is set, it is probably writable. If .reg_write() is missing, it must be read-only. I have no idea when nvmem_config::read_only is useful... (Q3) The style of drivers/nvmem/Makefile This Makefile looks ugly to me. All nvmem drivers are just single file modules. Why are they renamed when modules are created? For the name-space reason for modules, prefix "nvmem-" makes sense to me. It is true that adding "nvmem-" prefix is redundant while they are located in drivers/nvmem/ directory, but renaming in the Makefile is even more annoying to me. Having said that, we may not want to churn this. -- Best Regards Masahiro Yamada
Re: btusb "firmware request while host is not available" at resume
On Sun, Sep 10, 2017 at 8:49 PM, Gabriel Cwrote: > > Yes 81f95076281f is to blame.. After reverting it all is fine again. > > 15 resume cycles on the one laptop , 10 on the other without to hit the > trace. Yeah, I think that disable/enable_firmware in the suspend/resume path is basically just completely random code. There is nothing that serializes with anything else, and it has no actual basis for saying "I am now disabled/enabled" except for some entirely random time of whenever the suspend/resume callbacks happen. I'm going to revert it. I wonder why 4.13 seemed to work for me. Or maybe it didn't, and I just didn't notice, because I didn't use bluetooth and I wasn't traveling. I test my laptop every release, but I don't necessarily always _use_ it. Linus
Re: btusb "firmware request while host is not available" at resume
On Sun, Sep 10, 2017 at 8:49 PM, Gabriel C wrote: > > Yes 81f95076281f is to blame.. After reverting it all is fine again. > > 15 resume cycles on the one laptop , 10 on the other without to hit the > trace. Yeah, I think that disable/enable_firmware in the suspend/resume path is basically just completely random code. There is nothing that serializes with anything else, and it has no actual basis for saying "I am now disabled/enabled" except for some entirely random time of whenever the suspend/resume callbacks happen. I'm going to revert it. I wonder why 4.13 seemed to work for me. Or maybe it didn't, and I just didn't notice, because I didn't use bluetooth and I wasn't traveling. I test my laptop every release, but I don't necessarily always _use_ it. Linus
Re: [RFC v2 PATCH] x86/boot: Add the secdata section to the setup header
On Fri, Sep 08, 2017 at 01:59:00PM -0700, h...@zytor.com wrote: > On September 8, 2017 2:45:10 AM PDT, Gary Linwrote: > >On Thu, Sep 07, 2017 at 02:16:21PM -0700, h...@zytor.com wrote: > >> On September 7, 2017 2:44:51 AM PDT, Gary Lin wrote: > >> >On Thu, Jun 01, 2017 at 08:46:26AM +, Ard Biesheuvel wrote: > >> >> On 1 June 2017 at 08:11, Gary Lin wrote: > >> >> > On Fri, May 12, 2017 at 04:05:34PM +0800, Gary Lin wrote: > >> >> >> A new section, secdata, in the setup header is introduced to > >store > >> >the > >> >> >> distro-specific security version which is designed to help the > >> >> >> bootloader to warn the user when loading a less secure or > >> >vulnerable > >> >> >> kernel. The secdata section can be presented as the following: > >> >> >> > >> >> >> struct sec_hdr { > >> >> >> __u16 header_length; > >> >> >> __u32 distro_version; > >> >> >> __u16 security_version; > >> >> >> } __attribute__((packed)); > >> >> >> char *signer; > >> >> >> > >> >> >> It consists of a fixed size structure and a null-terminated > >> >string. > >> >> >> "header_length" is the size of "struct sec_hdr" and can be used > >as > >> >the > >> >> >> offset to "signer". It also can be a kind of the "header > >version" > >> >to > >> >> >> detect if any new member is introduced. > >> >> >> > >> >> >> The kernel packager of the distribution can put the distro name > >in > >> >> >> "signer" and the distro version in "distro_version". When a > >severe > >> >> >> vulnerability is fixed, the packager increases > >"security_version" > >> >in > >> >> >> the kernel build afterward. The bootloader can maintain a list > >of > >> >the > >> >> >> security versions of the current kernels and only allows the > >> >kernel with > >> >> >> a higher or equal security version to boot. If the user is > >going > >> >to boot > >> >> >> a kernel with a lower security version, a warning should show > >to > >> >prevent > >> >> >> the user from loading a vulnerable kernel accidentally. > >> >> >> > >> >> >> Enabling UEFI Secure Boot is recommended when using the > >security > >> >version > >> >> >> or the attacker may alter the security version stealthily. > >> >> >> > >> >> > Any comment? > >> >> > > >> >> > >> >> This is now entirely x86-specific. My preference would be to have > >a > >> >> generic solution instead. > >> >> > >> >After check the headers again, another idea came to my mind: the > >MS-DOS > >> >stub. It's designed to show a warning while the image is loaded in > >> >DOS(*), > >> >but I wonder if it still matters. In the x86 linux efi header, the > >stub > >> >is just a 3-lines message, while arm64 completely ignores the stub. > >> > > >> >Since there is a offset to the PE header at 0x3c, we can > >theoretically > >> >put any thing between 0x40 and the PE header without affecting the > >> >current settings. > >> > > >> >HPA, > >> > > >> >Does the MS-DOS stub raise any concern to you? > >> > > >> >Thanks, > >> > > >> >Gary Lin > >> > > >> >(*) > >> > >>https://msdn.microsoft.com/zh-tw/library/windows/desktop/ms680547(v=vs.85).aspx#ms-dos_stub__image_only_ > >> > > >> >> -- > >> >> Ard. > >> >> > >> >> > >> >> >> v2: > >> >> >> - Decrease the size of secdata_offset to 2 bytes since the > >setup > >> >header > >> >> >> is limited to around 32KB. > >> >> >> - Restructure the secdata section. The signer is now a > >> >null-terminated > >> >> >> string. The type of distro_version changes to u32 in case the > >> >distro > >> >> >> uses a long version. > >> >> >> - Modify the Kconfig names and add help. > >> >> >> - Remove the signer name hack in build.c. > >> >> >> > >> >> >> Cc: Ard Biesheuvel > >> >> >> Cc: "H. Peter Anvin" > >> >> >> Cc: Thomas Gleixner > >> >> >> Cc: Ingo Molnar > >> >> >> Cc: Joey Lee > >> >> >> Signed-off-by: Gary Lin > >> >> >> --- > >[snip] > >> >> >> -- > >> >> >> 2.12.2 > >> >> >> > >> >> > >> > >> I really don't think that is a good idea. I would much rather keep > >this in a space we fully own. > >Fine. I'll find another place for ARM64 (probably append the structure > >right after the PE-header and denote the 2-byte offset in the reserved > >fields in the first 64 bytes header). > > > >Thanks, > > > >Gary Lin > > Another "safe" option would be to put it in a COFF segment; then it would be > system-independent. Creating a new COFF section looks promising. Thanks for pointing the direction. Gary Lin
Re: [RFC v2 PATCH] x86/boot: Add the secdata section to the setup header
On Fri, Sep 08, 2017 at 01:59:00PM -0700, h...@zytor.com wrote: > On September 8, 2017 2:45:10 AM PDT, Gary Lin wrote: > >On Thu, Sep 07, 2017 at 02:16:21PM -0700, h...@zytor.com wrote: > >> On September 7, 2017 2:44:51 AM PDT, Gary Lin wrote: > >> >On Thu, Jun 01, 2017 at 08:46:26AM +, Ard Biesheuvel wrote: > >> >> On 1 June 2017 at 08:11, Gary Lin wrote: > >> >> > On Fri, May 12, 2017 at 04:05:34PM +0800, Gary Lin wrote: > >> >> >> A new section, secdata, in the setup header is introduced to > >store > >> >the > >> >> >> distro-specific security version which is designed to help the > >> >> >> bootloader to warn the user when loading a less secure or > >> >vulnerable > >> >> >> kernel. The secdata section can be presented as the following: > >> >> >> > >> >> >> struct sec_hdr { > >> >> >> __u16 header_length; > >> >> >> __u32 distro_version; > >> >> >> __u16 security_version; > >> >> >> } __attribute__((packed)); > >> >> >> char *signer; > >> >> >> > >> >> >> It consists of a fixed size structure and a null-terminated > >> >string. > >> >> >> "header_length" is the size of "struct sec_hdr" and can be used > >as > >> >the > >> >> >> offset to "signer". It also can be a kind of the "header > >version" > >> >to > >> >> >> detect if any new member is introduced. > >> >> >> > >> >> >> The kernel packager of the distribution can put the distro name > >in > >> >> >> "signer" and the distro version in "distro_version". When a > >severe > >> >> >> vulnerability is fixed, the packager increases > >"security_version" > >> >in > >> >> >> the kernel build afterward. The bootloader can maintain a list > >of > >> >the > >> >> >> security versions of the current kernels and only allows the > >> >kernel with > >> >> >> a higher or equal security version to boot. If the user is > >going > >> >to boot > >> >> >> a kernel with a lower security version, a warning should show > >to > >> >prevent > >> >> >> the user from loading a vulnerable kernel accidentally. > >> >> >> > >> >> >> Enabling UEFI Secure Boot is recommended when using the > >security > >> >version > >> >> >> or the attacker may alter the security version stealthily. > >> >> >> > >> >> > Any comment? > >> >> > > >> >> > >> >> This is now entirely x86-specific. My preference would be to have > >a > >> >> generic solution instead. > >> >> > >> >After check the headers again, another idea came to my mind: the > >MS-DOS > >> >stub. It's designed to show a warning while the image is loaded in > >> >DOS(*), > >> >but I wonder if it still matters. In the x86 linux efi header, the > >stub > >> >is just a 3-lines message, while arm64 completely ignores the stub. > >> > > >> >Since there is a offset to the PE header at 0x3c, we can > >theoretically > >> >put any thing between 0x40 and the PE header without affecting the > >> >current settings. > >> > > >> >HPA, > >> > > >> >Does the MS-DOS stub raise any concern to you? > >> > > >> >Thanks, > >> > > >> >Gary Lin > >> > > >> >(*) > >> > >>https://msdn.microsoft.com/zh-tw/library/windows/desktop/ms680547(v=vs.85).aspx#ms-dos_stub__image_only_ > >> > > >> >> -- > >> >> Ard. > >> >> > >> >> > >> >> >> v2: > >> >> >> - Decrease the size of secdata_offset to 2 bytes since the > >setup > >> >header > >> >> >> is limited to around 32KB. > >> >> >> - Restructure the secdata section. The signer is now a > >> >null-terminated > >> >> >> string. The type of distro_version changes to u32 in case the > >> >distro > >> >> >> uses a long version. > >> >> >> - Modify the Kconfig names and add help. > >> >> >> - Remove the signer name hack in build.c. > >> >> >> > >> >> >> Cc: Ard Biesheuvel > >> >> >> Cc: "H. Peter Anvin" > >> >> >> Cc: Thomas Gleixner > >> >> >> Cc: Ingo Molnar > >> >> >> Cc: Joey Lee > >> >> >> Signed-off-by: Gary Lin > >> >> >> --- > >[snip] > >> >> >> -- > >> >> >> 2.12.2 > >> >> >> > >> >> > >> > >> I really don't think that is a good idea. I would much rather keep > >this in a space we fully own. > >Fine. I'll find another place for ARM64 (probably append the structure > >right after the PE-header and denote the 2-byte offset in the reserved > >fields in the first 64 bytes header). > > > >Thanks, > > > >Gary Lin > > Another "safe" option would be to put it in a COFF segment; then it would be > system-independent. Creating a new COFF section looks promising. Thanks for pointing the direction. Gary Lin
Re: btusb "firmware request while host is not available" at resume
On 11.09.2017 05:15, Gabriel C wrote: On 11.09.2017 03:25, Greg Kroah-Hartman wrote: On Sun, Sep 10, 2017 at 12:26:02PM -0700, Linus Torvalds wrote: This seems to be a new problem at resume for the Intel btusb driver, but I'm not seeing anything in that driver itself that looks like a likely trigger, so I wonder if it's some driver core change, a generic resume path issue, or a workqueue change that has made it trigger for me. It might also just be a timing difference, maybe it's always been there? Does anybody have any ideas? It does't happen on every resume, and the machine works despite this (but no bluetooth - the *next* resume might bring it back, though). Ah, it's not just me having this problem. I don't see it happening in 4.12, and haven't had the time to bisect it. I seem to be able to trigger it every suspend/resume cycle, so I don't know if it's a timing issue. I see the same problem with QCA hardware.. but a bit different. On first resume cycle the firmware call is fine but the adapter dies a bit later with : 'Bluetooth: hci0: Failed to send body at 4 of 1857 (-110)' On second resume cycle I hit the trace too. Linus -- ACPI: Low-level resume complete ACPI: EC: EC started PM: Restoring platform NVS memory Enabling non-boot CPUs ... x86: Booting SMP configuration: smpboot: Booting Node 0 Processor 1 APIC 0x2 cache: parent cpu1 should not be sleeping CPU1 is up smpboot: Booting Node 0 Processor 2 APIC 0x1 cache: parent cpu2 should not be sleeping CPU2 is up smpboot: Booting Node 0 Processor 3 APIC 0x3 cache: parent cpu3 should not be sleeping CPU3 is up ACPI: Waking up from system sleep state S3 ACPI: EC: event unblocked usb 1-3: reset full-speed USB device number 2 using xhci_hcd usb 1-4: reset full-speed USB device number 3 using xhci_hcd usb 1-5: reset high-speed USB device number 4 using xhci_hcd usb 1-3:1.0: rebind failed: -517 usb 1-3:1.1: rebind failed: -517 Bluetooth: hci0: Bootloader revision 0.0 build 2 week 52 2014 OOM killer enabled. Restarting tasks ... Bluetooth: hci0: Device revision is 5 Bluetooth: hci0: Secure boot is enabled Bluetooth: hci0: OTP lock is enabled Bluetooth: hci0: API lock is enabled Bluetooth: hci0: Debug lock is disabled Bluetooth: hci0: Minimum firmware build 1 week 10 2014 firmware request while host is not available [ cut here ] WARNING: CPU: 2 PID: 621 at drivers/base/firmware_class.c:1250 _request_firmware+0x460/0x790 CPU: 2 PID: 621 Comm: kworker/u9:2 Not tainted 4.13.0-10313-ge860d2c904d1 #11 Hardware name: Dell Inc. XPS 13 9350/09JHRY, BIOS 1.4.17 05/10/2017 Workqueue: hci0 hci_power_on [bluetooth] task: 8d3767895ac0 task.stack: 9d3481efc000 RIP: 0010:_request_firmware+0x460/0x790 Call Trace: request_firmware+0x37/0x50 btusb_setup_intel_new+0x227/0x7e0 [btusb] hci_dev_do_open+0x3da/0x570 [bluetooth] hci_power_on+0x52/0x1f0 [bluetooth] process_one_work+0x1db/0x3d0 worker_thread+0x47/0x3e0 kthread+0x125/0x140 ret_from_fork+0x22/0x30 ---[ end trace 007b222491432927 ]--- Bluetooth: hci0: Failed to load Intel firmware file (-112) [drm] RC6 on done. thermal thermal_zone11: failed to read out thermal zone (-5) PM: suspend exit Ah, I'll blame Luis for this, I think it might be due to 81f95076281f ("firmware: add sanity check on shutdown/suspend") Luis, any ideas? I'll try to revert this and try it out tomorrow when I get a chance. I can revert it an fire up some testing.. Yes 81f95076281f is to blame.. After reverting it all is fine again. 15 resume cycles on the one laptop , 10 on the other without to hit the trace. Regards
Re: btusb "firmware request while host is not available" at resume
On 11.09.2017 05:15, Gabriel C wrote: On 11.09.2017 03:25, Greg Kroah-Hartman wrote: On Sun, Sep 10, 2017 at 12:26:02PM -0700, Linus Torvalds wrote: This seems to be a new problem at resume for the Intel btusb driver, but I'm not seeing anything in that driver itself that looks like a likely trigger, so I wonder if it's some driver core change, a generic resume path issue, or a workqueue change that has made it trigger for me. It might also just be a timing difference, maybe it's always been there? Does anybody have any ideas? It does't happen on every resume, and the machine works despite this (but no bluetooth - the *next* resume might bring it back, though). Ah, it's not just me having this problem. I don't see it happening in 4.12, and haven't had the time to bisect it. I seem to be able to trigger it every suspend/resume cycle, so I don't know if it's a timing issue. I see the same problem with QCA hardware.. but a bit different. On first resume cycle the firmware call is fine but the adapter dies a bit later with : 'Bluetooth: hci0: Failed to send body at 4 of 1857 (-110)' On second resume cycle I hit the trace too. Linus -- ACPI: Low-level resume complete ACPI: EC: EC started PM: Restoring platform NVS memory Enabling non-boot CPUs ... x86: Booting SMP configuration: smpboot: Booting Node 0 Processor 1 APIC 0x2 cache: parent cpu1 should not be sleeping CPU1 is up smpboot: Booting Node 0 Processor 2 APIC 0x1 cache: parent cpu2 should not be sleeping CPU2 is up smpboot: Booting Node 0 Processor 3 APIC 0x3 cache: parent cpu3 should not be sleeping CPU3 is up ACPI: Waking up from system sleep state S3 ACPI: EC: event unblocked usb 1-3: reset full-speed USB device number 2 using xhci_hcd usb 1-4: reset full-speed USB device number 3 using xhci_hcd usb 1-5: reset high-speed USB device number 4 using xhci_hcd usb 1-3:1.0: rebind failed: -517 usb 1-3:1.1: rebind failed: -517 Bluetooth: hci0: Bootloader revision 0.0 build 2 week 52 2014 OOM killer enabled. Restarting tasks ... Bluetooth: hci0: Device revision is 5 Bluetooth: hci0: Secure boot is enabled Bluetooth: hci0: OTP lock is enabled Bluetooth: hci0: API lock is enabled Bluetooth: hci0: Debug lock is disabled Bluetooth: hci0: Minimum firmware build 1 week 10 2014 firmware request while host is not available [ cut here ] WARNING: CPU: 2 PID: 621 at drivers/base/firmware_class.c:1250 _request_firmware+0x460/0x790 CPU: 2 PID: 621 Comm: kworker/u9:2 Not tainted 4.13.0-10313-ge860d2c904d1 #11 Hardware name: Dell Inc. XPS 13 9350/09JHRY, BIOS 1.4.17 05/10/2017 Workqueue: hci0 hci_power_on [bluetooth] task: 8d3767895ac0 task.stack: 9d3481efc000 RIP: 0010:_request_firmware+0x460/0x790 Call Trace: request_firmware+0x37/0x50 btusb_setup_intel_new+0x227/0x7e0 [btusb] hci_dev_do_open+0x3da/0x570 [bluetooth] hci_power_on+0x52/0x1f0 [bluetooth] process_one_work+0x1db/0x3d0 worker_thread+0x47/0x3e0 kthread+0x125/0x140 ret_from_fork+0x22/0x30 ---[ end trace 007b222491432927 ]--- Bluetooth: hci0: Failed to load Intel firmware file (-112) [drm] RC6 on done. thermal thermal_zone11: failed to read out thermal zone (-5) PM: suspend exit Ah, I'll blame Luis for this, I think it might be due to 81f95076281f ("firmware: add sanity check on shutdown/suspend") Luis, any ideas? I'll try to revert this and try it out tomorrow when I get a chance. I can revert it an fire up some testing.. Yes 81f95076281f is to blame.. After reverting it all is fine again. 15 resume cycles on the one laptop , 10 on the other without to hit the trace. Regards
Re: [PATCH v2 3/5] dmaengine: zynqmp_ps_pcie: Adding PS PCIe DMA driver
Hi Ravi, [auto build test ERROR on linus/master] [also build test ERROR on v4.13 next-20170908] [cannot apply to xlnx/master] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Ravi-Shankar-Jonnalagadda/dmaengine-ZynqMP-PS-PCIe-DMA-driver/20170911-052150 config: arm64-allmodconfig (attached as .config) compiler: aarch64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=arm64 Note: the linux-review/Ravi-Shankar-Jonnalagadda/dmaengine-ZynqMP-PS-PCIe-DMA-driver/20170911-052150 HEAD 442a00e11c4fa28ed29541773946aa1cda153e7e builds fine. It only hurts bisectibility. All errors (new ones prefixed by >>): >> make[4]: *** No rule to make target 'drivers/dma/xilinx/ps_pcie_platform.o', >> needed by 'drivers/dma/xilinx/ps_pcie_dma.o'. make[4]: Target '__build' not remade because of errors. --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [PATCH v2 3/5] dmaengine: zynqmp_ps_pcie: Adding PS PCIe DMA driver
Hi Ravi, [auto build test ERROR on linus/master] [also build test ERROR on v4.13 next-20170908] [cannot apply to xlnx/master] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Ravi-Shankar-Jonnalagadda/dmaengine-ZynqMP-PS-PCIe-DMA-driver/20170911-052150 config: arm64-allmodconfig (attached as .config) compiler: aarch64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=arm64 Note: the linux-review/Ravi-Shankar-Jonnalagadda/dmaengine-ZynqMP-PS-PCIe-DMA-driver/20170911-052150 HEAD 442a00e11c4fa28ed29541773946aa1cda153e7e builds fine. It only hurts bisectibility. All errors (new ones prefixed by >>): >> make[4]: *** No rule to make target 'drivers/dma/xilinx/ps_pcie_platform.o', >> needed by 'drivers/dma/xilinx/ps_pcie_dma.o'. make[4]: Target '__build' not remade because of errors. --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
[PATCH] x86/processor: Remove unused declaration of __generic_processor_info()
Commit: 2b85b3d22920 ("x86/acpi: Restore the order of CPU IDs") ... reverted the __generic_processor_info() to generic_processor_info(), but forgot to remove its declaration in mpspec.h together. Remove the declaration and make the comments consistent. Signed-off-by: Dou Liyang--- arch/x86/include/asm/mpspec.h | 1 - arch/x86/kernel/apic/apic.c | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/include/asm/mpspec.h b/arch/x86/include/asm/mpspec.h index 831eb78..c471ca1 100644 --- a/arch/x86/include/asm/mpspec.h +++ b/arch/x86/include/asm/mpspec.h @@ -86,7 +86,6 @@ static inline void e820__memblock_alloc_reserved_mpc_new(void) { } #endif int generic_processor_info(int apicid, int version); -int __generic_processor_info(int apicid, int version, bool enabled); #define PHYSID_ARRAY_SIZE BITS_TO_LONGS(MAX_LOCAL_APIC) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 8315e2f..d705c76 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2130,7 +2130,7 @@ int generic_processor_info(int apicid, int version) * Since fixing handling of boot_cpu_physical_apicid requires * another discussion and tests on each platform, we leave it * for now and here we use read_apic_id() directly in this -* function, __generic_processor_info(). +* function, generic_processor_info(). */ if (disabled_cpu_apicid != BAD_APICID && disabled_cpu_apicid != read_apic_id() && -- 2.5.5
[PATCH] x86/processor: Remove unused declaration of __generic_processor_info()
Commit: 2b85b3d22920 ("x86/acpi: Restore the order of CPU IDs") ... reverted the __generic_processor_info() to generic_processor_info(), but forgot to remove its declaration in mpspec.h together. Remove the declaration and make the comments consistent. Signed-off-by: Dou Liyang --- arch/x86/include/asm/mpspec.h | 1 - arch/x86/kernel/apic/apic.c | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/include/asm/mpspec.h b/arch/x86/include/asm/mpspec.h index 831eb78..c471ca1 100644 --- a/arch/x86/include/asm/mpspec.h +++ b/arch/x86/include/asm/mpspec.h @@ -86,7 +86,6 @@ static inline void e820__memblock_alloc_reserved_mpc_new(void) { } #endif int generic_processor_info(int apicid, int version); -int __generic_processor_info(int apicid, int version, bool enabled); #define PHYSID_ARRAY_SIZE BITS_TO_LONGS(MAX_LOCAL_APIC) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 8315e2f..d705c76 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2130,7 +2130,7 @@ int generic_processor_info(int apicid, int version) * Since fixing handling of boot_cpu_physical_apicid requires * another discussion and tests on each platform, we leave it * for now and here we use read_apic_id() directly in this -* function, __generic_processor_info(). +* function, generic_processor_info(). */ if (disabled_cpu_apicid != BAD_APICID && disabled_cpu_apicid != read_apic_id() && -- 2.5.5
[PATCH] f2fs: fix double count on issued discard commands
If issue_cond is true, it does double count for # of issued commands. Signed-off-by: Jaegeuk Kim--- fs/f2fs/segment.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 7fd742f747ce..25196ff5d587 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -1087,7 +1087,7 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi, bool issue_cond) issued++; __submit_discard_cmd(sbi, dc); } - if (issue_cond && iter++ > DISCARD_ISSUE_RATE) + if (issue_cond && iter > DISCARD_ISSUE_RATE) goto out; } if (list_empty(pend_list) && dcc->pend_list_tag[i] & P_TRIM) -- 2.14.0.rc1.383.gd1ce394fe2-goog
[PATCH] f2fs: fix double count on issued discard commands
If issue_cond is true, it does double count for # of issued commands. Signed-off-by: Jaegeuk Kim --- fs/f2fs/segment.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 7fd742f747ce..25196ff5d587 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -1087,7 +1087,7 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi, bool issue_cond) issued++; __submit_discard_cmd(sbi, dc); } - if (issue_cond && iter++ > DISCARD_ISSUE_RATE) + if (issue_cond && iter > DISCARD_ISSUE_RATE) goto out; } if (list_empty(pend_list) && dcc->pend_list_tag[i] & P_TRIM) -- 2.14.0.rc1.383.gd1ce394fe2-goog
Re: WARNING: CPU: 2 PID: 4277 at lib/refcount.c:186
On Fri, 2017-09-08 at 11:46 -0700, Eric Dumazet wrote: > On Fri, 2017-09-08 at 10:21 -0700, Cong Wang wrote: > > (Cc'ing netdev) > > > > On Fri, Sep 8, 2017 at 5:59 AM, Shankara Pailoor> > wrote: > > > Hi, > > > > > > I found a warning while fuzzing with Syzkaller on linux 4.13-rc7 on > > > x86_64. The full stack trace is below: > > > > > > WARNING: CPU: 2 PID: 4277 at lib/refcount.c:186 > > > refcount_sub_and_test+0x167/0x1b0 lib/refcount.c:186 > > > Kernel panic - not syncing: panic_on_warn set ... > > > > > > CPU: 2 PID: 4277 Comm: syz-executor0 Not tainted 4.13.0-rc7 #3 > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > > Ubuntu-1.8.2-1ubuntu1 04/01/2014 > > > Call Trace: > > > > > > __dump_stack lib/dump_stack.c:16 [inline] > > > dump_stack+0xf7/0x1aa lib/dump_stack.c:52 > > > panic+0x1ae/0x3a7 kernel/panic.c:180 > > > __warn+0x1c4/0x1d9 kernel/panic.c:541 > > > report_bug+0x211/0x2d0 lib/bug.c:183 > > > fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190 > > > do_trap_no_signal arch/x86/kernel/traps.c:224 [inline] > > > do_trap+0x260/0x390 arch/x86/kernel/traps.c:273 > > > do_error_trap+0x118/0x340 arch/x86/kernel/traps.c:310 > > > do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323 > > > invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:846 > > > RIP: 0010:refcount_sub_and_test+0x167/0x1b0 lib/refcount.c:186 > > > RSP: 0018:88006e006b60 EFLAGS: 00010286 > > > RAX: 0026 RBX: RCX: > > > RDX: 0026 RSI: 11000dc00d2c RDI: ed000dc00d60 > > > RBP: 88006e006bf0 R08: 0001 R09: > > > R10: R11: R12: 11000dc00d6d > > > R13: R14: 0001 R15: 88006ce9d340 > > > refcount_dec_and_test+0x1a/0x20 lib/refcount.c:211 > > > reqsk_put+0x71/0x2b0 include/net/request_sock.h:123 > > > tcp_v4_rcv+0x259e/0x2e20 net/ipv4/tcp_ipv4.c:1729 > > > ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216 > > > NF_HOOK include/linux/netfilter.h:248 [inline] > > > ip_local_deliver+0x1ce/0x6d0 net/ipv4/ip_input.c:257 > > > dst_input include/net/dst.h:477 [inline] > > > ip_rcv_finish+0x8db/0x19c0 net/ipv4/ip_input.c:397 > > > NF_HOOK include/linux/netfilter.h:248 [inline] > > > ip_rcv+0xc3f/0x17d0 net/ipv4/ip_input.c:488 > > > __netif_receive_skb_core+0x1fb7/0x31f0 net/core/dev.c:4298 > > > __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4336 > > > process_backlog+0x1c5/0x6d0 net/core/dev.c:5102 > > > napi_poll net/core/dev.c:5499 [inline] > > > net_rx_action+0x6d3/0x14a0 net/core/dev.c:5565 > > > __do_softirq+0x2cb/0xb2d kernel/softirq.c:284 > > > do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:898 > > > > > > do_softirq.part.16+0x63/0x80 kernel/softirq.c:328 > > > do_softirq kernel/softirq.c:176 [inline] > > > __local_bh_enable_ip+0x84/0x90 kernel/softirq.c:181 > > > local_bh_enable include/linux/bottom_half.h:31 [inline] > > > rcu_read_unlock_bh include/linux/rcupdate.h:705 [inline] > > > ip_finish_output2+0x8ad/0x1360 net/ipv4/ip_output.c:231 > > > ip_finish_output+0x74e/0xb80 net/ipv4/ip_output.c:317 > > > NF_HOOK_COND include/linux/netfilter.h:237 [inline] > > > ip_output+0x1cc/0x850 net/ipv4/ip_output.c:405 > > > dst_output include/net/dst.h:471 [inline] > > > ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124 > > > ip_queue_xmit+0x8c6/0x1810 net/ipv4/ip_output.c:504 > > > tcp_transmit_skb+0x1963/0x3320 net/ipv4/tcp_output.c:1123 > > > tcp_send_ack.part.35+0x38c/0x620 net/ipv4/tcp_output.c:3575 > > > tcp_send_ack+0x49/0x60 net/ipv4/tcp_output.c:3545 > > > tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:5795 [inline] > > > tcp_rcv_state_process+0x4876/0x4b60 net/ipv4/tcp_input.c:5930 > > > tcp_v4_do_rcv+0x58a/0x820 net/ipv4/tcp_ipv4.c:1483 > > > sk_backlog_rcv include/net/sock.h:907 [inline] > > > __release_sock+0x124/0x360 net/core/sock.c:2223 > > > release_sock+0xa4/0x2a0 net/core/sock.c:2715 > > > inet_wait_for_connect net/ipv4/af_inet.c:557 [inline] > > > __inet_stream_connect+0x671/0xf00 net/ipv4/af_inet.c:643 > > > inet_stream_connect+0x58/0xa0 net/ipv4/af_inet.c:682 > > > SYSC_connect+0x204/0x470 net/socket.c:1628 > > > SyS_connect+0x24/0x30 net/socket.c:1609 > > > entry_SYSCALL_64_fastpath+0x18/0xad > > > RIP: 0033:0x451e59 > > > RSP: 002b:7f474843fc08 EFLAGS: 0216 ORIG_RAX: 002a > > > RAX: ffda RBX: 00718000 RCX: 00451e59 > > > RDX: 0010 RSI: 20002000 RDI: 0007 > > > RBP: 0046 R08: R09: > > > R10: R11: 0216 R12: > > > R13: 7ffc040a0f8f R14: 7f47484409c0 R15: > > > > > > > > > > > > > > > I found that the following program is able to reproduce the warning: > > > > > > > > > Pastebin: https://pastebin.com/B75BdYKz > > > > > > Here are my configs:
Re: WARNING: CPU: 2 PID: 4277 at lib/refcount.c:186
On Fri, 2017-09-08 at 11:46 -0700, Eric Dumazet wrote: > On Fri, 2017-09-08 at 10:21 -0700, Cong Wang wrote: > > (Cc'ing netdev) > > > > On Fri, Sep 8, 2017 at 5:59 AM, Shankara Pailoor > > wrote: > > > Hi, > > > > > > I found a warning while fuzzing with Syzkaller on linux 4.13-rc7 on > > > x86_64. The full stack trace is below: > > > > > > WARNING: CPU: 2 PID: 4277 at lib/refcount.c:186 > > > refcount_sub_and_test+0x167/0x1b0 lib/refcount.c:186 > > > Kernel panic - not syncing: panic_on_warn set ... > > > > > > CPU: 2 PID: 4277 Comm: syz-executor0 Not tainted 4.13.0-rc7 #3 > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > > Ubuntu-1.8.2-1ubuntu1 04/01/2014 > > > Call Trace: > > > > > > __dump_stack lib/dump_stack.c:16 [inline] > > > dump_stack+0xf7/0x1aa lib/dump_stack.c:52 > > > panic+0x1ae/0x3a7 kernel/panic.c:180 > > > __warn+0x1c4/0x1d9 kernel/panic.c:541 > > > report_bug+0x211/0x2d0 lib/bug.c:183 > > > fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190 > > > do_trap_no_signal arch/x86/kernel/traps.c:224 [inline] > > > do_trap+0x260/0x390 arch/x86/kernel/traps.c:273 > > > do_error_trap+0x118/0x340 arch/x86/kernel/traps.c:310 > > > do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323 > > > invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:846 > > > RIP: 0010:refcount_sub_and_test+0x167/0x1b0 lib/refcount.c:186 > > > RSP: 0018:88006e006b60 EFLAGS: 00010286 > > > RAX: 0026 RBX: RCX: > > > RDX: 0026 RSI: 11000dc00d2c RDI: ed000dc00d60 > > > RBP: 88006e006bf0 R08: 0001 R09: > > > R10: R11: R12: 11000dc00d6d > > > R13: R14: 0001 R15: 88006ce9d340 > > > refcount_dec_and_test+0x1a/0x20 lib/refcount.c:211 > > > reqsk_put+0x71/0x2b0 include/net/request_sock.h:123 > > > tcp_v4_rcv+0x259e/0x2e20 net/ipv4/tcp_ipv4.c:1729 > > > ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216 > > > NF_HOOK include/linux/netfilter.h:248 [inline] > > > ip_local_deliver+0x1ce/0x6d0 net/ipv4/ip_input.c:257 > > > dst_input include/net/dst.h:477 [inline] > > > ip_rcv_finish+0x8db/0x19c0 net/ipv4/ip_input.c:397 > > > NF_HOOK include/linux/netfilter.h:248 [inline] > > > ip_rcv+0xc3f/0x17d0 net/ipv4/ip_input.c:488 > > > __netif_receive_skb_core+0x1fb7/0x31f0 net/core/dev.c:4298 > > > __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4336 > > > process_backlog+0x1c5/0x6d0 net/core/dev.c:5102 > > > napi_poll net/core/dev.c:5499 [inline] > > > net_rx_action+0x6d3/0x14a0 net/core/dev.c:5565 > > > __do_softirq+0x2cb/0xb2d kernel/softirq.c:284 > > > do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:898 > > > > > > do_softirq.part.16+0x63/0x80 kernel/softirq.c:328 > > > do_softirq kernel/softirq.c:176 [inline] > > > __local_bh_enable_ip+0x84/0x90 kernel/softirq.c:181 > > > local_bh_enable include/linux/bottom_half.h:31 [inline] > > > rcu_read_unlock_bh include/linux/rcupdate.h:705 [inline] > > > ip_finish_output2+0x8ad/0x1360 net/ipv4/ip_output.c:231 > > > ip_finish_output+0x74e/0xb80 net/ipv4/ip_output.c:317 > > > NF_HOOK_COND include/linux/netfilter.h:237 [inline] > > > ip_output+0x1cc/0x850 net/ipv4/ip_output.c:405 > > > dst_output include/net/dst.h:471 [inline] > > > ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124 > > > ip_queue_xmit+0x8c6/0x1810 net/ipv4/ip_output.c:504 > > > tcp_transmit_skb+0x1963/0x3320 net/ipv4/tcp_output.c:1123 > > > tcp_send_ack.part.35+0x38c/0x620 net/ipv4/tcp_output.c:3575 > > > tcp_send_ack+0x49/0x60 net/ipv4/tcp_output.c:3545 > > > tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:5795 [inline] > > > tcp_rcv_state_process+0x4876/0x4b60 net/ipv4/tcp_input.c:5930 > > > tcp_v4_do_rcv+0x58a/0x820 net/ipv4/tcp_ipv4.c:1483 > > > sk_backlog_rcv include/net/sock.h:907 [inline] > > > __release_sock+0x124/0x360 net/core/sock.c:2223 > > > release_sock+0xa4/0x2a0 net/core/sock.c:2715 > > > inet_wait_for_connect net/ipv4/af_inet.c:557 [inline] > > > __inet_stream_connect+0x671/0xf00 net/ipv4/af_inet.c:643 > > > inet_stream_connect+0x58/0xa0 net/ipv4/af_inet.c:682 > > > SYSC_connect+0x204/0x470 net/socket.c:1628 > > > SyS_connect+0x24/0x30 net/socket.c:1609 > > > entry_SYSCALL_64_fastpath+0x18/0xad > > > RIP: 0033:0x451e59 > > > RSP: 002b:7f474843fc08 EFLAGS: 0216 ORIG_RAX: 002a > > > RAX: ffda RBX: 00718000 RCX: 00451e59 > > > RDX: 0010 RSI: 20002000 RDI: 0007 > > > RBP: 0046 R08: R09: > > > R10: R11: 0216 R12: > > > R13: 7ffc040a0f8f R14: 7f47484409c0 R15: > > > > > > > > > > > > > > > I found that the following program is able to reproduce the warning: > > > > > > > > > Pastebin: https://pastebin.com/B75BdYKz > > > > > > Here are my configs: https://pastebin.com/zRYCXbak
Re: iov_iter_pipe warning.
On Mon, Sep 11, 2017 at 10:31:13AM +1000, Dave Chinner wrote: > splice does not go down the direct IO path, so iomap_dio_actor() > should never be handled a pipe as the destination for the IO data. > Indeed, splice read has to supply the pages to be put into the pipe, > which the DIO path does not do - it requires pages be supplied to > it. So I'm not sure why we'd care about pipe destination limitations > in the DIO path? splice doesn't give a rat's arse for direct IO; it's up to filesystem. generic_file_splice_read() simply sets up a pipe-backed iov_iter and calls ->read_iter(), period. iov_iter_get_pages() for pipe-backed destination does page allocation and inserts freshly allocated pages into pipe. copy_to_iter() does the same + copies data; copy_page_to_iter() grabs an extra reference to page and inserts it into pipe, not that O_DIRECT ->read_iter() had been likely to use the last one. Normally O_DIRECT would work just fine - pages get allocated, references to them put into pipe cyclic buffer *and* into a number of bio, bio would get submitted and once the IO is completed we unlock the pipe, making those pages available for readers. With minimal care it works just fine - all you really need is * cope with failing copy_to_... / iov_iter_get_pages(). Short read if we'd already gotten something, -EFAULT otherwise. That goes for pipe-backed same as for iovec-backed - any ->read_iter() that fails to handle that is already in trouble. * make sure that iov_iter_get_pages()/iov_iter_get_pages_alloc() is followed by iov_iter_advance() for the amount you've actually filled, before any subsequent copy_to_iter()/copy_page_to_iter() or return from ->read_iter(), whichever comes first. That includes the situation when you actually hadn't filled anything at all - just remember to do iov_iter_advance(to, 0) in that case. That's about the only extra requirement imposed by pipes and it's not hard to satisfy. Combination of iov_iter_advance() with iov_iter_revert() works as usual. Normally a filesystem doesn't need to care about splice at all - just use generic_file_splice_read() and be done with that. It will use the normal ->read_iter(), with whatever locking, etc., your filesystem would do on a normal read.
Re: iov_iter_pipe warning.
On Mon, Sep 11, 2017 at 10:31:13AM +1000, Dave Chinner wrote: > splice does not go down the direct IO path, so iomap_dio_actor() > should never be handled a pipe as the destination for the IO data. > Indeed, splice read has to supply the pages to be put into the pipe, > which the DIO path does not do - it requires pages be supplied to > it. So I'm not sure why we'd care about pipe destination limitations > in the DIO path? splice doesn't give a rat's arse for direct IO; it's up to filesystem. generic_file_splice_read() simply sets up a pipe-backed iov_iter and calls ->read_iter(), period. iov_iter_get_pages() for pipe-backed destination does page allocation and inserts freshly allocated pages into pipe. copy_to_iter() does the same + copies data; copy_page_to_iter() grabs an extra reference to page and inserts it into pipe, not that O_DIRECT ->read_iter() had been likely to use the last one. Normally O_DIRECT would work just fine - pages get allocated, references to them put into pipe cyclic buffer *and* into a number of bio, bio would get submitted and once the IO is completed we unlock the pipe, making those pages available for readers. With minimal care it works just fine - all you really need is * cope with failing copy_to_... / iov_iter_get_pages(). Short read if we'd already gotten something, -EFAULT otherwise. That goes for pipe-backed same as for iovec-backed - any ->read_iter() that fails to handle that is already in trouble. * make sure that iov_iter_get_pages()/iov_iter_get_pages_alloc() is followed by iov_iter_advance() for the amount you've actually filled, before any subsequent copy_to_iter()/copy_page_to_iter() or return from ->read_iter(), whichever comes first. That includes the situation when you actually hadn't filled anything at all - just remember to do iov_iter_advance(to, 0) in that case. That's about the only extra requirement imposed by pipes and it's not hard to satisfy. Combination of iov_iter_advance() with iov_iter_revert() works as usual. Normally a filesystem doesn't need to care about splice at all - just use generic_file_splice_read() and be done with that. It will use the normal ->read_iter(), with whatever locking, etc., your filesystem would do on a normal read.
Re: [RFC PATCH v5 0/5] vfio-pci: Add support for mmapping MSI-X table
On 29/08/17 12:58, Alexey Kardashevskiy wrote: > On 21/08/17 12:47, Alexey Kardashevskiy wrote: >> Folks, >> >> Ok, people did talk, exchanged ideas, lovely :) What happens now? Do I >> repost this or go back to PCI bus flags or something else? Thanks. > > > Anyone, any help? How do we proceed with this? Thanks. Ping, anyone? Or everybody is just waiting for the kvm forum to discuss it in person? Thanks. > > > >> >> >> >> On 14/08/17 19:45, Alexey Kardashevskiy wrote: >>> Folks, >>> >>> Is there anything to change besides those compiler errors and David's >>> comment in 5/5? Or the while patchset is too bad? Thanks. >>> >>> >>> >>> On 07/08/17 17:25, Alexey Kardashevskiy wrote: This is a followup for "[PATCH kernel v4 0/6] vfio-pci: Add support for mmapping MSI-X table" http://www.spinics.net/lists/kvm/msg152232.html This time it is using "caps" in IOMMU groups. The main question is if PCI bus flags or IOMMU domains are still better (and which one). >>> Here is some background: Current vfio-pci implementation disallows to mmap the page containing MSI-X table in case that users can write directly to MSI-X table and generate an incorrect MSIs. However, this will cause some performance issue when there are some critical device registers in the same page as the MSI-X table. We have to handle the mmio access to these registers in QEMU emulation rather than in guest. To solve this issue, this series allows to expose MSI-X table to userspace when hardware enables the capability of interrupt remapping which can ensure that a given PCI device can only shoot the MSIs assigned for it. And we introduce a new bus_flags PCI_BUS_FLAGS_MSI_REMAP to test this capability on PCI side for different archs. This is based on sha1 26c5cebfdb6c "Merge branch 'parisc-4.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux" Please comment. Thanks. Changelog: v5: * redid the whole thing via so-called IOMMU group capabilities v4: * rebased on recent upstream * got all 6 patches from v2 (v3 was missing some) Alexey Kardashevskiy (5): iommu: Add capabilities to a group iommu: Set IOMMU_GROUP_CAP_ISOLATE_MSIX if MSI controller enables IRQ remapping iommu/intel/amd: Set IOMMU_GROUP_CAP_ISOLATE_MSIX if IRQ remapping is enabled powerpc/iommu: Set IOMMU_GROUP_CAP_ISOLATE_MSIX vfio-pci: Allow to expose MSI-X table to userspace when safe include/linux/iommu.h| 20 include/linux/vfio.h | 1 + arch/powerpc/kernel/iommu.c | 1 + drivers/iommu/amd_iommu.c| 3 +++ drivers/iommu/intel-iommu.c | 3 +++ drivers/iommu/iommu.c| 35 +++ drivers/vfio/pci/vfio_pci.c | 20 +--- drivers/vfio/pci/vfio_pci_rdwr.c | 5 - drivers/vfio/vfio.c | 15 +++ 9 files changed, 99 insertions(+), 4 deletions(-) >>> >>> >> >> > > -- Alexey
Re: [RFC PATCH v5 0/5] vfio-pci: Add support for mmapping MSI-X table
On 29/08/17 12:58, Alexey Kardashevskiy wrote: > On 21/08/17 12:47, Alexey Kardashevskiy wrote: >> Folks, >> >> Ok, people did talk, exchanged ideas, lovely :) What happens now? Do I >> repost this or go back to PCI bus flags or something else? Thanks. > > > Anyone, any help? How do we proceed with this? Thanks. Ping, anyone? Or everybody is just waiting for the kvm forum to discuss it in person? Thanks. > > > >> >> >> >> On 14/08/17 19:45, Alexey Kardashevskiy wrote: >>> Folks, >>> >>> Is there anything to change besides those compiler errors and David's >>> comment in 5/5? Or the while patchset is too bad? Thanks. >>> >>> >>> >>> On 07/08/17 17:25, Alexey Kardashevskiy wrote: This is a followup for "[PATCH kernel v4 0/6] vfio-pci: Add support for mmapping MSI-X table" http://www.spinics.net/lists/kvm/msg152232.html This time it is using "caps" in IOMMU groups. The main question is if PCI bus flags or IOMMU domains are still better (and which one). >>> Here is some background: Current vfio-pci implementation disallows to mmap the page containing MSI-X table in case that users can write directly to MSI-X table and generate an incorrect MSIs. However, this will cause some performance issue when there are some critical device registers in the same page as the MSI-X table. We have to handle the mmio access to these registers in QEMU emulation rather than in guest. To solve this issue, this series allows to expose MSI-X table to userspace when hardware enables the capability of interrupt remapping which can ensure that a given PCI device can only shoot the MSIs assigned for it. And we introduce a new bus_flags PCI_BUS_FLAGS_MSI_REMAP to test this capability on PCI side for different archs. This is based on sha1 26c5cebfdb6c "Merge branch 'parisc-4.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux" Please comment. Thanks. Changelog: v5: * redid the whole thing via so-called IOMMU group capabilities v4: * rebased on recent upstream * got all 6 patches from v2 (v3 was missing some) Alexey Kardashevskiy (5): iommu: Add capabilities to a group iommu: Set IOMMU_GROUP_CAP_ISOLATE_MSIX if MSI controller enables IRQ remapping iommu/intel/amd: Set IOMMU_GROUP_CAP_ISOLATE_MSIX if IRQ remapping is enabled powerpc/iommu: Set IOMMU_GROUP_CAP_ISOLATE_MSIX vfio-pci: Allow to expose MSI-X table to userspace when safe include/linux/iommu.h| 20 include/linux/vfio.h | 1 + arch/powerpc/kernel/iommu.c | 1 + drivers/iommu/amd_iommu.c| 3 +++ drivers/iommu/intel-iommu.c | 3 +++ drivers/iommu/iommu.c| 35 +++ drivers/vfio/pci/vfio_pci.c | 20 +--- drivers/vfio/pci/vfio_pci_rdwr.c | 5 - drivers/vfio/vfio.c | 15 +++ 9 files changed, 99 insertions(+), 4 deletions(-) >>> >>> >> >> > > -- Alexey
[PATCH v3] dt-bindings: add nutsboard vendor prefix
From: YuanCheng ChengNutsBoard (http://nutsboard.org/) is an organization and focus on ARM based embedded boards with open source software. Signed-off-by: YuanCheng Cheng --- Documentation/devicetree/bindings/vendor-prefixes.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt index 401ed98..3a16be8 100644 --- a/Documentation/devicetree/bindings/vendor-prefixes.txt +++ b/Documentation/devicetree/bindings/vendor-prefixes.txt @@ -234,6 +234,7 @@ nintendoNintendo nltNLT Technologies, Ltd. nokia Nokia nordic Nordic Semiconductor +nutsboard NutsBoard nuvotonNuvoton Technology Corporation nvdNew Vision Display nvidia NVIDIA -- 2.7.4
[PATCH v3] dt-bindings: add nutsboard vendor prefix
From: YuanCheng Cheng NutsBoard (http://nutsboard.org/) is an organization and focus on ARM based embedded boards with open source software. Signed-off-by: YuanCheng Cheng --- Documentation/devicetree/bindings/vendor-prefixes.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt index 401ed98..3a16be8 100644 --- a/Documentation/devicetree/bindings/vendor-prefixes.txt +++ b/Documentation/devicetree/bindings/vendor-prefixes.txt @@ -234,6 +234,7 @@ nintendoNintendo nltNLT Technologies, Ltd. nokia Nokia nordic Nordic Semiconductor +nutsboard NutsBoard nuvotonNuvoton Technology Corporation nvdNew Vision Display nvidia NVIDIA -- 2.7.4
Re: btusb "firmware request while host is not available" at resume
On 11.09.2017 03:25, Greg Kroah-Hartman wrote: On Sun, Sep 10, 2017 at 12:26:02PM -0700, Linus Torvalds wrote: This seems to be a new problem at resume for the Intel btusb driver, but I'm not seeing anything in that driver itself that looks like a likely trigger, so I wonder if it's some driver core change, a generic resume path issue, or a workqueue change that has made it trigger for me. It might also just be a timing difference, maybe it's always been there? Does anybody have any ideas? It does't happen on every resume, and the machine works despite this (but no bluetooth - the *next* resume might bring it back, though). Ah, it's not just me having this problem. I don't see it happening in 4.12, and haven't had the time to bisect it. I seem to be able to trigger it every suspend/resume cycle, so I don't know if it's a timing issue. I see the same problem with QCA hardware.. but a bit different. On first resume cycle the firmware call is fine but the adapter dies a bit later with : 'Bluetooth: hci0: Failed to send body at 4 of 1857 (-110)' On second resume cycle I hit the trace too. Linus -- ACPI: Low-level resume complete ACPI: EC: EC started PM: Restoring platform NVS memory Enabling non-boot CPUs ... x86: Booting SMP configuration: smpboot: Booting Node 0 Processor 1 APIC 0x2 cache: parent cpu1 should not be sleeping CPU1 is up smpboot: Booting Node 0 Processor 2 APIC 0x1 cache: parent cpu2 should not be sleeping CPU2 is up smpboot: Booting Node 0 Processor 3 APIC 0x3 cache: parent cpu3 should not be sleeping CPU3 is up ACPI: Waking up from system sleep state S3 ACPI: EC: event unblocked usb 1-3: reset full-speed USB device number 2 using xhci_hcd usb 1-4: reset full-speed USB device number 3 using xhci_hcd usb 1-5: reset high-speed USB device number 4 using xhci_hcd usb 1-3:1.0: rebind failed: -517 usb 1-3:1.1: rebind failed: -517 Bluetooth: hci0: Bootloader revision 0.0 build 2 week 52 2014 OOM killer enabled. Restarting tasks ... Bluetooth: hci0: Device revision is 5 Bluetooth: hci0: Secure boot is enabled Bluetooth: hci0: OTP lock is enabled Bluetooth: hci0: API lock is enabled Bluetooth: hci0: Debug lock is disabled Bluetooth: hci0: Minimum firmware build 1 week 10 2014 firmware request while host is not available [ cut here ] WARNING: CPU: 2 PID: 621 at drivers/base/firmware_class.c:1250 _request_firmware+0x460/0x790 CPU: 2 PID: 621 Comm: kworker/u9:2 Not tainted 4.13.0-10313-ge860d2c904d1 #11 Hardware name: Dell Inc. XPS 13 9350/09JHRY, BIOS 1.4.17 05/10/2017 Workqueue: hci0 hci_power_on [bluetooth] task: 8d3767895ac0 task.stack: 9d3481efc000 RIP: 0010:_request_firmware+0x460/0x790 Call Trace: request_firmware+0x37/0x50 btusb_setup_intel_new+0x227/0x7e0 [btusb] hci_dev_do_open+0x3da/0x570 [bluetooth] hci_power_on+0x52/0x1f0 [bluetooth] process_one_work+0x1db/0x3d0 worker_thread+0x47/0x3e0 kthread+0x125/0x140 ret_from_fork+0x22/0x30 ---[ end trace 007b222491432927 ]--- Bluetooth: hci0: Failed to load Intel firmware file (-112) [drm] RC6 on done. thermal thermal_zone11: failed to read out thermal zone (-5) PM: suspend exit Ah, I'll blame Luis for this, I think it might be due to 81f95076281f ("firmware: add sanity check on shutdown/suspend") Luis, any ideas? I'll try to revert this and try it out tomorrow when I get a chance. I can revert it an fire up some testing.. This time with 4.13.x I hit all kind bugs on this box anyway :)
Re: btusb "firmware request while host is not available" at resume
On 11.09.2017 03:25, Greg Kroah-Hartman wrote: On Sun, Sep 10, 2017 at 12:26:02PM -0700, Linus Torvalds wrote: This seems to be a new problem at resume for the Intel btusb driver, but I'm not seeing anything in that driver itself that looks like a likely trigger, so I wonder if it's some driver core change, a generic resume path issue, or a workqueue change that has made it trigger for me. It might also just be a timing difference, maybe it's always been there? Does anybody have any ideas? It does't happen on every resume, and the machine works despite this (but no bluetooth - the *next* resume might bring it back, though). Ah, it's not just me having this problem. I don't see it happening in 4.12, and haven't had the time to bisect it. I seem to be able to trigger it every suspend/resume cycle, so I don't know if it's a timing issue. I see the same problem with QCA hardware.. but a bit different. On first resume cycle the firmware call is fine but the adapter dies a bit later with : 'Bluetooth: hci0: Failed to send body at 4 of 1857 (-110)' On second resume cycle I hit the trace too. Linus -- ACPI: Low-level resume complete ACPI: EC: EC started PM: Restoring platform NVS memory Enabling non-boot CPUs ... x86: Booting SMP configuration: smpboot: Booting Node 0 Processor 1 APIC 0x2 cache: parent cpu1 should not be sleeping CPU1 is up smpboot: Booting Node 0 Processor 2 APIC 0x1 cache: parent cpu2 should not be sleeping CPU2 is up smpboot: Booting Node 0 Processor 3 APIC 0x3 cache: parent cpu3 should not be sleeping CPU3 is up ACPI: Waking up from system sleep state S3 ACPI: EC: event unblocked usb 1-3: reset full-speed USB device number 2 using xhci_hcd usb 1-4: reset full-speed USB device number 3 using xhci_hcd usb 1-5: reset high-speed USB device number 4 using xhci_hcd usb 1-3:1.0: rebind failed: -517 usb 1-3:1.1: rebind failed: -517 Bluetooth: hci0: Bootloader revision 0.0 build 2 week 52 2014 OOM killer enabled. Restarting tasks ... Bluetooth: hci0: Device revision is 5 Bluetooth: hci0: Secure boot is enabled Bluetooth: hci0: OTP lock is enabled Bluetooth: hci0: API lock is enabled Bluetooth: hci0: Debug lock is disabled Bluetooth: hci0: Minimum firmware build 1 week 10 2014 firmware request while host is not available [ cut here ] WARNING: CPU: 2 PID: 621 at drivers/base/firmware_class.c:1250 _request_firmware+0x460/0x790 CPU: 2 PID: 621 Comm: kworker/u9:2 Not tainted 4.13.0-10313-ge860d2c904d1 #11 Hardware name: Dell Inc. XPS 13 9350/09JHRY, BIOS 1.4.17 05/10/2017 Workqueue: hci0 hci_power_on [bluetooth] task: 8d3767895ac0 task.stack: 9d3481efc000 RIP: 0010:_request_firmware+0x460/0x790 Call Trace: request_firmware+0x37/0x50 btusb_setup_intel_new+0x227/0x7e0 [btusb] hci_dev_do_open+0x3da/0x570 [bluetooth] hci_power_on+0x52/0x1f0 [bluetooth] process_one_work+0x1db/0x3d0 worker_thread+0x47/0x3e0 kthread+0x125/0x140 ret_from_fork+0x22/0x30 ---[ end trace 007b222491432927 ]--- Bluetooth: hci0: Failed to load Intel firmware file (-112) [drm] RC6 on done. thermal thermal_zone11: failed to read out thermal zone (-5) PM: suspend exit Ah, I'll blame Luis for this, I think it might be due to 81f95076281f ("firmware: add sanity check on shutdown/suspend") Luis, any ideas? I'll try to revert this and try it out tomorrow when I get a chance. I can revert it an fire up some testing.. This time with 4.13.x I hit all kind bugs on this box anyway :)
[PATCH] sysrq : fix Show Regs call trace on ARM
When kernel configuration SMP,PREEMPT and DEBUG_PREEMPT are enabled, echo 1 >/proc/sys/kernel/sysrq echo p >/proc/sysrq-trigger kernel will print call trace as below: sysrq: SysRq : Show Regs BUG: using __this_cpu_read() in preemptible [] code: sh/435 caller is __this_cpu_preempt_check+0x18/0x20 Call trace: [] dump_backtrace+0x0/0x1d0 [] show_stack+0x24/0x30 [] dump_stack+0x90/0xb0 [] check_preemption_disabled+0x100/0x108 [] __this_cpu_preempt_check+0x18/0x20 [] sysrq_handle_showregs+0x1c/0x40 [] __handle_sysrq+0x12c/0x1a0 [] write_sysrq_trigger+0x60/0x70 [] proc_reg_write+0x90/0xd0 [] __vfs_write+0x48/0x90 [] vfs_write+0xa4/0x190 [] SyS_write+0x54/0xb0 [] el0_svc_naked+0x24/0x28 This can be seen on a common board like an r-pi3. This happens because when echo p >/proc/sysrq-trigger, get_irq_regs() is called outside of IRQ context, if preemption is enabled in this situation,kernel will print the call trace. Since many prior discussions on the mailing lists have made it clear that get_irq_regs either just returns NULL or stale data when used outside of IRQ context,we simply avoid calling it outside of IRQ context. Signed-off-by: Jibin Xu--- drivers/tty/sysrq.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c index 3ffc1ce29023..6ed8c47312a8 100644 --- a/drivers/tty/sysrq.c +++ b/drivers/tty/sysrq.c @@ -245,8 +245,10 @@ static void sysrq_handle_showallcpus(int key) * architecture has no support for it: */ if (!trigger_all_cpu_backtrace()) { - struct pt_regs *regs = get_irq_regs(); + struct pt_regs *regs = NULL; + if (in_irq()) + regs = get_irq_regs(); if (regs) { pr_info("CPU%d:\n", smp_processor_id()); show_regs(regs); @@ -265,7 +267,10 @@ static struct sysrq_key_op sysrq_showallcpus_op = { static void sysrq_handle_showregs(int key) { - struct pt_regs *regs = get_irq_regs(); + struct pt_regs *regs = NULL; + + if (in_irq()) + regs = get_irq_regs(); if (regs) show_regs(regs); perf_event_print_debug(); -- 2.13.0
[PATCH] sysrq : fix Show Regs call trace on ARM
When kernel configuration SMP,PREEMPT and DEBUG_PREEMPT are enabled, echo 1 >/proc/sys/kernel/sysrq echo p >/proc/sysrq-trigger kernel will print call trace as below: sysrq: SysRq : Show Regs BUG: using __this_cpu_read() in preemptible [] code: sh/435 caller is __this_cpu_preempt_check+0x18/0x20 Call trace: [] dump_backtrace+0x0/0x1d0 [] show_stack+0x24/0x30 [] dump_stack+0x90/0xb0 [] check_preemption_disabled+0x100/0x108 [] __this_cpu_preempt_check+0x18/0x20 [] sysrq_handle_showregs+0x1c/0x40 [] __handle_sysrq+0x12c/0x1a0 [] write_sysrq_trigger+0x60/0x70 [] proc_reg_write+0x90/0xd0 [] __vfs_write+0x48/0x90 [] vfs_write+0xa4/0x190 [] SyS_write+0x54/0xb0 [] el0_svc_naked+0x24/0x28 This can be seen on a common board like an r-pi3. This happens because when echo p >/proc/sysrq-trigger, get_irq_regs() is called outside of IRQ context, if preemption is enabled in this situation,kernel will print the call trace. Since many prior discussions on the mailing lists have made it clear that get_irq_regs either just returns NULL or stale data when used outside of IRQ context,we simply avoid calling it outside of IRQ context. Signed-off-by: Jibin Xu --- drivers/tty/sysrq.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c index 3ffc1ce29023..6ed8c47312a8 100644 --- a/drivers/tty/sysrq.c +++ b/drivers/tty/sysrq.c @@ -245,8 +245,10 @@ static void sysrq_handle_showallcpus(int key) * architecture has no support for it: */ if (!trigger_all_cpu_backtrace()) { - struct pt_regs *regs = get_irq_regs(); + struct pt_regs *regs = NULL; + if (in_irq()) + regs = get_irq_regs(); if (regs) { pr_info("CPU%d:\n", smp_processor_id()); show_regs(regs); @@ -265,7 +267,10 @@ static struct sysrq_key_op sysrq_showallcpus_op = { static void sysrq_handle_showregs(int key) { - struct pt_regs *regs = get_irq_regs(); + struct pt_regs *regs = NULL; + + if (in_irq()) + regs = get_irq_regs(); if (regs) show_regs(regs); perf_event_print_debug(); -- 2.13.0
linux-next: Tree for Sep 11
Hi all, Please do not add any v4.15 related material to your linux-next included branches until after v4.14-rc1 has been released. Changes since 20170908: The watchdog tree gained a conflict against the arm-soc tree. The aio tree gained a conflict against the vfs tree. The akpm-current tree still had its build failure for which I applied a fix patch. Non-merge commits (relative to Linus' tree): 1830 2014 files changed, 77242 insertions(+), 18986 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc and an allmodconfig (with CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a native build of tools/perf. After the final fixups (if any), I do an x86_64 modules_install followed by builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc and sparc64 defconfig. And finally, a simple boot test of the powerpc pseries_le_defconfig kernel in qemu. Below is a summary of the state of the merge. I am currently merging 267 trees (counting Linus' and 41 trees of bug fix patches pending for the current merge release). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (d719518d9ce9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next) Merging fixes/master (b4b8cbf679c4 Cavium CNN55XX: fix broken default Kconfig entry) Merging kbuild-current/fixes (64236e315955 kbuild: update comments of Makefile.asm-generic) Merging arc-current/for-curr (1ee55a8f7f6b ARC: Re-enable MMU upon Machine Check exception) Merging arm-current/fixes (746a272e4414 ARM: 8692/1: mm: abort uaccess retries upon fatal signal) Merging m68k-current/for-linus (558d5ad276c9 m68k/mac: Avoid soft-lockup warning after mach_power_off) Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups) Merging powerpc-fixes/fixes (1a92a80ad386 powerpc/mm: Ensure cpumask update is ordered) Merging sparc/master (6470812e2226 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc) Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and linking special files) Merging net/master (1f4f554a72be net: qualcomm: rmnet: Fix a double free) Merging ipsec/master (f581a0dd744f wl1251: add a missing spin_lock_init()) Merging netfilter/master (90c4ae4e2c1d netfilter: xt_hashlimit: fix build error caused by 64bit division) Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook mask only if set) Merging wireless-drivers/master (f957dd3c8db2 brcmfmac: feature check for multi-scheduled scan fails on bcm4345 devices) Merging mac80211/master (126f760ca94d rds: Fix incorrect statistics counting) Merging sound-current/for-linus (2a0d85d9add7 ALSA: asihpi: Kill BUG_ON() usages) Merging pci-current/for-linus (8e1101d25164 PCI/MSI: Don't warn when irq_create_affinity_masks() returns NULL) Merging driver-core.current/driver-core-linus (d34fc1adf01f Merge branch 'akpm' (patches from Andrew)) Merging tty.current/tty-linus (d34fc1adf01f Merge branch 'akpm' (patches from Andrew)) Merging usb.current/usb-linus (d34fc1adf01f Merge branch 'akpm' (patches from Andrew)) Merging usb-gadget-fixes/fixes (b7d44c36a6f6 usb: renesas_usbhs: gadget: fix unused-but-set-variable warning) Merging usb-serial-fixes/usb-linus (fd1b8668af59 USB: serial: option: add D-Link DWM-222 device ID) Merging usb-chipidea-fixes/ci-for-usb-stable (cbb22ebcfb99 usb: chipidea: core: check before accessing ci_role in ci_role_show) Merging phy/fixes (5771a8c08880 Linux v4.13-rc1) Merging staging.current/staging-linus (d34fc1adf01f Merge branch 'akpm' (patches from Andrew)) Merging char-misc.current/char-misc-linus (d34fc1adf01f Merge branch 'akpm' (patches from Andrew)) Merging input-current/for-linus (a6cbfa1e6d38 Merge branch 'next' into for-linus) Merging crypto-current/master (2d45a7e89833 crypto:
linux-next: Tree for Sep 11
Hi all, Please do not add any v4.15 related material to your linux-next included branches until after v4.14-rc1 has been released. Changes since 20170908: The watchdog tree gained a conflict against the arm-soc tree. The aio tree gained a conflict against the vfs tree. The akpm-current tree still had its build failure for which I applied a fix patch. Non-merge commits (relative to Linus' tree): 1830 2014 files changed, 77242 insertions(+), 18986 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc and an allmodconfig (with CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a native build of tools/perf. After the final fixups (if any), I do an x86_64 modules_install followed by builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc and sparc64 defconfig. And finally, a simple boot test of the powerpc pseries_le_defconfig kernel in qemu. Below is a summary of the state of the merge. I am currently merging 267 trees (counting Linus' and 41 trees of bug fix patches pending for the current merge release). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (d719518d9ce9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next) Merging fixes/master (b4b8cbf679c4 Cavium CNN55XX: fix broken default Kconfig entry) Merging kbuild-current/fixes (64236e315955 kbuild: update comments of Makefile.asm-generic) Merging arc-current/for-curr (1ee55a8f7f6b ARC: Re-enable MMU upon Machine Check exception) Merging arm-current/fixes (746a272e4414 ARM: 8692/1: mm: abort uaccess retries upon fatal signal) Merging m68k-current/for-linus (558d5ad276c9 m68k/mac: Avoid soft-lockup warning after mach_power_off) Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups) Merging powerpc-fixes/fixes (1a92a80ad386 powerpc/mm: Ensure cpumask update is ordered) Merging sparc/master (6470812e2226 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc) Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and linking special files) Merging net/master (1f4f554a72be net: qualcomm: rmnet: Fix a double free) Merging ipsec/master (f581a0dd744f wl1251: add a missing spin_lock_init()) Merging netfilter/master (90c4ae4e2c1d netfilter: xt_hashlimit: fix build error caused by 64bit division) Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook mask only if set) Merging wireless-drivers/master (f957dd3c8db2 brcmfmac: feature check for multi-scheduled scan fails on bcm4345 devices) Merging mac80211/master (126f760ca94d rds: Fix incorrect statistics counting) Merging sound-current/for-linus (2a0d85d9add7 ALSA: asihpi: Kill BUG_ON() usages) Merging pci-current/for-linus (8e1101d25164 PCI/MSI: Don't warn when irq_create_affinity_masks() returns NULL) Merging driver-core.current/driver-core-linus (d34fc1adf01f Merge branch 'akpm' (patches from Andrew)) Merging tty.current/tty-linus (d34fc1adf01f Merge branch 'akpm' (patches from Andrew)) Merging usb.current/usb-linus (d34fc1adf01f Merge branch 'akpm' (patches from Andrew)) Merging usb-gadget-fixes/fixes (b7d44c36a6f6 usb: renesas_usbhs: gadget: fix unused-but-set-variable warning) Merging usb-serial-fixes/usb-linus (fd1b8668af59 USB: serial: option: add D-Link DWM-222 device ID) Merging usb-chipidea-fixes/ci-for-usb-stable (cbb22ebcfb99 usb: chipidea: core: check before accessing ci_role in ci_role_show) Merging phy/fixes (5771a8c08880 Linux v4.13-rc1) Merging staging.current/staging-linus (d34fc1adf01f Merge branch 'akpm' (patches from Andrew)) Merging char-misc.current/char-misc-linus (d34fc1adf01f Merge branch 'akpm' (patches from Andrew)) Merging input-current/for-linus (a6cbfa1e6d38 Merge branch 'next' into for-linus) Merging crypto-current/master (2d45a7e89833 crypto:
Re: [lkp-robot] [sched/fair] 6d46bd3d97: netperf.Throughput_tps -11.3% regression
On Sun, 2017-09-10 at 09:53 -0700, Joel Fernandes wrote: > > Anyone know what in the netperf test triggers use of the sync flag? homer:..kernel/linux-master # git grep wake_up_interruptible_sync_poll net net/core/sock.c:wake_up_interruptible_sync_poll(>wait, POLLIN | POLLPRI | net/core/sock.c: wake_up_interruptible_sync_poll(>wait, POLLOUT | net/sctp/socket.c: wake_up_interruptible_sync_poll(>wait, POLLIN | net/smc/smc_rx.c: wake_up_interruptible_sync_poll(>wait, POLLIN | POLLPRI | net/tipc/socket.c: wake_up_interruptible_sync_poll(>wait, POLLOUT | net/tipc/socket.c: wake_up_interruptible_sync_poll(>wait, POLLIN | net/unix/af_unix.c: wake_up_interruptible_sync_poll(>wait, net/unix/af_unix.c: wake_up_interruptible_sync_poll(>peer_wait, The same as metric tons of other stuff. Once upon a time, we had avg_overlap to help decide whether to wake core affine or not, on top of the wake_affine() imbalance constraint, but instrumentation showed it to be too error prone, so it had to die. These days, an affine wakeup generally means cache affine, and the sync hint gives you a wee bit more chance of migration near to tasty hot data being approved. The sync hint was born back in the bad old days, when communicating tasks not sharing L2 may as well have been talking over two tin cans and a limp string. These days, things are oodles better, but truly synchronous stuff could still benefit from core affinity (up to hugely for very fast/light stuff) if it weren't for all the caveats that can lead to tossing concurrency opportunities out the window. -Mike
Re: [lkp-robot] [sched/fair] 6d46bd3d97: netperf.Throughput_tps -11.3% regression
On Sun, 2017-09-10 at 09:53 -0700, Joel Fernandes wrote: > > Anyone know what in the netperf test triggers use of the sync flag? homer:..kernel/linux-master # git grep wake_up_interruptible_sync_poll net net/core/sock.c:wake_up_interruptible_sync_poll(>wait, POLLIN | POLLPRI | net/core/sock.c: wake_up_interruptible_sync_poll(>wait, POLLOUT | net/sctp/socket.c: wake_up_interruptible_sync_poll(>wait, POLLIN | net/smc/smc_rx.c: wake_up_interruptible_sync_poll(>wait, POLLIN | POLLPRI | net/tipc/socket.c: wake_up_interruptible_sync_poll(>wait, POLLOUT | net/tipc/socket.c: wake_up_interruptible_sync_poll(>wait, POLLIN | net/unix/af_unix.c: wake_up_interruptible_sync_poll(>wait, net/unix/af_unix.c: wake_up_interruptible_sync_poll(>peer_wait, The same as metric tons of other stuff. Once upon a time, we had avg_overlap to help decide whether to wake core affine or not, on top of the wake_affine() imbalance constraint, but instrumentation showed it to be too error prone, so it had to die. These days, an affine wakeup generally means cache affine, and the sync hint gives you a wee bit more chance of migration near to tasty hot data being approved. The sync hint was born back in the bad old days, when communicating tasks not sharing L2 may as well have been talking over two tin cans and a limp string. These days, things are oodles better, but truly synchronous stuff could still benefit from core affinity (up to hugely for very fast/light stuff) if it weren't for all the caveats that can lead to tossing concurrency opportunities out the window. -Mike
Re: [PATCH 01/10] spi-nor: intel-spi: Fix number of protected range registers for BYT/LPT
Hi Marek, On Fri, Sep 1, 2017 at 5:52 PM, Marek Vasutwrote: > On 09/01/2017 10:00 AM, Bin Meng wrote: >> The number of protected range registers is not the same on BYT/LPT/ >> BXT. GPR0 only exists on Apollo Lake and its offset is reserved on >> other platforms. >> >> Signed-off-by: Bin Meng > > Can this use regmap ? What you're implementing here seems like regmap to me. No. regmap does not apply here. Regards, Bin
Re: [PATCH 01/10] spi-nor: intel-spi: Fix number of protected range registers for BYT/LPT
Hi Marek, On Fri, Sep 1, 2017 at 5:52 PM, Marek Vasut wrote: > On 09/01/2017 10:00 AM, Bin Meng wrote: >> The number of protected range registers is not the same on BYT/LPT/ >> BXT. GPR0 only exists on Apollo Lake and its offset is reserved on >> other platforms. >> >> Signed-off-by: Bin Meng > > Can this use regmap ? What you're implementing here seems like regmap to me. No. regmap does not apply here. Regards, Bin
Re: [PATCH] scsi: ufs: Make use of UFS_BIT macro wherever possible
Hi, Ping!!! Should I drop this patch and send another one which removes UFS_BIT() macro? On Tue, Aug 29, 2017 at 4:35 PM, Alim Akhtarwrote: > Hi Bart, > Thanks for your review. > > On 08/28/2017 09:15 PM, Bart Van Assche wrote: >> On Mon, 2017-08-28 at 17:49 +0530, Alim Akhtar wrote: >>> This entire file uses UFS_BIT macro for bits definition, expect for few >>> places. This patch convert those defines to use UFS_BIT macro to be aligned >>> with reset of the file. >> >> This is the definition of the UFS_BIT() macro I found in >> drivers/scsi/ufs/ufshci.h: >> >> #define UFS_BIT(x)(1L << (x)) >> >> Using this macro makes code longer instead of shorter and does not improve >> code readability. Is this macro really useful? Wouldn't it be better to >> remove the UFS_BIT() macro instead of introducing more uses of it? >> > Well, the intension of this patch is to make use of already existing > UFS_BIT() macro. > > I am not aware of the history why this macro was created at first place. > > Well, it does improve code readability, for me at least, no need for one > to do a calculation to see which bit it is, as we pass _bit_ number to > UFS_BIT. > > I am totally okay, if you or other reviewers suggests me to change > UFS_BIT to actual bit position, something like the original case, which > this patch is trying to change. > >> Thanks, >> >> Bart. >> > Thanks! > Alim -- Regards, Alim
Re: [PATCH] scsi: ufs: Make use of UFS_BIT macro wherever possible
Hi, Ping!!! Should I drop this patch and send another one which removes UFS_BIT() macro? On Tue, Aug 29, 2017 at 4:35 PM, Alim Akhtar wrote: > Hi Bart, > Thanks for your review. > > On 08/28/2017 09:15 PM, Bart Van Assche wrote: >> On Mon, 2017-08-28 at 17:49 +0530, Alim Akhtar wrote: >>> This entire file uses UFS_BIT macro for bits definition, expect for few >>> places. This patch convert those defines to use UFS_BIT macro to be aligned >>> with reset of the file. >> >> This is the definition of the UFS_BIT() macro I found in >> drivers/scsi/ufs/ufshci.h: >> >> #define UFS_BIT(x)(1L << (x)) >> >> Using this macro makes code longer instead of shorter and does not improve >> code readability. Is this macro really useful? Wouldn't it be better to >> remove the UFS_BIT() macro instead of introducing more uses of it? >> > Well, the intension of this patch is to make use of already existing > UFS_BIT() macro. > > I am not aware of the history why this macro was created at first place. > > Well, it does improve code readability, for me at least, no need for one > to do a calculation to see which bit it is, as we pass _bit_ number to > UFS_BIT. > > I am totally okay, if you or other reviewers suggests me to change > UFS_BIT to actual bit position, something like the original case, which > this patch is trying to change. > >> Thanks, >> >> Bart. >> > Thanks! > Alim -- Regards, Alim
[PATCH RFC V2 05/10] perf tools: lock to protect thread list
From: Kan LiangAdd two locks to protect namespaces_list and comm_list. The comm which is used in db-export are not protected. Because the multithread code will not touch it. It can be added later if required. Signed-off-by: Kan Liang --- tools/perf/ui/browsers/hists.c | 2 +- tools/perf/util/thread.c | 60 +- tools/perf/util/thread.h | 6 +++-- 3 files changed, 52 insertions(+), 16 deletions(-) diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c index 13dfb0a..429e1dd 100644 --- a/tools/perf/ui/browsers/hists.c +++ b/tools/perf/ui/browsers/hists.c @@ -2370,7 +2370,7 @@ static int perf_evsel_browser_title(struct hist_browser *browser, char unit; int printed; const struct dso *dso = hists->dso_filter; - const struct thread *thread = hists->thread_filter; + struct thread *thread = hists->thread_filter; int socket_id = hists->socket_filter; unsigned long nr_samples = hists->stats.nr_events[PERF_RECORD_SAMPLE]; u64 nr_events = hists->stats.total_period; diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c index b7957da..25830b2 100644 --- a/tools/perf/util/thread.c +++ b/tools/perf/util/thread.c @@ -45,6 +45,8 @@ struct thread *thread__new(pid_t pid, pid_t tid) thread->cpu = -1; INIT_LIST_HEAD(>namespaces_list); INIT_LIST_HEAD(>comm_list); + pthread_mutex_init(>namespaces_lock, NULL); + pthread_mutex_init(>comm_lock, NULL); comm_str = malloc(32); if (!comm_str) @@ -83,15 +85,21 @@ void thread__delete(struct thread *thread) map_groups__put(thread->mg); thread->mg = NULL; } + pthread_mutex_lock(>namespaces_lock); list_for_each_entry_safe(namespaces, tmp_namespaces, >namespaces_list, list) { list_del(>list); namespaces__free(namespaces); } + pthread_mutex_unlock(>namespaces_lock); + + pthread_mutex_lock(>comm_lock); list_for_each_entry_safe(comm, tmp_comm, >comm_list, list) { list_del(>list); comm__free(comm); } + pthread_mutex_unlock(>comm_lock); + unwind__finish_access(thread); nsinfo__zput(thread->nsinfo); @@ -128,11 +136,17 @@ struct namespaces *thread__namespaces(const struct thread *thread) int thread__set_namespaces(struct thread *thread, u64 timestamp, struct namespaces_event *event) { - struct namespaces *new, *curr = thread__namespaces(thread); + struct namespaces *new, *curr; + + pthread_mutex_lock(>namespaces_lock); + + curr = thread__namespaces(thread); new = namespaces__new(event); - if (!new) + if (!new) { + pthread_mutex_unlock(>namespaces_lock); return -ENOMEM; + } list_add(>list, >namespaces_list); @@ -146,17 +160,21 @@ int thread__set_namespaces(struct thread *thread, u64 timestamp, curr->end_time = timestamp; } + pthread_mutex_unlock(>namespaces_lock); + return 0; } -void thread__namespaces_id(const struct thread *thread, +void thread__namespaces_id(struct thread *thread, u64 *dev, u64 *ino) { struct namespaces *ns; + pthread_mutex_lock(>namespaces_lock); ns = thread__namespaces(thread); *dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0; *ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0; + pthread_mutex_unlock(>namespaces_lock); } struct comm *thread__comm(const struct thread *thread) @@ -183,17 +201,23 @@ struct comm *thread__exec_comm(const struct thread *thread) int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp, bool exec) { - struct comm *new, *curr = thread__comm(thread); + struct comm *new, *curr; + int err = 0; + + pthread_mutex_lock(>comm_lock); + curr = thread__comm(thread); /* Override the default :tid entry */ if (!thread->comm_set) { - int err = comm__override(curr, str, timestamp, exec); + err = comm__override(curr, str, timestamp, exec); if (err) - return err; + goto unlock; } else { new = comm__new(str, timestamp, exec); - if (!new) - return -ENOMEM; + if (!new) { + err = -ENOMEM; + goto unlock; + } list_add(>list, >comm_list); if (exec) @@ -202,7 +226,9 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp, thread->comm_set = true; -
[PATCH RFC V2 05/10] perf tools: lock to protect thread list
From: Kan Liang Add two locks to protect namespaces_list and comm_list. The comm which is used in db-export are not protected. Because the multithread code will not touch it. It can be added later if required. Signed-off-by: Kan Liang --- tools/perf/ui/browsers/hists.c | 2 +- tools/perf/util/thread.c | 60 +- tools/perf/util/thread.h | 6 +++-- 3 files changed, 52 insertions(+), 16 deletions(-) diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c index 13dfb0a..429e1dd 100644 --- a/tools/perf/ui/browsers/hists.c +++ b/tools/perf/ui/browsers/hists.c @@ -2370,7 +2370,7 @@ static int perf_evsel_browser_title(struct hist_browser *browser, char unit; int printed; const struct dso *dso = hists->dso_filter; - const struct thread *thread = hists->thread_filter; + struct thread *thread = hists->thread_filter; int socket_id = hists->socket_filter; unsigned long nr_samples = hists->stats.nr_events[PERF_RECORD_SAMPLE]; u64 nr_events = hists->stats.total_period; diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c index b7957da..25830b2 100644 --- a/tools/perf/util/thread.c +++ b/tools/perf/util/thread.c @@ -45,6 +45,8 @@ struct thread *thread__new(pid_t pid, pid_t tid) thread->cpu = -1; INIT_LIST_HEAD(>namespaces_list); INIT_LIST_HEAD(>comm_list); + pthread_mutex_init(>namespaces_lock, NULL); + pthread_mutex_init(>comm_lock, NULL); comm_str = malloc(32); if (!comm_str) @@ -83,15 +85,21 @@ void thread__delete(struct thread *thread) map_groups__put(thread->mg); thread->mg = NULL; } + pthread_mutex_lock(>namespaces_lock); list_for_each_entry_safe(namespaces, tmp_namespaces, >namespaces_list, list) { list_del(>list); namespaces__free(namespaces); } + pthread_mutex_unlock(>namespaces_lock); + + pthread_mutex_lock(>comm_lock); list_for_each_entry_safe(comm, tmp_comm, >comm_list, list) { list_del(>list); comm__free(comm); } + pthread_mutex_unlock(>comm_lock); + unwind__finish_access(thread); nsinfo__zput(thread->nsinfo); @@ -128,11 +136,17 @@ struct namespaces *thread__namespaces(const struct thread *thread) int thread__set_namespaces(struct thread *thread, u64 timestamp, struct namespaces_event *event) { - struct namespaces *new, *curr = thread__namespaces(thread); + struct namespaces *new, *curr; + + pthread_mutex_lock(>namespaces_lock); + + curr = thread__namespaces(thread); new = namespaces__new(event); - if (!new) + if (!new) { + pthread_mutex_unlock(>namespaces_lock); return -ENOMEM; + } list_add(>list, >namespaces_list); @@ -146,17 +160,21 @@ int thread__set_namespaces(struct thread *thread, u64 timestamp, curr->end_time = timestamp; } + pthread_mutex_unlock(>namespaces_lock); + return 0; } -void thread__namespaces_id(const struct thread *thread, +void thread__namespaces_id(struct thread *thread, u64 *dev, u64 *ino) { struct namespaces *ns; + pthread_mutex_lock(>namespaces_lock); ns = thread__namespaces(thread); *dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0; *ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0; + pthread_mutex_unlock(>namespaces_lock); } struct comm *thread__comm(const struct thread *thread) @@ -183,17 +201,23 @@ struct comm *thread__exec_comm(const struct thread *thread) int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp, bool exec) { - struct comm *new, *curr = thread__comm(thread); + struct comm *new, *curr; + int err = 0; + + pthread_mutex_lock(>comm_lock); + curr = thread__comm(thread); /* Override the default :tid entry */ if (!thread->comm_set) { - int err = comm__override(curr, str, timestamp, exec); + err = comm__override(curr, str, timestamp, exec); if (err) - return err; + goto unlock; } else { new = comm__new(str, timestamp, exec); - if (!new) - return -ENOMEM; + if (!new) { + err = -ENOMEM; + goto unlock; + } list_add(>list, >comm_list); if (exec) @@ -202,7 +226,9 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp, thread->comm_set = true; - return 0; +unlock: +
[PATCH RFC V2 04/10] petf tools: introduce a new function to set namespaces id
From: Kan LiangFinish the namespaces id setting job in thread.c. Signed-off-by: Kan Liang --- tools/perf/util/hist.c | 7 ++- tools/perf/util/thread.c | 10 ++ tools/perf/util/thread.h | 2 ++ 3 files changed, 14 insertions(+), 5 deletions(-) diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c index 0f00dd9..e332c4f 100644 --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -584,14 +584,9 @@ __hists__add_entry(struct hists *hists, bool sample_self, struct hist_entry_ops *ops) { - struct namespaces *ns = thread__namespaces(al->thread); struct hist_entry entry = { .thread = al->thread, .comm_str = thread__comm_str(al->thread), - .cgroup_id = { - .dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0, - .ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0, - }, .ms = { .map= al->map, .sym= al->sym, @@ -617,6 +612,8 @@ __hists__add_entry(struct hists *hists, .ops = ops, }; + thread__namespaces_id(al->thread, _id.dev, + _id.ino); return hists__findnew_entry(hists, , al, sample_self); } diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c index aee9a42..b7957da 100644 --- a/tools/perf/util/thread.c +++ b/tools/perf/util/thread.c @@ -149,6 +149,16 @@ int thread__set_namespaces(struct thread *thread, u64 timestamp, return 0; } +void thread__namespaces_id(const struct thread *thread, + u64 *dev, u64 *ino) +{ + struct namespaces *ns; + + ns = thread__namespaces(thread); + *dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0; + *ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0; +} + struct comm *thread__comm(const struct thread *thread) { if (list_empty(>comm_list)) diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h index cb1a5dd..fd7bd41 100644 --- a/tools/perf/util/thread.h +++ b/tools/perf/util/thread.h @@ -68,6 +68,8 @@ static inline void thread__exited(struct thread *thread) struct namespaces *thread__namespaces(const struct thread *thread); int thread__set_namespaces(struct thread *thread, u64 timestamp, struct namespaces_event *event); +void thread__namespaces_id(const struct thread *thread, + u64 *dev, u64 *ino); int __thread__set_comm(struct thread *thread, const char *comm, u64 timestamp, bool exec); -- 2.5.5
[PATCH RFC V2 04/10] petf tools: introduce a new function to set namespaces id
From: Kan Liang Finish the namespaces id setting job in thread.c. Signed-off-by: Kan Liang --- tools/perf/util/hist.c | 7 ++- tools/perf/util/thread.c | 10 ++ tools/perf/util/thread.h | 2 ++ 3 files changed, 14 insertions(+), 5 deletions(-) diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c index 0f00dd9..e332c4f 100644 --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -584,14 +584,9 @@ __hists__add_entry(struct hists *hists, bool sample_self, struct hist_entry_ops *ops) { - struct namespaces *ns = thread__namespaces(al->thread); struct hist_entry entry = { .thread = al->thread, .comm_str = thread__comm_str(al->thread), - .cgroup_id = { - .dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0, - .ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0, - }, .ms = { .map= al->map, .sym= al->sym, @@ -617,6 +612,8 @@ __hists__add_entry(struct hists *hists, .ops = ops, }; + thread__namespaces_id(al->thread, _id.dev, + _id.ino); return hists__findnew_entry(hists, , al, sample_self); } diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c index aee9a42..b7957da 100644 --- a/tools/perf/util/thread.c +++ b/tools/perf/util/thread.c @@ -149,6 +149,16 @@ int thread__set_namespaces(struct thread *thread, u64 timestamp, return 0; } +void thread__namespaces_id(const struct thread *thread, + u64 *dev, u64 *ino) +{ + struct namespaces *ns; + + ns = thread__namespaces(thread); + *dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0; + *ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0; +} + struct comm *thread__comm(const struct thread *thread) { if (list_empty(>comm_list)) diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h index cb1a5dd..fd7bd41 100644 --- a/tools/perf/util/thread.h +++ b/tools/perf/util/thread.h @@ -68,6 +68,8 @@ static inline void thread__exited(struct thread *thread) struct namespaces *thread__namespaces(const struct thread *thread); int thread__set_namespaces(struct thread *thread, u64 timestamp, struct namespaces_event *event); +void thread__namespaces_id(const struct thread *thread, + u64 *dev, u64 *ino); int __thread__set_comm(struct thread *thread, const char *comm, u64 timestamp, bool exec); -- 2.5.5
[PATCH RFC V2 00/10] perf top optimization
From: Kan LiangThe patch series intends to fix the severe performance issue in Knights Landing/Mill, when monitoring in heavy load system. perf top costs a few minutes to show the result, which is unacceptable. With the patch series applied, the latency will reduces to several seconds. machine__synthesize_threads and perf_top__mmap_read costs most of the perf top time (> 99%). Patch 1-9 do the optimization for machine__synthesize_threads. Patch 10 does the optimization for perf_top__mmap_read. Optimization for machine__synthesize_threads - Multithreading the whole process. - The threads number is set to the max online CPU# by default. User can change the threads number through the new option. - Introduces hashtable for machine threads to reduce the lock contention. - The optimization can also benefit other platforms and other perf tools, like perf record. But this patch series doesn't do the optimization for other tools. It can be done later separately. - With this optimization applied, there is a 1.56x speedup in Knights Mill with heavy workload. Optimization for perf_top__mmap_read - switch back to overwrite mode For non overwrite mode, it tries to read everything in the ring buffer and does not check the messup. Once there are lots of samples delivered shortly, the processing time could be very long. Considering the real time requirement for perf top, it should switch back to overwrite mode. - With this optimization applied, there is a huge 53.8x speedup in Knights Mill with heavy workload. - With this optimization applied, the latency of perf_top__mmap_read is less than the default perf top fresh time (2s) in Knights Mill with heavy workload. Here are perf top latency test result on Knights Mill and Skylake server The heavy workload is to compile Linux kernel as below "sudo nice make -j$(grep -c '^processor' /proc/cpuinfo)" Then, "sudo perf top" The latency period is the time between perf top launched and the first profiling result shown. - Latency on Knights Mill (272 CPUs) Original(s) With patch(s) Speedup 272.68 16.48 16.54x - Latency on Skylake server (192 CPUs) Original(s) With patch(s) Speedup 12.28 2.964.15x Changes since V1: - Patch 1: machine threads and hashtable related renaming (Arnaldo) - Patch 6: use a smaller locked section for comm_str__put add a locked wrapper for comm_str__findnew (Arnaldo) Kan Liang (10): perf tools: hashtable for machine threads perf tools: using scandir to replace readdir petf tools: using comm_str to replace comm in hist_entry petf tools: introduce a new function to set namespaces id perf tools: lock to protect thread list perf tools: lock to protect comm_str rb tree perf tools: change machine comm_exec type to atomic perf top: implement multithreading for perf_event__synthesize_threads perf top: add option to set the number of thread for event synthesize perf top: switch back to overwrite mode tools/perf/builtin-kvm.c | 3 +- tools/perf/builtin-record.c | 2 +- tools/perf/builtin-top.c | 9 +- tools/perf/builtin-trace.c| 21 +++-- tools/perf/tests/mmap-thread-lookup.c | 2 +- tools/perf/ui/browsers/hists.c| 2 +- tools/perf/util/comm.c| 18 +++- tools/perf/util/event.c | 149 +--- tools/perf/util/event.h | 14 ++- tools/perf/util/evlist.c | 5 +- tools/perf/util/hist.c| 11 +-- tools/perf/util/machine.c | 158 +- tools/perf/util/machine.h | 34 ++-- tools/perf/util/rb_resort.h | 5 +- tools/perf/util/sort.c| 8 +- tools/perf/util/sort.h| 2 +- tools/perf/util/thread.c | 68 --- tools/perf/util/thread.h | 6 +- tools/perf/util/top.h | 1 + 19 files changed, 376 insertions(+), 142 deletions(-) -- 2.5.5
[PATCH RFC V2 00/10] perf top optimization
From: Kan Liang The patch series intends to fix the severe performance issue in Knights Landing/Mill, when monitoring in heavy load system. perf top costs a few minutes to show the result, which is unacceptable. With the patch series applied, the latency will reduces to several seconds. machine__synthesize_threads and perf_top__mmap_read costs most of the perf top time (> 99%). Patch 1-9 do the optimization for machine__synthesize_threads. Patch 10 does the optimization for perf_top__mmap_read. Optimization for machine__synthesize_threads - Multithreading the whole process. - The threads number is set to the max online CPU# by default. User can change the threads number through the new option. - Introduces hashtable for machine threads to reduce the lock contention. - The optimization can also benefit other platforms and other perf tools, like perf record. But this patch series doesn't do the optimization for other tools. It can be done later separately. - With this optimization applied, there is a 1.56x speedup in Knights Mill with heavy workload. Optimization for perf_top__mmap_read - switch back to overwrite mode For non overwrite mode, it tries to read everything in the ring buffer and does not check the messup. Once there are lots of samples delivered shortly, the processing time could be very long. Considering the real time requirement for perf top, it should switch back to overwrite mode. - With this optimization applied, there is a huge 53.8x speedup in Knights Mill with heavy workload. - With this optimization applied, the latency of perf_top__mmap_read is less than the default perf top fresh time (2s) in Knights Mill with heavy workload. Here are perf top latency test result on Knights Mill and Skylake server The heavy workload is to compile Linux kernel as below "sudo nice make -j$(grep -c '^processor' /proc/cpuinfo)" Then, "sudo perf top" The latency period is the time between perf top launched and the first profiling result shown. - Latency on Knights Mill (272 CPUs) Original(s) With patch(s) Speedup 272.68 16.48 16.54x - Latency on Skylake server (192 CPUs) Original(s) With patch(s) Speedup 12.28 2.964.15x Changes since V1: - Patch 1: machine threads and hashtable related renaming (Arnaldo) - Patch 6: use a smaller locked section for comm_str__put add a locked wrapper for comm_str__findnew (Arnaldo) Kan Liang (10): perf tools: hashtable for machine threads perf tools: using scandir to replace readdir petf tools: using comm_str to replace comm in hist_entry petf tools: introduce a new function to set namespaces id perf tools: lock to protect thread list perf tools: lock to protect comm_str rb tree perf tools: change machine comm_exec type to atomic perf top: implement multithreading for perf_event__synthesize_threads perf top: add option to set the number of thread for event synthesize perf top: switch back to overwrite mode tools/perf/builtin-kvm.c | 3 +- tools/perf/builtin-record.c | 2 +- tools/perf/builtin-top.c | 9 +- tools/perf/builtin-trace.c| 21 +++-- tools/perf/tests/mmap-thread-lookup.c | 2 +- tools/perf/ui/browsers/hists.c| 2 +- tools/perf/util/comm.c| 18 +++- tools/perf/util/event.c | 149 +--- tools/perf/util/event.h | 14 ++- tools/perf/util/evlist.c | 5 +- tools/perf/util/hist.c| 11 +-- tools/perf/util/machine.c | 158 +- tools/perf/util/machine.h | 34 ++-- tools/perf/util/rb_resort.h | 5 +- tools/perf/util/sort.c| 8 +- tools/perf/util/sort.h| 2 +- tools/perf/util/thread.c | 68 --- tools/perf/util/thread.h | 6 +- tools/perf/util/top.h | 1 + 19 files changed, 376 insertions(+), 142 deletions(-) -- 2.5.5
[PATCH RFC V2 03/10] petf tools: using comm_str to replace comm in hist_entry
From: Kan LiangFor hist_entry, it only needs comm_str for cmp. Signed-off-by: Kan Liang --- tools/perf/util/hist.c | 4 ++-- tools/perf/util/sort.c | 8 tools/perf/util/sort.h | 2 +- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c index e60d8d8..0f00dd9 100644 --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -587,7 +587,7 @@ __hists__add_entry(struct hists *hists, struct namespaces *ns = thread__namespaces(al->thread); struct hist_entry entry = { .thread = al->thread, - .comm = thread__comm(al->thread), + .comm_str = thread__comm_str(al->thread), .cgroup_id = { .dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0, .ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0, @@ -944,7 +944,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter, .hists = evsel__hists(evsel), .cpu = al->cpu, .thread = al->thread, - .comm = thread__comm(al->thread), + .comm_str = thread__comm_str(al->thread), .ip = al->addr, .ms = { .map = al->map, diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index eb3ab90..99ab411 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -114,26 +114,26 @@ static int64_t sort__comm_cmp(struct hist_entry *left, struct hist_entry *right) { /* Compare the addr that should be unique among comm */ - return strcmp(comm__str(right->comm), comm__str(left->comm)); + return strcmp(right->comm_str, left->comm_str); } static int64_t sort__comm_collapse(struct hist_entry *left, struct hist_entry *right) { /* Compare the addr that should be unique among comm */ - return strcmp(comm__str(right->comm), comm__str(left->comm)); + return strcmp(right->comm_str, left->comm_str); } static int64_t sort__comm_sort(struct hist_entry *left, struct hist_entry *right) { - return strcmp(comm__str(right->comm), comm__str(left->comm)); + return strcmp(right->comm_str, left->comm_str); } static int hist_entry__comm_snprintf(struct hist_entry *he, char *bf, size_t size, unsigned int width) { - return repsep_snprintf(bf, size, "%-*.*s", width, width, comm__str(he->comm)); + return repsep_snprintf(bf, size, "%-*.*s", width, width, he->comm_str); } struct sort_entry sort_comm = { diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h index f36dc49..861cbe7 100644 --- a/tools/perf/util/sort.h +++ b/tools/perf/util/sort.h @@ -96,7 +96,7 @@ struct hist_entry { struct he_stat *stat_acc; struct map_symbol ms; struct thread *thread; - struct comm *comm; + const char *comm_str; struct namespace_id cgroup_id; u64 ip; u64 transaction; -- 2.5.5
[PATCH RFC V2 03/10] petf tools: using comm_str to replace comm in hist_entry
From: Kan Liang For hist_entry, it only needs comm_str for cmp. Signed-off-by: Kan Liang --- tools/perf/util/hist.c | 4 ++-- tools/perf/util/sort.c | 8 tools/perf/util/sort.h | 2 +- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c index e60d8d8..0f00dd9 100644 --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -587,7 +587,7 @@ __hists__add_entry(struct hists *hists, struct namespaces *ns = thread__namespaces(al->thread); struct hist_entry entry = { .thread = al->thread, - .comm = thread__comm(al->thread), + .comm_str = thread__comm_str(al->thread), .cgroup_id = { .dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0, .ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0, @@ -944,7 +944,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter, .hists = evsel__hists(evsel), .cpu = al->cpu, .thread = al->thread, - .comm = thread__comm(al->thread), + .comm_str = thread__comm_str(al->thread), .ip = al->addr, .ms = { .map = al->map, diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index eb3ab90..99ab411 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -114,26 +114,26 @@ static int64_t sort__comm_cmp(struct hist_entry *left, struct hist_entry *right) { /* Compare the addr that should be unique among comm */ - return strcmp(comm__str(right->comm), comm__str(left->comm)); + return strcmp(right->comm_str, left->comm_str); } static int64_t sort__comm_collapse(struct hist_entry *left, struct hist_entry *right) { /* Compare the addr that should be unique among comm */ - return strcmp(comm__str(right->comm), comm__str(left->comm)); + return strcmp(right->comm_str, left->comm_str); } static int64_t sort__comm_sort(struct hist_entry *left, struct hist_entry *right) { - return strcmp(comm__str(right->comm), comm__str(left->comm)); + return strcmp(right->comm_str, left->comm_str); } static int hist_entry__comm_snprintf(struct hist_entry *he, char *bf, size_t size, unsigned int width) { - return repsep_snprintf(bf, size, "%-*.*s", width, width, comm__str(he->comm)); + return repsep_snprintf(bf, size, "%-*.*s", width, width, he->comm_str); } struct sort_entry sort_comm = { diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h index f36dc49..861cbe7 100644 --- a/tools/perf/util/sort.h +++ b/tools/perf/util/sort.h @@ -96,7 +96,7 @@ struct hist_entry { struct he_stat *stat_acc; struct map_symbol ms; struct thread *thread; - struct comm *comm; + const char *comm_str; struct namespace_id cgroup_id; u64 ip; u64 transaction; -- 2.5.5
[PATCH RFC V2 08/10] perf top: implement multithreading for perf_event__synthesize_threads
From: Kan LiangThe proc files which is sorted with alphabetical order are evenly assigned to several synthesize threads to be processed in parallel. For perf top, the threads number hard code to online CPU number. The following patch will introduce an option to set it. For other perf tools, the thread number is 1. Because the process function is not ready for multithreading, e.g. process_synthesized_event. This patch series only support event synthesize multithreading for perf top. For other tools, it can be done separately later. With multithread applied, the total processing time can get up to 1.56x speedup on Knights Mill for perf top. For specific single event processing, the processing time could increase because of the lock contention. So proc_map_timeout may need to be increased. Otherwise some proc maps will be truncated. Based on my test, increasing the proc_map_timeout has small impact on the total processing time. The total processing time still get 1.49x speedup on Knights Mill after increasing the proc_map_timeout. The patch itself doesn't increase the proc_map_timeout. Doesn't need to implement multithreading for perf_event__synthesize_thread_map. It doesn't have performance issue. Signed-off-by: Kan Liang --- tools/perf/builtin-kvm.c | 3 +- tools/perf/builtin-record.c | 2 +- tools/perf/builtin-top.c | 4 +- tools/perf/builtin-trace.c| 2 +- tools/perf/tests/mmap-thread-lookup.c | 2 +- tools/perf/util/event.c | 146 ++ tools/perf/util/event.h | 14 +++- tools/perf/util/machine.c | 8 +- tools/perf/util/machine.h | 9 ++- 9 files changed, 146 insertions(+), 44 deletions(-) diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c index f309c37..edd14cb 100644 --- a/tools/perf/builtin-kvm.c +++ b/tools/perf/builtin-kvm.c @@ -1442,7 +1442,8 @@ static int kvm_events_live(struct perf_kvm_stat *kvm, perf_session__set_id_hdr_size(kvm->session); ordered_events__set_copy_on_queue(>session->ordered_events, true); machine__synthesize_threads(>session->machines.host, >opts.target, - kvm->evlist->threads, false, kvm->opts.proc_map_timeout); + kvm->evlist->threads, false, + kvm->opts.proc_map_timeout, 1); err = kvm_live_open_events(kvm); if (err) goto out; diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 56f8142..f8a6c89 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -863,7 +863,7 @@ static int record__synthesize(struct record *rec, bool tail) err = __machine__synthesize_threads(machine, tool, >target, rec->evlist->threads, process_synthesized_event, opts->sample_address, - opts->proc_map_timeout); + opts->proc_map_timeout, 1); out: return err; } diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index ee954bd..4b8fdc1 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -959,7 +959,9 @@ static int __cmd_top(struct perf_top *top) goto out_delete; machine__synthesize_threads(>session->machines.host, >target, - top->evlist->threads, false, opts->proc_map_timeout); + top->evlist->threads, false, + opts->proc_map_timeout, + (unsigned int)sysconf(_SC_NPROCESSORS_ONLN)); if (perf_hpp_list.socket) { ret = perf_env__read_cpu_topology_map(_env); diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c index ee8c6e8..c6acd91 100644 --- a/tools/perf/builtin-trace.c +++ b/tools/perf/builtin-trace.c @@ -1131,7 +1131,7 @@ static int trace__symbols_init(struct trace *trace, struct perf_evlist *evlist) err = __machine__synthesize_threads(trace->host, >tool, >opts.target, evlist->threads, trace__tool_process, false, - trace->opts.proc_map_timeout); + trace->opts.proc_map_timeout, 1); if (err) symbol__exit(); diff --git a/tools/perf/tests/mmap-thread-lookup.c b/tools/perf/tests/mmap-thread-lookup.c index f94a419..2a0068a 100644 --- a/tools/perf/tests/mmap-thread-lookup.c +++ b/tools/perf/tests/mmap-thread-lookup.c @@ -131,7 +131,7 @@ static int synth_all(struct machine *machine) { return perf_event__synthesize_threads(NULL, perf_event__process, - machine, 0,
[PATCH RFC V2 08/10] perf top: implement multithreading for perf_event__synthesize_threads
From: Kan Liang The proc files which is sorted with alphabetical order are evenly assigned to several synthesize threads to be processed in parallel. For perf top, the threads number hard code to online CPU number. The following patch will introduce an option to set it. For other perf tools, the thread number is 1. Because the process function is not ready for multithreading, e.g. process_synthesized_event. This patch series only support event synthesize multithreading for perf top. For other tools, it can be done separately later. With multithread applied, the total processing time can get up to 1.56x speedup on Knights Mill for perf top. For specific single event processing, the processing time could increase because of the lock contention. So proc_map_timeout may need to be increased. Otherwise some proc maps will be truncated. Based on my test, increasing the proc_map_timeout has small impact on the total processing time. The total processing time still get 1.49x speedup on Knights Mill after increasing the proc_map_timeout. The patch itself doesn't increase the proc_map_timeout. Doesn't need to implement multithreading for perf_event__synthesize_thread_map. It doesn't have performance issue. Signed-off-by: Kan Liang --- tools/perf/builtin-kvm.c | 3 +- tools/perf/builtin-record.c | 2 +- tools/perf/builtin-top.c | 4 +- tools/perf/builtin-trace.c| 2 +- tools/perf/tests/mmap-thread-lookup.c | 2 +- tools/perf/util/event.c | 146 ++ tools/perf/util/event.h | 14 +++- tools/perf/util/machine.c | 8 +- tools/perf/util/machine.h | 9 ++- 9 files changed, 146 insertions(+), 44 deletions(-) diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c index f309c37..edd14cb 100644 --- a/tools/perf/builtin-kvm.c +++ b/tools/perf/builtin-kvm.c @@ -1442,7 +1442,8 @@ static int kvm_events_live(struct perf_kvm_stat *kvm, perf_session__set_id_hdr_size(kvm->session); ordered_events__set_copy_on_queue(>session->ordered_events, true); machine__synthesize_threads(>session->machines.host, >opts.target, - kvm->evlist->threads, false, kvm->opts.proc_map_timeout); + kvm->evlist->threads, false, + kvm->opts.proc_map_timeout, 1); err = kvm_live_open_events(kvm); if (err) goto out; diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 56f8142..f8a6c89 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -863,7 +863,7 @@ static int record__synthesize(struct record *rec, bool tail) err = __machine__synthesize_threads(machine, tool, >target, rec->evlist->threads, process_synthesized_event, opts->sample_address, - opts->proc_map_timeout); + opts->proc_map_timeout, 1); out: return err; } diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index ee954bd..4b8fdc1 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -959,7 +959,9 @@ static int __cmd_top(struct perf_top *top) goto out_delete; machine__synthesize_threads(>session->machines.host, >target, - top->evlist->threads, false, opts->proc_map_timeout); + top->evlist->threads, false, + opts->proc_map_timeout, + (unsigned int)sysconf(_SC_NPROCESSORS_ONLN)); if (perf_hpp_list.socket) { ret = perf_env__read_cpu_topology_map(_env); diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c index ee8c6e8..c6acd91 100644 --- a/tools/perf/builtin-trace.c +++ b/tools/perf/builtin-trace.c @@ -1131,7 +1131,7 @@ static int trace__symbols_init(struct trace *trace, struct perf_evlist *evlist) err = __machine__synthesize_threads(trace->host, >tool, >opts.target, evlist->threads, trace__tool_process, false, - trace->opts.proc_map_timeout); + trace->opts.proc_map_timeout, 1); if (err) symbol__exit(); diff --git a/tools/perf/tests/mmap-thread-lookup.c b/tools/perf/tests/mmap-thread-lookup.c index f94a419..2a0068a 100644 --- a/tools/perf/tests/mmap-thread-lookup.c +++ b/tools/perf/tests/mmap-thread-lookup.c @@ -131,7 +131,7 @@ static int synth_all(struct machine *machine) { return perf_event__synthesize_threads(NULL, perf_event__process, - machine, 0, 500); +
[PATCH RFC V2 06/10] perf tools: lock to protect comm_str rb tree
From: Kan LiangAdd comm_str_lock to protect comm_str rb tree. Signed-off-by: Kan Liang --- tools/perf/util/comm.c | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/comm.c b/tools/perf/util/comm.c index 7bc981b..cb7f98c 100644 --- a/tools/perf/util/comm.c +++ b/tools/perf/util/comm.c @@ -5,6 +5,7 @@ #include #include #include +#include struct comm_str { char *str; @@ -14,6 +15,7 @@ struct comm_str { /* Should perhaps be moved to struct machine */ static struct rb_root comm_str_root; +static pthread_mutex_t comm_str_lock = PTHREAD_MUTEX_INITIALIZER; static struct comm_str *comm_str__get(struct comm_str *cs) { @@ -25,7 +27,9 @@ static struct comm_str *comm_str__get(struct comm_str *cs) static void comm_str__put(struct comm_str *cs) { if (cs && refcount_dec_and_test(>refcnt)) { + pthread_mutex_lock(_str_lock); rb_erase(>rb_node, _str_root); + pthread_mutex_unlock(_str_lock); zfree(>str); free(cs); } @@ -50,7 +54,8 @@ static struct comm_str *comm_str__alloc(const char *str) return cs; } -static struct comm_str *comm_str__findnew(const char *str, struct rb_root *root) +static +struct comm_str *__comm_str__findnew(const char *str, struct rb_root *root) { struct rb_node **p = >rb_node; struct rb_node *parent = NULL; @@ -81,6 +86,17 @@ static struct comm_str *comm_str__findnew(const char *str, struct rb_root *root) return new; } +static struct comm_str *comm_str__findnew(const char *str, struct rb_root *root) +{ + struct comm_str *cs; + + pthread_mutex_lock(_str_lock); + cs = __comm_str__findnew(str, root); + pthread_mutex_unlock(_str_lock); + + return cs; +} + struct comm *comm__new(const char *str, u64 timestamp, bool exec) { struct comm *comm = zalloc(sizeof(*comm)); -- 2.5.5
[PATCH RFC V2 10/10] perf top: switch back to overwrite mode
From: Kan Liangperf_top__mmap_read has severe performance issue in Knights Landing/Mill, when monitoring in heavy load system. It costs several minutes to finish, which is unacceptable. perf top was overwrite mode. But it is changed to non overwrite mode since commit 93fc64f14472 ("perf top: Switch to non overwrite mode"). For non overwrite mode, it tries to read everything in the ring buffer and does not check the messup. Once there are lots of samples delivered shortly, the processing time could be very long. Knights Landing/Mill as a manycore processor contains a large number of small cores. Because of the huge core number, it will generated lots of samples in a heavy load system. Also, since the huge sample#, the mmap writer probably bite the tail and mess up the samples. Switching to overwrite mode, which dropping the unsure mmap entries, significantly speeds up the whole progress. Considering the real time requirement for perf top, it should switch back to overwrite mode. Only warning once if the messup is detected. Providing some hints to users. Signed-off-by: Kan Liang --- tools/perf/builtin-top.c | 2 +- tools/perf/util/evlist.c | 5 - 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index 5f59aa7..8124b8f 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -902,7 +902,7 @@ static int perf_top__start_counters(struct perf_top *top) } } - if (perf_evlist__mmap(evlist, opts->mmap_pages, false) < 0) { + if (perf_evlist__mmap(evlist, opts->mmap_pages, true) < 0) { ui__error("Failed to mmap with %d (%s)\n", errno, str_error_r(errno, msg, sizeof(msg))); goto out_err; diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index 6a0d7ff..ef0b6b1 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -723,7 +723,10 @@ perf_mmap__read(struct perf_mmap *md, bool check_messup, u64 start, * In either case, truncate and restart at 'end'. */ if (diff > md->mask / 2 || diff < 0) { - fprintf(stderr, "WARNING: failed to keep up with mmap data.\n"); + WARN_ONCE(1, "WARNING: failed to keep up with mmap data.\n" +"Please try increasing the period (-c) or\n" +"decreasing the freq (-F) or\n" +"limiting the number of CPUs"); /* * 'end' points to a known good entry, start there. -- 2.5.5
[PATCH RFC V2 06/10] perf tools: lock to protect comm_str rb tree
From: Kan Liang Add comm_str_lock to protect comm_str rb tree. Signed-off-by: Kan Liang --- tools/perf/util/comm.c | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/comm.c b/tools/perf/util/comm.c index 7bc981b..cb7f98c 100644 --- a/tools/perf/util/comm.c +++ b/tools/perf/util/comm.c @@ -5,6 +5,7 @@ #include #include #include +#include struct comm_str { char *str; @@ -14,6 +15,7 @@ struct comm_str { /* Should perhaps be moved to struct machine */ static struct rb_root comm_str_root; +static pthread_mutex_t comm_str_lock = PTHREAD_MUTEX_INITIALIZER; static struct comm_str *comm_str__get(struct comm_str *cs) { @@ -25,7 +27,9 @@ static struct comm_str *comm_str__get(struct comm_str *cs) static void comm_str__put(struct comm_str *cs) { if (cs && refcount_dec_and_test(>refcnt)) { + pthread_mutex_lock(_str_lock); rb_erase(>rb_node, _str_root); + pthread_mutex_unlock(_str_lock); zfree(>str); free(cs); } @@ -50,7 +54,8 @@ static struct comm_str *comm_str__alloc(const char *str) return cs; } -static struct comm_str *comm_str__findnew(const char *str, struct rb_root *root) +static +struct comm_str *__comm_str__findnew(const char *str, struct rb_root *root) { struct rb_node **p = >rb_node; struct rb_node *parent = NULL; @@ -81,6 +86,17 @@ static struct comm_str *comm_str__findnew(const char *str, struct rb_root *root) return new; } +static struct comm_str *comm_str__findnew(const char *str, struct rb_root *root) +{ + struct comm_str *cs; + + pthread_mutex_lock(_str_lock); + cs = __comm_str__findnew(str, root); + pthread_mutex_unlock(_str_lock); + + return cs; +} + struct comm *comm__new(const char *str, u64 timestamp, bool exec) { struct comm *comm = zalloc(sizeof(*comm)); -- 2.5.5
[PATCH RFC V2 10/10] perf top: switch back to overwrite mode
From: Kan Liang perf_top__mmap_read has severe performance issue in Knights Landing/Mill, when monitoring in heavy load system. It costs several minutes to finish, which is unacceptable. perf top was overwrite mode. But it is changed to non overwrite mode since commit 93fc64f14472 ("perf top: Switch to non overwrite mode"). For non overwrite mode, it tries to read everything in the ring buffer and does not check the messup. Once there are lots of samples delivered shortly, the processing time could be very long. Knights Landing/Mill as a manycore processor contains a large number of small cores. Because of the huge core number, it will generated lots of samples in a heavy load system. Also, since the huge sample#, the mmap writer probably bite the tail and mess up the samples. Switching to overwrite mode, which dropping the unsure mmap entries, significantly speeds up the whole progress. Considering the real time requirement for perf top, it should switch back to overwrite mode. Only warning once if the messup is detected. Providing some hints to users. Signed-off-by: Kan Liang --- tools/perf/builtin-top.c | 2 +- tools/perf/util/evlist.c | 5 - 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index 5f59aa7..8124b8f 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -902,7 +902,7 @@ static int perf_top__start_counters(struct perf_top *top) } } - if (perf_evlist__mmap(evlist, opts->mmap_pages, false) < 0) { + if (perf_evlist__mmap(evlist, opts->mmap_pages, true) < 0) { ui__error("Failed to mmap with %d (%s)\n", errno, str_error_r(errno, msg, sizeof(msg))); goto out_err; diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index 6a0d7ff..ef0b6b1 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -723,7 +723,10 @@ perf_mmap__read(struct perf_mmap *md, bool check_messup, u64 start, * In either case, truncate and restart at 'end'. */ if (diff > md->mask / 2 || diff < 0) { - fprintf(stderr, "WARNING: failed to keep up with mmap data.\n"); + WARN_ONCE(1, "WARNING: failed to keep up with mmap data.\n" +"Please try increasing the period (-c) or\n" +"decreasing the freq (-F) or\n" +"limiting the number of CPUs"); /* * 'end' points to a known good entry, start there. -- 2.5.5
[PATCH RFC V2 01/10] perf tools: hashtable for machine threads
From: Kan LiangTo process any events, it needs to find the thread in the machine first. The machine maintains a rb tree to store all threads. The rb tree is protected by a rw lock. It is not a problem for current perf which serially processing events. However, it will have scalability performance issue to process events in parallel, especially on a heavy load system which have many threads. Introduce a hashtable to divide the big rb tree into many samll rb tree for threads. The index is thread id % hashtable size. It can reduce the lock contention. Signed-off-by: Kan Liang --- tools/perf/builtin-trace.c | 19 +++--- tools/perf/util/machine.c | 139 tools/perf/util/machine.h | 23 ++-- tools/perf/util/rb_resort.h | 5 +- 4 files changed, 120 insertions(+), 66 deletions(-) diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c index 771ddab..ee8c6e8 100644 --- a/tools/perf/builtin-trace.c +++ b/tools/perf/builtin-trace.c @@ -2730,20 +2730,23 @@ DEFINE_RESORT_RB(threads, (thread__nr_events(a->thread->priv) < thread__nr_event static size_t trace__fprintf_thread_summary(struct trace *trace, FILE *fp) { - DECLARE_RESORT_RB_MACHINE_THREADS(threads, trace->host); size_t printed = trace__fprintf_threads_header(fp); struct rb_node *nd; + int i; - if (threads == NULL) { - fprintf(fp, "%s", "Error sorting output by nr_events!\n"); - return 0; - } + for (i = 0; i < THREADS__TABLE_SIZE; i++) { + DECLARE_RESORT_RB_MACHINE_THREADS(threads, trace->host, i); - resort_rb__for_each_entry(nd, threads) - printed += trace__fprintf_thread(fp, threads_entry->thread, trace); + if (threads == NULL) { + fprintf(fp, "%s", "Error sorting output by nr_events!\n"); + return 0; + } - resort_rb__delete(threads); + resort_rb__for_each_entry(nd, threads) + printed += trace__fprintf_thread(fp, threads_entry->thread, trace); + resort_rb__delete(threads); + } return printed; } diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index df70936..2f1ad29 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -33,6 +33,21 @@ static void dsos__init(struct dsos *dsos) pthread_rwlock_init(>lock, NULL); } +static void machine__thread_init(struct machine *machine) +{ + struct threads *thread; + int i; + + for (i = 0; i < THREADS__TABLE_SIZE; i++) { + thread = >threads[i]; + thread->entries = RB_ROOT; + pthread_rwlock_init(>lock, NULL); + thread->nr = 0; + INIT_LIST_HEAD(>dead); + thread->last_match = NULL; + } +} + int machine__init(struct machine *machine, const char *root_dir, pid_t pid) { memset(machine, 0, sizeof(*machine)); @@ -40,11 +55,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid) RB_CLEAR_NODE(>rb_node); dsos__init(>dsos); - machine->threads = RB_ROOT; - pthread_rwlock_init(>threads_lock, NULL); - machine->nr_threads = 0; - INIT_LIST_HEAD(>dead_threads); - machine->last_match = NULL; + machine__thread_init(machine); machine->vdso_info = NULL; machine->env = NULL; @@ -140,28 +151,39 @@ static void dsos__exit(struct dsos *dsos) void machine__delete_threads(struct machine *machine) { + struct threads *thread; struct rb_node *nd; + int i; - pthread_rwlock_wrlock(>threads_lock); - nd = rb_first(>threads); - while (nd) { - struct thread *t = rb_entry(nd, struct thread, rb_node); + for (i = 0; i < THREADS__TABLE_SIZE; i++) { + thread = >threads[i]; + pthread_rwlock_wrlock(>lock); + nd = rb_first(>entries); + while (nd) { + struct thread *t = rb_entry(nd, struct thread, rb_node); - nd = rb_next(nd); - __machine__remove_thread(machine, t, false); + nd = rb_next(nd); + __machine__remove_thread(machine, t, false); + } + pthread_rwlock_unlock(>lock); } - pthread_rwlock_unlock(>threads_lock); } void machine__exit(struct machine *machine) { + struct threads *thread; + int i; + machine__destroy_kernel_maps(machine); map_groups__exit(>kmaps); dsos__exit(>dsos); machine__exit_vdso(machine); zfree(>root_dir); zfree(>current_tid); - pthread_rwlock_destroy(>threads_lock); + for (i = 0; i < THREADS__TABLE_SIZE; i++) { + thread = >threads[i]; +
[PATCH RFC V2 01/10] perf tools: hashtable for machine threads
From: Kan Liang To process any events, it needs to find the thread in the machine first. The machine maintains a rb tree to store all threads. The rb tree is protected by a rw lock. It is not a problem for current perf which serially processing events. However, it will have scalability performance issue to process events in parallel, especially on a heavy load system which have many threads. Introduce a hashtable to divide the big rb tree into many samll rb tree for threads. The index is thread id % hashtable size. It can reduce the lock contention. Signed-off-by: Kan Liang --- tools/perf/builtin-trace.c | 19 +++--- tools/perf/util/machine.c | 139 tools/perf/util/machine.h | 23 ++-- tools/perf/util/rb_resort.h | 5 +- 4 files changed, 120 insertions(+), 66 deletions(-) diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c index 771ddab..ee8c6e8 100644 --- a/tools/perf/builtin-trace.c +++ b/tools/perf/builtin-trace.c @@ -2730,20 +2730,23 @@ DEFINE_RESORT_RB(threads, (thread__nr_events(a->thread->priv) < thread__nr_event static size_t trace__fprintf_thread_summary(struct trace *trace, FILE *fp) { - DECLARE_RESORT_RB_MACHINE_THREADS(threads, trace->host); size_t printed = trace__fprintf_threads_header(fp); struct rb_node *nd; + int i; - if (threads == NULL) { - fprintf(fp, "%s", "Error sorting output by nr_events!\n"); - return 0; - } + for (i = 0; i < THREADS__TABLE_SIZE; i++) { + DECLARE_RESORT_RB_MACHINE_THREADS(threads, trace->host, i); - resort_rb__for_each_entry(nd, threads) - printed += trace__fprintf_thread(fp, threads_entry->thread, trace); + if (threads == NULL) { + fprintf(fp, "%s", "Error sorting output by nr_events!\n"); + return 0; + } - resort_rb__delete(threads); + resort_rb__for_each_entry(nd, threads) + printed += trace__fprintf_thread(fp, threads_entry->thread, trace); + resort_rb__delete(threads); + } return printed; } diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index df70936..2f1ad29 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -33,6 +33,21 @@ static void dsos__init(struct dsos *dsos) pthread_rwlock_init(>lock, NULL); } +static void machine__thread_init(struct machine *machine) +{ + struct threads *thread; + int i; + + for (i = 0; i < THREADS__TABLE_SIZE; i++) { + thread = >threads[i]; + thread->entries = RB_ROOT; + pthread_rwlock_init(>lock, NULL); + thread->nr = 0; + INIT_LIST_HEAD(>dead); + thread->last_match = NULL; + } +} + int machine__init(struct machine *machine, const char *root_dir, pid_t pid) { memset(machine, 0, sizeof(*machine)); @@ -40,11 +55,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid) RB_CLEAR_NODE(>rb_node); dsos__init(>dsos); - machine->threads = RB_ROOT; - pthread_rwlock_init(>threads_lock, NULL); - machine->nr_threads = 0; - INIT_LIST_HEAD(>dead_threads); - machine->last_match = NULL; + machine__thread_init(machine); machine->vdso_info = NULL; machine->env = NULL; @@ -140,28 +151,39 @@ static void dsos__exit(struct dsos *dsos) void machine__delete_threads(struct machine *machine) { + struct threads *thread; struct rb_node *nd; + int i; - pthread_rwlock_wrlock(>threads_lock); - nd = rb_first(>threads); - while (nd) { - struct thread *t = rb_entry(nd, struct thread, rb_node); + for (i = 0; i < THREADS__TABLE_SIZE; i++) { + thread = >threads[i]; + pthread_rwlock_wrlock(>lock); + nd = rb_first(>entries); + while (nd) { + struct thread *t = rb_entry(nd, struct thread, rb_node); - nd = rb_next(nd); - __machine__remove_thread(machine, t, false); + nd = rb_next(nd); + __machine__remove_thread(machine, t, false); + } + pthread_rwlock_unlock(>lock); } - pthread_rwlock_unlock(>threads_lock); } void machine__exit(struct machine *machine) { + struct threads *thread; + int i; + machine__destroy_kernel_maps(machine); map_groups__exit(>kmaps); dsos__exit(>dsos); machine__exit_vdso(machine); zfree(>root_dir); zfree(>current_tid); - pthread_rwlock_destroy(>threads_lock); + for (i = 0; i < THREADS__TABLE_SIZE; i++) { + thread = >threads[i]; + pthread_rwlock_destroy(>lock); + } } void
[PATCH RFC V2 09/10] perf top: add option to set the number of thread for event synthesize
From: Kan LiangUsing UINT_MAX to indicate the default thread#, which is the max number of online CPU. Signed-off-by: Kan Liang --- tools/perf/builtin-top.c | 5 - tools/perf/util/event.c | 5 - tools/perf/util/top.h| 1 + 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index 4b8fdc1..5f59aa7 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -961,7 +961,7 @@ static int __cmd_top(struct perf_top *top) machine__synthesize_threads(>session->machines.host, >target, top->evlist->threads, false, opts->proc_map_timeout, - (unsigned int)sysconf(_SC_NPROCESSORS_ONLN)); + top->nr_threads_synthesize); if (perf_hpp_list.socket) { ret = perf_env__read_cpu_topology_map(_env); @@ -1114,6 +1114,7 @@ int cmd_top(int argc, const char **argv) }, .max_stack = sysctl_perf_event_max_stack, .sym_pcnt_filter = 5, + .nr_threads_synthesize = UINT_MAX, }; struct record_opts *opts = _opts; struct target *target = >target; @@ -1223,6 +1224,8 @@ int cmd_top(int argc, const char **argv) OPT_BOOLEAN(0, "hierarchy", _conf.report_hierarchy, "Show entries in a hierarchy"), OPT_BOOLEAN(0, "force", _conf.force, "don't complain, do it"), + OPT_UINTEGER(0, "num-thread-synthesize", _threads_synthesize, + "number of thread to run event synthesize"), OPT_END() }; const char * const top_usage[] = { diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 8c4e072..ecef279 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -778,7 +778,10 @@ int perf_event__synthesize_threads(struct perf_tool *tool, if (n < 0) return err; - thread_nr = nr_threads_synthesize; + if (nr_threads_synthesize == UINT_MAX) + thread_nr = sysconf(_SC_NPROCESSORS_ONLN); + else + thread_nr = nr_threads_synthesize; if (thread_nr <= 0) thread_nr = 1; if (thread_nr > n) diff --git a/tools/perf/util/top.h b/tools/perf/util/top.h index 9bdfb78..f4296e1 100644 --- a/tools/perf/util/top.h +++ b/tools/perf/util/top.h @@ -37,6 +37,7 @@ struct perf_top { intsym_pcnt_filter; const char *sym_filter; float min_percent; + unsigned int nr_threads_synthesize; }; #define CONSOLE_CLEAR "[H[2J" -- 2.5.5
[PATCH RFC V2 09/10] perf top: add option to set the number of thread for event synthesize
From: Kan Liang Using UINT_MAX to indicate the default thread#, which is the max number of online CPU. Signed-off-by: Kan Liang --- tools/perf/builtin-top.c | 5 - tools/perf/util/event.c | 5 - tools/perf/util/top.h| 1 + 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index 4b8fdc1..5f59aa7 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -961,7 +961,7 @@ static int __cmd_top(struct perf_top *top) machine__synthesize_threads(>session->machines.host, >target, top->evlist->threads, false, opts->proc_map_timeout, - (unsigned int)sysconf(_SC_NPROCESSORS_ONLN)); + top->nr_threads_synthesize); if (perf_hpp_list.socket) { ret = perf_env__read_cpu_topology_map(_env); @@ -1114,6 +1114,7 @@ int cmd_top(int argc, const char **argv) }, .max_stack = sysctl_perf_event_max_stack, .sym_pcnt_filter = 5, + .nr_threads_synthesize = UINT_MAX, }; struct record_opts *opts = _opts; struct target *target = >target; @@ -1223,6 +1224,8 @@ int cmd_top(int argc, const char **argv) OPT_BOOLEAN(0, "hierarchy", _conf.report_hierarchy, "Show entries in a hierarchy"), OPT_BOOLEAN(0, "force", _conf.force, "don't complain, do it"), + OPT_UINTEGER(0, "num-thread-synthesize", _threads_synthesize, + "number of thread to run event synthesize"), OPT_END() }; const char * const top_usage[] = { diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 8c4e072..ecef279 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -778,7 +778,10 @@ int perf_event__synthesize_threads(struct perf_tool *tool, if (n < 0) return err; - thread_nr = nr_threads_synthesize; + if (nr_threads_synthesize == UINT_MAX) + thread_nr = sysconf(_SC_NPROCESSORS_ONLN); + else + thread_nr = nr_threads_synthesize; if (thread_nr <= 0) thread_nr = 1; if (thread_nr > n) diff --git a/tools/perf/util/top.h b/tools/perf/util/top.h index 9bdfb78..f4296e1 100644 --- a/tools/perf/util/top.h +++ b/tools/perf/util/top.h @@ -37,6 +37,7 @@ struct perf_top { intsym_pcnt_filter; const char *sym_filter; float min_percent; + unsigned int nr_threads_synthesize; }; #define CONSOLE_CLEAR "[H[2J" -- 2.5.5
[PATCH RFC V2 02/10] perf tools: using scandir to replace readdir
From: Kan LiangFor perf_event__synthesize_threads, perf goes through all proc files serially by readdir. scandir did a snapshoot of proc, which is multithreading friendly. It's possible that some threads which are added during event synthesize. But the number of lost threads should be small. They should not impact the final analysis. Signed-off-by: Kan Liang --- tools/perf/util/event.c | 46 ++ 1 file changed, 26 insertions(+), 20 deletions(-) diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 1c905ba..c31f678 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -683,12 +683,14 @@ int perf_event__synthesize_threads(struct perf_tool *tool, bool mmap_data, unsigned int proc_map_timeout) { - DIR *proc; - char proc_path[PATH_MAX]; - struct dirent *dirent; union perf_event *comm_event, *mmap_event, *fork_event; union perf_event *namespaces_event; + char proc_path[PATH_MAX]; + struct dirent **dirent; int err = -1; + char *end; + pid_t pid; + int n, i; if (machine__is_default_guest(machine)) return 0; @@ -712,29 +714,33 @@ int perf_event__synthesize_threads(struct perf_tool *tool, goto out_free_fork; snprintf(proc_path, sizeof(proc_path), "%s/proc", machine->root_dir); - proc = opendir(proc_path); + n = scandir(proc_path, , 0, alphasort); - if (proc == NULL) + if (n < 0) goto out_free_namespaces; - while ((dirent = readdir(proc)) != NULL) { - char *end; - pid_t pid = strtol(dirent->d_name, , 10); - - if (*end) /* only interested in proper numerical dirents */ + for (i = 0; i < n; i++) { + if (!isdigit(dirent[i]->d_name[0])) continue; - /* -* We may race with exiting thread, so don't stop just because -* one thread couldn't be synthesized. -*/ - __event__synthesize_thread(comm_event, mmap_event, fork_event, - namespaces_event, pid, 1, process, - tool, machine, mmap_data, - proc_map_timeout); - } + pid = (pid_t)strtol(dirent[i]->d_name, , 10); + /* only interested in proper numerical dirents */ + if (!*end) { + /* +* We may race with exiting thread, so don't stop +* just because one thread couldn't be synthesized. +*/ + __event__synthesize_thread(comm_event, mmap_event, + fork_event, namespaces_event, + pid, 1, process, tool, + machine, mmap_data, + proc_map_timeout); + } + free(dirent[i]); + } + free(dirent); err = 0; - closedir(proc); + out_free_namespaces: free(namespaces_event); out_free_fork: -- 2.5.5
[PATCH RFC V2 02/10] perf tools: using scandir to replace readdir
From: Kan Liang For perf_event__synthesize_threads, perf goes through all proc files serially by readdir. scandir did a snapshoot of proc, which is multithreading friendly. It's possible that some threads which are added during event synthesize. But the number of lost threads should be small. They should not impact the final analysis. Signed-off-by: Kan Liang --- tools/perf/util/event.c | 46 ++ 1 file changed, 26 insertions(+), 20 deletions(-) diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 1c905ba..c31f678 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -683,12 +683,14 @@ int perf_event__synthesize_threads(struct perf_tool *tool, bool mmap_data, unsigned int proc_map_timeout) { - DIR *proc; - char proc_path[PATH_MAX]; - struct dirent *dirent; union perf_event *comm_event, *mmap_event, *fork_event; union perf_event *namespaces_event; + char proc_path[PATH_MAX]; + struct dirent **dirent; int err = -1; + char *end; + pid_t pid; + int n, i; if (machine__is_default_guest(machine)) return 0; @@ -712,29 +714,33 @@ int perf_event__synthesize_threads(struct perf_tool *tool, goto out_free_fork; snprintf(proc_path, sizeof(proc_path), "%s/proc", machine->root_dir); - proc = opendir(proc_path); + n = scandir(proc_path, , 0, alphasort); - if (proc == NULL) + if (n < 0) goto out_free_namespaces; - while ((dirent = readdir(proc)) != NULL) { - char *end; - pid_t pid = strtol(dirent->d_name, , 10); - - if (*end) /* only interested in proper numerical dirents */ + for (i = 0; i < n; i++) { + if (!isdigit(dirent[i]->d_name[0])) continue; - /* -* We may race with exiting thread, so don't stop just because -* one thread couldn't be synthesized. -*/ - __event__synthesize_thread(comm_event, mmap_event, fork_event, - namespaces_event, pid, 1, process, - tool, machine, mmap_data, - proc_map_timeout); - } + pid = (pid_t)strtol(dirent[i]->d_name, , 10); + /* only interested in proper numerical dirents */ + if (!*end) { + /* +* We may race with exiting thread, so don't stop +* just because one thread couldn't be synthesized. +*/ + __event__synthesize_thread(comm_event, mmap_event, + fork_event, namespaces_event, + pid, 1, process, tool, + machine, mmap_data, + proc_map_timeout); + } + free(dirent[i]); + } + free(dirent); err = 0; - closedir(proc); + out_free_namespaces: free(namespaces_event); out_free_fork: -- 2.5.5
[PATCH RFC V2 07/10] perf tools: change machine comm_exec type to atomic
From: Kan LiangIn case there are two or more threads want to change it. Signed-off-by: Kan Liang --- tools/perf/util/machine.c | 11 ++- tools/perf/util/machine.h | 2 +- 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index 2f1ad29..bbfb9e0 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -64,8 +64,8 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid) machine->id_hdr_size = 0; machine->kptr_restrict_warned = false; - machine->comm_exec = false; machine->kernel_start = 0; + atomic_set(>comm_exec, 0); memset(machine->vmlinux_maps, 0, sizeof(machine->vmlinux_maps)); @@ -238,14 +238,15 @@ struct machine *machines__add(struct machines *machines, pid_t pid, void machines__set_comm_exec(struct machines *machines, bool comm_exec) { + int exec = comm_exec ? 1 : 0; struct rb_node *nd; - machines->host.comm_exec = comm_exec; + atomic_set(>host.comm_exec, exec); for (nd = rb_first(>guests); nd; nd = rb_next(nd)) { struct machine *machine = rb_entry(nd, struct machine, rb_node); - machine->comm_exec = comm_exec; + atomic_set(>comm_exec, exec); } } @@ -505,7 +506,7 @@ struct thread *machine__find_thread(struct machine *machine, pid_t pid, struct comm *machine__thread_exec_comm(struct machine *machine, struct thread *thread) { - if (machine->comm_exec) + if (atomic_read(>comm_exec)) return thread__exec_comm(thread); else return thread__comm(thread); @@ -521,7 +522,7 @@ int machine__process_comm_event(struct machine *machine, union perf_event *event int err = 0; if (exec) - machine->comm_exec = true; + atomic_set(>comm_exec, 1); if (dump_trace) perf_event__fprintf_comm(event, stdout); diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h index 64663eb..fb3c2a2 100644 --- a/tools/perf/util/machine.h +++ b/tools/perf/util/machine.h @@ -38,7 +38,7 @@ struct machine { struct rb_noderb_node; pid_t pid; u16 id_hdr_size; - bool comm_exec; + atomic_t comm_exec; bool kptr_restrict_warned; char *root_dir; struct threadsthreads[THREADS__TABLE_SIZE]; -- 2.5.5
[PATCH RFC V2 07/10] perf tools: change machine comm_exec type to atomic
From: Kan Liang In case there are two or more threads want to change it. Signed-off-by: Kan Liang --- tools/perf/util/machine.c | 11 ++- tools/perf/util/machine.h | 2 +- 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index 2f1ad29..bbfb9e0 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -64,8 +64,8 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid) machine->id_hdr_size = 0; machine->kptr_restrict_warned = false; - machine->comm_exec = false; machine->kernel_start = 0; + atomic_set(>comm_exec, 0); memset(machine->vmlinux_maps, 0, sizeof(machine->vmlinux_maps)); @@ -238,14 +238,15 @@ struct machine *machines__add(struct machines *machines, pid_t pid, void machines__set_comm_exec(struct machines *machines, bool comm_exec) { + int exec = comm_exec ? 1 : 0; struct rb_node *nd; - machines->host.comm_exec = comm_exec; + atomic_set(>host.comm_exec, exec); for (nd = rb_first(>guests); nd; nd = rb_next(nd)) { struct machine *machine = rb_entry(nd, struct machine, rb_node); - machine->comm_exec = comm_exec; + atomic_set(>comm_exec, exec); } } @@ -505,7 +506,7 @@ struct thread *machine__find_thread(struct machine *machine, pid_t pid, struct comm *machine__thread_exec_comm(struct machine *machine, struct thread *thread) { - if (machine->comm_exec) + if (atomic_read(>comm_exec)) return thread__exec_comm(thread); else return thread__comm(thread); @@ -521,7 +522,7 @@ int machine__process_comm_event(struct machine *machine, union perf_event *event int err = 0; if (exec) - machine->comm_exec = true; + atomic_set(>comm_exec, 1); if (dump_trace) perf_event__fprintf_comm(event, stdout); diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h index 64663eb..fb3c2a2 100644 --- a/tools/perf/util/machine.h +++ b/tools/perf/util/machine.h @@ -38,7 +38,7 @@ struct machine { struct rb_noderb_node; pid_t pid; u16 id_hdr_size; - bool comm_exec; + atomic_t comm_exec; bool kptr_restrict_warned; char *root_dir; struct threadsthreads[THREADS__TABLE_SIZE]; -- 2.5.5
[PATCH 2/4] dt-bindings: timer: Add Actions Semi S700
Define a compatible string for the Actions Semi S700 SoC timer. Signed-off-by: Andreas Färber--- Documentation/devicetree/bindings/timer/actions,owl-timer.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/timer/actions,owl-timer.txt b/Documentation/devicetree/bindings/timer/actions,owl-timer.txt index e3c28da80cb2..977054f87563 100644 --- a/Documentation/devicetree/bindings/timer/actions,owl-timer.txt +++ b/Documentation/devicetree/bindings/timer/actions,owl-timer.txt @@ -2,6 +2,7 @@ Actions Semi Owl Timer Required properties: - compatible : "actions,s500-timer" for S500 + "actions,s700-timer" for S700 "actions,s900-timer" for S900 - reg : Offset and length of the register set for the device. - interrupts : Should contain the interrupts. -- 2.13.5
[PATCH 2/4] dt-bindings: timer: Add Actions Semi S700
Define a compatible string for the Actions Semi S700 SoC timer. Signed-off-by: Andreas Färber --- Documentation/devicetree/bindings/timer/actions,owl-timer.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/timer/actions,owl-timer.txt b/Documentation/devicetree/bindings/timer/actions,owl-timer.txt index e3c28da80cb2..977054f87563 100644 --- a/Documentation/devicetree/bindings/timer/actions,owl-timer.txt +++ b/Documentation/devicetree/bindings/timer/actions,owl-timer.txt @@ -2,6 +2,7 @@ Actions Semi Owl Timer Required properties: - compatible : "actions,s500-timer" for S500 + "actions,s700-timer" for S700 "actions,s900-timer" for S900 - reg : Offset and length of the register set for the device. - interrupts : Should contain the interrupts. -- 2.13.5
[PATCH 0/4] arm64: Add initial Actions Semi S700 and CubieBoard7 support
Hello, This series prepares the Actions Semi S700 SoC and Cubietech CubieBoard7. It boots equally far as the S900 based Bubblegum-96 these patches are based on, i.e. into an initrd with serial and all four cores up. I have not yet managed to download the CubieBoard7 sources from Baidu, and the Mega download has not yet been fixed, so this is based on the S700 datasheet and /proc/device-tree/ from the preinstalled Android. Lacking sources and instructions to replace the bootloader on eMMC or to try booting from SD, the only working way I've found to boot mainline kernels appears to be booting into Android, then issuing "reboot bootloader". U-Boot is lacking the saveenv command, so it wasn't possible to change the bootdelay to facilitate this. Cf. https://en.opensuse.org/HCL:CubieBoard7 Work branch: https://github.com/afaerber/linux/commits/bg96-next Have a lot of fun! Cheers, Andreas Cc: Ahha LeeCc: supp...@cubietech.com Cc: devicet...@vger.kernel.org Cc: Daniel Lezcano Cc: Thomas Gleixner Andreas Färber (4): dt-bindings: arm: actions: Add S700 and CubieBoard7 dt-bindings: timer: Add Actions Semi S700 clocksource: owl: Prepare S700 arm64: dts: actions: Add S700 and CubieBoard7 Documentation/devicetree/bindings/arm/actions.txt | 15 ++ .../bindings/timer/actions,owl-timer.txt | 1 + arch/arm64/boot/dts/actions/Makefile | 2 + arch/arm64/boot/dts/actions/s700-cubieboard7.dts | 47 ++ arch/arm64/boot/dts/actions/s700.dtsi | 164 + drivers/clocksource/owl-timer.c| 1 + 6 files changed, 230 insertions(+) create mode 100644 arch/arm64/boot/dts/actions/s700-cubieboard7.dts create mode 100644 arch/arm64/boot/dts/actions/s700.dtsi -- 2.13.5
[PATCH 0/4] arm64: Add initial Actions Semi S700 and CubieBoard7 support
Hello, This series prepares the Actions Semi S700 SoC and Cubietech CubieBoard7. It boots equally far as the S900 based Bubblegum-96 these patches are based on, i.e. into an initrd with serial and all four cores up. I have not yet managed to download the CubieBoard7 sources from Baidu, and the Mega download has not yet been fixed, so this is based on the S700 datasheet and /proc/device-tree/ from the preinstalled Android. Lacking sources and instructions to replace the bootloader on eMMC or to try booting from SD, the only working way I've found to boot mainline kernels appears to be booting into Android, then issuing "reboot bootloader". U-Boot is lacking the saveenv command, so it wasn't possible to change the bootdelay to facilitate this. Cf. https://en.opensuse.org/HCL:CubieBoard7 Work branch: https://github.com/afaerber/linux/commits/bg96-next Have a lot of fun! Cheers, Andreas Cc: Ahha Lee Cc: supp...@cubietech.com Cc: devicet...@vger.kernel.org Cc: Daniel Lezcano Cc: Thomas Gleixner Andreas Färber (4): dt-bindings: arm: actions: Add S700 and CubieBoard7 dt-bindings: timer: Add Actions Semi S700 clocksource: owl: Prepare S700 arm64: dts: actions: Add S700 and CubieBoard7 Documentation/devicetree/bindings/arm/actions.txt | 15 ++ .../bindings/timer/actions,owl-timer.txt | 1 + arch/arm64/boot/dts/actions/Makefile | 2 + arch/arm64/boot/dts/actions/s700-cubieboard7.dts | 47 ++ arch/arm64/boot/dts/actions/s700.dtsi | 164 + drivers/clocksource/owl-timer.c| 1 + 6 files changed, 230 insertions(+) create mode 100644 arch/arm64/boot/dts/actions/s700-cubieboard7.dts create mode 100644 arch/arm64/boot/dts/actions/s700.dtsi -- 2.13.5
[PATCH 3/4] clocksource: owl: Prepare S700
S700 has two 2Hz timers like S500, and four TIMx timers like S900. Signed-off-by: Andreas Färber--- drivers/clocksource/owl-timer.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/clocksource/owl-timer.c b/drivers/clocksource/owl-timer.c index d19c53c11094..d5f3f72c10ee 100644 --- a/drivers/clocksource/owl-timer.c +++ b/drivers/clocksource/owl-timer.c @@ -169,4 +169,5 @@ static int __init owl_timer_init(struct device_node *node) return 0; } CLOCKSOURCE_OF_DECLARE(owl_s500, "actions,s500-timer", owl_timer_init); +CLOCKSOURCE_OF_DECLARE(owl_s700, "actions,s700-timer", owl_timer_init); CLOCKSOURCE_OF_DECLARE(owl_s900, "actions,s900-timer", owl_timer_init); -- 2.13.5
[PATCH 3/4] clocksource: owl: Prepare S700
S700 has two 2Hz timers like S500, and four TIMx timers like S900. Signed-off-by: Andreas Färber --- drivers/clocksource/owl-timer.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/clocksource/owl-timer.c b/drivers/clocksource/owl-timer.c index d19c53c11094..d5f3f72c10ee 100644 --- a/drivers/clocksource/owl-timer.c +++ b/drivers/clocksource/owl-timer.c @@ -169,4 +169,5 @@ static int __init owl_timer_init(struct device_node *node) return 0; } CLOCKSOURCE_OF_DECLARE(owl_s500, "actions,s500-timer", owl_timer_init); +CLOCKSOURCE_OF_DECLARE(owl_s700, "actions,s700-timer", owl_timer_init); CLOCKSOURCE_OF_DECLARE(owl_s900, "actions,s900-timer", owl_timer_init); -- 2.13.5
[PATCH 4/4] arm64: dts: actions: Add S700 and CubieBoard7
Add Device Trees for S700 SoC and Cubietech CubieBoard7. Signed-off-by: Andreas Färber--- arch/arm64/boot/dts/actions/Makefile | 2 + arch/arm64/boot/dts/actions/s700-cubieboard7.dts | 47 +++ arch/arm64/boot/dts/actions/s700.dtsi| 164 +++ 3 files changed, 213 insertions(+) create mode 100644 arch/arm64/boot/dts/actions/s700-cubieboard7.dts create mode 100644 arch/arm64/boot/dts/actions/s700.dtsi diff --git a/arch/arm64/boot/dts/actions/Makefile b/arch/arm64/boot/dts/actions/Makefile index 62922d688ce3..cfd7f2679feb 100644 --- a/arch/arm64/boot/dts/actions/Makefile +++ b/arch/arm64/boot/dts/actions/Makefile @@ -1,3 +1,5 @@ +dtb-$(CONFIG_ARCH_ACTIONS) += s700-cubieboard7.dtb + dtb-$(CONFIG_ARCH_ACTIONS) += s900-bubblegum-96.dtb always := $(dtb-y) diff --git a/arch/arm64/boot/dts/actions/s700-cubieboard7.dts b/arch/arm64/boot/dts/actions/s700-cubieboard7.dts new file mode 100644 index ..e562f04c2490 --- /dev/null +++ b/arch/arm64/boot/dts/actions/s700-cubieboard7.dts @@ -0,0 +1,47 @@ +/* + * Copyright (c) 2017 Andreas Färber + * + * SPDX-License-Identifier: (GPL-2.0+ OR MIT) + */ + +/dts-v1/; + +#include "s700.dtsi" + +/ { + compatible = "cubietech,cubieboard7", "actions,s700"; + model = "CubieBoard7"; + + aliases { + serial3 = + }; + + chosen { + stdout-path = "serial3:115200n8"; + }; + + memory@0 { + device_type = "memory"; + reg = <0x0 0x0 0x0 0x8000>; + }; + + memory@1,e000 { + device_type = "memory"; + reg = <0x1 0xe000 0x0 0x0>; + }; + + uart3_clk: uart3-clk { + compatible = "fixed-clock"; + clock-frequency = <921600>; + #clock-cells = <0>; + }; +}; + + { + clocks = <>; +}; + + { + status = "okay"; + clocks = <_clk>; +}; diff --git a/arch/arm64/boot/dts/actions/s700.dtsi b/arch/arm64/boot/dts/actions/s700.dtsi new file mode 100644 index ..b54df405aaeb --- /dev/null +++ b/arch/arm64/boot/dts/actions/s700.dtsi @@ -0,0 +1,164 @@ +/* + * Copyright (c) 2017 Andreas Färber + * + * SPDX-License-Identifier: (GPL-2.0+ OR MIT) + */ + +#include + +/ { + compatible = "actions,s700"; + interrupt-parent = <>; + #address-cells = <2>; + #size-cells = <2>; + + cpus { + #address-cells = <2>; + #size-cells = <0>; + + cpu0: cpu@0 { + device_type = "cpu"; + compatible = "arm,cortex-a53", "arm,armv8"; + reg = <0x0 0x0>; + enable-method = "psci"; + }; + + cpu1: cpu@1 { + device_type = "cpu"; + compatible = "arm,cortex-a53", "arm,armv8"; + reg = <0x0 0x1>; + enable-method = "psci"; + }; + + cpu2: cpu@2 { + device_type = "cpu"; + compatible = "arm,cortex-a53", "arm,armv8"; + reg = <0x0 0x2>; + enable-method = "psci"; + }; + + cpu3: cpu@3 { + device_type = "cpu"; + compatible = "arm,cortex-a53", "arm,armv8"; + reg = <0x0 0x3>; + enable-method = "psci"; + }; + }; + + reserved-memory { + #address-cells = <2>; + #size-cells = <2>; + ranges; + + secmon@1f00 { + reg = <0x0 0x1f00 0x0 0x100>; + no-map; + }; + }; + + psci { + compatible = "arm,psci-0.2"; + method = "smc"; + }; + + arm-pmu { + compatible = "arm,cortex-a53-pmu"; + interrupts = , +, +, +; + interrupt-affinity = <>, <>, <>, <>; + }; + + timer { + compatible = "arm,armv8-timer"; + interrupts = , +, +, +; + }; + + hosc: hosc { + compatible = "fixed-clock"; + clock-frequency = <2400>; + #clock-cells = <0>; + }; + + soc { + compatible = "simple-bus"; + #address-cells = <2>; + #size-cells = <2>; + ranges; + + gic: interrupt-controller@e00f1000 { + compatible = "arm,gic-400"; + reg = <0x0 0xe00f1000 0x0 0x1000>, + <0x0 0xe00f2000 0x0 0x2000>, + <0x0
[PATCH 1/4] dt-bindings: arm: actions: Add S700 and CubieBoard7
Document the Actions Semi S700 SoC and the Cubietech CubieBoard7. Signed-off-by: Andreas Färber--- Documentation/devicetree/bindings/arm/actions.txt | 15 +++ 1 file changed, 15 insertions(+) diff --git a/Documentation/devicetree/bindings/arm/actions.txt b/Documentation/devicetree/bindings/arm/actions.txt index ced764a8549e..544a8855fad5 100644 --- a/Documentation/devicetree/bindings/arm/actions.txt +++ b/Documentation/devicetree/bindings/arm/actions.txt @@ -25,6 +25,21 @@ Root node property compatible must contain, depending on board: - LeMaker Guitar Base Board rev. B: "lemaker,guitar-bb-rev-b", "lemaker,guitar" +S700 SoC + + +Required root node properties: + +- compatible : must contain "actions,s700" + + +Boards: + +Root node property compatible must contain, depending on board: + + - Cubietech CubieBoard7: "cubietech,cubieboard7" + + S900 SoC -- 2.13.5
[PATCH 4/4] arm64: dts: actions: Add S700 and CubieBoard7
Add Device Trees for S700 SoC and Cubietech CubieBoard7. Signed-off-by: Andreas Färber --- arch/arm64/boot/dts/actions/Makefile | 2 + arch/arm64/boot/dts/actions/s700-cubieboard7.dts | 47 +++ arch/arm64/boot/dts/actions/s700.dtsi| 164 +++ 3 files changed, 213 insertions(+) create mode 100644 arch/arm64/boot/dts/actions/s700-cubieboard7.dts create mode 100644 arch/arm64/boot/dts/actions/s700.dtsi diff --git a/arch/arm64/boot/dts/actions/Makefile b/arch/arm64/boot/dts/actions/Makefile index 62922d688ce3..cfd7f2679feb 100644 --- a/arch/arm64/boot/dts/actions/Makefile +++ b/arch/arm64/boot/dts/actions/Makefile @@ -1,3 +1,5 @@ +dtb-$(CONFIG_ARCH_ACTIONS) += s700-cubieboard7.dtb + dtb-$(CONFIG_ARCH_ACTIONS) += s900-bubblegum-96.dtb always := $(dtb-y) diff --git a/arch/arm64/boot/dts/actions/s700-cubieboard7.dts b/arch/arm64/boot/dts/actions/s700-cubieboard7.dts new file mode 100644 index ..e562f04c2490 --- /dev/null +++ b/arch/arm64/boot/dts/actions/s700-cubieboard7.dts @@ -0,0 +1,47 @@ +/* + * Copyright (c) 2017 Andreas Färber + * + * SPDX-License-Identifier: (GPL-2.0+ OR MIT) + */ + +/dts-v1/; + +#include "s700.dtsi" + +/ { + compatible = "cubietech,cubieboard7", "actions,s700"; + model = "CubieBoard7"; + + aliases { + serial3 = + }; + + chosen { + stdout-path = "serial3:115200n8"; + }; + + memory@0 { + device_type = "memory"; + reg = <0x0 0x0 0x0 0x8000>; + }; + + memory@1,e000 { + device_type = "memory"; + reg = <0x1 0xe000 0x0 0x0>; + }; + + uart3_clk: uart3-clk { + compatible = "fixed-clock"; + clock-frequency = <921600>; + #clock-cells = <0>; + }; +}; + + { + clocks = <>; +}; + + { + status = "okay"; + clocks = <_clk>; +}; diff --git a/arch/arm64/boot/dts/actions/s700.dtsi b/arch/arm64/boot/dts/actions/s700.dtsi new file mode 100644 index ..b54df405aaeb --- /dev/null +++ b/arch/arm64/boot/dts/actions/s700.dtsi @@ -0,0 +1,164 @@ +/* + * Copyright (c) 2017 Andreas Färber + * + * SPDX-License-Identifier: (GPL-2.0+ OR MIT) + */ + +#include + +/ { + compatible = "actions,s700"; + interrupt-parent = <>; + #address-cells = <2>; + #size-cells = <2>; + + cpus { + #address-cells = <2>; + #size-cells = <0>; + + cpu0: cpu@0 { + device_type = "cpu"; + compatible = "arm,cortex-a53", "arm,armv8"; + reg = <0x0 0x0>; + enable-method = "psci"; + }; + + cpu1: cpu@1 { + device_type = "cpu"; + compatible = "arm,cortex-a53", "arm,armv8"; + reg = <0x0 0x1>; + enable-method = "psci"; + }; + + cpu2: cpu@2 { + device_type = "cpu"; + compatible = "arm,cortex-a53", "arm,armv8"; + reg = <0x0 0x2>; + enable-method = "psci"; + }; + + cpu3: cpu@3 { + device_type = "cpu"; + compatible = "arm,cortex-a53", "arm,armv8"; + reg = <0x0 0x3>; + enable-method = "psci"; + }; + }; + + reserved-memory { + #address-cells = <2>; + #size-cells = <2>; + ranges; + + secmon@1f00 { + reg = <0x0 0x1f00 0x0 0x100>; + no-map; + }; + }; + + psci { + compatible = "arm,psci-0.2"; + method = "smc"; + }; + + arm-pmu { + compatible = "arm,cortex-a53-pmu"; + interrupts = , +, +, +; + interrupt-affinity = <>, <>, <>, <>; + }; + + timer { + compatible = "arm,armv8-timer"; + interrupts = , +, +, +; + }; + + hosc: hosc { + compatible = "fixed-clock"; + clock-frequency = <2400>; + #clock-cells = <0>; + }; + + soc { + compatible = "simple-bus"; + #address-cells = <2>; + #size-cells = <2>; + ranges; + + gic: interrupt-controller@e00f1000 { + compatible = "arm,gic-400"; + reg = <0x0 0xe00f1000 0x0 0x1000>, + <0x0 0xe00f2000 0x0 0x2000>, + <0x0 0xe00f4000 0x0 0x2000>, +
[PATCH 1/4] dt-bindings: arm: actions: Add S700 and CubieBoard7
Document the Actions Semi S700 SoC and the Cubietech CubieBoard7. Signed-off-by: Andreas Färber --- Documentation/devicetree/bindings/arm/actions.txt | 15 +++ 1 file changed, 15 insertions(+) diff --git a/Documentation/devicetree/bindings/arm/actions.txt b/Documentation/devicetree/bindings/arm/actions.txt index ced764a8549e..544a8855fad5 100644 --- a/Documentation/devicetree/bindings/arm/actions.txt +++ b/Documentation/devicetree/bindings/arm/actions.txt @@ -25,6 +25,21 @@ Root node property compatible must contain, depending on board: - LeMaker Guitar Base Board rev. B: "lemaker,guitar-bb-rev-b", "lemaker,guitar" +S700 SoC + + +Required root node properties: + +- compatible : must contain "actions,s700" + + +Boards: + +Root node property compatible must contain, depending on board: + + - Cubietech CubieBoard7: "cubietech,cubieboard7" + + S900 SoC -- 2.13.5
[PATCH] net: tcp_input: Neaten DBGUNDO
Move the #ifdef into the static void function so that the use of DBGUNDO is validated when FASTRETRANS_DEBUG <= 1. Remove the now unnecessary #else and #define DBGUNDO. Signed-off-by: Joe Perches--- net/ipv4/tcp_input.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index c5d7656b..bddf724f5c02 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2333,9 +2333,9 @@ static bool tcp_any_retrans_done(const struct sock *sk) return false; } -#if FASTRETRANS_DEBUG > 1 static void DBGUNDO(struct sock *sk, const char *msg) { +#if FASTRETRANS_DEBUG > 1 struct tcp_sock *tp = tcp_sk(sk); struct inet_sock *inet = inet_sk(sk); @@ -2357,10 +2357,8 @@ static void DBGUNDO(struct sock *sk, const char *msg) tp->packets_out); } #endif -} -#else -#define DBGUNDO(x...) do { } while (0) #endif +} static void tcp_undo_cwnd_reduction(struct sock *sk, bool unmark_loss) { -- 2.10.0.rc2.1.g053435c
[PATCH] net: tcp_input: Neaten DBGUNDO
Move the #ifdef into the static void function so that the use of DBGUNDO is validated when FASTRETRANS_DEBUG <= 1. Remove the now unnecessary #else and #define DBGUNDO. Signed-off-by: Joe Perches --- net/ipv4/tcp_input.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index c5d7656b..bddf724f5c02 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2333,9 +2333,9 @@ static bool tcp_any_retrans_done(const struct sock *sk) return false; } -#if FASTRETRANS_DEBUG > 1 static void DBGUNDO(struct sock *sk, const char *msg) { +#if FASTRETRANS_DEBUG > 1 struct tcp_sock *tp = tcp_sk(sk); struct inet_sock *inet = inet_sk(sk); @@ -2357,10 +2357,8 @@ static void DBGUNDO(struct sock *sk, const char *msg) tp->packets_out); } #endif -} -#else -#define DBGUNDO(x...) do { } while (0) #endif +} static void tcp_undo_cwnd_reduction(struct sock *sk, bool unmark_loss) { -- 2.10.0.rc2.1.g053435c
Re: [PATCH] media: i2c: OV5647: gate clock lane before stream on
Hi Luis, 2017-08-07 22:48 GMT+08:00 Luis Oliveira: > Hi all, > > I'm new here, I got to be Maintainer of this driver by the old Maintainer > recommendation. Still getting the hang of it :) > > On 07-Aug-17 13:26, Philipp Zabel wrote: >> Hi Jacob, >> >> On Mon, 2017-08-07 at 19:06 +0800, Jacob Chen wrote: >> [...] >> --- a/drivers/media/i2c/ov5647.c >> +++ b/drivers/media/i2c/ov5647.c >> @@ -253,6 +253,10 @@ static int ov5647_stream_on(struct v4l2_subdev *sd) >> { >> int ret; >> >> + ret = ov5647_write(sd, 0x4800, 0x04); >> + if (ret < 0) >> + return ret; >> + >> >> So this clears BIT(1) (force clock lane to low power mode) and BIT(5) >> (gate clock lane while idle) that were set by ov5647_stream_off() during >> __sensor_init() due to the change below. >> >> Is there a reason, btw, that this driver is full of magic register >> addresses and values? A few #defines would make this a lot more >> readable. >> > > For what I can see I agree that a few register name setting could be done. > >> ret = ov5647_write(sd, 0x4202, 0x00); >> if (ret < 0) >> return ret; >> @@ -264,6 +268,10 @@ static int ov5647_stream_off(struct v4l2_subdev *sd) >> { >> int ret; >> >> + ret = ov5647_write(sd, 0x4800, 0x25); >> + if (ret < 0) >> + return ret; >> + >> ret = ov5647_write(sd, 0x4202, 0x0f); >> if (ret < 0) >> return ret; >> @@ -320,7 +328,7 @@ static int __sensor_init(struct v4l2_subdev *sd) >> return ret; >> } >> >> - return ov5647_write(sd, 0x4800, 0x04); >> + return ov5647_stream_off(sd); >> >> I see now that BIT(2) (keep bus in LP-11 while idle) is and was always >> set. So the change is that initially, additionally to LP-11 mode, the >> clock lane is gated and forced into low power mode, as well? >> > > This is my interpretation as well. > >> } >> >> static int ov5647_sensor_power(struct v4l2_subdev *sd, int on) >> -- >> 2.7.4 >> > > Can anyone comment on it? > > I saw there is a same discussion in > https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel.org_patch_9569031_=DwICaQ=DPL6_X_6JkXFx7AXWqB0tg=eMn12aiiNuIDjtRi5xEzC7tWJkpra2vl_XYFVvfxIGE=eortcRXje2uLyZNI_-Uw3Ur_z24tb-e4pZfom7WhdE0=6sLc76bhjR0IdaA3ArZ7F7slgtcyGz8pDTzAF_CBLno= > There is a comment in i.MX CSI2 driver. > " > Configure MIPI Camera Sensor to put all Tx lanes in LP-11 state. > This must be carried out by the MIPI sensor's s_power(ON) subdev > op. > " > That's what this patch do, sensor driver should make sure that clock > lanes are in stop state while not streaming. This is not the same, as far as I can tell. BIT(5) is just clock lane gating, as you describe above. To put the bus into LP-11 state, BIT(2) needs to be set. >>> >>> Yeah, but i double that clock lane is not in LP11 when continue clock >>> mode is enabled. > > I think by spec it shouldn't got to stopstate in continuous clock. > >> >> If indeed LP-11 state is not achieved while the sensor is idle, as long >> as BIT(5) is cleared, I think this patch is correct. >> >> regards >> Philipp >> > > As far as I understand, bit[5] set to 1 will force clock lane to be gated (in > other words it will be forced to be in LP-11 if there are no packets to > transmit). But also LP-11 must not be achieved with the BIT(5) cleared (free > running mode)? > Yes. When the BIT(5) is cleared, clock lane will be in continuous mode immediately, unless BIT(0) is set. > Sorry if I misunderstood something. > > regards, > Luis >
Re: [PATCH] media: i2c: OV5647: gate clock lane before stream on
Hi Luis, 2017-08-07 22:48 GMT+08:00 Luis Oliveira : > Hi all, > > I'm new here, I got to be Maintainer of this driver by the old Maintainer > recommendation. Still getting the hang of it :) > > On 07-Aug-17 13:26, Philipp Zabel wrote: >> Hi Jacob, >> >> On Mon, 2017-08-07 at 19:06 +0800, Jacob Chen wrote: >> [...] >> --- a/drivers/media/i2c/ov5647.c >> +++ b/drivers/media/i2c/ov5647.c >> @@ -253,6 +253,10 @@ static int ov5647_stream_on(struct v4l2_subdev *sd) >> { >> int ret; >> >> + ret = ov5647_write(sd, 0x4800, 0x04); >> + if (ret < 0) >> + return ret; >> + >> >> So this clears BIT(1) (force clock lane to low power mode) and BIT(5) >> (gate clock lane while idle) that were set by ov5647_stream_off() during >> __sensor_init() due to the change below. >> >> Is there a reason, btw, that this driver is full of magic register >> addresses and values? A few #defines would make this a lot more >> readable. >> > > For what I can see I agree that a few register name setting could be done. > >> ret = ov5647_write(sd, 0x4202, 0x00); >> if (ret < 0) >> return ret; >> @@ -264,6 +268,10 @@ static int ov5647_stream_off(struct v4l2_subdev *sd) >> { >> int ret; >> >> + ret = ov5647_write(sd, 0x4800, 0x25); >> + if (ret < 0) >> + return ret; >> + >> ret = ov5647_write(sd, 0x4202, 0x0f); >> if (ret < 0) >> return ret; >> @@ -320,7 +328,7 @@ static int __sensor_init(struct v4l2_subdev *sd) >> return ret; >> } >> >> - return ov5647_write(sd, 0x4800, 0x04); >> + return ov5647_stream_off(sd); >> >> I see now that BIT(2) (keep bus in LP-11 while idle) is and was always >> set. So the change is that initially, additionally to LP-11 mode, the >> clock lane is gated and forced into low power mode, as well? >> > > This is my interpretation as well. > >> } >> >> static int ov5647_sensor_power(struct v4l2_subdev *sd, int on) >> -- >> 2.7.4 >> > > Can anyone comment on it? > > I saw there is a same discussion in > https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel.org_patch_9569031_=DwICaQ=DPL6_X_6JkXFx7AXWqB0tg=eMn12aiiNuIDjtRi5xEzC7tWJkpra2vl_XYFVvfxIGE=eortcRXje2uLyZNI_-Uw3Ur_z24tb-e4pZfom7WhdE0=6sLc76bhjR0IdaA3ArZ7F7slgtcyGz8pDTzAF_CBLno= > There is a comment in i.MX CSI2 driver. > " > Configure MIPI Camera Sensor to put all Tx lanes in LP-11 state. > This must be carried out by the MIPI sensor's s_power(ON) subdev > op. > " > That's what this patch do, sensor driver should make sure that clock > lanes are in stop state while not streaming. This is not the same, as far as I can tell. BIT(5) is just clock lane gating, as you describe above. To put the bus into LP-11 state, BIT(2) needs to be set. >>> >>> Yeah, but i double that clock lane is not in LP11 when continue clock >>> mode is enabled. > > I think by spec it shouldn't got to stopstate in continuous clock. > >> >> If indeed LP-11 state is not achieved while the sensor is idle, as long >> as BIT(5) is cleared, I think this patch is correct. >> >> regards >> Philipp >> > > As far as I understand, bit[5] set to 1 will force clock lane to be gated (in > other words it will be forced to be in LP-11 if there are no packets to > transmit). But also LP-11 must not be achieved with the BIT(5) cleared (free > running mode)? > Yes. When the BIT(5) is cleared, clock lane will be in continuous mode immediately, unless BIT(0) is set. > Sorry if I misunderstood something. > > regards, > Luis >
Re: xtensa compiler error triggered by 'locking/lockdep: Implement the 'crossrelease' feature'
On Sun, Sep 10, 2017 at 11:31 AM, Guenter Roeckwrote: > xtensa:allmodconfig fails to build in mainline with compiler errors > as follows. > > drivers/staging/rtl8723bs/core/rtw_ap.c: In function ‘expire_timeout_chk’: > drivers/staging/rtl8723bs/core/rtw_ap.c:442:1: internal compiler error: > in change_address_1, at emit-rtl.c:2126 > > drivers/staging/rtl8188eu/core/rtw_ap.c: In function ‘expire_timeout_chk’: > drivers/staging/rtl8188eu/core/rtw_ap.c:445:1: internal compiler error: > in change_address_1, at emit-rtl.c:2150 > > I tried gcc 6.3, 6.4, and 7.2; they all have the same problem. > > Bisect points to commit b09be676e0f ("locking/lockdep: Implement the > 'crossrelease' feature"). > > The compile error is still seen if I disable CONFIG_LOCKDEP_CROSSRELEASE. > The image builds fine if I disable both CONFIG_RTL8723BS and CONFIG_R8188EU. Thanks for the report. I'll look at the compiler. -- Thanks. -- Max
Re: xtensa compiler error triggered by 'locking/lockdep: Implement the 'crossrelease' feature'
On Sun, Sep 10, 2017 at 11:31 AM, Guenter Roeck wrote: > xtensa:allmodconfig fails to build in mainline with compiler errors > as follows. > > drivers/staging/rtl8723bs/core/rtw_ap.c: In function ‘expire_timeout_chk’: > drivers/staging/rtl8723bs/core/rtw_ap.c:442:1: internal compiler error: > in change_address_1, at emit-rtl.c:2126 > > drivers/staging/rtl8188eu/core/rtw_ap.c: In function ‘expire_timeout_chk’: > drivers/staging/rtl8188eu/core/rtw_ap.c:445:1: internal compiler error: > in change_address_1, at emit-rtl.c:2150 > > I tried gcc 6.3, 6.4, and 7.2; they all have the same problem. > > Bisect points to commit b09be676e0f ("locking/lockdep: Implement the > 'crossrelease' feature"). > > The compile error is still seen if I disable CONFIG_LOCKDEP_CROSSRELEASE. > The image builds fine if I disable both CONFIG_RTL8723BS and CONFIG_R8188EU. Thanks for the report. I'll look at the compiler. -- Thanks. -- Max
Re: [PATCH v3 1/3] arm64/ras: support sea error recovery
Hi James, Thanks for your comments. On 2017/9/9 2:15, James Morse wrote: > Hi Xie XiuQi, > > (Sorry a few versions of this went past before I caught up with it) > > On 07/09/17 08:45, Xie XiuQi wrote: >> With ARM v8.2 RAS Extension, SEA are usually triggered when memory errors >> are consumed. In some cases, if the error address is in a clean page or a >> read-only page, there is a chance to recover. Such as error occurs in a >> instruction page, we can reread this page from disk instead of killing >> process. > >> Because memory_failure() may sleep, we can not call it directly in SEA >> exception context. > > This is why we have memory_failure_queue() instead, it ... bother. That > doesn't > look nmi-safe. (I thought this ended with an llist, but clearly I was looking > at > the wrong thing). > > It doesn't look like this is a problem for NOTIFY_SEA as it would only > interrupt > itself on the same CPU if the memory-failure code/data were corrupt. (which is > not a case we can handle). We need to fix this before any of the asynchronous > NMI-like RAS notifications for arm64 get merged. > > (this is one problem, but I don't think its 'the' problem you are trying to > solve with this series). > > >> So we saved faulting physical address associated with >> a process in the ghes handler and set __TIF_SEA_NOTIFY. > > A per-notification type TIF flag looks fishy, surely this would affect all > NMI-like RAS notification methods? > > >> When we return >> from SEA exception context and get into do_notify_resume() before the >> process running, we could check it and call memory_failure() to do >> recovery. It's safe, because we are in process context. > > I'm afraid I don't think this is the best approach for fixing this. > Its tied to the notification type, but the notification should be irrelevant > once we call ghes_proc(). > It adds code poking around in CPER and ACPI/GHES to the arm64 arch code, all > of > this should be in the core common code. > Most importantly: this means arm64 behaves differently with regard to handling > memory errors to other architectures using ACPI. Two behaviours means twice > the > code, review and bugs... I think we can use arch_apei_report_mem_error() to report memory error, just like what is implemented in x86. We can record the memory address in somewhere else, namely report the error. Then the SEA handler can use the memory error reported. > > > Delaying the handling until we re-enter user-space means faults that may > affect > the kernel aren't handled until much later. Just because the fault was > synchronous and user-space was running doesn't mean only user space is > affected. > Some examples I've collected so far: the zero-page may be corrupt, this is > mapped into every process and used by the kernel. Similarly corruption in the > vdso affects all user-space. The fault may affect the page tables, this > affects > all users of the mm_struct. I am not sure if memory_failure can recognize the zero page as a kernel page and just panic. If several processes share a same page, I think we don't have to kill the process until it access the address containing the errors. If the process access the error address, a SEA will occur again, then we can handle it. If the process will never access the error address, such as a syscall in vdso the process won't use, we don't have to kill the process. For vdso case, we havn't repaired the error. It will affect newly created processes. I don't figure out how to handle this. Is it a kernel page mapped into user space or memory_failure will just panic when handling this page. > > (I'm sure we agree that an synchronous-external-abort interrupting the kernel > is > fatal for the kernel, but the other way round isn't always true). > > Setting a TIF flag to handle the error before re-entering user-space is a > problem as the scheduler may choose to pre-empt this task and run all the > other > tasks before this eventually gets handled. > > > Assuming this is just a problem with memory_failure_queue(), two alternatives > I > can suggest are making memory_failure_queue() nmi-safe, Yes, I agree, but it may be hard to be achieved. or abstracting > NOTIFY_NMI's estatus pool/cache to use for the arm64 NMI-like notifications > too. > > If there is more to this, can you explain the problem you're trying to solve? > (I suspect there may be an issue with multiple-signals being merged, or > exactly > when memory_failure_queue()'s work gets run.) Can you outline the sequence of > events? > > > You're picking a physical address out of 'ARM Processor Error Information > Structure', these correspond with Cache, TLB, Bus or (the mysterious) 'micro > architectural error'. I don't see anything checking the error type. > Given the physical address, are you adding error-handling for cache-errors > with > this series? > > > Thanks, > > James > > > . > Thanks, Xiongfeng Wang
Re: [PATCH v3 1/3] arm64/ras: support sea error recovery
Hi James, Thanks for your comments. On 2017/9/9 2:15, James Morse wrote: > Hi Xie XiuQi, > > (Sorry a few versions of this went past before I caught up with it) > > On 07/09/17 08:45, Xie XiuQi wrote: >> With ARM v8.2 RAS Extension, SEA are usually triggered when memory errors >> are consumed. In some cases, if the error address is in a clean page or a >> read-only page, there is a chance to recover. Such as error occurs in a >> instruction page, we can reread this page from disk instead of killing >> process. > >> Because memory_failure() may sleep, we can not call it directly in SEA >> exception context. > > This is why we have memory_failure_queue() instead, it ... bother. That > doesn't > look nmi-safe. (I thought this ended with an llist, but clearly I was looking > at > the wrong thing). > > It doesn't look like this is a problem for NOTIFY_SEA as it would only > interrupt > itself on the same CPU if the memory-failure code/data were corrupt. (which is > not a case we can handle). We need to fix this before any of the asynchronous > NMI-like RAS notifications for arm64 get merged. > > (this is one problem, but I don't think its 'the' problem you are trying to > solve with this series). > > >> So we saved faulting physical address associated with >> a process in the ghes handler and set __TIF_SEA_NOTIFY. > > A per-notification type TIF flag looks fishy, surely this would affect all > NMI-like RAS notification methods? > > >> When we return >> from SEA exception context and get into do_notify_resume() before the >> process running, we could check it and call memory_failure() to do >> recovery. It's safe, because we are in process context. > > I'm afraid I don't think this is the best approach for fixing this. > Its tied to the notification type, but the notification should be irrelevant > once we call ghes_proc(). > It adds code poking around in CPER and ACPI/GHES to the arm64 arch code, all > of > this should be in the core common code. > Most importantly: this means arm64 behaves differently with regard to handling > memory errors to other architectures using ACPI. Two behaviours means twice > the > code, review and bugs... I think we can use arch_apei_report_mem_error() to report memory error, just like what is implemented in x86. We can record the memory address in somewhere else, namely report the error. Then the SEA handler can use the memory error reported. > > > Delaying the handling until we re-enter user-space means faults that may > affect > the kernel aren't handled until much later. Just because the fault was > synchronous and user-space was running doesn't mean only user space is > affected. > Some examples I've collected so far: the zero-page may be corrupt, this is > mapped into every process and used by the kernel. Similarly corruption in the > vdso affects all user-space. The fault may affect the page tables, this > affects > all users of the mm_struct. I am not sure if memory_failure can recognize the zero page as a kernel page and just panic. If several processes share a same page, I think we don't have to kill the process until it access the address containing the errors. If the process access the error address, a SEA will occur again, then we can handle it. If the process will never access the error address, such as a syscall in vdso the process won't use, we don't have to kill the process. For vdso case, we havn't repaired the error. It will affect newly created processes. I don't figure out how to handle this. Is it a kernel page mapped into user space or memory_failure will just panic when handling this page. > > (I'm sure we agree that an synchronous-external-abort interrupting the kernel > is > fatal for the kernel, but the other way round isn't always true). > > Setting a TIF flag to handle the error before re-entering user-space is a > problem as the scheduler may choose to pre-empt this task and run all the > other > tasks before this eventually gets handled. > > > Assuming this is just a problem with memory_failure_queue(), two alternatives > I > can suggest are making memory_failure_queue() nmi-safe, Yes, I agree, but it may be hard to be achieved. or abstracting > NOTIFY_NMI's estatus pool/cache to use for the arm64 NMI-like notifications > too. > > If there is more to this, can you explain the problem you're trying to solve? > (I suspect there may be an issue with multiple-signals being merged, or > exactly > when memory_failure_queue()'s work gets run.) Can you outline the sequence of > events? > > > You're picking a physical address out of 'ARM Processor Error Information > Structure', these correspond with Cache, TLB, Bus or (the mysterious) 'micro > architectural error'. I don't see anything checking the error type. > Given the physical address, are you adding error-handling for cache-errors > with > this series? > > > Thanks, > > James > > > . > Thanks, Xiongfeng Wang
[PATCH] media: i2c: OV5647: ensure clock lane in LP-11 state before streaming on
When I was supporting Rpi Camera Module on the ASUS Tinker board, I found this driver have some issues with rockchip's mipi-csi driver. It didn't place clock lane in LP-11 state before performing D-PHY initialisation. >From our experience, on some OV sensors, LP-11 state is not achieved while BIT(5)-0x4800 is cleared. So let's set BIT(5) and BIT(0) both while not streaming, in order to coax the clock lane into LP-11 state. 0x4800 : MIPI CTRL 00 BIT(5) : clock lane gate enable 0: continuous 1: none-continuous BIT(0) : manually set clock lane 0: Not used 1: used Changes in V2: modify commit messages Signed-off-by: Jacob Chen--- drivers/media/i2c/ov5647.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/media/i2c/ov5647.c b/drivers/media/i2c/ov5647.c index 95ce90f..247302d 100644 --- a/drivers/media/i2c/ov5647.c +++ b/drivers/media/i2c/ov5647.c @@ -253,6 +253,10 @@ static int ov5647_stream_on(struct v4l2_subdev *sd) { int ret; + ret = ov5647_write(sd, 0x4800, 0x04); + if (ret < 0) + return ret; + ret = ov5647_write(sd, 0x4202, 0x00); if (ret < 0) return ret; @@ -264,6 +268,10 @@ static int ov5647_stream_off(struct v4l2_subdev *sd) { int ret; + ret = ov5647_write(sd, 0x4800, 0x25); + if (ret < 0) + return ret; + ret = ov5647_write(sd, 0x4202, 0x0f); if (ret < 0) return ret; @@ -320,7 +328,10 @@ static int __sensor_init(struct v4l2_subdev *sd) return ret; } - return ov5647_write(sd, 0x4800, 0x04); + /* +* stream off to make the clock lane into LP-11 state. +*/ + return ov5647_stream_off(sd); } static int ov5647_sensor_power(struct v4l2_subdev *sd, int on) -- 2.7.4
[PATCH] media: i2c: OV5647: ensure clock lane in LP-11 state before streaming on
When I was supporting Rpi Camera Module on the ASUS Tinker board, I found this driver have some issues with rockchip's mipi-csi driver. It didn't place clock lane in LP-11 state before performing D-PHY initialisation. >From our experience, on some OV sensors, LP-11 state is not achieved while BIT(5)-0x4800 is cleared. So let's set BIT(5) and BIT(0) both while not streaming, in order to coax the clock lane into LP-11 state. 0x4800 : MIPI CTRL 00 BIT(5) : clock lane gate enable 0: continuous 1: none-continuous BIT(0) : manually set clock lane 0: Not used 1: used Changes in V2: modify commit messages Signed-off-by: Jacob Chen --- drivers/media/i2c/ov5647.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/media/i2c/ov5647.c b/drivers/media/i2c/ov5647.c index 95ce90f..247302d 100644 --- a/drivers/media/i2c/ov5647.c +++ b/drivers/media/i2c/ov5647.c @@ -253,6 +253,10 @@ static int ov5647_stream_on(struct v4l2_subdev *sd) { int ret; + ret = ov5647_write(sd, 0x4800, 0x04); + if (ret < 0) + return ret; + ret = ov5647_write(sd, 0x4202, 0x00); if (ret < 0) return ret; @@ -264,6 +268,10 @@ static int ov5647_stream_off(struct v4l2_subdev *sd) { int ret; + ret = ov5647_write(sd, 0x4800, 0x25); + if (ret < 0) + return ret; + ret = ov5647_write(sd, 0x4202, 0x0f); if (ret < 0) return ret; @@ -320,7 +328,10 @@ static int __sensor_init(struct v4l2_subdev *sd) return ret; } - return ov5647_write(sd, 0x4800, 0x04); + /* +* stream off to make the clock lane into LP-11 state. +*/ + return ov5647_stream_off(sd); } static int ov5647_sensor_power(struct v4l2_subdev *sd, int on) -- 2.7.4
Re: Current mainline git (24e700e291d52bd2) hangs when building e.g. perf
On Sun, Sep 10, 2017 at 6:12 PM, Rik van Rielwrote: > On Sat, 2017-09-09 at 12:28 -0700, Andy Lutomirski wrote: >> - >> I propose the following fix. If PCID is on, then, in >> enter_lazy_tlb(), we switch to init_mm with the no-flush flag set. >> (And we give init_mm its own dedicated ASID to keep it simple and >> fast >> -- no need to use the LRU ASID mapping to assign one >> dynamically.) We >> clear the bit in mm_cpumask. That is, we more or less just skip the >> whole lazy TLB optimization and rely on PCID CPUs having reasonably >> fast CR3 writes. No extra IPIs. > > Avoiding the IPIs is probably what matters the most, especially > on systems with deep C states, and virtual machines where the > host may be running something else, causing the IPI service time > to go through the roof for idle VCPUs. > >> Also, sorry Rik, this means your old increased laziness optimization >> is dead in the water. It will have exactly the same speculative load >> problem. > > Doesn't a memory barrier solve that speculative load > problem? > > The memory barrier could be added only to the path > that potentially skips reloading the TLB, under the > assumption that a memory barrier is cheaper than a > TLB reload (even with ASID). No, nothing stops the problematic speculative load. Here's the issue. One CPU removes a reference to a page table from a higher-level page table, flushes, and then frees the page table. Then it re-allocates it and writes something unrelated there. Another CPU that has CR3 pointing to the page hierarchy in question could have a reference to the freed table in its paging structure cache. Even if it's guaranteed to not try to access the addresses in question (because they're user addresses and the other CPU is in kernel mode, etc), but there is never a guarantee that the CPU doesn't randomly try to fill its TLB for the affected addresses. This results in invalid PTEs in the TLB, possible accesses using bogus memory types, and maybe even reads from IO space. It looks like we actually need to propagate flushes everywhere that could have references to the flushed range, even if the software won't access that range.
Re: Current mainline git (24e700e291d52bd2) hangs when building e.g. perf
On Sun, Sep 10, 2017 at 6:12 PM, Rik van Riel wrote: > On Sat, 2017-09-09 at 12:28 -0700, Andy Lutomirski wrote: >> - >> I propose the following fix. If PCID is on, then, in >> enter_lazy_tlb(), we switch to init_mm with the no-flush flag set. >> (And we give init_mm its own dedicated ASID to keep it simple and >> fast >> -- no need to use the LRU ASID mapping to assign one >> dynamically.) We >> clear the bit in mm_cpumask. That is, we more or less just skip the >> whole lazy TLB optimization and rely on PCID CPUs having reasonably >> fast CR3 writes. No extra IPIs. > > Avoiding the IPIs is probably what matters the most, especially > on systems with deep C states, and virtual machines where the > host may be running something else, causing the IPI service time > to go through the roof for idle VCPUs. > >> Also, sorry Rik, this means your old increased laziness optimization >> is dead in the water. It will have exactly the same speculative load >> problem. > > Doesn't a memory barrier solve that speculative load > problem? > > The memory barrier could be added only to the path > that potentially skips reloading the TLB, under the > assumption that a memory barrier is cheaper than a > TLB reload (even with ASID). No, nothing stops the problematic speculative load. Here's the issue. One CPU removes a reference to a page table from a higher-level page table, flushes, and then frees the page table. Then it re-allocates it and writes something unrelated there. Another CPU that has CR3 pointing to the page hierarchy in question could have a reference to the freed table in its paging structure cache. Even if it's guaranteed to not try to access the addresses in question (because they're user addresses and the other CPU is in kernel mode, etc), but there is never a guarantee that the CPU doesn't randomly try to fill its TLB for the affected addresses. This results in invalid PTEs in the TLB, possible accesses using bogus memory types, and maybe even reads from IO space. It looks like we actually need to propagate flushes everywhere that could have references to the flushed range, even if the software won't access that range.
Re: [PATCH] scsi: shost->async_scan should be protected by mutex_lock
On 09/07/2017 11:54 PM, Ouyangzhaowei (Charles) wrote: > shost->async_scan should be protected by mutex_lock, otherwise the check > of "called twice" won't work. > > Signed-off-by: Ouyang Zhaowei> --- > drivers/scsi/scsi_scan.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c > index fd88dab..1d1df51 100644 > --- a/drivers/scsi/scsi_scan.c > +++ b/drivers/scsi/scsi_scan.c > @@ -1722,6 +1722,7 @@ static struct async_scan_data > *scsi_prep_async_scan(struct Scsi_Host *shost) > if (strncmp(scsi_scan_type, "sync", 4) == 0) > return NULL; > > + mutex_lock(>scan_mutex); The mutex will not be unlocked in the event that either the host has called scsi_prep_async_scan() twice, or a condition is meet the branches to the "err" label prior to where the original mutex_lock() was located below. -Tyrel > if (shost->async_scan) { > shost_printk(KERN_DEBUG, shost, "%s called twice\n", > __func__); > return NULL; > @@ -1735,7 +1736,6 @@ static struct async_scan_data > *scsi_prep_async_scan(struct Scsi_Host *shost) > goto err; > init_completion(>prev_finished); > > - mutex_lock(>scan_mutex); > spin_lock_irqsave(shost->host_lock, flags); > shost->async_scan = 1; > spin_unlock_irqrestore(shost->host_lock, flags); >
Re: [PATCH] scsi: shost->async_scan should be protected by mutex_lock
On 09/07/2017 11:54 PM, Ouyangzhaowei (Charles) wrote: > shost->async_scan should be protected by mutex_lock, otherwise the check > of "called twice" won't work. > > Signed-off-by: Ouyang Zhaowei > --- > drivers/scsi/scsi_scan.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c > index fd88dab..1d1df51 100644 > --- a/drivers/scsi/scsi_scan.c > +++ b/drivers/scsi/scsi_scan.c > @@ -1722,6 +1722,7 @@ static struct async_scan_data > *scsi_prep_async_scan(struct Scsi_Host *shost) > if (strncmp(scsi_scan_type, "sync", 4) == 0) > return NULL; > > + mutex_lock(>scan_mutex); The mutex will not be unlocked in the event that either the host has called scsi_prep_async_scan() twice, or a condition is meet the branches to the "err" label prior to where the original mutex_lock() was located below. -Tyrel > if (shost->async_scan) { > shost_printk(KERN_DEBUG, shost, "%s called twice\n", > __func__); > return NULL; > @@ -1735,7 +1736,6 @@ static struct async_scan_data > *scsi_prep_async_scan(struct Scsi_Host *shost) > goto err; > init_completion(>prev_finished); > > - mutex_lock(>scan_mutex); > spin_lock_irqsave(shost->host_lock, flags); > shost->async_scan = 1; > spin_unlock_irqrestore(shost->host_lock, flags); >
Re: [PATCH v1 1/4] perf annotate: create a new hists to manage multiple events samples
On 9/8/2017 9:43 PM, Arnaldo Carvalho de Melo wrote: Em Wed, Aug 16, 2017 at 06:18:33PM +0800, Jin Yao escreveu: An issue is found during using perf annotate. perf record -e cycles,branches ... perf annotate main --stdio The result only shows cycles. It should show both cycles and branches on the left side. It works with "--group", but need this to work even without groups. In current design, the hists is per event. So we need a new hists to manage the samples for multiple events and use a new hist_event data structure to save the map/symbol information for per event. Humm, why do we need another hists? Don't we have one per evsel, don't we have a evlist from where to get all of those evsels, can't we just use that to add one column per evsel? - Arnaldo Hi Arnaldo, I'm considering a case. Suppose we sample 2 events ("branches" and "cache-misses"). The samples of "branches" are hit in function A and the samples of "cache-misses" are hit in function B. The branches evsel has one hists and cache-misses evsel has another hists. The hists of branches evsel has one hist-entry which stands for the function A symbol. The hists of cache-misses evsel has one hist-entry which stands for the function B symbol. If we start to show the instructions in function B from cache-misses evsel, we will lose the function A. Because even if we get the branches evsel from the link in cache-misses evsel, but the function A is before function B and function B has been displayed yet, so the function A is lost. Considering the number of events can be greater than 2, the code will be much more complicated. So using a global hists should be an easy solution. Thanks Jin Yao
Re: [PATCH v1 1/4] perf annotate: create a new hists to manage multiple events samples
On 9/8/2017 9:43 PM, Arnaldo Carvalho de Melo wrote: Em Wed, Aug 16, 2017 at 06:18:33PM +0800, Jin Yao escreveu: An issue is found during using perf annotate. perf record -e cycles,branches ... perf annotate main --stdio The result only shows cycles. It should show both cycles and branches on the left side. It works with "--group", but need this to work even without groups. In current design, the hists is per event. So we need a new hists to manage the samples for multiple events and use a new hist_event data structure to save the map/symbol information for per event. Humm, why do we need another hists? Don't we have one per evsel, don't we have a evlist from where to get all of those evsels, can't we just use that to add one column per evsel? - Arnaldo Hi Arnaldo, I'm considering a case. Suppose we sample 2 events ("branches" and "cache-misses"). The samples of "branches" are hit in function A and the samples of "cache-misses" are hit in function B. The branches evsel has one hists and cache-misses evsel has another hists. The hists of branches evsel has one hist-entry which stands for the function A symbol. The hists of cache-misses evsel has one hist-entry which stands for the function B symbol. If we start to show the instructions in function B from cache-misses evsel, we will lose the function A. Because even if we get the branches evsel from the link in cache-misses evsel, but the function A is before function B and function B has been displayed yet, so the function A is lost. Considering the number of events can be greater than 2, the code will be much more complicated. So using a global hists should be an easy solution. Thanks Jin Yao
[git pull] m68knommu changes for v4.14
Hi Linus, Can you please pull the m68knommu git tree, for-next branch. Only 2 changes. One removes unused code, the other makes local clock code arguments consistent with generic clock code. Regards Greg The following changes since commit cc4a41fe5541a73019a864883297bd5043aa6d98: Linux 4.13-rc7 (2017-08-27 17:20:40 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu.git for-next for you to fetch changes up to 8a1943492bd629b7eae64535a96fcb959a01bbd0: m68knommu: remove dead code (2017-08-28 10:40:30 +1000) Alexandre Belloni (1): m68knommu: remove dead code Jonas Gorski (1): m68k: allow NULL clock for clk_get_rate arch/m68k/coldfire/clk.c| 3 +++ arch/m68k/coldfire/m5441x.c | 37 - 2 files changed, 3 insertions(+), 37 deletions(-)
[git pull] m68knommu changes for v4.14
Hi Linus, Can you please pull the m68knommu git tree, for-next branch. Only 2 changes. One removes unused code, the other makes local clock code arguments consistent with generic clock code. Regards Greg The following changes since commit cc4a41fe5541a73019a864883297bd5043aa6d98: Linux 4.13-rc7 (2017-08-27 17:20:40 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu.git for-next for you to fetch changes up to 8a1943492bd629b7eae64535a96fcb959a01bbd0: m68knommu: remove dead code (2017-08-28 10:40:30 +1000) Alexandre Belloni (1): m68knommu: remove dead code Jonas Gorski (1): m68k: allow NULL clock for clk_get_rate arch/m68k/coldfire/clk.c| 3 +++ arch/m68k/coldfire/m5441x.c | 37 - 2 files changed, 3 insertions(+), 37 deletions(-)
linux-next: manual merge of the aio tree with the vfs tree
Hi Benjamin, Today's linux-next merge of the aio tree got a conflict in: fs/aio.c between commit: 32ec9f249d65 ("io_getevents: Use timespec64 to represent timeouts") from the vfs tree and commit: eb5263749f68 ("aio: handle integer overflow in io_getevents() timespec usage") from the aio tree. I fixed it up (I just dropped the change in the latter commit) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell
linux-next: manual merge of the aio tree with the vfs tree
Hi Benjamin, Today's linux-next merge of the aio tree got a conflict in: fs/aio.c between commit: 32ec9f249d65 ("io_getevents: Use timespec64 to represent timeouts") from the vfs tree and commit: eb5263749f68 ("aio: handle integer overflow in io_getevents() timespec usage") from the aio tree. I fixed it up (I just dropped the change in the latter commit) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell
Re: btusb "firmware request while host is not available" at resume
On Sun, Sep 10, 2017 at 12:26:02PM -0700, Linus Torvalds wrote: > This seems to be a new problem at resume for the Intel btusb driver, > but I'm not seeing anything in that driver itself that looks like a > likely trigger, so I wonder if it's some driver core change, a generic > resume path issue, or a workqueue change that has made it trigger for > me. > > It might also just be a timing difference, maybe it's always been there? > > Does anybody have any ideas? It does't happen on every resume, and the > machine works despite this (but no bluetooth - the *next* resume might > bring it back, though). Ah, it's not just me having this problem. I don't see it happening in 4.12, and haven't had the time to bisect it. I seem to be able to trigger it every suspend/resume cycle, so I don't know if it's a timing issue. > > Linus > > -- > > ACPI: Low-level resume complete > ACPI: EC: EC started > PM: Restoring platform NVS memory > Enabling non-boot CPUs ... > x86: Booting SMP configuration: > smpboot: Booting Node 0 Processor 1 APIC 0x2 >cache: parent cpu1 should not be sleeping > CPU1 is up > smpboot: Booting Node 0 Processor 2 APIC 0x1 >cache: parent cpu2 should not be sleeping > CPU2 is up > smpboot: Booting Node 0 Processor 3 APIC 0x3 >cache: parent cpu3 should not be sleeping > CPU3 is up > ACPI: Waking up from system sleep state S3 > ACPI: EC: event unblocked > usb 1-3: reset full-speed USB device number 2 using xhci_hcd > usb 1-4: reset full-speed USB device number 3 using xhci_hcd > usb 1-5: reset high-speed USB device number 4 using xhci_hcd > usb 1-3:1.0: rebind failed: -517 > usb 1-3:1.1: rebind failed: -517 > Bluetooth: hci0: Bootloader revision 0.0 build 2 week 52 2014 > OOM killer enabled. > Restarting tasks ... > Bluetooth: hci0: Device revision is 5 > Bluetooth: hci0: Secure boot is enabled > Bluetooth: hci0: OTP lock is enabled > Bluetooth: hci0: API lock is enabled > Bluetooth: hci0: Debug lock is disabled > Bluetooth: hci0: Minimum firmware build 1 week 10 2014 > firmware request while host is not available > [ cut here ] > WARNING: CPU: 2 PID: 621 at drivers/base/firmware_class.c:1250 > _request_firmware+0x460/0x790 > CPU: 2 PID: 621 Comm: kworker/u9:2 Not tainted 4.13.0-10313-ge860d2c904d1 > #11 > Hardware name: Dell Inc. XPS 13 9350/09JHRY, BIOS 1.4.17 05/10/2017 > Workqueue: hci0 hci_power_on [bluetooth] > task: 8d3767895ac0 task.stack: 9d3481efc000 > RIP: 0010:_request_firmware+0x460/0x790 > Call Trace: >request_firmware+0x37/0x50 >btusb_setup_intel_new+0x227/0x7e0 [btusb] >hci_dev_do_open+0x3da/0x570 [bluetooth] >hci_power_on+0x52/0x1f0 [bluetooth] >process_one_work+0x1db/0x3d0 >worker_thread+0x47/0x3e0 >kthread+0x125/0x140 >ret_from_fork+0x22/0x30 > ---[ end trace 007b222491432927 ]--- > Bluetooth: hci0: Failed to load Intel firmware file (-112) > [drm] RC6 on > done. > thermal thermal_zone11: failed to read out thermal zone (-5) > PM: suspend exit Ah, I'll blame Luis for this, I think it might be due to 81f95076281f ("firmware: add sanity check on shutdown/suspend") Luis, any ideas? I'll try to revert this and try it out tomorrow when I get a chance. thanks, greg k-h
Re: btusb "firmware request while host is not available" at resume
On Sun, Sep 10, 2017 at 12:26:02PM -0700, Linus Torvalds wrote: > This seems to be a new problem at resume for the Intel btusb driver, > but I'm not seeing anything in that driver itself that looks like a > likely trigger, so I wonder if it's some driver core change, a generic > resume path issue, or a workqueue change that has made it trigger for > me. > > It might also just be a timing difference, maybe it's always been there? > > Does anybody have any ideas? It does't happen on every resume, and the > machine works despite this (but no bluetooth - the *next* resume might > bring it back, though). Ah, it's not just me having this problem. I don't see it happening in 4.12, and haven't had the time to bisect it. I seem to be able to trigger it every suspend/resume cycle, so I don't know if it's a timing issue. > > Linus > > -- > > ACPI: Low-level resume complete > ACPI: EC: EC started > PM: Restoring platform NVS memory > Enabling non-boot CPUs ... > x86: Booting SMP configuration: > smpboot: Booting Node 0 Processor 1 APIC 0x2 >cache: parent cpu1 should not be sleeping > CPU1 is up > smpboot: Booting Node 0 Processor 2 APIC 0x1 >cache: parent cpu2 should not be sleeping > CPU2 is up > smpboot: Booting Node 0 Processor 3 APIC 0x3 >cache: parent cpu3 should not be sleeping > CPU3 is up > ACPI: Waking up from system sleep state S3 > ACPI: EC: event unblocked > usb 1-3: reset full-speed USB device number 2 using xhci_hcd > usb 1-4: reset full-speed USB device number 3 using xhci_hcd > usb 1-5: reset high-speed USB device number 4 using xhci_hcd > usb 1-3:1.0: rebind failed: -517 > usb 1-3:1.1: rebind failed: -517 > Bluetooth: hci0: Bootloader revision 0.0 build 2 week 52 2014 > OOM killer enabled. > Restarting tasks ... > Bluetooth: hci0: Device revision is 5 > Bluetooth: hci0: Secure boot is enabled > Bluetooth: hci0: OTP lock is enabled > Bluetooth: hci0: API lock is enabled > Bluetooth: hci0: Debug lock is disabled > Bluetooth: hci0: Minimum firmware build 1 week 10 2014 > firmware request while host is not available > [ cut here ] > WARNING: CPU: 2 PID: 621 at drivers/base/firmware_class.c:1250 > _request_firmware+0x460/0x790 > CPU: 2 PID: 621 Comm: kworker/u9:2 Not tainted 4.13.0-10313-ge860d2c904d1 > #11 > Hardware name: Dell Inc. XPS 13 9350/09JHRY, BIOS 1.4.17 05/10/2017 > Workqueue: hci0 hci_power_on [bluetooth] > task: 8d3767895ac0 task.stack: 9d3481efc000 > RIP: 0010:_request_firmware+0x460/0x790 > Call Trace: >request_firmware+0x37/0x50 >btusb_setup_intel_new+0x227/0x7e0 [btusb] >hci_dev_do_open+0x3da/0x570 [bluetooth] >hci_power_on+0x52/0x1f0 [bluetooth] >process_one_work+0x1db/0x3d0 >worker_thread+0x47/0x3e0 >kthread+0x125/0x140 >ret_from_fork+0x22/0x30 > ---[ end trace 007b222491432927 ]--- > Bluetooth: hci0: Failed to load Intel firmware file (-112) > [drm] RC6 on > done. > thermal thermal_zone11: failed to read out thermal zone (-5) > PM: suspend exit Ah, I'll blame Luis for this, I think it might be due to 81f95076281f ("firmware: add sanity check on shutdown/suspend") Luis, any ideas? I'll try to revert this and try it out tomorrow when I get a chance. thanks, greg k-h