date:20180827

Re: [PATCH v2 1/6] dt-bindings: reset: Add PDC Global binding for SDM845 SoCs

2018-08-27 Thread Matthias Kaehlcke

Hi Sibi,

On Fri, Aug 24, 2018 at 06:48:55PM +0530, Sibi Sankar wrote:
> Add PDC Global(Power Domain Controller) binding for SDM845 SoCs.

nit: missing blank before the opening parenthesis.

> 
> Signed-off-by: Sibi Sankar 
> ---
>  .../bindings/reset/qcom,pdc-global.txt| 52 +++
>  include/dt-bindings/reset/qcom,sdm845-pdc.h   | 20 +++
>  2 files changed, 72 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/reset/qcom,pdc-global.txt
>  create mode 100644 include/dt-bindings/reset/qcom,sdm845-pdc.h
> 
> diff --git a/Documentation/devicetree/bindings/reset/qcom,pdc-global.txt 
> b/Documentation/devicetree/bindings/reset/qcom,pdc-global.txt
> new file mode 100644
> index ..69f9edca9503
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/reset/qcom,pdc-global.txt
> @@ -0,0 +1,52 @@
> +PDC Global
> +==
> +
> +This binding describes a reset-controller found on PDC-Global(Power Domain
> +Controller) block for Qualcomm Technologies Inc SDM845 SoCs.

Are there other PDC reset controllers that aren't 'global'? Otherwise
I'd suggest to use 'pdc-reset' instead of 'pdc-global', which is more
specific and in line with the name of the driver added by this series.
Or something like 'pdc-reset-global/main' if there are other
controllers?

> +Required properties:
> +- compatible:
> + Usage: required
> + Value type: 
> + Definition: must be:
> + "qcom,sdm845-pdc-global"
> +
> +- reg:
> + Usage: required
> + Value type: 
> + Definition: must specify the base address and size of the register
> + space.
> +
> +- #reset-cells:
> + Usage: required
> + Value type: 
> + Definition: must be 1; cell entry represents the reset index.
> +
> +Example:
> +
> +pdc_reset: reset-controller@b2e {
> + compatible = "qcom,sdm845-pdc-global";
> + reg = <0xb2e 0x2>;
> + #reset-cells = <1>;
> +};
> +
> +PDC reset clients
> +==
> +
> +Device nodes that need access to reset lines should
> +specify them as a reset phandle in their corresponding node as
> +specified in reset.txt.
> +
> +For list of all valid reset indicies see

s/indicies/indices/ (or s/indicies/lines/ ?)

Cheers

Matthias

Re: [PATCH v2 1/6] dt-bindings: reset: Add PDC Global binding for SDM845 SoCs

2018-08-27 Thread Matthias Kaehlcke

Hi Sibi,

On Fri, Aug 24, 2018 at 06:48:55PM +0530, Sibi Sankar wrote:
> Add PDC Global(Power Domain Controller) binding for SDM845 SoCs.

nit: missing blank before the opening parenthesis.

> 
> Signed-off-by: Sibi Sankar 
> ---
>  .../bindings/reset/qcom,pdc-global.txt| 52 +++
>  include/dt-bindings/reset/qcom,sdm845-pdc.h   | 20 +++
>  2 files changed, 72 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/reset/qcom,pdc-global.txt
>  create mode 100644 include/dt-bindings/reset/qcom,sdm845-pdc.h
> 
> diff --git a/Documentation/devicetree/bindings/reset/qcom,pdc-global.txt 
> b/Documentation/devicetree/bindings/reset/qcom,pdc-global.txt
> new file mode 100644
> index ..69f9edca9503
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/reset/qcom,pdc-global.txt
> @@ -0,0 +1,52 @@
> +PDC Global
> +==
> +
> +This binding describes a reset-controller found on PDC-Global(Power Domain
> +Controller) block for Qualcomm Technologies Inc SDM845 SoCs.

Are there other PDC reset controllers that aren't 'global'? Otherwise
I'd suggest to use 'pdc-reset' instead of 'pdc-global', which is more
specific and in line with the name of the driver added by this series.
Or something like 'pdc-reset-global/main' if there are other
controllers?

> +Required properties:
> +- compatible:
> + Usage: required
> + Value type: 
> + Definition: must be:
> + "qcom,sdm845-pdc-global"
> +
> +- reg:
> + Usage: required
> + Value type: 
> + Definition: must specify the base address and size of the register
> + space.
> +
> +- #reset-cells:
> + Usage: required
> + Value type: 
> + Definition: must be 1; cell entry represents the reset index.
> +
> +Example:
> +
> +pdc_reset: reset-controller@b2e {
> + compatible = "qcom,sdm845-pdc-global";
> + reg = <0xb2e 0x2>;
> + #reset-cells = <1>;
> +};
> +
> +PDC reset clients
> +==
> +
> +Device nodes that need access to reset lines should
> +specify them as a reset phandle in their corresponding node as
> +specified in reset.txt.
> +
> +For list of all valid reset indicies see

s/indicies/indices/ (or s/indicies/lines/ ?)

Cheers

Matthias

Re: [PATCH 4/7] mm/hmm: properly handle migration pmd

2018-08-27 Thread Jerome Glisse

On Fri, Aug 24, 2018 at 08:05:46PM -0400, Zi Yan wrote:
> Hi Jérôme,
> 
> On 24 Aug 2018, at 15:25, jgli...@redhat.com wrote:
> 
> > From: Jérôme Glisse 
> >
> > Before this patch migration pmd entry (!pmd_present()) would have
> > been treated as a bad entry (pmd_bad() returns true on migration
> > pmd entry). The outcome was that device driver would believe that
> > the range covered by the pmd was bad and would either SIGBUS or
> > simply kill all the device's threads (each device driver decide
> > how to react when the device tries to access poisonnous or invalid
> > range of memory).
> >
> > This patch explicitly handle the case of migration pmd entry which
> > are non present pmd entry and either wait for the migration to
> > finish or report empty range (when device is just trying to pre-
> > fill a range of virtual address and thus do not want to wait or
> > trigger page fault).
> >
> > Signed-off-by: Aneesh Kumar K.V 
> > Signed-off-by: Jérôme Glisse 
> > Cc: Ralph Campbell 
> > Cc: John Hubbard 
> > Cc: Andrew Morton 
> > ---
> >  mm/hmm.c | 45 +++--
> >  1 file changed, 39 insertions(+), 6 deletions(-)
> >
> > diff --git a/mm/hmm.c b/mm/hmm.c
> > index a16678d08127..659efc9aada6 100644
> > --- a/mm/hmm.c
> > +++ b/mm/hmm.c
> > @@ -577,22 +577,47 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
> >  {
> > struct hmm_vma_walk *hmm_vma_walk = walk->private;
> > struct hmm_range *range = hmm_vma_walk->range;
> > +   struct vm_area_struct *vma = walk->vma;
> > uint64_t *pfns = range->pfns;
> > unsigned long addr = start, i;
> > pte_t *ptep;
> > +   pmd_t pmd;
> >
> > -   i = (addr - range->start) >> PAGE_SHIFT;
> >
> >  again:
> > -   if (pmd_none(*pmdp))
> > +   pmd = READ_ONCE(*pmdp);
> > +   if (pmd_none(pmd))
> > return hmm_vma_walk_hole(start, end, walk);
> >
> > -   if (pmd_huge(*pmdp) && (range->vma->vm_flags & VM_HUGETLB))
> > +   if (pmd_huge(pmd) && (range->vma->vm_flags & VM_HUGETLB))
> > return hmm_pfns_bad(start, end, walk);
> >
> > -   if (pmd_devmap(*pmdp) || pmd_trans_huge(*pmdp)) {
> > -   pmd_t pmd;
> > +   if (!pmd_present(pmd)) {
> > +   swp_entry_t entry = pmd_to_swp_entry(pmd);
> > +
> > +   if (is_migration_entry(entry)) {
> 
> I think you should check thp_migration_supported() here, since PMD migration 
> is only enabled in x86_64 systems.
> Other architectures should treat PMD migration entries as bad.

You are right, Andrew do you want to repost or can you edit above if
to:

if (thp_migration_supported() && is_migration_entry(entry)) {

Cheers,
Jérôme

Re: [PATCH 4/7] mm/hmm: properly handle migration pmd

2018-08-27 Thread Jerome Glisse

On Fri, Aug 24, 2018 at 08:05:46PM -0400, Zi Yan wrote:
> Hi Jérôme,
> 
> On 24 Aug 2018, at 15:25, jgli...@redhat.com wrote:
> 
> > From: Jérôme Glisse 
> >
> > Before this patch migration pmd entry (!pmd_present()) would have
> > been treated as a bad entry (pmd_bad() returns true on migration
> > pmd entry). The outcome was that device driver would believe that
> > the range covered by the pmd was bad and would either SIGBUS or
> > simply kill all the device's threads (each device driver decide
> > how to react when the device tries to access poisonnous or invalid
> > range of memory).
> >
> > This patch explicitly handle the case of migration pmd entry which
> > are non present pmd entry and either wait for the migration to
> > finish or report empty range (when device is just trying to pre-
> > fill a range of virtual address and thus do not want to wait or
> > trigger page fault).
> >
> > Signed-off-by: Aneesh Kumar K.V 
> > Signed-off-by: Jérôme Glisse 
> > Cc: Ralph Campbell 
> > Cc: John Hubbard 
> > Cc: Andrew Morton 
> > ---
> >  mm/hmm.c | 45 +++--
> >  1 file changed, 39 insertions(+), 6 deletions(-)
> >
> > diff --git a/mm/hmm.c b/mm/hmm.c
> > index a16678d08127..659efc9aada6 100644
> > --- a/mm/hmm.c
> > +++ b/mm/hmm.c
> > @@ -577,22 +577,47 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
> >  {
> > struct hmm_vma_walk *hmm_vma_walk = walk->private;
> > struct hmm_range *range = hmm_vma_walk->range;
> > +   struct vm_area_struct *vma = walk->vma;
> > uint64_t *pfns = range->pfns;
> > unsigned long addr = start, i;
> > pte_t *ptep;
> > +   pmd_t pmd;
> >
> > -   i = (addr - range->start) >> PAGE_SHIFT;
> >
> >  again:
> > -   if (pmd_none(*pmdp))
> > +   pmd = READ_ONCE(*pmdp);
> > +   if (pmd_none(pmd))
> > return hmm_vma_walk_hole(start, end, walk);
> >
> > -   if (pmd_huge(*pmdp) && (range->vma->vm_flags & VM_HUGETLB))
> > +   if (pmd_huge(pmd) && (range->vma->vm_flags & VM_HUGETLB))
> > return hmm_pfns_bad(start, end, walk);
> >
> > -   if (pmd_devmap(*pmdp) || pmd_trans_huge(*pmdp)) {
> > -   pmd_t pmd;
> > +   if (!pmd_present(pmd)) {
> > +   swp_entry_t entry = pmd_to_swp_entry(pmd);
> > +
> > +   if (is_migration_entry(entry)) {
> 
> I think you should check thp_migration_supported() here, since PMD migration 
> is only enabled in x86_64 systems.
> Other architectures should treat PMD migration entries as bad.

You are right, Andrew do you want to repost or can you edit above if
to:

if (thp_migration_supported() && is_migration_entry(entry)) {

Cheers,
Jérôme

Re: [PATCH v2 1/6] dt-bindings: reset: Add PDC Global binding for SDM845 SoCs

2018-08-27 Thread Bjorn Andersson

On Fri 24 Aug 06:18 PDT 2018, Sibi Sankar wrote:

> Add PDC Global(Power Domain Controller) binding for SDM845 SoCs.
> 
> Signed-off-by: Sibi Sankar 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  .../bindings/reset/qcom,pdc-global.txt| 52 +++
>  include/dt-bindings/reset/qcom,sdm845-pdc.h   | 20 +++
>  2 files changed, 72 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/reset/qcom,pdc-global.txt
>  create mode 100644 include/dt-bindings/reset/qcom,sdm845-pdc.h
> 
> diff --git a/Documentation/devicetree/bindings/reset/qcom,pdc-global.txt 
> b/Documentation/devicetree/bindings/reset/qcom,pdc-global.txt
> new file mode 100644
> index ..69f9edca9503
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/reset/qcom,pdc-global.txt
> @@ -0,0 +1,52 @@
> +PDC Global
> +==
> +
> +This binding describes a reset-controller found on PDC-Global(Power Domain
> +Controller) block for Qualcomm Technologies Inc SDM845 SoCs.
> +
> +Required properties:
> +- compatible:
> + Usage: required
> + Value type: 
> + Definition: must be:
> + "qcom,sdm845-pdc-global"
> +
> +- reg:
> + Usage: required
> + Value type: 
> + Definition: must specify the base address and size of the register
> + space.
> +
> +- #reset-cells:
> + Usage: required
> + Value type: 
> + Definition: must be 1; cell entry represents the reset index.
> +
> +Example:
> +
> +pdc_reset: reset-controller@b2e {
> + compatible = "qcom,sdm845-pdc-global";
> + reg = <0xb2e 0x2>;
> + #reset-cells = <1>;
> +};
> +
> +PDC reset clients
> +==
> +
> +Device nodes that need access to reset lines should
> +specify them as a reset phandle in their corresponding node as
> +specified in reset.txt.
> +
> +For list of all valid reset indicies see
> +
> +
> +Example:
> +
> +modem-pil@408 {
> + ...
> +
> + resets = <_reset PDC_MODEM_SYNC_RESET>;
> + reset-names = "pdc_reset";
> +
> + ...
> +};
> diff --git a/include/dt-bindings/reset/qcom,sdm845-pdc.h 
> b/include/dt-bindings/reset/qcom,sdm845-pdc.h
> new file mode 100644
> index ..53c37f9c319a
> --- /dev/null
> +++ b/include/dt-bindings/reset/qcom,sdm845-pdc.h
> @@ -0,0 +1,20 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2018 The Linux Foundation. All rights reserved.
> + */
> +
> +#ifndef _DT_BINDINGS_RESET_PDC_SDM_845_H
> +#define _DT_BINDINGS_RESET_PDC_SDM_845_H
> +
> +#define PDC_APPS_SYNC_RESET  0
> +#define PDC_SP_SYNC_RESET1
> +#define PDC_AUDIO_SYNC_RESET 2
> +#define PDC_SENSORS_SYNC_RESET   3
> +#define PDC_AOP_SYNC_RESET   4
> +#define PDC_DEBUG_SYNC_RESET 5
> +#define PDC_GPU_SYNC_RESET   6
> +#define PDC_DISPLAY_SYNC_RESET   7
> +#define PDC_COMPUTE_SYNC_RESET   8
> +#define PDC_MODEM_SYNC_RESET 9
> +
> +#endif
> -- 
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>

Re: [PATCH v2 1/6] dt-bindings: reset: Add PDC Global binding for SDM845 SoCs

2018-08-27 Thread Bjorn Andersson

On Fri 24 Aug 06:18 PDT 2018, Sibi Sankar wrote:

> Add PDC Global(Power Domain Controller) binding for SDM845 SoCs.
> 
> Signed-off-by: Sibi Sankar 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  .../bindings/reset/qcom,pdc-global.txt| 52 +++
>  include/dt-bindings/reset/qcom,sdm845-pdc.h   | 20 +++
>  2 files changed, 72 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/reset/qcom,pdc-global.txt
>  create mode 100644 include/dt-bindings/reset/qcom,sdm845-pdc.h
> 
> diff --git a/Documentation/devicetree/bindings/reset/qcom,pdc-global.txt 
> b/Documentation/devicetree/bindings/reset/qcom,pdc-global.txt
> new file mode 100644
> index ..69f9edca9503
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/reset/qcom,pdc-global.txt
> @@ -0,0 +1,52 @@
> +PDC Global
> +==
> +
> +This binding describes a reset-controller found on PDC-Global(Power Domain
> +Controller) block for Qualcomm Technologies Inc SDM845 SoCs.
> +
> +Required properties:
> +- compatible:
> + Usage: required
> + Value type: 
> + Definition: must be:
> + "qcom,sdm845-pdc-global"
> +
> +- reg:
> + Usage: required
> + Value type: 
> + Definition: must specify the base address and size of the register
> + space.
> +
> +- #reset-cells:
> + Usage: required
> + Value type: 
> + Definition: must be 1; cell entry represents the reset index.
> +
> +Example:
> +
> +pdc_reset: reset-controller@b2e {
> + compatible = "qcom,sdm845-pdc-global";
> + reg = <0xb2e 0x2>;
> + #reset-cells = <1>;
> +};
> +
> +PDC reset clients
> +==
> +
> +Device nodes that need access to reset lines should
> +specify them as a reset phandle in their corresponding node as
> +specified in reset.txt.
> +
> +For list of all valid reset indicies see
> +
> +
> +Example:
> +
> +modem-pil@408 {
> + ...
> +
> + resets = <_reset PDC_MODEM_SYNC_RESET>;
> + reset-names = "pdc_reset";
> +
> + ...
> +};
> diff --git a/include/dt-bindings/reset/qcom,sdm845-pdc.h 
> b/include/dt-bindings/reset/qcom,sdm845-pdc.h
> new file mode 100644
> index ..53c37f9c319a
> --- /dev/null
> +++ b/include/dt-bindings/reset/qcom,sdm845-pdc.h
> @@ -0,0 +1,20 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2018 The Linux Foundation. All rights reserved.
> + */
> +
> +#ifndef _DT_BINDINGS_RESET_PDC_SDM_845_H
> +#define _DT_BINDINGS_RESET_PDC_SDM_845_H
> +
> +#define PDC_APPS_SYNC_RESET  0
> +#define PDC_SP_SYNC_RESET1
> +#define PDC_AUDIO_SYNC_RESET 2
> +#define PDC_SENSORS_SYNC_RESET   3
> +#define PDC_AOP_SYNC_RESET   4
> +#define PDC_DEBUG_SYNC_RESET 5
> +#define PDC_GPU_SYNC_RESET   6
> +#define PDC_DISPLAY_SYNC_RESET   7
> +#define PDC_COMPUTE_SYNC_RESET   8
> +#define PDC_MODEM_SYNC_RESET 9
> +
> +#endif
> -- 
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>

Please Dear, I Need An Investment Partner

2018-08-27 Thread Aisha Gaddafi




--
Dear Assalamu Alaikum,
I came across your contact during my private search
Mrs Aisha Al-Qaddafi is my name, the only daughter of late Libyan
president, I have funds the sum
of $27.5 million USD for investment, I am interested in you for
investment project assistance in your country,
i shall compensate you 30% of the total sum after the funds are
transfer into your account,
Greetings from Mrs Aisha Al-Qaddafi
Mrs Aisha Al-Qaddafi

---Übersetzung in Deutsch---

Lieber Assalamu Alaikum,
Ich bin bei meiner privaten Suche auf Ihren Kontakt gestoßen
Frau Aisha Al-Gaddafi ist mein Name, die einzige Tochter des 
verstorbenen Libyers
Präsident, ich habe die Summe finanziert
von 27,5 Mio. USD für Investitionen, ich interessiere mich für Sie
Investitionsprojektunterstützung in Ihrem Land,
Ich werde Ihnen 30% der Gesamtsumme erstatten, wenn die Mittel 
aufgebraucht sind
in dein Konto überweisen,
Grüße von Frau Aisha Al-Gaddafi
Frau Aisha Al-Gaddafi
--

Please Dear, I Need An Investment Partner

2018-08-27 Thread Aisha Gaddafi




--
Dear Assalamu Alaikum,
I came across your contact during my private search
Mrs Aisha Al-Qaddafi is my name, the only daughter of late Libyan
president, I have funds the sum
of $27.5 million USD for investment, I am interested in you for
investment project assistance in your country,
i shall compensate you 30% of the total sum after the funds are
transfer into your account,
Greetings from Mrs Aisha Al-Qaddafi
Mrs Aisha Al-Qaddafi

---Übersetzung in Deutsch---

Lieber Assalamu Alaikum,
Ich bin bei meiner privaten Suche auf Ihren Kontakt gestoßen
Frau Aisha Al-Gaddafi ist mein Name, die einzige Tochter des 
verstorbenen Libyers
Präsident, ich habe die Summe finanziert
von 27,5 Mio. USD für Investitionen, ich interessiere mich für Sie
Investitionsprojektunterstützung in Ihrem Land,
Ich werde Ihnen 30% der Gesamtsumme erstatten, wenn die Mittel 
aufgebraucht sind
in dein Konto überweisen,
Grüße von Frau Aisha Al-Gaddafi
Frau Aisha Al-Gaddafi
--

Re: [PATCH v2 4/5] drivers: pinctrl: qcom: sdm845: support GPIO wakeup from suspend

2018-08-27 Thread Bjorn Andersson

On Fri 24 Aug 13:01 PDT 2018, Lina Iyer wrote:

> Enable TLMM IRQs to be sensed by PDC when we enter suspend. It is
> possible that the TLMM may be powered off and not detect GPIOs that are
> configured as wake up interrupts. By hooking into suspend callbacks, we
> allow PDC IRQs to take over and wake up the system if wakeup interrupts
> are triggered.
> 
> Signed-off-by: Lina Iyer 
> ---
>  drivers/pinctrl/qcom/pinctrl-sdm845.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/pinctrl/qcom/pinctrl-sdm845.c 
> b/drivers/pinctrl/qcom/pinctrl-sdm845.c
> index 2ab7a8885757..cc333b8afb99 100644
> --- a/drivers/pinctrl/qcom/pinctrl-sdm845.c
> +++ b/drivers/pinctrl/qcom/pinctrl-sdm845.c
> @@ -1297,10 +1297,16 @@ static const struct of_device_id 
> sdm845_pinctrl_of_match[] = {
>   { },
>  };
>  
> +static const struct dev_pm_ops msm_pinctrl_dev_pm_ops = {
> + SET_LATE_SYSTEM_SLEEP_PM_OPS(msm_pinctrl_suspend_late,
> +  msm_pinctrl_resume_late)
> +};
> +

I expect these four lines to be duplicated in every platform file, so I
think it would be better to just move it to pinctrl-msm.c and extern
declare it in pinctrl-msm.h.

>  static struct platform_driver sdm845_pinctrl_driver = {
>   .driver = {
>   .name = "sdm845-pinctrl",
>   .of_match_table = sdm845_pinctrl_of_match,
> + .pm = _pinctrl_dev_pm_ops,
>   },
>   .probe = sdm845_pinctrl_probe,
>   .remove = msm_pinctrl_remove,

Regards,
Bjorn

Re: [PATCH v2 4/5] drivers: pinctrl: qcom: sdm845: support GPIO wakeup from suspend

2018-08-27 Thread Bjorn Andersson

On Fri 24 Aug 13:01 PDT 2018, Lina Iyer wrote:

> Enable TLMM IRQs to be sensed by PDC when we enter suspend. It is
> possible that the TLMM may be powered off and not detect GPIOs that are
> configured as wake up interrupts. By hooking into suspend callbacks, we
> allow PDC IRQs to take over and wake up the system if wakeup interrupts
> are triggered.
> 
> Signed-off-by: Lina Iyer 
> ---
>  drivers/pinctrl/qcom/pinctrl-sdm845.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/pinctrl/qcom/pinctrl-sdm845.c 
> b/drivers/pinctrl/qcom/pinctrl-sdm845.c
> index 2ab7a8885757..cc333b8afb99 100644
> --- a/drivers/pinctrl/qcom/pinctrl-sdm845.c
> +++ b/drivers/pinctrl/qcom/pinctrl-sdm845.c
> @@ -1297,10 +1297,16 @@ static const struct of_device_id 
> sdm845_pinctrl_of_match[] = {
>   { },
>  };
>  
> +static const struct dev_pm_ops msm_pinctrl_dev_pm_ops = {
> + SET_LATE_SYSTEM_SLEEP_PM_OPS(msm_pinctrl_suspend_late,
> +  msm_pinctrl_resume_late)
> +};
> +

I expect these four lines to be duplicated in every platform file, so I
think it would be better to just move it to pinctrl-msm.c and extern
declare it in pinctrl-msm.h.

>  static struct platform_driver sdm845_pinctrl_driver = {
>   .driver = {
>   .name = "sdm845-pinctrl",
>   .of_match_table = sdm845_pinctrl_of_match,
> + .pm = _pinctrl_dev_pm_ops,
>   },
>   .probe = sdm845_pinctrl_probe,
>   .remove = msm_pinctrl_remove,

Regards,
Bjorn

[PATCH 2/2] math-emu/soft-fp.h: (_FP_ROUND_ZERO) cast 0 to void to fix warning

2018-08-27 Thread Vincent Chen

_FP_ROUND_ZERO is defined as 0 and used as a statemente in macro
_FP_ROUND. This will generate "error: statement with no effect
[-Werror=unused-value]" from gcc when compiling. Defining
_FP_ROUND_ZERO as (void)0 to fix it.

This modification references the content of glibc 'commit 
(8ed1e7d5894000c155acbd06f)'

Signed-off-by: Vincent Chen 
---
 include/math-emu/soft-fp.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/math-emu/soft-fp.h b/include/math-emu/soft-fp.h
index 3f284bc..5650c16 100644
--- a/include/math-emu/soft-fp.h
+++ b/include/math-emu/soft-fp.h
@@ -138,7 +138,7 @@
   _FP_FRAC_ADDI_##wc(X, _FP_WORK_ROUND);   \
 } while (0)
 
-#define _FP_ROUND_ZERO(wc, X)  0
+#define _FP_ROUND_ZERO(wc, X)  (void)0
 
 #define _FP_ROUND_PINF(wc, X)  \
 do {   \
-- 
1.7.1

[PATCH 2/2] math-emu/soft-fp.h: (_FP_ROUND_ZERO) cast 0 to void to fix warning

2018-08-27 Thread Vincent Chen

_FP_ROUND_ZERO is defined as 0 and used as a statemente in macro
_FP_ROUND. This will generate "error: statement with no effect
[-Werror=unused-value]" from gcc when compiling. Defining
_FP_ROUND_ZERO as (void)0 to fix it.

This modification references the content of glibc 'commit 
(8ed1e7d5894000c155acbd06f)'

Signed-off-by: Vincent Chen 
---
 include/math-emu/soft-fp.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/math-emu/soft-fp.h b/include/math-emu/soft-fp.h
index 3f284bc..5650c16 100644
--- a/include/math-emu/soft-fp.h
+++ b/include/math-emu/soft-fp.h
@@ -138,7 +138,7 @@
   _FP_FRAC_ADDI_##wc(X, _FP_WORK_ROUND);   \
 } while (0)
 
-#define _FP_ROUND_ZERO(wc, X)  0
+#define _FP_ROUND_ZERO(wc, X)  (void)0
 
 #define _FP_ROUND_PINF(wc, X)  \
 do {   \
-- 
1.7.1

[PATCH 1/2] math-emu/op-2.h: Use statement expressions to prevent negative constant shift

2018-08-27 Thread Vincent Chen

This modification references the content of glibc 'commit <
sysdeps/unix/sysv/linux/sparc/sparc64/dl-procinfo.c: Moved to>
(fe0b1e854ad32a69b260)'

Signed-off-by: Vincent Chen 
---
 include/math-emu/op-2.h |   97 ++
 1 files changed, 46 insertions(+), 51 deletions(-)

diff --git a/include/math-emu/op-2.h b/include/math-emu/op-2.h
index 4f26ecc..13a374f 100644
--- a/include/math-emu/op-2.h
+++ b/include/math-emu/op-2.h
@@ -31,61 +31,56 @@
 #define _FP_FRAC_HIGH_2(X) (X##_f1)
 #define _FP_FRAC_LOW_2(X)  (X##_f0)
 #define _FP_FRAC_WORD_2(X,w)   (X##_f##w)
+#define _FP_FRAC_SLL_2(X, N) (\
+   (void) (((N) < _FP_W_TYPE_SIZE)\
+ ? ({ \
+   if (__builtin_constant_p(N) && (N) == 1) { \
+   X##_f1 = X##_f1 + X##_f1 + \
+   (((_FP_WS_TYPE) (X##_f0)) < 0);\
+   X##_f0 += X##_f0;  \
+   } else {   \
+   X##_f1 = X##_f1 << (N) | X##_f0 >> \
+   (_FP_W_TYPE_SIZE - (N));   \
+   X##_f0 <<= (N);\
+   }  \
+   0; \
+   }) \
+ : ({ \
+ X##_f1 = X##_f0 << ((N) - _FP_W_TYPE_SIZE);  \
+ X##_f0 = 0;  \
+ })))
+
+
+#define _FP_FRAC_SRL_2(X, N) (\
+   (void) (((N) < _FP_W_TYPE_SIZE)\
+ ? ({ \
+ X##_f0 = X##_f0 >> (N) | X##_f1 << (_FP_W_TYPE_SIZE - (N));  \
+ X##_f1 >>= (N);  \
+   }) \
+ : ({ \
+ X##_f0 = X##_f1 >> ((N) - _FP_W_TYPE_SIZE);  \
+ X##_f1 = 0;  \
+   })))
 
-#define _FP_FRAC_SLL_2(X,N)\
-  do { \
-if ((N) < _FP_W_TYPE_SIZE) \
-  {
\
-   if (__builtin_constant_p(N) && (N) == 1)\
- { \
-   X##_f1 = X##_f1 + X##_f1 + (((_FP_WS_TYPE)(X##_f0)) < 0);   \
-   X##_f0 += X##_f0;   \
- } \
-   else\
- { \
-   X##_f1 = X##_f1 << (N) | X##_f0 >> (_FP_W_TYPE_SIZE - (N)); \
-   X##_f0 <<= (N); \
- } \
-  }
\
-else   \
-  {
\
-   X##_f1 = X##_f0 << ((N) - _FP_W_TYPE_SIZE); \
-   X##_f0 = 0; \
-  }
\
-  } while (0)
-
-#define _FP_FRAC_SRL_2(X,N)\
-  do { \
-if ((N) < _FP_W_TYPE_SIZE) \
-  {
\
-   X##_f0 = X##_f0 >> (N) | X##_f1 << (_FP_W_TYPE_SIZE - (N)); \
-   X##_f1 >>= (N); \
-  }
\
-else   \
-  {
\
-   X##_f0 =

[PATCH 1/2] math-emu/op-2.h: Use statement expressions to prevent negative constant shift

2018-08-27 Thread Vincent Chen

This modification references the content of glibc 'commit <
sysdeps/unix/sysv/linux/sparc/sparc64/dl-procinfo.c: Moved to>
(fe0b1e854ad32a69b260)'

Signed-off-by: Vincent Chen 
---
 include/math-emu/op-2.h |   97 ++
 1 files changed, 46 insertions(+), 51 deletions(-)

diff --git a/include/math-emu/op-2.h b/include/math-emu/op-2.h
index 4f26ecc..13a374f 100644
--- a/include/math-emu/op-2.h
+++ b/include/math-emu/op-2.h
@@ -31,61 +31,56 @@
 #define _FP_FRAC_HIGH_2(X) (X##_f1)
 #define _FP_FRAC_LOW_2(X)  (X##_f0)
 #define _FP_FRAC_WORD_2(X,w)   (X##_f##w)
+#define _FP_FRAC_SLL_2(X, N) (\
+   (void) (((N) < _FP_W_TYPE_SIZE)\
+ ? ({ \
+   if (__builtin_constant_p(N) && (N) == 1) { \
+   X##_f1 = X##_f1 + X##_f1 + \
+   (((_FP_WS_TYPE) (X##_f0)) < 0);\
+   X##_f0 += X##_f0;  \
+   } else {   \
+   X##_f1 = X##_f1 << (N) | X##_f0 >> \
+   (_FP_W_TYPE_SIZE - (N));   \
+   X##_f0 <<= (N);\
+   }  \
+   0; \
+   }) \
+ : ({ \
+ X##_f1 = X##_f0 << ((N) - _FP_W_TYPE_SIZE);  \
+ X##_f0 = 0;  \
+ })))
+
+
+#define _FP_FRAC_SRL_2(X, N) (\
+   (void) (((N) < _FP_W_TYPE_SIZE)\
+ ? ({ \
+ X##_f0 = X##_f0 >> (N) | X##_f1 << (_FP_W_TYPE_SIZE - (N));  \
+ X##_f1 >>= (N);  \
+   }) \
+ : ({ \
+ X##_f0 = X##_f1 >> ((N) - _FP_W_TYPE_SIZE);  \
+ X##_f1 = 0;  \
+   })))
 
-#define _FP_FRAC_SLL_2(X,N)\
-  do { \
-if ((N) < _FP_W_TYPE_SIZE) \
-  {
\
-   if (__builtin_constant_p(N) && (N) == 1)\
- { \
-   X##_f1 = X##_f1 + X##_f1 + (((_FP_WS_TYPE)(X##_f0)) < 0);   \
-   X##_f0 += X##_f0;   \
- } \
-   else\
- { \
-   X##_f1 = X##_f1 << (N) | X##_f0 >> (_FP_W_TYPE_SIZE - (N)); \
-   X##_f0 <<= (N); \
- } \
-  }
\
-else   \
-  {
\
-   X##_f1 = X##_f0 << ((N) - _FP_W_TYPE_SIZE); \
-   X##_f0 = 0; \
-  }
\
-  } while (0)
-
-#define _FP_FRAC_SRL_2(X,N)\
-  do { \
-if ((N) < _FP_W_TYPE_SIZE) \
-  {
\
-   X##_f0 = X##_f0 >> (N) | X##_f1 << (_FP_W_TYPE_SIZE - (N)); \
-   X##_f1 >>= (N); \
-  }
\
-else   \
-  {
\
-   X##_f0 =

Re: [PATCH v2 1/5] drivers: pinctrl: qcom: add wakeup capability to GPIO

2018-08-27 Thread Bjorn Andersson

On Mon 27 Aug 09:56 PDT 2018, Lina Iyer wrote:

> On Sun, Aug 26 2018 at 08:33 -0600, Linus Walleij wrote:
> > On Fri, Aug 17, 2018 at 6:39 PM Lina Iyer  wrote:
> > 
> > > QCOM SoC's that have Power Domain Controller (PDC) chip in the always-on
> > > domain can wakeup the SoC, when interrupts and GPIOs are routed to the
> > > its interrupt controller. Only select GPIOs that are deemed wakeup
> > > capable are routed to specific PDC pins. During low power state, the
> > > pinmux interrupt controller may be non-functional but the PDC would be.
> > > The PDC can detect the wakeup GPIO is triggered and bring the TLMM to an
> > > operational state.
> > > 
> > > Interrupts that are level triggered will be detected at the TLMM when
> > > the controller becomes operational. Edge interrupts however need to be
> > > replayed again.
> > > 
> > > Request the corresponding PDC IRQ, when the GPIO is requested as an IRQ,
> > > but keep it disabled. During suspend, we can enable the PDC IRQ instead
> > > of the GPIO IRQ, which may or not be detected.
> > > 
> > > Signed-off-by: Lina Iyer 
> > > ---
> > > Changes in v1:
> > > - Trigger GPIO in h/w from PDC IRQ handler
> > > - Avoid big tables for GPIO-PDC map, pick from DT instead
> > > - Use handler_data
> > 
> > Just for the record this is an impressive and much needed patch
> > set, no other SoC developer has yet taken on the task of making this
> > work so I very much appreciate that Qualcomm show the way.
> > 
> > > +static int msm_gpio_pdc_pin_request(struct irq_data *d)
> > > +static int msm_gpio_pdc_pin_release(struct irq_data *d)
> > > +static int msm_gpio_irq_reqres(struct irq_data *d)
> > > +{
> > (...)
> > > +   if (gpiochip_lock_as_irq(gc, irqd_to_hwirq(d))) {
> > (...)
> > > +static void msm_gpio_irq_relres(struct irq_data *d)
> > > +{
> > > +   gpiochip_unlock_as_irq(gc, irqd_to_hwirq(d));
> > > +}
> > 
> > FYI Hans Verkuil is working on a patch set that moves the
> > lock/unlock as IRQ call to the irqchip request() and release()
> > functions so we can switch a GPIO irqchip line from IRQ
> > mode to say output at runtime without too much trouble.
> > (CEC needs this.)
> > 
> Thanks, I will look into Hans's RFCv2. But what would help me would be
> to avoid creating the IRQ for the GPIO itself (I have the latent IRQ),
> if I could just return that instead in gpio_to_irq(), it might be
> easier. I understand ->to_irq() is supposed to be a translate function
> only, I can avoid the dance of enabling and diabling the PDC IRQ on
> suspend and resume.
> 

I did implement gpio_to_irq() like this in the PMIC gpio/mpp drivers and
we've since concluded that we need to move this to some hierarchical
interrupt controller, because people like Linus expect to be able to say

  interrupts = <_controller 1 IRQ_TYPE_EDGE_RISING> 

which is something used all over the place with the TLMM driver today.

Regards,
Bjorn

Re: [PATCH v2 1/5] drivers: pinctrl: qcom: add wakeup capability to GPIO

2018-08-27 Thread Bjorn Andersson

On Mon 27 Aug 09:56 PDT 2018, Lina Iyer wrote:

> On Sun, Aug 26 2018 at 08:33 -0600, Linus Walleij wrote:
> > On Fri, Aug 17, 2018 at 6:39 PM Lina Iyer  wrote:
> > 
> > > QCOM SoC's that have Power Domain Controller (PDC) chip in the always-on
> > > domain can wakeup the SoC, when interrupts and GPIOs are routed to the
> > > its interrupt controller. Only select GPIOs that are deemed wakeup
> > > capable are routed to specific PDC pins. During low power state, the
> > > pinmux interrupt controller may be non-functional but the PDC would be.
> > > The PDC can detect the wakeup GPIO is triggered and bring the TLMM to an
> > > operational state.
> > > 
> > > Interrupts that are level triggered will be detected at the TLMM when
> > > the controller becomes operational. Edge interrupts however need to be
> > > replayed again.
> > > 
> > > Request the corresponding PDC IRQ, when the GPIO is requested as an IRQ,
> > > but keep it disabled. During suspend, we can enable the PDC IRQ instead
> > > of the GPIO IRQ, which may or not be detected.
> > > 
> > > Signed-off-by: Lina Iyer 
> > > ---
> > > Changes in v1:
> > > - Trigger GPIO in h/w from PDC IRQ handler
> > > - Avoid big tables for GPIO-PDC map, pick from DT instead
> > > - Use handler_data
> > 
> > Just for the record this is an impressive and much needed patch
> > set, no other SoC developer has yet taken on the task of making this
> > work so I very much appreciate that Qualcomm show the way.
> > 
> > > +static int msm_gpio_pdc_pin_request(struct irq_data *d)
> > > +static int msm_gpio_pdc_pin_release(struct irq_data *d)
> > > +static int msm_gpio_irq_reqres(struct irq_data *d)
> > > +{
> > (...)
> > > +   if (gpiochip_lock_as_irq(gc, irqd_to_hwirq(d))) {
> > (...)
> > > +static void msm_gpio_irq_relres(struct irq_data *d)
> > > +{
> > > +   gpiochip_unlock_as_irq(gc, irqd_to_hwirq(d));
> > > +}
> > 
> > FYI Hans Verkuil is working on a patch set that moves the
> > lock/unlock as IRQ call to the irqchip request() and release()
> > functions so we can switch a GPIO irqchip line from IRQ
> > mode to say output at runtime without too much trouble.
> > (CEC needs this.)
> > 
> Thanks, I will look into Hans's RFCv2. But what would help me would be
> to avoid creating the IRQ for the GPIO itself (I have the latent IRQ),
> if I could just return that instead in gpio_to_irq(), it might be
> easier. I understand ->to_irq() is supposed to be a translate function
> only, I can avoid the dance of enabling and diabling the PDC IRQ on
> suspend and resume.
> 

I did implement gpio_to_irq() like this in the PMIC gpio/mpp drivers and
we've since concluded that we need to move this to some hierarchical
interrupt controller, because people like Linus expect to be able to say

  interrupts = <_controller 1 IRQ_TYPE_EDGE_RISING> 

which is something used all over the place with the TLMM driver today.

Regards,
Bjorn

Re: [PATCH v2 2/6] reset: qcom: PDC Global (Power Domain Controller) reset controller

2018-08-27 Thread Matthias Kaehlcke

Hi Sibi,

On Fri, Aug 24, 2018 at 06:48:56PM +0530, Sibi Sankar wrote:
> Add reset controller for SDM845 SoCs to control reset signals provided
> by PDC Global for Modem, Compute, Display, GPU, Debug, AOP, Sensors,
> Audio, SP and APPS
> 
> Signed-off-by: Sibi Sankar 
> ---
>  drivers/reset/Kconfig  |   9 +++
>  drivers/reset/Makefile |   1 +
>  drivers/reset/reset-qcom-pdc.c | 142 +
>  3 files changed, 152 insertions(+)
>  create mode 100644 drivers/reset/reset-qcom-pdc.c
> 
> diff --git a/drivers/reset/Kconfig b/drivers/reset/Kconfig
> index 13d28fdbdbb5..c21da9fe51ec 100644
> --- a/drivers/reset/Kconfig
> +++ b/drivers/reset/Kconfig
> @@ -98,6 +98,15 @@ config RESET_QCOM_AOSS
> reset signals provided by AOSS for Modem, Venus, ADSP,
> GPU, Camera, Wireless, Display subsystem. Otherwise, say N.
>  
> +config RESET_QCOM_PDC
> + tristate "Qualcomm PDC Reset Driver"
> + depends on ARCH_QCOM || COMPILE_TEST
> + help
> +   This enables the PDC (Power Domain Controller) reset driver
> +   for Qualcomm Technologies Inc SDM845 SoCs. Say Y if you want
> +   to control reset signals provided by PDC for Modem, Compute,
> +   Display, GPU, Debug, AOP, Sensors, Audio, SP and APPS.

What exactly does APPS mean? The AP cores, the entire SoC, something
else?

> +
>  config RESET_SIMPLE
>   bool "Simple Reset Controller Driver" if COMPILE_TEST
>   default ARCH_SOCFPGA || ARCH_STM32 || ARCH_STRATIX10 || ARCH_SUNXI || 
> ARCH_ZX || ARCH_ASPEED
> diff --git a/drivers/reset/Makefile b/drivers/reset/Makefile
> index 4243c38228e2..d08e8b90046a 100644
> --- a/drivers/reset/Makefile
> +++ b/drivers/reset/Makefile
> @@ -16,6 +16,7 @@ obj-$(CONFIG_RESET_MESON_AUDIO_ARB) += 
> reset-meson-audio-arb.o
>  obj-$(CONFIG_RESET_OXNAS) += reset-oxnas.o
>  obj-$(CONFIG_RESET_PISTACHIO) += reset-pistachio.o
>  obj-$(CONFIG_RESET_QCOM_AOSS) += reset-qcom-aoss.o
> +obj-$(CONFIG_RESET_QCOM_PDC) += reset-qcom-pdc.o
>  obj-$(CONFIG_RESET_SIMPLE) += reset-simple.o
>  obj-$(CONFIG_RESET_STM32MP157) += reset-stm32mp1.o
>  obj-$(CONFIG_RESET_SUNXI) += reset-sunxi.o
> diff --git a/drivers/reset/reset-qcom-pdc.c b/drivers/reset/reset-qcom-pdc.c
> new file mode 100644
> index ..bb6a5e5ee0f8
> --- /dev/null
> +++ b/drivers/reset/reset-qcom-pdc.c
> @@ -0,0 +1,142 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2018 The Linux Foundation. All rights reserved.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 

Headers should be sorted in alphabetical order.

> +
> +#define RPMH_PDC_SYNC_RESET  0x100
> +
> +struct qcom_pdc_reset_map {
> + u8 bit;
> +};
> +
> +struct qcom_pdc_desc {
> + const struct regmap_config *config;
> + const struct qcom_pdc_reset_map *resets;
> + size_t num_resets;
> +};

Not sure if this structure adds much value or just a layer of
indirection:

- .config is only accessed in _probe(), sdm845_pdc_regmap_config could
  be used directly
- .resets is used in _(de)assert(), sdm845_pdc_resets could be used
  directly
- .num_resets is only accessed in _probe(),
  ARRAY_SIZE(sdm845_pdc_resets) could be used instead

It probably makes sense if it is planned to support reset controllers
of other SoCs with this driver.

> +struct qcom_pdc_reset_data {
> + struct reset_controller_dev rcdev;
> + struct regmap *regmap;
> + const struct qcom_pdc_desc *desc;
> +};
> +
> +static const struct regmap_config sdm845_pdc_regmap_config = {
> + .name   = "pdc-reset",
> + .reg_bits   = 32,
> + .reg_stride = 4,
> + .val_bits   = 32,
> + .max_register   = 0x2,
> + .fast_io= true,
> +};
> +
> +static const struct qcom_pdc_reset_map sdm845_pdc_resets[] = {
> + [PDC_APPS_SYNC_RESET] = {0},
> + [PDC_SP_SYNC_RESET] = {1},
> + [PDC_AUDIO_SYNC_RESET] = {2},
> + [PDC_SENSORS_SYNC_RESET] = {3},
> + [PDC_AOP_SYNC_RESET] = {4},
> + [PDC_DEBUG_SYNC_RESET] = {5},
> + [PDC_GPU_SYNC_RESET] = {6},
> + [PDC_DISPLAY_SYNC_RESET] = {7},
> + [PDC_COMPUTE_SYNC_RESET] = {8},
> + [PDC_MODEM_SYNC_RESET] = {9},
> +};
> +
> +static const struct qcom_pdc_desc sdm845_pdc_desc = {
> + .config = _pdc_regmap_config,
> + .resets = sdm845_pdc_resets,
> + .num_resets = ARRAY_SIZE(sdm845_pdc_resets),
> +};
> +
> +static inline struct qcom_pdc_reset_data *to_qcom_pdc_reset_data(
> + struct reset_controller_dev *rcdev)
> +{
> + return container_of(rcdev, struct qcom_pdc_reset_data, rcdev);
> +}
> +
> +static int qcom_pdc_control_assert(struct reset_controller_dev *rcdev,
> + unsigned long idx)
> +{
> + struct qcom_pdc_reset_data *data = to_qcom_pdc_reset_data(rcdev);
> + const struct qcom_pdc_reset_map *map = >desc->resets[idx];
> +
> + return regmap_update_bits(data->regmap, RPMH_PDC_SYNC_RESET,
> +

Re: [PATCH v2 2/6] reset: qcom: PDC Global (Power Domain Controller) reset controller

2018-08-27 Thread Matthias Kaehlcke

Hi Sibi,

On Fri, Aug 24, 2018 at 06:48:56PM +0530, Sibi Sankar wrote:
> Add reset controller for SDM845 SoCs to control reset signals provided
> by PDC Global for Modem, Compute, Display, GPU, Debug, AOP, Sensors,
> Audio, SP and APPS
> 
> Signed-off-by: Sibi Sankar 
> ---
>  drivers/reset/Kconfig  |   9 +++
>  drivers/reset/Makefile |   1 +
>  drivers/reset/reset-qcom-pdc.c | 142 +
>  3 files changed, 152 insertions(+)
>  create mode 100644 drivers/reset/reset-qcom-pdc.c
> 
> diff --git a/drivers/reset/Kconfig b/drivers/reset/Kconfig
> index 13d28fdbdbb5..c21da9fe51ec 100644
> --- a/drivers/reset/Kconfig
> +++ b/drivers/reset/Kconfig
> @@ -98,6 +98,15 @@ config RESET_QCOM_AOSS
> reset signals provided by AOSS for Modem, Venus, ADSP,
> GPU, Camera, Wireless, Display subsystem. Otherwise, say N.
>  
> +config RESET_QCOM_PDC
> + tristate "Qualcomm PDC Reset Driver"
> + depends on ARCH_QCOM || COMPILE_TEST
> + help
> +   This enables the PDC (Power Domain Controller) reset driver
> +   for Qualcomm Technologies Inc SDM845 SoCs. Say Y if you want
> +   to control reset signals provided by PDC for Modem, Compute,
> +   Display, GPU, Debug, AOP, Sensors, Audio, SP and APPS.

What exactly does APPS mean? The AP cores, the entire SoC, something
else?

> +
>  config RESET_SIMPLE
>   bool "Simple Reset Controller Driver" if COMPILE_TEST
>   default ARCH_SOCFPGA || ARCH_STM32 || ARCH_STRATIX10 || ARCH_SUNXI || 
> ARCH_ZX || ARCH_ASPEED
> diff --git a/drivers/reset/Makefile b/drivers/reset/Makefile
> index 4243c38228e2..d08e8b90046a 100644
> --- a/drivers/reset/Makefile
> +++ b/drivers/reset/Makefile
> @@ -16,6 +16,7 @@ obj-$(CONFIG_RESET_MESON_AUDIO_ARB) += 
> reset-meson-audio-arb.o
>  obj-$(CONFIG_RESET_OXNAS) += reset-oxnas.o
>  obj-$(CONFIG_RESET_PISTACHIO) += reset-pistachio.o
>  obj-$(CONFIG_RESET_QCOM_AOSS) += reset-qcom-aoss.o
> +obj-$(CONFIG_RESET_QCOM_PDC) += reset-qcom-pdc.o
>  obj-$(CONFIG_RESET_SIMPLE) += reset-simple.o
>  obj-$(CONFIG_RESET_STM32MP157) += reset-stm32mp1.o
>  obj-$(CONFIG_RESET_SUNXI) += reset-sunxi.o
> diff --git a/drivers/reset/reset-qcom-pdc.c b/drivers/reset/reset-qcom-pdc.c
> new file mode 100644
> index ..bb6a5e5ee0f8
> --- /dev/null
> +++ b/drivers/reset/reset-qcom-pdc.c
> @@ -0,0 +1,142 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2018 The Linux Foundation. All rights reserved.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 

Headers should be sorted in alphabetical order.

> +
> +#define RPMH_PDC_SYNC_RESET  0x100
> +
> +struct qcom_pdc_reset_map {
> + u8 bit;
> +};
> +
> +struct qcom_pdc_desc {
> + const struct regmap_config *config;
> + const struct qcom_pdc_reset_map *resets;
> + size_t num_resets;
> +};

Not sure if this structure adds much value or just a layer of
indirection:

- .config is only accessed in _probe(), sdm845_pdc_regmap_config could
  be used directly
- .resets is used in _(de)assert(), sdm845_pdc_resets could be used
  directly
- .num_resets is only accessed in _probe(),
  ARRAY_SIZE(sdm845_pdc_resets) could be used instead

It probably makes sense if it is planned to support reset controllers
of other SoCs with this driver.

> +struct qcom_pdc_reset_data {
> + struct reset_controller_dev rcdev;
> + struct regmap *regmap;
> + const struct qcom_pdc_desc *desc;
> +};
> +
> +static const struct regmap_config sdm845_pdc_regmap_config = {
> + .name   = "pdc-reset",
> + .reg_bits   = 32,
> + .reg_stride = 4,
> + .val_bits   = 32,
> + .max_register   = 0x2,
> + .fast_io= true,
> +};
> +
> +static const struct qcom_pdc_reset_map sdm845_pdc_resets[] = {
> + [PDC_APPS_SYNC_RESET] = {0},
> + [PDC_SP_SYNC_RESET] = {1},
> + [PDC_AUDIO_SYNC_RESET] = {2},
> + [PDC_SENSORS_SYNC_RESET] = {3},
> + [PDC_AOP_SYNC_RESET] = {4},
> + [PDC_DEBUG_SYNC_RESET] = {5},
> + [PDC_GPU_SYNC_RESET] = {6},
> + [PDC_DISPLAY_SYNC_RESET] = {7},
> + [PDC_COMPUTE_SYNC_RESET] = {8},
> + [PDC_MODEM_SYNC_RESET] = {9},
> +};
> +
> +static const struct qcom_pdc_desc sdm845_pdc_desc = {
> + .config = _pdc_regmap_config,
> + .resets = sdm845_pdc_resets,
> + .num_resets = ARRAY_SIZE(sdm845_pdc_resets),
> +};
> +
> +static inline struct qcom_pdc_reset_data *to_qcom_pdc_reset_data(
> + struct reset_controller_dev *rcdev)
> +{
> + return container_of(rcdev, struct qcom_pdc_reset_data, rcdev);
> +}
> +
> +static int qcom_pdc_control_assert(struct reset_controller_dev *rcdev,
> + unsigned long idx)
> +{
> + struct qcom_pdc_reset_data *data = to_qcom_pdc_reset_data(rcdev);
> + const struct qcom_pdc_reset_map *map = >desc->resets[idx];
> +
> + return regmap_update_bits(data->regmap, RPMH_PDC_SYNC_RESET,
> +

RE: [PATCH v13 02/13] x86/cpufeature: Add SGX and SGX_LC CPU features

2018-08-27 Thread Huang, Kai

> +#define X86_FEATURE_SGX_LC   (16*32+30) /* supports SGX launch
> configuration */

Sorry if it was me who wrote the comment "SGX launch configuration". I think we 
should just use "SGX launch control". :)

Thanks,
-Kai
> 
>  /* AMD-defined CPU features, CPUID level 0x8007 (EBX), word 17 */
>  #define X86_FEATURE_OVERFLOW_RECOV   (17*32+ 0) /* MCA overflow
> recovery support */
> --
> 2.17.1

RE: [PATCH v13 02/13] x86/cpufeature: Add SGX and SGX_LC CPU features

2018-08-27 Thread Huang, Kai

> +#define X86_FEATURE_SGX_LC   (16*32+30) /* supports SGX launch
> configuration */

Sorry if it was me who wrote the comment "SGX launch configuration". I think we 
should just use "SGX launch control". :)

Thanks,
-Kai
> 
>  /* AMD-defined CPU features, CPUID level 0x8007 (EBX), word 17 */
>  #define X86_FEATURE_OVERFLOW_RECOV   (17*32+ 0) /* MCA overflow
> recovery support */
> --
> 2.17.1

Re: [PATCH] x86/nmi: Fix some races in NMI uaccess

2018-08-27 Thread Jann Horn

On Tue, Aug 28, 2018 at 1:26 AM Andy Lutomirski  wrote:
>
> On Mon, Aug 27, 2018 at 4:12 PM, Jann Horn  wrote:
> > On Tue, Aug 28, 2018 at 1:04 AM Andy Lutomirski  wrote:
> >>
> >> In NMI context, we might be in the middle of context switching or in
> >> the middle of switch_mm_irqs_off().  In either case, CR3 might not
> >> match current->mm, which could cause copy_from_user_nmi() and
> >> friends to read the wrong memory.
> >>
> >> Fix it by adding a new nmi_uaccess_okay() helper and checking it in
> >> copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.
> >
> > What about eBPF probes (which I think can be attached to kprobe points
> > / tracepoints / perf events) that perform userspace reads / userspace
> > writes / kernel reads? Can those run in NMI context, and if so, do
> > they also need special handling?
>
> I assume they can run in NMI context, which might be problematic in
> and of themselves.  For example, does BPF adequately protect against a
> BPF program accessing a map while bpf(2) is modifying it?  It seems
> like bpf_prog_active is intended to serve this purpose.
>
> But I don't see any obvious mechanism for eBPF programs to read user memory.

Look in kernel/trace/bpf_trace.c, which defines a bunch of eBPF
helpers that can only be called from privileged eBPF code. Ah, but I
misremembered, the userspace write helper does have a guard against
interrupts, just the arbitrary read helper doesn't.

BPF_CALL_3(bpf_probe_read, void *, dst, u32, size, const void *, unsafe_ptr)
{
int ret;

ret = probe_kernel_read(dst, unsafe_ptr, size);
if (unlikely(ret < 0))
memset(dst, 0, size);

return ret;
}
[...]
BPF_CALL_3(bpf_probe_write_user, void *, unsafe_ptr, const void *, src,
   u32, size)
{
/*
 * Ensure we're in user context which is safe for the helper to
 * run. This helper has no business in a kthread.
 *
 * access_ok() should prevent writing to non-user memory, but in
 * some situations (nommu, temporary switch, etc) access_ok() does
 * not provide enough validation, hence the check on KERNEL_DS.
 */

if (unlikely(in_interrupt() ||
 current->flags & (PF_KTHREAD | PF_EXITING)))
return -EPERM;
if (unlikely(uaccess_kernel()))
return -EPERM;
if (!access_ok(VERIFY_WRITE, unsafe_ptr, size))
return -EPERM;

return probe_kernel_write(unsafe_ptr, src, size);
}

Re: [PATCH] x86/nmi: Fix some races in NMI uaccess

2018-08-27 Thread Jann Horn

On Tue, Aug 28, 2018 at 1:26 AM Andy Lutomirski  wrote:
>
> On Mon, Aug 27, 2018 at 4:12 PM, Jann Horn  wrote:
> > On Tue, Aug 28, 2018 at 1:04 AM Andy Lutomirski  wrote:
> >>
> >> In NMI context, we might be in the middle of context switching or in
> >> the middle of switch_mm_irqs_off().  In either case, CR3 might not
> >> match current->mm, which could cause copy_from_user_nmi() and
> >> friends to read the wrong memory.
> >>
> >> Fix it by adding a new nmi_uaccess_okay() helper and checking it in
> >> copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.
> >
> > What about eBPF probes (which I think can be attached to kprobe points
> > / tracepoints / perf events) that perform userspace reads / userspace
> > writes / kernel reads? Can those run in NMI context, and if so, do
> > they also need special handling?
>
> I assume they can run in NMI context, which might be problematic in
> and of themselves.  For example, does BPF adequately protect against a
> BPF program accessing a map while bpf(2) is modifying it?  It seems
> like bpf_prog_active is intended to serve this purpose.
>
> But I don't see any obvious mechanism for eBPF programs to read user memory.

Look in kernel/trace/bpf_trace.c, which defines a bunch of eBPF
helpers that can only be called from privileged eBPF code. Ah, but I
misremembered, the userspace write helper does have a guard against
interrupts, just the arbitrary read helper doesn't.

BPF_CALL_3(bpf_probe_read, void *, dst, u32, size, const void *, unsafe_ptr)
{
int ret;

ret = probe_kernel_read(dst, unsafe_ptr, size);
if (unlikely(ret < 0))
memset(dst, 0, size);

return ret;
}
[...]
BPF_CALL_3(bpf_probe_write_user, void *, unsafe_ptr, const void *, src,
   u32, size)
{
/*
 * Ensure we're in user context which is safe for the helper to
 * run. This helper has no business in a kthread.
 *
 * access_ok() should prevent writing to non-user memory, but in
 * some situations (nommu, temporary switch, etc) access_ok() does
 * not provide enough validation, hence the check on KERNEL_DS.
 */

if (unlikely(in_interrupt() ||
 current->flags & (PF_KTHREAD | PF_EXITING)))
return -EPERM;
if (unlikely(uaccess_kernel()))
return -EPERM;
if (!access_ok(VERIFY_WRITE, unsafe_ptr, size))
return -EPERM;

return probe_kernel_write(unsafe_ptr, src, size);
}

Re: [PATCH 2/2] mm: zero remaining unavailable struct pages

2018-08-27 Thread Pasha Tatashin

On 8/23/18 2:25 PM, Masayoshi Mizuma wrote:
> From: Naoya Horiguchi 
> 
> There is a kernel panic that is triggered when reading /proc/kpageflags
> on the kernel booted with kernel parameter 'memmap=nn[KMG]!ss[KMG]':
> 
>   BUG: unable to handle kernel paging request at fffe
>   PGD 9b20e067 P4D 9b20e067 PUD 9b210067 PMD 0
>   Oops:  [#1] SMP PTI
>   CPU: 2 PID: 1728 Comm: page-types Not tainted 
> 4.17.0-rc6-mm1-v4.17-rc6-180605-0816-00236-g2dfb086ef02c+ #160
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.fc28 
> 04/01/2014
>   RIP: 0010:stable_page_flags+0x27/0x3c0
>   Code: 00 00 00 0f 1f 44 00 00 48 85 ff 0f 84 a0 03 00 00 41 54 55 49 89 fc 
> 53 48 8b 57 08 48 8b 2f 48 8d 42 ff 83 e2 01 48 0f 44 c7 <48> 8b 00 f6 c4 01 
> 0f 84 10 03 00 00 31 db 49 8b 54 24 08 4c 89 e7
>   RSP: 0018:bbd44111fde0 EFLAGS: 00010202
>   RAX: fffe RBX: 7fffeff9 RCX: 
>   RDX: 0001 RSI: 0202 RDI: ed1182fff5c0
>   RBP:  R08: 0001 R09: 0001
>   R10: bbd44111fed8 R11:  R12: ed1182fff5c0
>   R13: 000bffd7 R14: 02fff5c0 R15: bbd44111ff10
>   FS:  7efc4335a500() GS:93a5bfc0() knlGS:
>   CS:  0010 DS:  ES:  CR0: 80050033
>   CR2: fffe CR3: b2a58000 CR4: 001406e0
>   Call Trace:
>kpageflags_read+0xc7/0x120
>proc_reg_read+0x3c/0x60
>__vfs_read+0x36/0x170
>vfs_read+0x89/0x130
>ksys_pread64+0x71/0x90
>do_syscall_64+0x5b/0x160
>entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7efc42e75e23
>   Code: 09 00 ba 9f 01 00 00 e8 ab 81 f4 ff 66 2e 0f 1f 84 00 00 00 00 00 90 
> 83 3d 29 0a 2d 00 00 75 13 49 89 ca b8 11 00 00 00 0f 05 <48> 3d 01 f0 ff ff 
> 73 34 c3 48 83 ec 08 e8 db d3 01 00 48 89 04 24
> 
> According to kernel bisection, this problem became visible due to commit
> f7f99100d8d9 which changes how struct pages are initialized.
> 
> Memblock layout affects the pfn ranges covered by node/zone. Consider
> that we have a VM with 2 NUMA nodes and each node has 4GB memory, and
> the default (no memmap= given) memblock layout is like below:
> 
>   MEMBLOCK configuration:
>memory size = 0x0001fff75c00 reserved size = 0x0300c000
>memory.cnt  = 0x4
>memory[0x0] [0x1000-0x0009efff], 
> 0x0009e000 bytes on node 0 flags: 0x0
>memory[0x1] [0x0010-0xbffd6fff], 
> 0xbfed7000 bytes on node 0 flags: 0x0
>memory[0x2] [0x0001-0x00013fff], 
> 0x4000 bytes on node 0 flags: 0x0
>memory[0x3] [0x00014000-0x00023fff], 
> 0x0001 bytes on node 1 flags: 0x0
>...
> 
> If you give memmap=1G!4G (so it just covers memory[0x2]),
> the range [0x1-0x13fff] is gone:
> 
>   MEMBLOCK configuration:
>memory size = 0x0001bff75c00 reserved size = 0x0300c000
>memory.cnt  = 0x3
>memory[0x0] [0x1000-0x0009efff], 
> 0x0009e000 bytes on node 0 flags: 0x0
>memory[0x1] [0x0010-0xbffd6fff], 
> 0xbfed7000 bytes on node 0 flags: 0x0
>memory[0x2] [0x00014000-0x00023fff], 
> 0x0001 bytes on node 1 flags: 0x0
>...
> 
> This causes shrinking node 0's pfn range because it is calculated by
> the address range of memblock.memory. So some of struct pages in the
> gap range are left uninitialized.
> 
> We have a function zero_resv_unavail() which does zeroing the struct
> pages outside memblock.memory, but currently it covers only the reserved
> unavailable range (i.e. memblock.memory && !memblock.reserved).
> This patch extends it to cover all unavailable range, which fixes
> the reported issue.
> 
> Fixes: f7f99100d8d9 ("mm: stop zeroing memory during allocation in vmemmap")
> Signed-off-by: Naoya Horiguchi 
> Tested-by: Oscar Salvador 
> Tested-by: Masayoshi Mizuma 

Reviewed-by: Pavel Tatashin 

Also, please review and add the following patch to this series:

From 6d23e66e979244734a06c1b636742c2568121b39 Mon Sep 17 00:00:00 2001
From: Pavel Tatashin 
Date: Mon, 27 Aug 2018 19:10:35 -0400
Subject: [PATCH] mm: return zero_resv_unavail optimization

When checking for valid pfns in zero_resv_unavail(), it is not necessary to
verify that pfns within pageblock_nr_pages ranges are valid, only the first
one needs to be checked. This is because memory for pages are allocated in
contiguous chunks that contain pageblock_nr_pages struct pages.

Signed-off-by: Pavel Tatashin 
---
 mm/page_alloc.c | 46 ++
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 650d8f16a67e..5dfc206db40e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6441,6 +6441,29 @@ void __init free_area_init_node(int

Re: [PATCH 2/2] mm: zero remaining unavailable struct pages

2018-08-27 Thread Pasha Tatashin

On 8/23/18 2:25 PM, Masayoshi Mizuma wrote:
> From: Naoya Horiguchi 
> 
> There is a kernel panic that is triggered when reading /proc/kpageflags
> on the kernel booted with kernel parameter 'memmap=nn[KMG]!ss[KMG]':
> 
>   BUG: unable to handle kernel paging request at fffe
>   PGD 9b20e067 P4D 9b20e067 PUD 9b210067 PMD 0
>   Oops:  [#1] SMP PTI
>   CPU: 2 PID: 1728 Comm: page-types Not tainted 
> 4.17.0-rc6-mm1-v4.17-rc6-180605-0816-00236-g2dfb086ef02c+ #160
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.fc28 
> 04/01/2014
>   RIP: 0010:stable_page_flags+0x27/0x3c0
>   Code: 00 00 00 0f 1f 44 00 00 48 85 ff 0f 84 a0 03 00 00 41 54 55 49 89 fc 
> 53 48 8b 57 08 48 8b 2f 48 8d 42 ff 83 e2 01 48 0f 44 c7 <48> 8b 00 f6 c4 01 
> 0f 84 10 03 00 00 31 db 49 8b 54 24 08 4c 89 e7
>   RSP: 0018:bbd44111fde0 EFLAGS: 00010202
>   RAX: fffe RBX: 7fffeff9 RCX: 
>   RDX: 0001 RSI: 0202 RDI: ed1182fff5c0
>   RBP:  R08: 0001 R09: 0001
>   R10: bbd44111fed8 R11:  R12: ed1182fff5c0
>   R13: 000bffd7 R14: 02fff5c0 R15: bbd44111ff10
>   FS:  7efc4335a500() GS:93a5bfc0() knlGS:
>   CS:  0010 DS:  ES:  CR0: 80050033
>   CR2: fffe CR3: b2a58000 CR4: 001406e0
>   Call Trace:
>kpageflags_read+0xc7/0x120
>proc_reg_read+0x3c/0x60
>__vfs_read+0x36/0x170
>vfs_read+0x89/0x130
>ksys_pread64+0x71/0x90
>do_syscall_64+0x5b/0x160
>entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7efc42e75e23
>   Code: 09 00 ba 9f 01 00 00 e8 ab 81 f4 ff 66 2e 0f 1f 84 00 00 00 00 00 90 
> 83 3d 29 0a 2d 00 00 75 13 49 89 ca b8 11 00 00 00 0f 05 <48> 3d 01 f0 ff ff 
> 73 34 c3 48 83 ec 08 e8 db d3 01 00 48 89 04 24
> 
> According to kernel bisection, this problem became visible due to commit
> f7f99100d8d9 which changes how struct pages are initialized.
> 
> Memblock layout affects the pfn ranges covered by node/zone. Consider
> that we have a VM with 2 NUMA nodes and each node has 4GB memory, and
> the default (no memmap= given) memblock layout is like below:
> 
>   MEMBLOCK configuration:
>memory size = 0x0001fff75c00 reserved size = 0x0300c000
>memory.cnt  = 0x4
>memory[0x0] [0x1000-0x0009efff], 
> 0x0009e000 bytes on node 0 flags: 0x0
>memory[0x1] [0x0010-0xbffd6fff], 
> 0xbfed7000 bytes on node 0 flags: 0x0
>memory[0x2] [0x0001-0x00013fff], 
> 0x4000 bytes on node 0 flags: 0x0
>memory[0x3] [0x00014000-0x00023fff], 
> 0x0001 bytes on node 1 flags: 0x0
>...
> 
> If you give memmap=1G!4G (so it just covers memory[0x2]),
> the range [0x1-0x13fff] is gone:
> 
>   MEMBLOCK configuration:
>memory size = 0x0001bff75c00 reserved size = 0x0300c000
>memory.cnt  = 0x3
>memory[0x0] [0x1000-0x0009efff], 
> 0x0009e000 bytes on node 0 flags: 0x0
>memory[0x1] [0x0010-0xbffd6fff], 
> 0xbfed7000 bytes on node 0 flags: 0x0
>memory[0x2] [0x00014000-0x00023fff], 
> 0x0001 bytes on node 1 flags: 0x0
>...
> 
> This causes shrinking node 0's pfn range because it is calculated by
> the address range of memblock.memory. So some of struct pages in the
> gap range are left uninitialized.
> 
> We have a function zero_resv_unavail() which does zeroing the struct
> pages outside memblock.memory, but currently it covers only the reserved
> unavailable range (i.e. memblock.memory && !memblock.reserved).
> This patch extends it to cover all unavailable range, which fixes
> the reported issue.
> 
> Fixes: f7f99100d8d9 ("mm: stop zeroing memory during allocation in vmemmap")
> Signed-off-by: Naoya Horiguchi 
> Tested-by: Oscar Salvador 
> Tested-by: Masayoshi Mizuma 

Reviewed-by: Pavel Tatashin 

Also, please review and add the following patch to this series:

From 6d23e66e979244734a06c1b636742c2568121b39 Mon Sep 17 00:00:00 2001
From: Pavel Tatashin 
Date: Mon, 27 Aug 2018 19:10:35 -0400
Subject: [PATCH] mm: return zero_resv_unavail optimization

When checking for valid pfns in zero_resv_unavail(), it is not necessary to
verify that pfns within pageblock_nr_pages ranges are valid, only the first
one needs to be checked. This is because memory for pages are allocated in
contiguous chunks that contain pageblock_nr_pages struct pages.

Signed-off-by: Pavel Tatashin 
---
 mm/page_alloc.c | 46 ++
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 650d8f16a67e..5dfc206db40e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6441,6 +6441,29 @@ void __init free_area_init_node(int

Re: [PATCH v3 3/3] mm: don't miss the last page because of round-off error

2018-08-27 Thread Roman Gushchin

On Mon, Aug 27, 2018 at 02:04:32PM -0700, Andrew Morton wrote:
> On Mon, 27 Aug 2018 09:26:21 -0700 Roman Gushchin  wrote:
> 
> > I've noticed, that dying memory cgroups are  often pinned
> > in memory by a single pagecache page. Even under moderate
> > memory pressure they sometimes stayed in such state
> > for a long time. That looked strange.
> > 
> > My investigation showed that the problem is caused by
> > applying the LRU pressure balancing math:
> > 
> >   scan = div64_u64(scan * fraction[lru], denominator),
> > 
> > where
> > 
> >   denominator = fraction[anon] + fraction[file] + 1.
> > 
> > Because fraction[lru] is always less than denominator,
> > if the initial scan size is 1, the result is always 0.
> > 
> > This means the last page is not scanned and has
> > no chances to be reclaimed.
> > 
> > Fix this by rounding up the result of the division.
> > 
> > In practice this change significantly improves the speed
> > of dying cgroups reclaim.
> > 
> > ...
> >
> > --- a/include/linux/math64.h
> > +++ b/include/linux/math64.h
> > @@ -281,4 +281,6 @@ static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 
> > divisor)
> >  }
> >  #endif /* mul_u64_u32_div */
> >  
> > +#define DIV64_U64_ROUND_UP(ll, d)  div64_u64((ll) + (d) - 1, (d))
> 
> This macro references arg `d' more than once.  That can cause problems
> if the passed expression has side-effects and is poor practice.  Can
> we please redo this with a temporary?

Sure. This was copy-pasted to match the existing DIV_ROUND_UP
(probably, not the best idea).

So let me fix them both in a separate patch.

Thanks!

Re: [PATCH v3 3/3] mm: don't miss the last page because of round-off error

2018-08-27 Thread Roman Gushchin

On Mon, Aug 27, 2018 at 02:04:32PM -0700, Andrew Morton wrote:
> On Mon, 27 Aug 2018 09:26:21 -0700 Roman Gushchin  wrote:
> 
> > I've noticed, that dying memory cgroups are  often pinned
> > in memory by a single pagecache page. Even under moderate
> > memory pressure they sometimes stayed in such state
> > for a long time. That looked strange.
> > 
> > My investigation showed that the problem is caused by
> > applying the LRU pressure balancing math:
> > 
> >   scan = div64_u64(scan * fraction[lru], denominator),
> > 
> > where
> > 
> >   denominator = fraction[anon] + fraction[file] + 1.
> > 
> > Because fraction[lru] is always less than denominator,
> > if the initial scan size is 1, the result is always 0.
> > 
> > This means the last page is not scanned and has
> > no chances to be reclaimed.
> > 
> > Fix this by rounding up the result of the division.
> > 
> > In practice this change significantly improves the speed
> > of dying cgroups reclaim.
> > 
> > ...
> >
> > --- a/include/linux/math64.h
> > +++ b/include/linux/math64.h
> > @@ -281,4 +281,6 @@ static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 
> > divisor)
> >  }
> >  #endif /* mul_u64_u32_div */
> >  
> > +#define DIV64_U64_ROUND_UP(ll, d)  div64_u64((ll) + (d) - 1, (d))
> 
> This macro references arg `d' more than once.  That can cause problems
> if the passed expression has side-effects and is poor practice.  Can
> we please redo this with a temporary?

Sure. This was copy-pasted to match the existing DIV_ROUND_UP
(probably, not the best idea).

So let me fix them both in a separate patch.

Thanks!

Re: [PATCH] x86/nmi: Fix some races in NMI uaccess

2018-08-27 Thread Andy Lutomirski

On Mon, Aug 27, 2018 at 4:12 PM, Jann Horn  wrote:
> On Tue, Aug 28, 2018 at 1:04 AM Andy Lutomirski  wrote:
>>
>> In NMI context, we might be in the middle of context switching or in
>> the middle of switch_mm_irqs_off().  In either case, CR3 might not
>> match current->mm, which could cause copy_from_user_nmi() and
>> friends to read the wrong memory.
>>
>> Fix it by adding a new nmi_uaccess_okay() helper and checking it in
>> copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.
>
> What about eBPF probes (which I think can be attached to kprobe points
> / tracepoints / perf events) that perform userspace reads / userspace
> writes / kernel reads? Can those run in NMI context, and if so, do
> they also need special handling?

I assume they can run in NMI context, which might be problematic in
and of themselves.  For example, does BPF adequately protect against a
BPF program accessing a map while bpf(2) is modifying it?  It seems
like bpf_prog_active is intended to serve this purpose.

But I don't see any obvious mechanism for eBPF programs to read user memory.

Re: [PATCH] x86/nmi: Fix some races in NMI uaccess

2018-08-27 Thread Andy Lutomirski

On Mon, Aug 27, 2018 at 4:12 PM, Jann Horn  wrote:
> On Tue, Aug 28, 2018 at 1:04 AM Andy Lutomirski  wrote:
>>
>> In NMI context, we might be in the middle of context switching or in
>> the middle of switch_mm_irqs_off().  In either case, CR3 might not
>> match current->mm, which could cause copy_from_user_nmi() and
>> friends to read the wrong memory.
>>
>> Fix it by adding a new nmi_uaccess_okay() helper and checking it in
>> copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.
>
> What about eBPF probes (which I think can be attached to kprobe points
> / tracepoints / perf events) that perform userspace reads / userspace
> writes / kernel reads? Can those run in NMI context, and if so, do
> they also need special handling?

I assume they can run in NMI context, which might be problematic in
and of themselves.  For example, does BPF adequately protect against a
BPF program accessing a map while bpf(2) is modifying it?  It seems
like bpf_prog_active is intended to serve this purpose.

But I don't see any obvious mechanism for eBPF programs to read user memory.

Re: [PATCH v3 1/3] mm: rework memcg kernel stack accounting

2018-08-27 Thread Roman Gushchin

On Mon, Aug 27, 2018 at 02:01:43PM -0700, Andrew Morton wrote:
> On Mon, 27 Aug 2018 09:26:19 -0700 Roman Gushchin  wrote:
> 
> > If CONFIG_VMAP_STACK is set, kernel stacks are allocated
> > using __vmalloc_node_range() with __GFP_ACCOUNT. So kernel
> > stack pages are charged against corresponding memory cgroups
> > on allocation and uncharged on releasing them.
> > 
> > The problem is that we do cache kernel stacks in small
> > per-cpu caches and do reuse them for new tasks, which can
> > belong to different memory cgroups.
> > 
> > Each stack page still holds a reference to the original cgroup,
> > so the cgroup can't be released until the vmap area is released.
> > 
> > To make this happen we need more than two subsequent exits
> > without forks in between on the current cpu, which makes it
> > very unlikely to happen. As a result, I saw a significant number
> > of dying cgroups (in theory, up to 2 * number_of_cpu +
> > number_of_tasks), which can't be released even by significant
> > memory pressure.
> > 
> > As a cgroup structure can take a significant amount of memory
> > (first of all, per-cpu data like memcg statistics), it leads
> > to a noticeable waste of memory.
> 
> OK, but this doesn't describe how the patch addresses this issue?

Sorry, missed this part. Let's add the following paragraph to the
commit message (the full updated patch is below):

To address the issue, let's charge thread stacks on assigning
them to tasks, and uncharge on releasing them and putting into
the per-cpu cache. So, cached stacks will not be assigned to
any memcg and will not hold any memcg reference.

> 
> >
> > ...
> >
> > @@ -371,6 +382,35 @@ static void account_kernel_stack(struct task_struct 
> > *tsk, int account)
> > }
> >  }
> >  
> > +static int memcg_charge_kernel_stack(struct task_struct *tsk)
> > +{
> > +#ifdef CONFIG_VMAP_STACK
> > +   struct vm_struct *vm = task_stack_vm_area(tsk);
> > +   int ret;
> > +
> > +   if (vm) {
> > +   int i;
> > +
> > +   for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) {
> 
> Can we ever have THREAD_SIZE < PAGE_SIZE?  64k pages?

Hm, good question!
We can, but I doubt that anyone using 64k pages AND CONFIG_VMAP_STACK,
and I *suspect* that it will trigger the BUG_ON() in account_kernel_stack():

static void account_kernel_stack(struct task_struct *tsk, int account) {
...

if (vm) {
...

BUG_ON(vm->nr_pages != THREAD_SIZE / PAGE_SIZE);

But I don't see anything that makes such a config illegitimate.
Does it makes any sense to use vmap if THREAD_SIZE < PAGE_SIZE?

> 
> > +   /*
> > +* If memcg_kmem_charge() fails, page->mem_cgroup
> > +* pointer is NULL, and both memcg_kmem_uncharge()
> > +* and mod_memcg_page_state() in free_thread_stack()
> > +* will ignore this page. So it's safe.
> > +*/
> > +   ret = memcg_kmem_charge(vm->pages[i], GFP_KERNEL, 0);
> > +   if (ret)
> > +   return ret;
> > +
> > +   mod_memcg_page_state(vm->pages[i],
> > +MEMCG_KERNEL_STACK_KB,
> > +PAGE_SIZE / 1024);
> > +   }
> > +   }
> > +#endif
> > +   return 0;
> > +}
> >
> > ...
> >

Thanks!

--

>From 91b373bb03715dcd2393302ab1816c929ee980ae Mon Sep 17 00:00:00 2001
From: Roman Gushchin 
Date: Tue, 14 Aug 2018 16:01:02 -0700
Subject: [PATCH v3 1/3] mm: rework memcg kernel stack accounting

If CONFIG_VMAP_STACK is set, kernel stacks are allocated
using __vmalloc_node_range() with __GFP_ACCOUNT. So kernel
stack pages are charged against corresponding memory cgroups
on allocation and uncharged on releasing them.

The problem is that we do cache kernel stacks in small
per-cpu caches and do reuse them for new tasks, which can
belong to different memory cgroups.

Each stack page still holds a reference to the original cgroup,
so the cgroup can't be released until the vmap area is released.

To make this happen we need more than two subsequent exits
without forks in between on the current cpu, which makes it
very unlikely to happen. As a result, I saw a significant number
of dying cgroups (in theory, up to 2 * number_of_cpu +
number_of_tasks), which can't be released even by significant
memory pressure.

As a cgroup structure can take a significant amount of memory
(first of all, per-cpu data like memcg statistics), it leads
to a noticeable waste of memory.

To address the issue, let's charge thread stacks on assigning
them to tasks, and uncharge on releasing them and putting into
the per-cpu cache. So, cached stacks will not be assigned to
any memcg and will not hold any memcg reference.

Fixes: ac496bf48d97 ("fork: Optimize task creation by caching
two thread stacks per CPU if CONFIG_VMAP_STACK=y")
Signed-off-by: Roman Gushchin 
Reviewed-by: Shakeel Butt 
Acked-by:

Re: [PATCH v3 1/3] mm: rework memcg kernel stack accounting

2018-08-27 Thread Roman Gushchin

On Mon, Aug 27, 2018 at 02:01:43PM -0700, Andrew Morton wrote:
> On Mon, 27 Aug 2018 09:26:19 -0700 Roman Gushchin  wrote:
> 
> > If CONFIG_VMAP_STACK is set, kernel stacks are allocated
> > using __vmalloc_node_range() with __GFP_ACCOUNT. So kernel
> > stack pages are charged against corresponding memory cgroups
> > on allocation and uncharged on releasing them.
> > 
> > The problem is that we do cache kernel stacks in small
> > per-cpu caches and do reuse them for new tasks, which can
> > belong to different memory cgroups.
> > 
> > Each stack page still holds a reference to the original cgroup,
> > so the cgroup can't be released until the vmap area is released.
> > 
> > To make this happen we need more than two subsequent exits
> > without forks in between on the current cpu, which makes it
> > very unlikely to happen. As a result, I saw a significant number
> > of dying cgroups (in theory, up to 2 * number_of_cpu +
> > number_of_tasks), which can't be released even by significant
> > memory pressure.
> > 
> > As a cgroup structure can take a significant amount of memory
> > (first of all, per-cpu data like memcg statistics), it leads
> > to a noticeable waste of memory.
> 
> OK, but this doesn't describe how the patch addresses this issue?

Sorry, missed this part. Let's add the following paragraph to the
commit message (the full updated patch is below):

To address the issue, let's charge thread stacks on assigning
them to tasks, and uncharge on releasing them and putting into
the per-cpu cache. So, cached stacks will not be assigned to
any memcg and will not hold any memcg reference.

> 
> >
> > ...
> >
> > @@ -371,6 +382,35 @@ static void account_kernel_stack(struct task_struct 
> > *tsk, int account)
> > }
> >  }
> >  
> > +static int memcg_charge_kernel_stack(struct task_struct *tsk)
> > +{
> > +#ifdef CONFIG_VMAP_STACK
> > +   struct vm_struct *vm = task_stack_vm_area(tsk);
> > +   int ret;
> > +
> > +   if (vm) {
> > +   int i;
> > +
> > +   for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) {
> 
> Can we ever have THREAD_SIZE < PAGE_SIZE?  64k pages?

Hm, good question!
We can, but I doubt that anyone using 64k pages AND CONFIG_VMAP_STACK,
and I *suspect* that it will trigger the BUG_ON() in account_kernel_stack():

static void account_kernel_stack(struct task_struct *tsk, int account) {
...

if (vm) {
...

BUG_ON(vm->nr_pages != THREAD_SIZE / PAGE_SIZE);

But I don't see anything that makes such a config illegitimate.
Does it makes any sense to use vmap if THREAD_SIZE < PAGE_SIZE?

> 
> > +   /*
> > +* If memcg_kmem_charge() fails, page->mem_cgroup
> > +* pointer is NULL, and both memcg_kmem_uncharge()
> > +* and mod_memcg_page_state() in free_thread_stack()
> > +* will ignore this page. So it's safe.
> > +*/
> > +   ret = memcg_kmem_charge(vm->pages[i], GFP_KERNEL, 0);
> > +   if (ret)
> > +   return ret;
> > +
> > +   mod_memcg_page_state(vm->pages[i],
> > +MEMCG_KERNEL_STACK_KB,
> > +PAGE_SIZE / 1024);
> > +   }
> > +   }
> > +#endif
> > +   return 0;
> > +}
> >
> > ...
> >

Thanks!

--

>From 91b373bb03715dcd2393302ab1816c929ee980ae Mon Sep 17 00:00:00 2001
From: Roman Gushchin 
Date: Tue, 14 Aug 2018 16:01:02 -0700
Subject: [PATCH v3 1/3] mm: rework memcg kernel stack accounting

If CONFIG_VMAP_STACK is set, kernel stacks are allocated
using __vmalloc_node_range() with __GFP_ACCOUNT. So kernel
stack pages are charged against corresponding memory cgroups
on allocation and uncharged on releasing them.

The problem is that we do cache kernel stacks in small
per-cpu caches and do reuse them for new tasks, which can
belong to different memory cgroups.

Each stack page still holds a reference to the original cgroup,
so the cgroup can't be released until the vmap area is released.

To make this happen we need more than two subsequent exits
without forks in between on the current cpu, which makes it
very unlikely to happen. As a result, I saw a significant number
of dying cgroups (in theory, up to 2 * number_of_cpu +
number_of_tasks), which can't be released even by significant
memory pressure.

As a cgroup structure can take a significant amount of memory
(first of all, per-cpu data like memcg statistics), it leads
to a noticeable waste of memory.

To address the issue, let's charge thread stacks on assigning
them to tasks, and uncharge on releasing them and putting into
the per-cpu cache. So, cached stacks will not be assigned to
any memcg and will not hold any memcg reference.

Fixes: ac496bf48d97 ("fork: Optimize task creation by caching
two thread stacks per CPU if CONFIG_VMAP_STACK=y")
Signed-off-by: Roman Gushchin 
Reviewed-by: Shakeel Butt 
Acked-by:

Re: [PATCH 1/2] Revert "x86/e820: put !E820_TYPE_RAM regions into memblock.reserved"

2018-08-27 Thread Pasha Tatashin

On 8/23/18 2:25 PM, Masayoshi Mizuma wrote:
> From: Masayoshi Mizuma 
> 
> commit 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into
> memblock.reserved") breaks movable_node kernel option because it
> changed the memory gap range to reserved memblock. So, the node
> is marked as Normal zone even if the SRAT has Hot plaggable affinity.
> 
> =
> kernel: BIOS-e820: [mem 0x1800-0x180f] usable
> kernel: BIOS-e820: [mem 0x1c00-0x1c0f] usable
> ...
> kernel: reserved[0x12]#011[0x1810-0x1bff], 
> 0x03f0 bytes flags: 0x0
> ...
> kernel: ACPI: SRAT: Node 2 PXM 6 [mem 0x1800-0x1bff] 
> hotplug
> kernel: ACPI: SRAT: Node 3 PXM 7 [mem 0x1c00-0x1fff] 
> hotplug
> ...
> kernel: Movable zone start for each node
> kernel:  Node 3: 0x1c00
> kernel: Early memory node ranges
> ...
> =
> 
> Naoya's v1 patch [*] fixes the original issue and this movable_node
> issue doesn't occur.
> Let's revert commit 124049decbb1 ("x86/e820: put !E820_TYPE_RAM
> regions into memblock.reserved") and apply the v1 patch.
> 
> [*] https://lkml.org/lkml/2018/6/13/27
> 
> Signed-off-by: Masayoshi Mizuma 

Reviewed-by: Pavel Tatashin

Re: [PATCH 1/2] Revert "x86/e820: put !E820_TYPE_RAM regions into memblock.reserved"

2018-08-27 Thread Pasha Tatashin

On 8/23/18 2:25 PM, Masayoshi Mizuma wrote:
> From: Masayoshi Mizuma 
> 
> commit 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into
> memblock.reserved") breaks movable_node kernel option because it
> changed the memory gap range to reserved memblock. So, the node
> is marked as Normal zone even if the SRAT has Hot plaggable affinity.
> 
> =
> kernel: BIOS-e820: [mem 0x1800-0x180f] usable
> kernel: BIOS-e820: [mem 0x1c00-0x1c0f] usable
> ...
> kernel: reserved[0x12]#011[0x1810-0x1bff], 
> 0x03f0 bytes flags: 0x0
> ...
> kernel: ACPI: SRAT: Node 2 PXM 6 [mem 0x1800-0x1bff] 
> hotplug
> kernel: ACPI: SRAT: Node 3 PXM 7 [mem 0x1c00-0x1fff] 
> hotplug
> ...
> kernel: Movable zone start for each node
> kernel:  Node 3: 0x1c00
> kernel: Early memory node ranges
> ...
> =
> 
> Naoya's v1 patch [*] fixes the original issue and this movable_node
> issue doesn't occur.
> Let's revert commit 124049decbb1 ("x86/e820: put !E820_TYPE_RAM
> regions into memblock.reserved") and apply the v1 patch.
> 
> [*] https://lkml.org/lkml/2018/6/13/27
> 
> Signed-off-by: Masayoshi Mizuma 

Reviewed-by: Pavel Tatashin

[PATCH 02/13] proc: apply seq_puts() whenever possible

2018-08-27 Thread Alexey Dobriyan

seq_puts() is faster than seq_printf() because it doesn't search for
format specifiers.

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/array.c | 16 
 fs/proc/base.c  |  2 +-
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 0ceb3b6b37e7..5016e03a4dba 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -343,28 +343,28 @@ static inline void task_seccomp(struct seq_file *m, 
struct task_struct *p)
 #ifdef CONFIG_SECCOMP
seq_put_decimal_ull(m, "\nSeccomp:\t", p->seccomp.mode);
 #endif
-   seq_printf(m, "\nSpeculation_Store_Bypass:\t");
+   seq_puts(m, "\nSpeculation_Store_Bypass:\t");
switch (arch_prctl_spec_ctrl_get(p, PR_SPEC_STORE_BYPASS)) {
case -EINVAL:
-   seq_printf(m, "unknown");
+   seq_puts(m, "unknown");
break;
case PR_SPEC_NOT_AFFECTED:
-   seq_printf(m, "not vulnerable");
+   seq_puts(m, "not vulnerable");
break;
case PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE:
-   seq_printf(m, "thread force mitigated");
+   seq_puts(m, "thread force mitigated");
break;
case PR_SPEC_PRCTL | PR_SPEC_DISABLE:
-   seq_printf(m, "thread mitigated");
+   seq_puts(m, "thread mitigated");
break;
case PR_SPEC_PRCTL | PR_SPEC_ENABLE:
-   seq_printf(m, "thread vulnerable");
+   seq_puts(m, "thread vulnerable");
break;
case PR_SPEC_DISABLE:
-   seq_printf(m, "globally mitigated");
+   seq_puts(m, "globally mitigated");
break;
default:
-   seq_printf(m, "vulnerable");
+   seq_puts(m, "vulnerable");
break;
}
seq_putc(m, '\n');
diff --git a/fs/proc/base.c b/fs/proc/base.c
index ccf86f16d9f0..f96babf3cffc 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -442,7 +442,7 @@ static int proc_pid_schedstat(struct seq_file *m, struct 
pid_namespace *ns,
  struct pid *pid, struct task_struct *task)
 {
if (unlikely(!sched_info_on()))
-   seq_printf(m, "0 0 0\n");
+   seq_puts(m, "0 0 0\n");
else
seq_printf(m, "%llu %llu %lu\n",
   (unsigned long long)task->se.sum_exec_runtime,
-- 
2.16.4

[PATCH 02/13] proc: apply seq_puts() whenever possible

2018-08-27 Thread Alexey Dobriyan

seq_puts() is faster than seq_printf() because it doesn't search for
format specifiers.

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/array.c | 16 
 fs/proc/base.c  |  2 +-
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 0ceb3b6b37e7..5016e03a4dba 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -343,28 +343,28 @@ static inline void task_seccomp(struct seq_file *m, 
struct task_struct *p)
 #ifdef CONFIG_SECCOMP
seq_put_decimal_ull(m, "\nSeccomp:\t", p->seccomp.mode);
 #endif
-   seq_printf(m, "\nSpeculation_Store_Bypass:\t");
+   seq_puts(m, "\nSpeculation_Store_Bypass:\t");
switch (arch_prctl_spec_ctrl_get(p, PR_SPEC_STORE_BYPASS)) {
case -EINVAL:
-   seq_printf(m, "unknown");
+   seq_puts(m, "unknown");
break;
case PR_SPEC_NOT_AFFECTED:
-   seq_printf(m, "not vulnerable");
+   seq_puts(m, "not vulnerable");
break;
case PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE:
-   seq_printf(m, "thread force mitigated");
+   seq_puts(m, "thread force mitigated");
break;
case PR_SPEC_PRCTL | PR_SPEC_DISABLE:
-   seq_printf(m, "thread mitigated");
+   seq_puts(m, "thread mitigated");
break;
case PR_SPEC_PRCTL | PR_SPEC_ENABLE:
-   seq_printf(m, "thread vulnerable");
+   seq_puts(m, "thread vulnerable");
break;
case PR_SPEC_DISABLE:
-   seq_printf(m, "globally mitigated");
+   seq_puts(m, "globally mitigated");
break;
default:
-   seq_printf(m, "vulnerable");
+   seq_puts(m, "vulnerable");
break;
}
seq_putc(m, '\n');
diff --git a/fs/proc/base.c b/fs/proc/base.c
index ccf86f16d9f0..f96babf3cffc 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -442,7 +442,7 @@ static int proc_pid_schedstat(struct seq_file *m, struct 
pid_namespace *ns,
  struct pid *pid, struct task_struct *task)
 {
if (unlikely(!sched_info_on()))
-   seq_printf(m, "0 0 0\n");
+   seq_puts(m, "0 0 0\n");
else
seq_printf(m, "%llu %llu %lu\n",
   (unsigned long long)task->se.sum_exec_runtime,
-- 
2.16.4

[PATCH 11/13] proc: readdir /proc/*/task

2018-08-27 Thread Alexey Dobriyan

---
 fs/proc/base.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 33f444721965..668e465c86b3 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3549,11 +3549,11 @@ static int proc_task_readdir(struct file *file, struct 
dir_context *ctx)
for (task = first_tid(proc_pid(inode), tid, ctx->pos - 2, ns);
 task;
 task = next_tid(task), ctx->pos++) {
-   char name[10 + 1];
-   unsigned int len;
+   char name[10], *p = name + sizeof(name);
+
tid = task_pid_nr_ns(task, ns);
-   len = snprintf(name, sizeof(name), "%u", tid);
-   if (!proc_fill_cache(file, ctx, name, len,
+   p = _print_integer_u32(p, tid);
+   if (!proc_fill_cache(file, ctx, p, name + sizeof(name) - p,
proc_task_instantiate, task, NULL)) {
/* returning this tgid failed, save it as the first
 * pid for the next readir call */
-- 
2.16.4

[PATCH 05/13] proc: new and improved way to print decimals

2018-08-27 Thread Alexey Dobriyan

C lacks a capable preprocess to turn

snprintf(buf, sizeof(buf), "%u", x);

into

print_integer_u32(buf, x);

so vsnprintf() is forced to have a million branches.
Benchmark anything which uses /proc and look for format_decode().

This unfortunate situation was partially fixed by seq_put_decimal_ull()
function which skipped "format specifier" part. However, it still does
unnecessary copies internally and even reflects the digits before
putting them into final buffer. It also does strlen() which is done at
runtime.

The following 3 functions

_print_integer_u32
_print_integer_u64
_print_integer_ul

cut all the overhead by printing backwards one character at a time:

x = 123456789

|   <|
|...123456789|

This is just as fast as current printing by 2 characters at a time,
because pids, fds, uids are small integers so emitting 2 characters
doesn't make much difference. It also generates very small code
(146 bytes total here, not counting the callers).
Current put_dec() and friends are surprisingly large.

All the functions have the following signature:

char *_print_integer_XXX(char *p, T x);

They are written quite in a very specific way to prevent gcc from
inlining everything and making a mess.

They aren't exported and advertised because idiomatic way of using them
is not something you see every day:
* fixed sized buffer on stack capable of holding the worst case,
* pointer past the end of the buffer (yay 6.5.6 p8!)
* no buffer length checks (wheee),
* no NUL terminator (ha-ha-ha),
* emitting output BACKWARDS (one character at a time!),
* finally one copy to the final buffer (one copy, one!).

char buf[10 + 1 + 20 + 1], *p = buf + sizeof(buf);

*--p = '\n';
p = _print_integer_u64(p, y);
*--p = ' ';
p = _print_integer_u32(p, x);

seq_write(seq, p, buf + sizeof(buf) - p);

As the comment says, do not tell anyone about these functions.

The plan is to use them inside /proc and only inside /proc.

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/internal.h | 11 +++
 fs/proc/util.c | 47 +++
 2 files changed, 58 insertions(+)

diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 5185d7f6a51e..be4965ef8e48 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -127,6 +127,17 @@ void task_dump_owner(struct task_struct *task, umode_t 
mode,
 kuid_t *ruid, kgid_t *rgid);
 
 unsigned name_to_int(const struct qstr *qstr);
+
+char *_print_integer_u32(char *, u32);
+char *_print_integer_u64(char *, u64);
+static inline char *_print_integer_ul(char *p, unsigned long x)
+{
+   if (sizeof(unsigned long) == 4)
+   return _print_integer_u32(p, x);
+   else
+   return _print_integer_u64(p, x);
+}
+
 /*
  * Offset of the first process in the /proc root directory..
  */
diff --git a/fs/proc/util.c b/fs/proc/util.c
index b161cfa0f9fa..2d9ceab04289 100644
--- a/fs/proc/util.c
+++ b/fs/proc/util.c
@@ -1,4 +1,5 @@
 #include 
+#include 
 
 unsigned name_to_int(const struct qstr *qstr)
 {
@@ -21,3 +22,49 @@ unsigned name_to_int(const struct qstr *qstr)
 out:
return ~0U;
 }
+
+/*
+ * Print an integer in decimal.
+ * "p" initially points PAST THE END OF THE BUFFER!
+ *
+ * DO NOT USE THESE FUNCTIONS!
+ *
+ * Do not copy these functions.
+ * Do not document these functions.
+ * Do not move these functions to lib/ or elsewhere.
+ * Do not export these functions to modules.
+ * Do not tell anyone about these functions.
+ */
+noinline
+char *_print_integer_u32(char *p, u32 x)
+{
+   do {
+   *--p = '0' + (x % 10);
+   x /= 10;
+   } while (x != 0);
+   return p;
+}
+
+static char *__print_integer_u32(char *p, u32 x)
+{
+   /* 0 <= x < 10^8 */
+   char *p0 = p - 8;
+
+   p = _print_integer_u32(p, x);
+   while (p != p0)
+   *--p = '0';
+   return p;
+}
+
+char *_print_integer_u64(char *p, u64 x)
+{
+   while (x >= 1) {
+   u64 q;
+   u32 r;
+
+   q = div_u64_rem(x, 1, );
+   p = __print_integer_u32(p, r);
+   x = q;
+   }
+   return _print_integer_u32(p, x);
+}
-- 
2.16.4

[ANNOUNCE] linux-4.18-ck1, Multiple Queue Skiplist Scheduler version 0.173 for linux 4.18

2018-08-27 Thread Con Kolivas

 Announcing a new -ck release, 4.18-ck1  with the latest version of the 
Multiple Queue Skiplist Scheduler, version 0.173. These are patches designed 
to improve system responsiveness and interactivity with specific emphasis on 
the desktop, but configurable for any workload.

linux-4.18-ck1:
-ck1 patches:
http://ck.kolivas.org/patches/4.0/4.18/4.18-ck1/
Git tree:
https://github.com/ckolivas/linux/tree/4.18-ck

MuQSS only:
Download:
http://ck.kolivas.org/patches/muqss/4.0/4.18/0001-MultiQueue-Skiplist-Scheduler-version-v0.173.patch
Git tree:
https://github.com/ckolivas/linux/tree/4.18-muqss


This is just a resync from 4.17 MuQSS and -ck patches.


Blog: ck-hack.blogspot.com

Enjoy!
お楽しみ下さい

-- 
-ck

[PATCH 09/13] proc: convert dentry flushing on exit to _print_integer()

2018-08-27 Thread Alexey Dobriyan

Benchmark fork+exit+waitpid 2^16 times:

6.579161299 seconds time elapsed ( +-  0.24% )
6.482729157 seconds time elapsed ( +-  0.42% )

-1.5%

Dentry flushing is very small part of exit(2), effects should be more
visible on a tiny 1-page process which doesn't uses libc.

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/base.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 17666bd61ac8..79d2f7d72ad1 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3022,11 +3022,11 @@ static const struct inode_operations 
proc_tgid_base_inode_operations = {
 static void proc_flush_task_mnt(struct vfsmount *mnt, pid_t pid, pid_t tgid)
 {
struct dentry *dentry, *leader, *dir;
-   char buf[10 + 1];
+   char buf[10];
struct qstr name;
 
-   name.name = buf;
-   name.len = snprintf(buf, sizeof(buf), "%u", pid);
+   name.name = _print_integer_u32(buf + sizeof(buf), pid);
+   name.len = buf + sizeof(buf) - (char *)name.name;
/* no ->d_hash() rejects on procfs */
dentry = d_hash_and_lookup(mnt->mnt_root, );
if (dentry) {
@@ -3037,8 +3037,8 @@ static void proc_flush_task_mnt(struct vfsmount *mnt, 
pid_t pid, pid_t tgid)
if (pid == tgid)
return;
 
-   name.name = buf;
-   name.len = snprintf(buf, sizeof(buf), "%u", tgid);
+   name.name = _print_integer_u32(buf + sizeof(buf), tgid);
+   name.len = buf + sizeof(buf) - (char *)name.name;
leader = d_hash_and_lookup(mnt->mnt_root, );
if (!leader)
goto out;
@@ -3049,8 +3049,8 @@ static void proc_flush_task_mnt(struct vfsmount *mnt, 
pid_t pid, pid_t tgid)
if (!dir)
goto out_put_leader;
 
-   name.name = buf;
-   name.len = snprintf(buf, sizeof(buf), "%u", pid);
+   name.name = _print_integer_u32(buf + sizeof(buf), pid);
+   name.len = buf + sizeof(buf) - (char *)name.name;
dentry = d_hash_and_lookup(dir, );
if (dentry) {
d_invalidate(dentry);
-- 
2.16.4

[PATCH 13/13] proc: convert /proc//task//children to _print_integer()

2018-08-27 Thread Alexey Dobriyan

Benchmark pread /proc/1/task/1/children 2^21 times on the same system:

6.766400479 s
4.328648442

-36%

(need to remeasure on a controlled set of children)

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/array.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index d0565527166a..045ce2cac1dd 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -710,9 +710,11 @@ get_children_pid(struct inode *inode, struct pid 
*pid_prev, loff_t pos)
 static int children_seq_show(struct seq_file *seq, void *v)
 {
struct inode *inode = file_inode(seq->file);
+   char buf[10 + 1], *p = buf + sizeof(buf);
 
-   seq_printf(seq, "%d ", pid_nr_ns(v, proc_pid_ns(inode)));
-   return 0;
+   *--p = ' ';
+   p = _print_integer_u32(p, pid_nr_ns(v, proc_pid_ns(inode)));
+   return seq_write(seq, p, buf + sizeof(buf) - p);
 }
 
 static void *children_seq_start(struct seq_file *seq, loff_t *pos)
-- 
2.16.4

[PATCH 11/13] proc: readdir /proc/*/task

2018-08-27 Thread Alexey Dobriyan

---
 fs/proc/base.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 33f444721965..668e465c86b3 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3549,11 +3549,11 @@ static int proc_task_readdir(struct file *file, struct 
dir_context *ctx)
for (task = first_tid(proc_pid(inode), tid, ctx->pos - 2, ns);
 task;
 task = next_tid(task), ctx->pos++) {
-   char name[10 + 1];
-   unsigned int len;
+   char name[10], *p = name + sizeof(name);
+
tid = task_pid_nr_ns(task, ns);
-   len = snprintf(name, sizeof(name), "%u", tid);
-   if (!proc_fill_cache(file, ctx, name, len,
+   p = _print_integer_u32(p, tid);
+   if (!proc_fill_cache(file, ctx, p, name + sizeof(name) - p,
proc_task_instantiate, task, NULL)) {
/* returning this tgid failed, save it as the first
 * pid for the next readir call */
-- 
2.16.4

[PATCH 05/13] proc: new and improved way to print decimals

2018-08-27 Thread Alexey Dobriyan

C lacks a capable preprocess to turn

snprintf(buf, sizeof(buf), "%u", x);

into

print_integer_u32(buf, x);

so vsnprintf() is forced to have a million branches.
Benchmark anything which uses /proc and look for format_decode().

This unfortunate situation was partially fixed by seq_put_decimal_ull()
function which skipped "format specifier" part. However, it still does
unnecessary copies internally and even reflects the digits before
putting them into final buffer. It also does strlen() which is done at
runtime.

The following 3 functions

_print_integer_u32
_print_integer_u64
_print_integer_ul

cut all the overhead by printing backwards one character at a time:

x = 123456789

|   <|
|...123456789|

This is just as fast as current printing by 2 characters at a time,
because pids, fds, uids are small integers so emitting 2 characters
doesn't make much difference. It also generates very small code
(146 bytes total here, not counting the callers).
Current put_dec() and friends are surprisingly large.

All the functions have the following signature:

char *_print_integer_XXX(char *p, T x);

They are written quite in a very specific way to prevent gcc from
inlining everything and making a mess.

They aren't exported and advertised because idiomatic way of using them
is not something you see every day:
* fixed sized buffer on stack capable of holding the worst case,
* pointer past the end of the buffer (yay 6.5.6 p8!)
* no buffer length checks (wheee),
* no NUL terminator (ha-ha-ha),
* emitting output BACKWARDS (one character at a time!),
* finally one copy to the final buffer (one copy, one!).

char buf[10 + 1 + 20 + 1], *p = buf + sizeof(buf);

*--p = '\n';
p = _print_integer_u64(p, y);
*--p = ' ';
p = _print_integer_u32(p, x);

seq_write(seq, p, buf + sizeof(buf) - p);

As the comment says, do not tell anyone about these functions.

The plan is to use them inside /proc and only inside /proc.

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/internal.h | 11 +++
 fs/proc/util.c | 47 +++
 2 files changed, 58 insertions(+)

diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 5185d7f6a51e..be4965ef8e48 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -127,6 +127,17 @@ void task_dump_owner(struct task_struct *task, umode_t 
mode,
 kuid_t *ruid, kgid_t *rgid);
 
 unsigned name_to_int(const struct qstr *qstr);
+
+char *_print_integer_u32(char *, u32);
+char *_print_integer_u64(char *, u64);
+static inline char *_print_integer_ul(char *p, unsigned long x)
+{
+   if (sizeof(unsigned long) == 4)
+   return _print_integer_u32(p, x);
+   else
+   return _print_integer_u64(p, x);
+}
+
 /*
  * Offset of the first process in the /proc root directory..
  */
diff --git a/fs/proc/util.c b/fs/proc/util.c
index b161cfa0f9fa..2d9ceab04289 100644
--- a/fs/proc/util.c
+++ b/fs/proc/util.c
@@ -1,4 +1,5 @@
 #include 
+#include 
 
 unsigned name_to_int(const struct qstr *qstr)
 {
@@ -21,3 +22,49 @@ unsigned name_to_int(const struct qstr *qstr)
 out:
return ~0U;
 }
+
+/*
+ * Print an integer in decimal.
+ * "p" initially points PAST THE END OF THE BUFFER!
+ *
+ * DO NOT USE THESE FUNCTIONS!
+ *
+ * Do not copy these functions.
+ * Do not document these functions.
+ * Do not move these functions to lib/ or elsewhere.
+ * Do not export these functions to modules.
+ * Do not tell anyone about these functions.
+ */
+noinline
+char *_print_integer_u32(char *p, u32 x)
+{
+   do {
+   *--p = '0' + (x % 10);
+   x /= 10;
+   } while (x != 0);
+   return p;
+}
+
+static char *__print_integer_u32(char *p, u32 x)
+{
+   /* 0 <= x < 10^8 */
+   char *p0 = p - 8;
+
+   p = _print_integer_u32(p, x);
+   while (p != p0)
+   *--p = '0';
+   return p;
+}
+
+char *_print_integer_u64(char *p, u64 x)
+{
+   while (x >= 1) {
+   u64 q;
+   u32 r;
+
+   q = div_u64_rem(x, 1, );
+   p = __print_integer_u32(p, r);
+   x = q;
+   }
+   return _print_integer_u32(p, x);
+}
-- 
2.16.4

[ANNOUNCE] linux-4.18-ck1, Multiple Queue Skiplist Scheduler version 0.173 for linux 4.18

2018-08-27 Thread Con Kolivas

 Announcing a new -ck release, 4.18-ck1  with the latest version of the 
Multiple Queue Skiplist Scheduler, version 0.173. These are patches designed 
to improve system responsiveness and interactivity with specific emphasis on 
the desktop, but configurable for any workload.

linux-4.18-ck1:
-ck1 patches:
http://ck.kolivas.org/patches/4.0/4.18/4.18-ck1/
Git tree:
https://github.com/ckolivas/linux/tree/4.18-ck

MuQSS only:
Download:
http://ck.kolivas.org/patches/muqss/4.0/4.18/0001-MultiQueue-Skiplist-Scheduler-version-v0.173.patch
Git tree:
https://github.com/ckolivas/linux/tree/4.18-muqss


This is just a resync from 4.17 MuQSS and -ck patches.


Blog: ck-hack.blogspot.com

Enjoy!
お楽しみ下さい

-- 
-ck

[PATCH 09/13] proc: convert dentry flushing on exit to _print_integer()

2018-08-27 Thread Alexey Dobriyan

Benchmark fork+exit+waitpid 2^16 times:

6.579161299 seconds time elapsed ( +-  0.24% )
6.482729157 seconds time elapsed ( +-  0.42% )

-1.5%

Dentry flushing is very small part of exit(2), effects should be more
visible on a tiny 1-page process which doesn't uses libc.

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/base.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 17666bd61ac8..79d2f7d72ad1 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3022,11 +3022,11 @@ static const struct inode_operations 
proc_tgid_base_inode_operations = {
 static void proc_flush_task_mnt(struct vfsmount *mnt, pid_t pid, pid_t tgid)
 {
struct dentry *dentry, *leader, *dir;
-   char buf[10 + 1];
+   char buf[10];
struct qstr name;
 
-   name.name = buf;
-   name.len = snprintf(buf, sizeof(buf), "%u", pid);
+   name.name = _print_integer_u32(buf + sizeof(buf), pid);
+   name.len = buf + sizeof(buf) - (char *)name.name;
/* no ->d_hash() rejects on procfs */
dentry = d_hash_and_lookup(mnt->mnt_root, );
if (dentry) {
@@ -3037,8 +3037,8 @@ static void proc_flush_task_mnt(struct vfsmount *mnt, 
pid_t pid, pid_t tgid)
if (pid == tgid)
return;
 
-   name.name = buf;
-   name.len = snprintf(buf, sizeof(buf), "%u", tgid);
+   name.name = _print_integer_u32(buf + sizeof(buf), tgid);
+   name.len = buf + sizeof(buf) - (char *)name.name;
leader = d_hash_and_lookup(mnt->mnt_root, );
if (!leader)
goto out;
@@ -3049,8 +3049,8 @@ static void proc_flush_task_mnt(struct vfsmount *mnt, 
pid_t pid, pid_t tgid)
if (!dir)
goto out_put_leader;
 
-   name.name = buf;
-   name.len = snprintf(buf, sizeof(buf), "%u", pid);
+   name.name = _print_integer_u32(buf + sizeof(buf), pid);
+   name.len = buf + sizeof(buf) - (char *)name.name;
dentry = d_hash_and_lookup(dir, );
if (dentry) {
d_invalidate(dentry);
-- 
2.16.4

[PATCH 13/13] proc: convert /proc//task//children to _print_integer()

2018-08-27 Thread Alexey Dobriyan

Benchmark pread /proc/1/task/1/children 2^21 times on the same system:

6.766400479 s
4.328648442

-36%

(need to remeasure on a controlled set of children)

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/array.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index d0565527166a..045ce2cac1dd 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -710,9 +710,11 @@ get_children_pid(struct inode *inode, struct pid 
*pid_prev, loff_t pos)
 static int children_seq_show(struct seq_file *seq, void *v)
 {
struct inode *inode = file_inode(seq->file);
+   char buf[10 + 1], *p = buf + sizeof(buf);
 
-   seq_printf(seq, "%d ", pid_nr_ns(v, proc_pid_ns(inode)));
-   return 0;
+   *--p = ' ';
+   p = _print_integer_u32(p, pid_nr_ns(v, proc_pid_ns(inode)));
+   return seq_write(seq, p, buf + sizeof(buf) - p);
 }
 
 static void *children_seq_start(struct seq_file *seq, loff_t *pos)
-- 
2.16.4

[PATCH 03/13] proc: rename "p" variable in proc_readfd_common()

2018-08-27 Thread Alexey Dobriyan

Use slightly more obvious "tsk" and prepare for changes in printing.

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/fd.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/proc/fd.c b/fs/proc/fd.c
index 81882a13212d..e098302b5101 100644
--- a/fs/proc/fd.c
+++ b/fs/proc/fd.c
@@ -228,16 +228,16 @@ static struct dentry *proc_lookupfd_common(struct inode 
*dir,
 static int proc_readfd_common(struct file *file, struct dir_context *ctx,
  instantiate_t instantiate)
 {
-   struct task_struct *p = get_proc_task(file_inode(file));
+   struct task_struct *tsk = get_proc_task(file_inode(file));
struct files_struct *files;
unsigned int fd;
 
-   if (!p)
+   if (!tsk)
return -ENOENT;
 
if (!dir_emit_dots(file, ctx))
goto out;
-   files = get_files_struct(p);
+   files = get_files_struct(tsk);
if (!files)
goto out;
 
@@ -259,7 +259,7 @@ static int proc_readfd_common(struct file *file, struct 
dir_context *ctx,
 
len = snprintf(name, sizeof(name), "%u", fd);
if (!proc_fill_cache(file, ctx,
-name, len, instantiate, p,
+name, len, instantiate, tsk,
 ))
goto out_fd_loop;
cond_resched();
@@ -269,7 +269,7 @@ static int proc_readfd_common(struct file *file, struct 
dir_context *ctx,
 out_fd_loop:
put_files_struct(files);
 out:
-   put_task_struct(p);
+   put_task_struct(tsk);
return 0;
 }
 
-- 
2.16.4

[PATCH 07/13] proc: convert /proc/thread-self to _print_integer()

2018-08-27 Thread Alexey Dobriyan

Benchmark readlink("/proc/thread-self") 2^23 times:

9.447948508 seconds time elapsed ( +-  0.06% )
7.846435274 seconds time elapsed ( +-  0.07% )

-17%
---
 fs/proc/thread_self.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/proc/thread_self.c b/fs/proc/thread_self.c
index b905010ca9eb..03dd644d8ada 100644
--- a/fs/proc/thread_self.c
+++ b/fs/proc/thread_self.c
@@ -15,6 +15,7 @@ static const char *proc_thread_self_get_link(struct dentry 
*dentry,
struct pid_namespace *ns = proc_pid_ns(inode);
pid_t tgid = task_tgid_nr_ns(current, ns);
pid_t pid = task_pid_nr_ns(current, ns);
+   char buf[10 + 6 + 10], *p = buf + sizeof(buf);
char *name;
 
if (!pid)
@@ -22,7 +23,13 @@ static const char *proc_thread_self_get_link(struct dentry 
*dentry,
name = kmalloc(10 + 6 + 10 + 1, dentry ? GFP_KERNEL : GFP_ATOMIC);
if (unlikely(!name))
return dentry ? ERR_PTR(-ENOMEM) : ERR_PTR(-ECHILD);
-   sprintf(name, "%u/task/%u", tgid, pid);
+
+   p = _print_integer_u32(p, pid);
+   p = memcpy(p - 6, "/task/", 6);
+   p = _print_integer_u32(p, tgid);
+   memcpy(name, p, buf + sizeof(buf) - p);
+   name[buf + sizeof(buf) - p] = '\0';
+
set_delayed_call(done, kfree_link, name);
return name;
 }
-- 
2.16.4

[PATCH 03/13] proc: rename "p" variable in proc_readfd_common()

2018-08-27 Thread Alexey Dobriyan

Use slightly more obvious "tsk" and prepare for changes in printing.

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/fd.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/proc/fd.c b/fs/proc/fd.c
index 81882a13212d..e098302b5101 100644
--- a/fs/proc/fd.c
+++ b/fs/proc/fd.c
@@ -228,16 +228,16 @@ static struct dentry *proc_lookupfd_common(struct inode 
*dir,
 static int proc_readfd_common(struct file *file, struct dir_context *ctx,
  instantiate_t instantiate)
 {
-   struct task_struct *p = get_proc_task(file_inode(file));
+   struct task_struct *tsk = get_proc_task(file_inode(file));
struct files_struct *files;
unsigned int fd;
 
-   if (!p)
+   if (!tsk)
return -ENOENT;
 
if (!dir_emit_dots(file, ctx))
goto out;
-   files = get_files_struct(p);
+   files = get_files_struct(tsk);
if (!files)
goto out;
 
@@ -259,7 +259,7 @@ static int proc_readfd_common(struct file *file, struct 
dir_context *ctx,
 
len = snprintf(name, sizeof(name), "%u", fd);
if (!proc_fill_cache(file, ctx,
-name, len, instantiate, p,
+name, len, instantiate, tsk,
 ))
goto out_fd_loop;
cond_resched();
@@ -269,7 +269,7 @@ static int proc_readfd_common(struct file *file, struct 
dir_context *ctx,
 out_fd_loop:
put_files_struct(files);
 out:
-   put_task_struct(p);
+   put_task_struct(tsk);
return 0;
 }
 
-- 
2.16.4

[PATCH 07/13] proc: convert /proc/thread-self to _print_integer()

2018-08-27 Thread Alexey Dobriyan

Benchmark readlink("/proc/thread-self") 2^23 times:

9.447948508 seconds time elapsed ( +-  0.06% )
7.846435274 seconds time elapsed ( +-  0.07% )

-17%
---
 fs/proc/thread_self.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/proc/thread_self.c b/fs/proc/thread_self.c
index b905010ca9eb..03dd644d8ada 100644
--- a/fs/proc/thread_self.c
+++ b/fs/proc/thread_self.c
@@ -15,6 +15,7 @@ static const char *proc_thread_self_get_link(struct dentry 
*dentry,
struct pid_namespace *ns = proc_pid_ns(inode);
pid_t tgid = task_tgid_nr_ns(current, ns);
pid_t pid = task_pid_nr_ns(current, ns);
+   char buf[10 + 6 + 10], *p = buf + sizeof(buf);
char *name;
 
if (!pid)
@@ -22,7 +23,13 @@ static const char *proc_thread_self_get_link(struct dentry 
*dentry,
name = kmalloc(10 + 6 + 10 + 1, dentry ? GFP_KERNEL : GFP_ATOMIC);
if (unlikely(!name))
return dentry ? ERR_PTR(-ENOMEM) : ERR_PTR(-ECHILD);
-   sprintf(name, "%u/task/%u", tgid, pid);
+
+   p = _print_integer_u32(p, pid);
+   p = memcpy(p - 6, "/task/", 6);
+   p = _print_integer_u32(p, tgid);
+   memcpy(name, p, buf + sizeof(buf) - p);
+   name[buf + sizeof(buf) - p] = '\0';
+
set_delayed_call(done, kfree_link, name);
return name;
 }
-- 
2.16.4

[PATCH 10/13] proc: convert readdir /proc to _print_integer()

2018-08-27 Thread Alexey Dobriyan

Benchmark readdir("/proc") 2^13 times with 2K processes in a pid
namespace:

850.3750 us per readdir
786.5625

-7.5%

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/base.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 79d2f7d72ad1..33f444721965 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3223,16 +3223,15 @@ int proc_pid_readdir(struct file *file, struct 
dir_context *ctx)
for (iter = next_tgid(ns, iter);
 iter.task;
 iter.tgid += 1, iter = next_tgid(ns, iter)) {
-   char name[10 + 1];
-   unsigned int len;
+   char name[10], *p = name + sizeof(name);
 
cond_resched();
if (!has_pid_permissions(ns, iter.task, HIDEPID_INVISIBLE))
continue;
 
-   len = snprintf(name, sizeof(name), "%u", iter.tgid);
+   p = _print_integer_u32(p, iter.tgid);
ctx->pos = iter.tgid + TGID_OFFSET;
-   if (!proc_fill_cache(file, ctx, name, len,
+   if (!proc_fill_cache(file, ctx, p, name + sizeof(name) - p,
 proc_pid_instantiate, iter.task, NULL)) {
put_task_struct(iter.task);
return 0;
-- 
2.16.4

[PATCH 04/13] proc: rename "p" variable in proc_map_files_readdir()

2018-08-27 Thread Alexey Dobriyan

Use "mfi", add "const" and move structure definition closer while I'm at it.

Note: moving "struct map_files_info info;" declaration to the scope
where it is used bloats the code by ~90 bytes. I'm not sure what's
going on.

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/base.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index f96babf3cffc..17666bd61ac8 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2014,12 +2014,6 @@ static int map_files_get_link(struct dentry *dentry, 
struct path *path)
return rc;
 }
 
-struct map_files_info {
-   unsigned long   start;
-   unsigned long   end;
-   fmode_t mode;
-};
-
 /*
  * Only allow CAP_SYS_ADMIN to follow the links, due to concerns about how the
  * symlinks may be used to bypass permissions on ancestor directories in the
@@ -2119,6 +2113,12 @@ static const struct inode_operations 
proc_map_files_inode_operations = {
.setattr= proc_setattr,
 };
 
+struct map_files_info {
+   unsigned long   start;
+   unsigned long   end;
+   fmode_t mode;
+};
+
 static int
 proc_map_files_readdir(struct file *file, struct dir_context *ctx)
 {
@@ -2128,7 +2128,6 @@ proc_map_files_readdir(struct file *file, struct 
dir_context *ctx)
unsigned long nr_files, pos, i;
struct flex_array *fa = NULL;
struct map_files_info info;
-   struct map_files_info *p;
int ret;
 
ret = -ENOENT;
@@ -2196,16 +2195,17 @@ proc_map_files_readdir(struct file *file, struct 
dir_context *ctx)
mmput(mm);
 
for (i = 0; i < nr_files; i++) {
+   const struct map_files_info *mfi;
char buf[4 * sizeof(long) + 2]; /* max: %lx-%lx\0 */
unsigned int len;
 
-   p = flex_array_get(fa, i);
-   len = snprintf(buf, sizeof(buf), "%lx-%lx", p->start, p->end);
+   mfi = flex_array_get(fa, i);
+   len = snprintf(buf, sizeof(buf), "%lx-%lx", mfi->start, 
mfi->end);
if (!proc_fill_cache(file, ctx,
  buf, len,
  proc_map_files_instantiate,
  task,
- (void *)(unsigned long)p->mode))
+ (void *)(unsigned long)mfi->mode))
break;
ctx->pos++;
}
-- 
2.16.4

[PATCH 10/13] proc: convert readdir /proc to _print_integer()

2018-08-27 Thread Alexey Dobriyan

Benchmark readdir("/proc") 2^13 times with 2K processes in a pid
namespace:

850.3750 us per readdir
786.5625

-7.5%

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/base.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 79d2f7d72ad1..33f444721965 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3223,16 +3223,15 @@ int proc_pid_readdir(struct file *file, struct 
dir_context *ctx)
for (iter = next_tgid(ns, iter);
 iter.task;
 iter.tgid += 1, iter = next_tgid(ns, iter)) {
-   char name[10 + 1];
-   unsigned int len;
+   char name[10], *p = name + sizeof(name);
 
cond_resched();
if (!has_pid_permissions(ns, iter.task, HIDEPID_INVISIBLE))
continue;
 
-   len = snprintf(name, sizeof(name), "%u", iter.tgid);
+   p = _print_integer_u32(p, iter.tgid);
ctx->pos = iter.tgid + TGID_OFFSET;
-   if (!proc_fill_cache(file, ctx, name, len,
+   if (!proc_fill_cache(file, ctx, p, name + sizeof(name) - p,
 proc_pid_instantiate, iter.task, NULL)) {
put_task_struct(iter.task);
return 0;
-- 
2.16.4

[PATCH 04/13] proc: rename "p" variable in proc_map_files_readdir()

2018-08-27 Thread Alexey Dobriyan

Use "mfi", add "const" and move structure definition closer while I'm at it.

Note: moving "struct map_files_info info;" declaration to the scope
where it is used bloats the code by ~90 bytes. I'm not sure what's
going on.

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/base.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index f96babf3cffc..17666bd61ac8 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2014,12 +2014,6 @@ static int map_files_get_link(struct dentry *dentry, 
struct path *path)
return rc;
 }
 
-struct map_files_info {
-   unsigned long   start;
-   unsigned long   end;
-   fmode_t mode;
-};
-
 /*
  * Only allow CAP_SYS_ADMIN to follow the links, due to concerns about how the
  * symlinks may be used to bypass permissions on ancestor directories in the
@@ -2119,6 +2113,12 @@ static const struct inode_operations 
proc_map_files_inode_operations = {
.setattr= proc_setattr,
 };
 
+struct map_files_info {
+   unsigned long   start;
+   unsigned long   end;
+   fmode_t mode;
+};
+
 static int
 proc_map_files_readdir(struct file *file, struct dir_context *ctx)
 {
@@ -2128,7 +2128,6 @@ proc_map_files_readdir(struct file *file, struct 
dir_context *ctx)
unsigned long nr_files, pos, i;
struct flex_array *fa = NULL;
struct map_files_info info;
-   struct map_files_info *p;
int ret;
 
ret = -ENOENT;
@@ -2196,16 +2195,17 @@ proc_map_files_readdir(struct file *file, struct 
dir_context *ctx)
mmput(mm);
 
for (i = 0; i < nr_files; i++) {
+   const struct map_files_info *mfi;
char buf[4 * sizeof(long) + 2]; /* max: %lx-%lx\0 */
unsigned int len;
 
-   p = flex_array_get(fa, i);
-   len = snprintf(buf, sizeof(buf), "%lx-%lx", p->start, p->end);
+   mfi = flex_array_get(fa, i);
+   len = snprintf(buf, sizeof(buf), "%lx-%lx", mfi->start, 
mfi->end);
if (!proc_fill_cache(file, ctx,
  buf, len,
  proc_map_files_instantiate,
  task,
- (void *)(unsigned long)p->mode))
+ (void *)(unsigned long)mfi->mode))
break;
ctx->pos++;
}
-- 
2.16.4

[PATCH 06/13] proc: convert /proc/self to _print_integer()

2018-08-27 Thread Alexey Dobriyan

Benchmark readlink("/proc/self") 2^23 times:

8.205992458 seconds time elapsed ( +-  0.15% )
7.535168869 seconds time elapsed ( +-  0.09% )

-8.2%

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/self.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/proc/self.c b/fs/proc/self.c
index 127265e5c55f..b2279412237b 100644
--- a/fs/proc/self.c
+++ b/fs/proc/self.c
@@ -14,6 +14,7 @@ static const char *proc_self_get_link(struct dentry *dentry,
 {
struct pid_namespace *ns = proc_pid_ns(inode);
pid_t tgid = task_tgid_nr_ns(current, ns);
+   char buf[10], *p = buf + sizeof(buf);
char *name;
 
if (!tgid)
@@ -22,7 +23,11 @@ static const char *proc_self_get_link(struct dentry *dentry,
name = kmalloc(10 + 1, dentry ? GFP_KERNEL : GFP_ATOMIC);
if (unlikely(!name))
return dentry ? ERR_PTR(-ENOMEM) : ERR_PTR(-ECHILD);
-   sprintf(name, "%u", tgid);
+
+   p = _print_integer_u32(p, tgid);
+   memcpy(name, p, buf + sizeof(buf) - p);
+   name[buf + sizeof(buf) - p] = '\0';
+
set_delayed_call(done, kfree_link, name);
return name;
 }
-- 
2.16.4

[PATCH 12/13] proc: convert /proc/*/statm to _print_integer()

2018-08-27 Thread Alexey Dobriyan

Benchmark pread("/proc/self/statm") 2^23 times:

6.135596793 seconds time elapsed ( +-  0.11% )
5.685442773 seconds time elapsed ( +-  0.11% )

-7.3%

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/array.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 5016e03a4dba..d0565527166a 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -627,27 +627,27 @@ int proc_pid_statm(struct seq_file *m, struct 
pid_namespace *ns,
 {
unsigned long size = 0, resident = 0, shared = 0, text = 0, data = 0;
struct mm_struct *mm = get_task_mm(task);
+   /* "%lu %lu %lu %lu 0 %lu 0\n" */
+   char buf[5 * ((sizeof(long) * 5 / 2) + 1) + 2 + 2];
+   char *p = buf + sizeof(buf);
 
if (mm) {
size = task_statm(mm, , , , );
mmput(mm);
}
-   /*
-* For quick read, open code by putting numbers directly
-* expected format is
-* seq_printf(m, "%lu %lu %lu %lu 0 %lu 0\n",
-*   size, resident, shared, text, data);
-*/
-   seq_put_decimal_ull(m, "", size);
-   seq_put_decimal_ull(m, " ", resident);
-   seq_put_decimal_ull(m, " ", shared);
-   seq_put_decimal_ull(m, " ", text);
-   seq_put_decimal_ull(m, " ", 0);
-   seq_put_decimal_ull(m, " ", data);
-   seq_put_decimal_ull(m, " ", 0);
-   seq_putc(m, '\n');
 
-   return 0;
+   p = memcpy(p - 3, " 0\n", 3);
+   p = _print_integer_ul(p, data);
+   p = memcpy(p - 3, " 0 ", 3);
+   p = _print_integer_ul(p, text);
+   *--p = ' ';
+   p = _print_integer_ul(p, shared);
+   *--p = ' ';
+   p = _print_integer_ul(p, resident);
+   *--p = ' ';
+   p = _print_integer_ul(p, size);
+
+   return seq_write(m, p, buf + sizeof(buf) - p);
 }
 
 #ifdef CONFIG_PROC_CHILDREN
-- 
2.16.4

[PATCH 08/13] proc: convert /proc/*/fd to _print_integer()

2018-08-27 Thread Alexey Dobriyan

Benchmark opendir+readdir("/proc/self/fd")+closedir 2^21 times
with 4 descriptors (0, 1, 2, 3 from opendir):

11.802099126 seconds time elapsed ( +-  0.23% )
10.950810068 seconds time elapsed ( +-  0.23% )

-7.2%

Benchmark the same thing with 1000 descriptors:

362.1250 us per iteration
288.4375 us

-20%

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/fd.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/proc/fd.c b/fs/proc/fd.c
index e098302b5101..60ad1935eefc 100644
--- a/fs/proc/fd.c
+++ b/fs/proc/fd.c
@@ -247,8 +247,7 @@ static int proc_readfd_common(struct file *file, struct 
dir_context *ctx,
 fd++, ctx->pos++) {
struct file *f;
struct fd_data data;
-   char name[10 + 1];
-   unsigned int len;
+   char name[10], *p = name + sizeof(name);
 
f = fcheck_files(files, fd);
if (!f)
@@ -257,9 +256,10 @@ static int proc_readfd_common(struct file *file, struct 
dir_context *ctx,
rcu_read_unlock();
data.fd = fd;
 
-   len = snprintf(name, sizeof(name), "%u", fd);
+   p = _print_integer_u32(p, fd);
if (!proc_fill_cache(file, ctx,
-name, len, instantiate, tsk,
+p, name + sizeof(name) - p,
+instantiate, tsk,
 ))
goto out_fd_loop;
cond_resched();
-- 
2.16.4

[PATCH 06/13] proc: convert /proc/self to _print_integer()

2018-08-27 Thread Alexey Dobriyan

Benchmark readlink("/proc/self") 2^23 times:

8.205992458 seconds time elapsed ( +-  0.15% )
7.535168869 seconds time elapsed ( +-  0.09% )

-8.2%

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/self.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/proc/self.c b/fs/proc/self.c
index 127265e5c55f..b2279412237b 100644
--- a/fs/proc/self.c
+++ b/fs/proc/self.c
@@ -14,6 +14,7 @@ static const char *proc_self_get_link(struct dentry *dentry,
 {
struct pid_namespace *ns = proc_pid_ns(inode);
pid_t tgid = task_tgid_nr_ns(current, ns);
+   char buf[10], *p = buf + sizeof(buf);
char *name;
 
if (!tgid)
@@ -22,7 +23,11 @@ static const char *proc_self_get_link(struct dentry *dentry,
name = kmalloc(10 + 1, dentry ? GFP_KERNEL : GFP_ATOMIC);
if (unlikely(!name))
return dentry ? ERR_PTR(-ENOMEM) : ERR_PTR(-ECHILD);
-   sprintf(name, "%u", tgid);
+
+   p = _print_integer_u32(p, tgid);
+   memcpy(name, p, buf + sizeof(buf) - p);
+   name[buf + sizeof(buf) - p] = '\0';
+
set_delayed_call(done, kfree_link, name);
return name;
 }
-- 
2.16.4

[PATCH 12/13] proc: convert /proc/*/statm to _print_integer()

2018-08-27 Thread Alexey Dobriyan

Benchmark pread("/proc/self/statm") 2^23 times:

6.135596793 seconds time elapsed ( +-  0.11% )
5.685442773 seconds time elapsed ( +-  0.11% )

-7.3%

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/array.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 5016e03a4dba..d0565527166a 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -627,27 +627,27 @@ int proc_pid_statm(struct seq_file *m, struct 
pid_namespace *ns,
 {
unsigned long size = 0, resident = 0, shared = 0, text = 0, data = 0;
struct mm_struct *mm = get_task_mm(task);
+   /* "%lu %lu %lu %lu 0 %lu 0\n" */
+   char buf[5 * ((sizeof(long) * 5 / 2) + 1) + 2 + 2];
+   char *p = buf + sizeof(buf);
 
if (mm) {
size = task_statm(mm, , , , );
mmput(mm);
}
-   /*
-* For quick read, open code by putting numbers directly
-* expected format is
-* seq_printf(m, "%lu %lu %lu %lu 0 %lu 0\n",
-*   size, resident, shared, text, data);
-*/
-   seq_put_decimal_ull(m, "", size);
-   seq_put_decimal_ull(m, " ", resident);
-   seq_put_decimal_ull(m, " ", shared);
-   seq_put_decimal_ull(m, " ", text);
-   seq_put_decimal_ull(m, " ", 0);
-   seq_put_decimal_ull(m, " ", data);
-   seq_put_decimal_ull(m, " ", 0);
-   seq_putc(m, '\n');
 
-   return 0;
+   p = memcpy(p - 3, " 0\n", 3);
+   p = _print_integer_ul(p, data);
+   p = memcpy(p - 3, " 0 ", 3);
+   p = _print_integer_ul(p, text);
+   *--p = ' ';
+   p = _print_integer_ul(p, shared);
+   *--p = ' ';
+   p = _print_integer_ul(p, resident);
+   *--p = ' ';
+   p = _print_integer_ul(p, size);
+
+   return seq_write(m, p, buf + sizeof(buf) - p);
 }
 
 #ifdef CONFIG_PROC_CHILDREN
-- 
2.16.4

[PATCH 08/13] proc: convert /proc/*/fd to _print_integer()

2018-08-27 Thread Alexey Dobriyan

Benchmark opendir+readdir("/proc/self/fd")+closedir 2^21 times
with 4 descriptors (0, 1, 2, 3 from opendir):

11.802099126 seconds time elapsed ( +-  0.23% )
10.950810068 seconds time elapsed ( +-  0.23% )

-7.2%

Benchmark the same thing with 1000 descriptors:

362.1250 us per iteration
288.4375 us

-20%

Signed-off-by: Alexey Dobriyan 
---
 fs/proc/fd.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/proc/fd.c b/fs/proc/fd.c
index e098302b5101..60ad1935eefc 100644
--- a/fs/proc/fd.c
+++ b/fs/proc/fd.c
@@ -247,8 +247,7 @@ static int proc_readfd_common(struct file *file, struct 
dir_context *ctx,
 fd++, ctx->pos++) {
struct file *f;
struct fd_data data;
-   char name[10 + 1];
-   unsigned int len;
+   char name[10], *p = name + sizeof(name);
 
f = fcheck_files(files, fd);
if (!f)
@@ -257,9 +256,10 @@ static int proc_readfd_common(struct file *file, struct 
dir_context *ctx,
rcu_read_unlock();
data.fd = fd;
 
-   len = snprintf(name, sizeof(name), "%u", fd);
+   p = _print_integer_u32(p, fd);
if (!proc_fill_cache(file, ctx,
-name, len, instantiate, tsk,
+p, name + sizeof(name) - p,
+instantiate, tsk,
 ))
goto out_fd_loop;
cond_resched();
-- 
2.16.4

[PATCH 01/13] seq_file: rewrite seq_puts() in terms of seq_write()

2018-08-27 Thread Alexey Dobriyan

Space savings -- 42 bytes!

seq_puts71  29 [-42]

Signed-off-by: Alexey Dobriyan 
---
 fs/seq_file.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/fs/seq_file.c b/fs/seq_file.c
index 1dea7a8a5255..0c282a88a896 100644
--- a/fs/seq_file.c
+++ b/fs/seq_file.c
@@ -653,14 +653,7 @@ EXPORT_SYMBOL(seq_putc);
 
 void seq_puts(struct seq_file *m, const char *s)
 {
-   int len = strlen(s);
-
-   if (m->count + len >= m->size) {
-   seq_set_overflow(m);
-   return;
-   }
-   memcpy(m->buf + m->count, s, len);
-   m->count += len;
+   seq_write(m, s, strlen(s));
 }
 EXPORT_SYMBOL(seq_puts);
 
-- 
2.16.4

[PATCH 01/13] seq_file: rewrite seq_puts() in terms of seq_write()

2018-08-27 Thread Alexey Dobriyan

Space savings -- 42 bytes!

seq_puts71  29 [-42]

Signed-off-by: Alexey Dobriyan 
---
 fs/seq_file.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/fs/seq_file.c b/fs/seq_file.c
index 1dea7a8a5255..0c282a88a896 100644
--- a/fs/seq_file.c
+++ b/fs/seq_file.c
@@ -653,14 +653,7 @@ EXPORT_SYMBOL(seq_putc);
 
 void seq_puts(struct seq_file *m, const char *s)
 {
-   int len = strlen(s);
-
-   if (m->count + len >= m->size) {
-   seq_set_overflow(m);
-   return;
-   }
-   memcpy(m->buf + m->count, s, len);
-   m->count += len;
+   seq_write(m, s, strlen(s));
 }
 EXPORT_SYMBOL(seq_puts);
 
-- 
2.16.4

Re: [PATCH] x86/nmi: Fix some races in NMI uaccess

2018-08-27 Thread Jann Horn

On Tue, Aug 28, 2018 at 1:04 AM Andy Lutomirski  wrote:
>
> In NMI context, we might be in the middle of context switching or in
> the middle of switch_mm_irqs_off().  In either case, CR3 might not
> match current->mm, which could cause copy_from_user_nmi() and
> friends to read the wrong memory.
>
> Fix it by adding a new nmi_uaccess_okay() helper and checking it in
> copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.

What about eBPF probes (which I think can be attached to kprobe points
/ tracepoints / perf events) that perform userspace reads / userspace
writes / kernel reads? Can those run in NMI context, and if so, do
they also need special handling?

Re: [PATCH] x86/nmi: Fix some races in NMI uaccess

2018-08-27 Thread Jann Horn

On Tue, Aug 28, 2018 at 1:04 AM Andy Lutomirski  wrote:
>
> In NMI context, we might be in the middle of context switching or in
> the middle of switch_mm_irqs_off().  In either case, CR3 might not
> match current->mm, which could cause copy_from_user_nmi() and
> friends to read the wrong memory.
>
> Fix it by adding a new nmi_uaccess_okay() helper and checking it in
> copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.

What about eBPF probes (which I think can be attached to kprobe points
/ tracepoints / perf events) that perform userspace reads / userspace
writes / kernel reads? Can those run in NMI context, and if so, do
they also need special handling?

[PATCH] x86/nmi: Fix some races in NMI uaccess

2018-08-27 Thread Andy Lutomirski

In NMI context, we might be in the middle of context switching or in
the middle of switch_mm_irqs_off().  In either case, CR3 might not
match current->mm, which could cause copy_from_user_nmi() and
friends to read the wrong memory.

Fix it by adding a new nmi_uaccess_okay() helper and checking it in
copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.

Cc: sta...@vger.kernel.org
Cc: Peter Zijlstra 
Cc: Nadav Amit 
Signed-off-by: Andy Lutomirski 
---

The 0day bot is still chewing on this, but I've tested it a bit locally
and it seems to do the right thing.

I've never observed the bug it fixes, but it does appear to fix a bug
unless I've missed something.  It's also a prerequisite for Nadav's
fixmap bugfix.

arch/x86/events/core.c  |  2 +-
 arch/x86/include/asm/tlbflush.h | 16 
 arch/x86/lib/usercopy.c |  5 +
 arch/x86/mm/tlb.c   |  3 +++
 4 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 5f4829f10129..dfb2f7c0d019 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2465,7 +2465,7 @@ perf_callchain_user(struct perf_callchain_entry_ctx 
*entry, struct pt_regs *regs
 
perf_callchain_store(entry, regs->ip);
 
-   if (!current->mm)
+   if (!nmi_uaccess_okay())
return;
 
if (perf_callchain_user32(regs, entry))
diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 89a73bc31622..b23b2625793b 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -230,6 +230,22 @@ struct tlb_state {
 };
 DECLARE_PER_CPU_SHARED_ALIGNED(struct tlb_state, cpu_tlbstate);
 
+/*
+ * Blindly accessing user memory from NMI context can be dangerous
+ * if we're in the middle of switching the current user task or
+ * switching the loaded mm.  It can also be dangerous if we
+ * interrupted some kernel code that was temporarily using a
+ * different mm.
+ */
+static inline bool nmi_uaccess_okay(void)
+{
+   struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm);
+   struct mm_struct *current_mm = current->mm;
+
+   return current_mm && loaded_mm == current_mm &&
+   loaded_mm->pgd == __va(read_cr3_pa());
+}
+
 /* Initialize cr4 shadow for this CPU. */
 static inline void cr4_init_shadow(void)
 {
diff --git a/arch/x86/lib/usercopy.c b/arch/x86/lib/usercopy.c
index c8c6ad0d58b8..3f435d7fca5e 100644
--- a/arch/x86/lib/usercopy.c
+++ b/arch/x86/lib/usercopy.c
@@ -7,6 +7,8 @@
 #include 
 #include 
 
+#include 
+
 /*
  * We rely on the nested NMI work to allow atomic faults from the NMI path; the
  * nested NMI paths are careful to preserve CR2.
@@ -19,6 +21,9 @@ copy_from_user_nmi(void *to, const void __user *from, 
unsigned long n)
if (__range_not_ok(from, n, TASK_SIZE))
return n;
 
+   if (!nmi_uaccess_okay())
+   return n;
+
/*
 * Even though this function is typically called from NMI/IRQ context
 * disable pagefaults so that its behaviour is consistent even when
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 457b281b9339..f4b41d5a93dd 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -345,6 +345,9 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct 
mm_struct *next,
 */
trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, 
TLB_FLUSH_ALL);
} else {
+   /* Let NMI code know that CR3 may not match expectations. */
+   this_cpu_write(cpu_tlbstate.loaded_mm, NULL);
+
/* The new ASID is already up to date. */
load_new_mm_cr3(next->pgd, new_asid, false);
 
-- 
2.17.1

[PATCH] x86/nmi: Fix some races in NMI uaccess

2018-08-27 Thread Andy Lutomirski

In NMI context, we might be in the middle of context switching or in
the middle of switch_mm_irqs_off().  In either case, CR3 might not
match current->mm, which could cause copy_from_user_nmi() and
friends to read the wrong memory.

Fix it by adding a new nmi_uaccess_okay() helper and checking it in
copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.

Cc: sta...@vger.kernel.org
Cc: Peter Zijlstra 
Cc: Nadav Amit 
Signed-off-by: Andy Lutomirski 
---

The 0day bot is still chewing on this, but I've tested it a bit locally
and it seems to do the right thing.

I've never observed the bug it fixes, but it does appear to fix a bug
unless I've missed something.  It's also a prerequisite for Nadav's
fixmap bugfix.

arch/x86/events/core.c  |  2 +-
 arch/x86/include/asm/tlbflush.h | 16 
 arch/x86/lib/usercopy.c |  5 +
 arch/x86/mm/tlb.c   |  3 +++
 4 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 5f4829f10129..dfb2f7c0d019 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2465,7 +2465,7 @@ perf_callchain_user(struct perf_callchain_entry_ctx 
*entry, struct pt_regs *regs
 
perf_callchain_store(entry, regs->ip);
 
-   if (!current->mm)
+   if (!nmi_uaccess_okay())
return;
 
if (perf_callchain_user32(regs, entry))
diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 89a73bc31622..b23b2625793b 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -230,6 +230,22 @@ struct tlb_state {
 };
 DECLARE_PER_CPU_SHARED_ALIGNED(struct tlb_state, cpu_tlbstate);
 
+/*
+ * Blindly accessing user memory from NMI context can be dangerous
+ * if we're in the middle of switching the current user task or
+ * switching the loaded mm.  It can also be dangerous if we
+ * interrupted some kernel code that was temporarily using a
+ * different mm.
+ */
+static inline bool nmi_uaccess_okay(void)
+{
+   struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm);
+   struct mm_struct *current_mm = current->mm;
+
+   return current_mm && loaded_mm == current_mm &&
+   loaded_mm->pgd == __va(read_cr3_pa());
+}
+
 /* Initialize cr4 shadow for this CPU. */
 static inline void cr4_init_shadow(void)
 {
diff --git a/arch/x86/lib/usercopy.c b/arch/x86/lib/usercopy.c
index c8c6ad0d58b8..3f435d7fca5e 100644
--- a/arch/x86/lib/usercopy.c
+++ b/arch/x86/lib/usercopy.c
@@ -7,6 +7,8 @@
 #include 
 #include 
 
+#include 
+
 /*
  * We rely on the nested NMI work to allow atomic faults from the NMI path; the
  * nested NMI paths are careful to preserve CR2.
@@ -19,6 +21,9 @@ copy_from_user_nmi(void *to, const void __user *from, 
unsigned long n)
if (__range_not_ok(from, n, TASK_SIZE))
return n;
 
+   if (!nmi_uaccess_okay())
+   return n;
+
/*
 * Even though this function is typically called from NMI/IRQ context
 * disable pagefaults so that its behaviour is consistent even when
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 457b281b9339..f4b41d5a93dd 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -345,6 +345,9 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct 
mm_struct *next,
 */
trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, 
TLB_FLUSH_ALL);
} else {
+   /* Let NMI code know that CR3 may not match expectations. */
+   this_cpu_write(cpu_tlbstate.loaded_mm, NULL);
+
/* The new ASID is already up to date. */
load_new_mm_cr3(next->pgd, new_asid, false);
 
-- 
2.17.1

Re: [PATCH 2/2] soc: imx: gpcv2: make pgc driver more generic for other i.MX platforms

2018-08-27 Thread Andrey Smirnov

On Mon, Aug 27, 2018 at 3:51 PM Andrey Smirnov  wrote:
>
> On Sun, Aug 5, 2018 at 11:45 PM Anson Huang  wrote:
> >
> > i.MX8MQ and i.MX8MM share same gpc module with i.MX7D, they
> > can reuse gpcv2 pgc driver for power domain control, this
> > patch renames all functions and structure definitions started
> > with "imx7" to "imx", and check machine type to pass platform
> > specific power domain data for power domain driver, thus make
> > gpcv2 pgc driver more generic for i.MX platforms.
> >
>
> Just for the sake of

Oops, forgot to type out the question I had about i.MX8MQ GPC in
general. I've noticed that vendor tree for i.MX8MQ has a separate
driver for GPC that relies on code in ARM Trusted Firmware binary blob
to do the actual switching. Do you by any chances know the relation
between this code and the driver I describe? Are they mutually
exclusive or complimentary (I assume the former)? Will the ATF-based
driver be eventually deprecated?

Thanks,
Andrey Smirnov

>
>
> > Signed-off-by: Anson Huang 
> > ---
> >  drivers/soc/imx/gpcv2.c | 68 
> > +
> >  1 file changed, 40 insertions(+), 28 deletions(-)
> >
> > diff --git a/drivers/soc/imx/gpcv2.c b/drivers/soc/imx/gpcv2.c
> > index 0e31465..0e33cb5 100644
> > --- a/drivers/soc/imx/gpcv2.c
> > +++ b/drivers/soc/imx/gpcv2.c
> > @@ -53,7 +53,7 @@
> >
> >  #define GPC_PGC_CTRL_PCR   BIT(0)
> >
> > -struct imx7_pgc_domain {
> > +struct imx_pgc_domain {
> > struct generic_pm_domain genpd;
> > struct regmap *regmap;
> > struct regulator *regulator;
> > @@ -69,11 +69,11 @@ struct imx7_pgc_domain {
> > struct device *dev;
> >  };
> >
> > -static int imx7_gpc_pu_pgc_sw_pxx_req(struct generic_pm_domain *genpd,
> > +static int imx_gpc_pu_pgc_sw_pxx_req(struct generic_pm_domain *genpd,
> >   bool on)
> >  {
> > -   struct imx7_pgc_domain *domain = container_of(genpd,
> > - struct 
> > imx7_pgc_domain,
> > +   struct imx_pgc_domain *domain = container_of(genpd,
> > + struct imx_pgc_domain,
> >   genpd);
> > unsigned int offset = on ?
> > GPC_PU_PGC_SW_PUP_REQ : GPC_PU_PGC_SW_PDN_REQ;
> > @@ -150,17 +150,17 @@ static int imx7_gpc_pu_pgc_sw_pxx_req(struct 
> > generic_pm_domain *genpd,
> > return ret;
> >  }
> >
> > -static int imx7_gpc_pu_pgc_sw_pup_req(struct generic_pm_domain *genpd)
> > +static int imx_gpc_pu_pgc_sw_pup_req(struct generic_pm_domain *genpd)
> >  {
> > -   return imx7_gpc_pu_pgc_sw_pxx_req(genpd, true);
> > +   return imx_gpc_pu_pgc_sw_pxx_req(genpd, true);
> >  }
> >
> > -static int imx7_gpc_pu_pgc_sw_pdn_req(struct generic_pm_domain *genpd)
> > +static int imx_gpc_pu_pgc_sw_pdn_req(struct generic_pm_domain *genpd)
> >  {
> > -   return imx7_gpc_pu_pgc_sw_pxx_req(genpd, false);
> > +   return imx_gpc_pu_pgc_sw_pxx_req(genpd, false);
> >  }
> >
> > -static const struct imx7_pgc_domain imx7_pgc_domains[] = {
> > +static const struct imx_pgc_domain imx7_pgc_domains[] = {
> > [IMX7_POWER_DOMAIN_MIPI_PHY] = {
> > .genpd = {
> > .name  = "mipi-phy",
> > @@ -198,9 +198,9 @@ static const struct imx7_pgc_domain imx7_pgc_domains[] 
> > = {
> > },
> >  };
> >
> > -static int imx7_pgc_domain_probe(struct platform_device *pdev)
> > +static int imx_pgc_domain_probe(struct platform_device *pdev)
> >  {
> > -   struct imx7_pgc_domain *domain = pdev->dev.platform_data;
> > +   struct imx_pgc_domain *domain = pdev->dev.platform_data;
> > int ret;
> >
> > domain->dev = >dev;
> > @@ -233,9 +233,9 @@ static int imx7_pgc_domain_probe(struct platform_device 
> > *pdev)
> > return ret;
> >  }
> >
> > -static int imx7_pgc_domain_remove(struct platform_device *pdev)
> > +static int imx_pgc_domain_remove(struct platform_device *pdev)
> >  {
> > -   struct imx7_pgc_domain *domain = pdev->dev.platform_data;
> > +   struct imx_pgc_domain *domain = pdev->dev.platform_data;
> >
> > of_genpd_del_provider(domain->dev->of_node);
> > pm_genpd_remove(>genpd);
> > @@ -243,23 +243,24 @@ static int imx7_pgc_domain_remove(struct 
> > platform_device *pdev)
> > return 0;
> >  }
> >
> > -static const struct platform_device_id imx7_pgc_domain_id[] = {
> > -   { "imx7-pgc-domain", },
> > +static const struct platform_device_id imx_pgc_domain_id[] = {
> > +   { "imx-pgc-domain", },
> > { },
> >  };
> >
> > -static struct platform_driver imx7_pgc_domain_driver = {
> > +static struct platform_driver imx_pgc_domain_driver = {
> > .driver = {
> > -   .name = "imx7-pgc",
> > +   .name = "imx-pgc",
> > },
> > -   .probe= imx7_pgc_domain_probe,
> > -   .remove   =

Re: [PATCH 2/2] soc: imx: gpcv2: make pgc driver more generic for other i.MX platforms

2018-08-27 Thread Andrey Smirnov

On Mon, Aug 27, 2018 at 3:51 PM Andrey Smirnov  wrote:
>
> On Sun, Aug 5, 2018 at 11:45 PM Anson Huang  wrote:
> >
> > i.MX8MQ and i.MX8MM share same gpc module with i.MX7D, they
> > can reuse gpcv2 pgc driver for power domain control, this
> > patch renames all functions and structure definitions started
> > with "imx7" to "imx", and check machine type to pass platform
> > specific power domain data for power domain driver, thus make
> > gpcv2 pgc driver more generic for i.MX platforms.
> >
>
> Just for the sake of

Oops, forgot to type out the question I had about i.MX8MQ GPC in
general. I've noticed that vendor tree for i.MX8MQ has a separate
driver for GPC that relies on code in ARM Trusted Firmware binary blob
to do the actual switching. Do you by any chances know the relation
between this code and the driver I describe? Are they mutually
exclusive or complimentary (I assume the former)? Will the ATF-based
driver be eventually deprecated?

Thanks,
Andrey Smirnov

>
>
> > Signed-off-by: Anson Huang 
> > ---
> >  drivers/soc/imx/gpcv2.c | 68 
> > +
> >  1 file changed, 40 insertions(+), 28 deletions(-)
> >
> > diff --git a/drivers/soc/imx/gpcv2.c b/drivers/soc/imx/gpcv2.c
> > index 0e31465..0e33cb5 100644
> > --- a/drivers/soc/imx/gpcv2.c
> > +++ b/drivers/soc/imx/gpcv2.c
> > @@ -53,7 +53,7 @@
> >
> >  #define GPC_PGC_CTRL_PCR   BIT(0)
> >
> > -struct imx7_pgc_domain {
> > +struct imx_pgc_domain {
> > struct generic_pm_domain genpd;
> > struct regmap *regmap;
> > struct regulator *regulator;
> > @@ -69,11 +69,11 @@ struct imx7_pgc_domain {
> > struct device *dev;
> >  };
> >
> > -static int imx7_gpc_pu_pgc_sw_pxx_req(struct generic_pm_domain *genpd,
> > +static int imx_gpc_pu_pgc_sw_pxx_req(struct generic_pm_domain *genpd,
> >   bool on)
> >  {
> > -   struct imx7_pgc_domain *domain = container_of(genpd,
> > - struct 
> > imx7_pgc_domain,
> > +   struct imx_pgc_domain *domain = container_of(genpd,
> > + struct imx_pgc_domain,
> >   genpd);
> > unsigned int offset = on ?
> > GPC_PU_PGC_SW_PUP_REQ : GPC_PU_PGC_SW_PDN_REQ;
> > @@ -150,17 +150,17 @@ static int imx7_gpc_pu_pgc_sw_pxx_req(struct 
> > generic_pm_domain *genpd,
> > return ret;
> >  }
> >
> > -static int imx7_gpc_pu_pgc_sw_pup_req(struct generic_pm_domain *genpd)
> > +static int imx_gpc_pu_pgc_sw_pup_req(struct generic_pm_domain *genpd)
> >  {
> > -   return imx7_gpc_pu_pgc_sw_pxx_req(genpd, true);
> > +   return imx_gpc_pu_pgc_sw_pxx_req(genpd, true);
> >  }
> >
> > -static int imx7_gpc_pu_pgc_sw_pdn_req(struct generic_pm_domain *genpd)
> > +static int imx_gpc_pu_pgc_sw_pdn_req(struct generic_pm_domain *genpd)
> >  {
> > -   return imx7_gpc_pu_pgc_sw_pxx_req(genpd, false);
> > +   return imx_gpc_pu_pgc_sw_pxx_req(genpd, false);
> >  }
> >
> > -static const struct imx7_pgc_domain imx7_pgc_domains[] = {
> > +static const struct imx_pgc_domain imx7_pgc_domains[] = {
> > [IMX7_POWER_DOMAIN_MIPI_PHY] = {
> > .genpd = {
> > .name  = "mipi-phy",
> > @@ -198,9 +198,9 @@ static const struct imx7_pgc_domain imx7_pgc_domains[] 
> > = {
> > },
> >  };
> >
> > -static int imx7_pgc_domain_probe(struct platform_device *pdev)
> > +static int imx_pgc_domain_probe(struct platform_device *pdev)
> >  {
> > -   struct imx7_pgc_domain *domain = pdev->dev.platform_data;
> > +   struct imx_pgc_domain *domain = pdev->dev.platform_data;
> > int ret;
> >
> > domain->dev = >dev;
> > @@ -233,9 +233,9 @@ static int imx7_pgc_domain_probe(struct platform_device 
> > *pdev)
> > return ret;
> >  }
> >
> > -static int imx7_pgc_domain_remove(struct platform_device *pdev)
> > +static int imx_pgc_domain_remove(struct platform_device *pdev)
> >  {
> > -   struct imx7_pgc_domain *domain = pdev->dev.platform_data;
> > +   struct imx_pgc_domain *domain = pdev->dev.platform_data;
> >
> > of_genpd_del_provider(domain->dev->of_node);
> > pm_genpd_remove(>genpd);
> > @@ -243,23 +243,24 @@ static int imx7_pgc_domain_remove(struct 
> > platform_device *pdev)
> > return 0;
> >  }
> >
> > -static const struct platform_device_id imx7_pgc_domain_id[] = {
> > -   { "imx7-pgc-domain", },
> > +static const struct platform_device_id imx_pgc_domain_id[] = {
> > +   { "imx-pgc-domain", },
> > { },
> >  };
> >
> > -static struct platform_driver imx7_pgc_domain_driver = {
> > +static struct platform_driver imx_pgc_domain_driver = {
> > .driver = {
> > -   .name = "imx7-pgc",
> > +   .name = "imx-pgc",
> > },
> > -   .probe= imx7_pgc_domain_probe,
> > -   .remove   =

Re: TLB flushes on fixmap changes

2018-08-27 Thread Andy Lutomirski

On Mon, Aug 27, 2018 at 3:54 PM, Nadav Amit  wrote:
> at 3:32 PM, Andy Lutomirski  wrote:
>
>> On Mon, Aug 27, 2018 at 2:55 PM, Nadav Amit  wrote:
>>> at 1:16 PM, Nadav Amit  wrote:
>>>
 at 12:58 PM, Andy Lutomirski  wrote:

> On Mon, Aug 27, 2018 at 12:43 PM, Nadav Amit  wrote:
>> at 12:10 PM, Nadav Amit  wrote:
>>
>>> at 11:58 AM, Andy Lutomirski  wrote:
>>>
 On Mon, Aug 27, 2018 at 11:54 AM, Nadav Amit  
 wrote:
>> On Mon, Aug 27, 2018 at 10:34 AM, Nadav Amit  
>> wrote:
>> What do you all think?
>
> I agree in general. But I think that current->mm would need to be 
> loaded, as
> otherwise I am afraid it would break switch_mm_irqs_off().

 What breaks?
>>>
>>> Actually nothing. I just saw the IBPB stuff regarding tsk, but it 
>>> should not
>>> matter.
>>
>> So here is what I got. It certainly needs some cleanup, but it boots.
>>
>> Let me know how crappy you find it...
>>
>>
>> diff --git a/arch/x86/include/asm/mmu_context.h 
>> b/arch/x86/include/asm/mmu_context.h
>> index bbc796eb0a3b..336779650a41 100644
>> --- a/arch/x86/include/asm/mmu_context.h
>> +++ b/arch/x86/include/asm/mmu_context.h
>> @@ -343,4 +343,24 @@ static inline unsigned long 
>> __get_current_cr3_fast(void)
>>  return cr3;
>> }
>>
>> +typedef struct {
>> +   struct mm_struct *prev;
>> +} temporary_mm_state_t;
>> +
>> +static inline temporary_mm_state_t use_temporary_mm(struct mm_struct 
>> *mm)
>> +{
>> +   temporary_mm_state_t state;
>> +
>> +   lockdep_assert_irqs_disabled();
>> +   state.prev = this_cpu_read(cpu_tlbstate.loaded_mm);
>> +   switch_mm_irqs_off(NULL, mm, current);
>> +   return state;
>> +}
>> +
>> +static inline void unuse_temporary_mm(temporary_mm_state_t prev)
>> +{
>> +   lockdep_assert_irqs_disabled();
>> +   switch_mm_irqs_off(NULL, prev.prev, current);
>> +}
>> +
>> #endif /* _ASM_X86_MMU_CONTEXT_H */
>> diff --git a/arch/x86/include/asm/pgtable.h 
>> b/arch/x86/include/asm/pgtable.h
>> index 5715647fc4fe..ef62af9a0ef7 100644
>> --- a/arch/x86/include/asm/pgtable.h
>> +++ b/arch/x86/include/asm/pgtable.h
>> @@ -976,6 +976,10 @@ static inline void __meminit 
>> init_trampoline_default(void)
>>  /* Default trampoline pgd value */
>>  trampoline_pgd_entry = init_top_pgt[pgd_index(__PAGE_OFFSET)];
>> }
>> +
>> +void __init patching_mm_init(void);
>> +#define patching_mm_init patching_mm_init
>> +
>> # ifdef CONFIG_RANDOMIZE_MEMORY
>> void __meminit init_trampoline(void);
>> # else
>> diff --git a/arch/x86/include/asm/pgtable_64_types.h 
>> b/arch/x86/include/asm/pgtable_64_types.h
>> index 054765ab2da2..9f44262abde0 100644
>> --- a/arch/x86/include/asm/pgtable_64_types.h
>> +++ b/arch/x86/include/asm/pgtable_64_types.h
>> @@ -116,6 +116,9 @@ extern unsigned int ptrs_per_p4d;
>> #define LDT_PGD_ENTRY  (pgtable_l5_enabled() ? LDT_PGD_ENTRY_L5 
>> : LDT_PGD_ENTRY_L4)
>> #define LDT_BASE_ADDR  (LDT_PGD_ENTRY << PGDIR_SHIFT)
>>
>> +#define TEXT_POKE_PGD_ENTRY-5UL
>> +#define TEXT_POKE_ADDR (TEXT_POKE_PGD_ENTRY << PGDIR_SHIFT)
>> +
>> #define __VMALLOC_BASE_L4  0xc900UL
>> #define __VMALLOC_BASE_L5  0xffa0UL
>>
>> diff --git a/arch/x86/include/asm/pgtable_types.h 
>> b/arch/x86/include/asm/pgtable_types.h
>> index 99fff853c944..840c72ec8c4f 100644
>> --- a/arch/x86/include/asm/pgtable_types.h
>> +++ b/arch/x86/include/asm/pgtable_types.h
>> @@ -505,6 +505,9 @@ pgprot_t phys_mem_access_prot(struct file *file, 
>> unsigned long pfn,
>> /* Install a pte for a particular vaddr in kernel space. */
>> void set_pte_vaddr(unsigned long vaddr, pte_t pte);
>>
>> +struct mm_struct;
>> +void set_mm_pte_vaddr(struct mm_struct *mm, unsigned long vaddr, pte_t 
>> pte);
>> +
>> #ifdef CONFIG_X86_32
>> extern void native_pagetable_init(void);
>> #else
>> diff --git a/arch/x86/include/asm/text-patching.h 
>> b/arch/x86/include/asm/text-patching.h
>> index 2ecd34e2d46c..cb364ea5b19d 100644
>> --- a/arch/x86/include/asm/text-patching.h
>> +++ b/arch/x86/include/asm/text-patching.h
>> @@ -38,4 +38,6 @@ extern void *text_poke(void *addr, const void *opcode, 
>> size_t len);
>> extern int poke_int3_handler(struct pt_regs *regs);
>> extern void *text_poke_bp(void *addr, const void *opcode, size_t len, 
>> void *handler);
>>
>> +extern struct mm_struct *patching_mm;
>> +
>> #endif /* _ASM_X86_TEXT_PATCHING_H */
>> diff --git a/arch/x86/kernel/alternative.c 
>> b/arch/x86/kernel/alternative.c
>>

Re: TLB flushes on fixmap changes

2018-08-27 Thread Andy Lutomirski

On Mon, Aug 27, 2018 at 3:54 PM, Nadav Amit  wrote:
> at 3:32 PM, Andy Lutomirski  wrote:
>
>> On Mon, Aug 27, 2018 at 2:55 PM, Nadav Amit  wrote:
>>> at 1:16 PM, Nadav Amit  wrote:
>>>
 at 12:58 PM, Andy Lutomirski  wrote:

> On Mon, Aug 27, 2018 at 12:43 PM, Nadav Amit  wrote:
>> at 12:10 PM, Nadav Amit  wrote:
>>
>>> at 11:58 AM, Andy Lutomirski  wrote:
>>>
 On Mon, Aug 27, 2018 at 11:54 AM, Nadav Amit  
 wrote:
>> On Mon, Aug 27, 2018 at 10:34 AM, Nadav Amit  
>> wrote:
>> What do you all think?
>
> I agree in general. But I think that current->mm would need to be 
> loaded, as
> otherwise I am afraid it would break switch_mm_irqs_off().

 What breaks?
>>>
>>> Actually nothing. I just saw the IBPB stuff regarding tsk, but it 
>>> should not
>>> matter.
>>
>> So here is what I got. It certainly needs some cleanup, but it boots.
>>
>> Let me know how crappy you find it...
>>
>>
>> diff --git a/arch/x86/include/asm/mmu_context.h 
>> b/arch/x86/include/asm/mmu_context.h
>> index bbc796eb0a3b..336779650a41 100644
>> --- a/arch/x86/include/asm/mmu_context.h
>> +++ b/arch/x86/include/asm/mmu_context.h
>> @@ -343,4 +343,24 @@ static inline unsigned long 
>> __get_current_cr3_fast(void)
>>  return cr3;
>> }
>>
>> +typedef struct {
>> +   struct mm_struct *prev;
>> +} temporary_mm_state_t;
>> +
>> +static inline temporary_mm_state_t use_temporary_mm(struct mm_struct 
>> *mm)
>> +{
>> +   temporary_mm_state_t state;
>> +
>> +   lockdep_assert_irqs_disabled();
>> +   state.prev = this_cpu_read(cpu_tlbstate.loaded_mm);
>> +   switch_mm_irqs_off(NULL, mm, current);
>> +   return state;
>> +}
>> +
>> +static inline void unuse_temporary_mm(temporary_mm_state_t prev)
>> +{
>> +   lockdep_assert_irqs_disabled();
>> +   switch_mm_irqs_off(NULL, prev.prev, current);
>> +}
>> +
>> #endif /* _ASM_X86_MMU_CONTEXT_H */
>> diff --git a/arch/x86/include/asm/pgtable.h 
>> b/arch/x86/include/asm/pgtable.h
>> index 5715647fc4fe..ef62af9a0ef7 100644
>> --- a/arch/x86/include/asm/pgtable.h
>> +++ b/arch/x86/include/asm/pgtable.h
>> @@ -976,6 +976,10 @@ static inline void __meminit 
>> init_trampoline_default(void)
>>  /* Default trampoline pgd value */
>>  trampoline_pgd_entry = init_top_pgt[pgd_index(__PAGE_OFFSET)];
>> }
>> +
>> +void __init patching_mm_init(void);
>> +#define patching_mm_init patching_mm_init
>> +
>> # ifdef CONFIG_RANDOMIZE_MEMORY
>> void __meminit init_trampoline(void);
>> # else
>> diff --git a/arch/x86/include/asm/pgtable_64_types.h 
>> b/arch/x86/include/asm/pgtable_64_types.h
>> index 054765ab2da2..9f44262abde0 100644
>> --- a/arch/x86/include/asm/pgtable_64_types.h
>> +++ b/arch/x86/include/asm/pgtable_64_types.h
>> @@ -116,6 +116,9 @@ extern unsigned int ptrs_per_p4d;
>> #define LDT_PGD_ENTRY  (pgtable_l5_enabled() ? LDT_PGD_ENTRY_L5 
>> : LDT_PGD_ENTRY_L4)
>> #define LDT_BASE_ADDR  (LDT_PGD_ENTRY << PGDIR_SHIFT)
>>
>> +#define TEXT_POKE_PGD_ENTRY-5UL
>> +#define TEXT_POKE_ADDR (TEXT_POKE_PGD_ENTRY << PGDIR_SHIFT)
>> +
>> #define __VMALLOC_BASE_L4  0xc900UL
>> #define __VMALLOC_BASE_L5  0xffa0UL
>>
>> diff --git a/arch/x86/include/asm/pgtable_types.h 
>> b/arch/x86/include/asm/pgtable_types.h
>> index 99fff853c944..840c72ec8c4f 100644
>> --- a/arch/x86/include/asm/pgtable_types.h
>> +++ b/arch/x86/include/asm/pgtable_types.h
>> @@ -505,6 +505,9 @@ pgprot_t phys_mem_access_prot(struct file *file, 
>> unsigned long pfn,
>> /* Install a pte for a particular vaddr in kernel space. */
>> void set_pte_vaddr(unsigned long vaddr, pte_t pte);
>>
>> +struct mm_struct;
>> +void set_mm_pte_vaddr(struct mm_struct *mm, unsigned long vaddr, pte_t 
>> pte);
>> +
>> #ifdef CONFIG_X86_32
>> extern void native_pagetable_init(void);
>> #else
>> diff --git a/arch/x86/include/asm/text-patching.h 
>> b/arch/x86/include/asm/text-patching.h
>> index 2ecd34e2d46c..cb364ea5b19d 100644
>> --- a/arch/x86/include/asm/text-patching.h
>> +++ b/arch/x86/include/asm/text-patching.h
>> @@ -38,4 +38,6 @@ extern void *text_poke(void *addr, const void *opcode, 
>> size_t len);
>> extern int poke_int3_handler(struct pt_regs *regs);
>> extern void *text_poke_bp(void *addr, const void *opcode, size_t len, 
>> void *handler);
>>
>> +extern struct mm_struct *patching_mm;
>> +
>> #endif /* _ASM_X86_TEXT_PATCHING_H */
>> diff --git a/arch/x86/kernel/alternative.c 
>> b/arch/x86/kernel/alternative.c
>>

Re: [PATCH] ASoC: AMD: Change MCLK to 48Mhz

2018-08-27 Thread Daniel Kurtz

On Tue, Aug 21, 2018 at 1:00 AM Akshu Agrawal  wrote:
>
> 25Mhz MCLK which was earlier used was of spread type.
> Thus, we were not getting accurate rate. The 48Mhz system
> clk is of non-spread type and we are changing to it to get
> accurate rate.
>
> Signed-off-by: Akshu Agrawal 

Reviewed-by: Daniel Kurtz 

> ---
>  sound/soc/amd/acp-da7219-max98357a.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/sound/soc/amd/acp-da7219-max98357a.c 
> b/sound/soc/amd/acp-da7219-max98357a.c
> index cf2f648..55d7f61 100644
> --- a/sound/soc/amd/acp-da7219-max98357a.c
> +++ b/sound/soc/amd/acp-da7219-max98357a.c
> @@ -42,7 +42,7 @@
>  #include "../codecs/da7219.h"
>  #include "../codecs/da7219-aad.h"
>
> -#define CZ_PLAT_CLK 2500
> +#define CZ_PLAT_CLK 4800
>  #define DUAL_CHANNEL   2
>
>  static struct snd_soc_jack cz_jack;
> --
> 1.9.1
>

Re: [PATCH] ASoC: AMD: Change MCLK to 48Mhz

2018-08-27 Thread Daniel Kurtz

On Tue, Aug 21, 2018 at 1:00 AM Akshu Agrawal  wrote:
>
> 25Mhz MCLK which was earlier used was of spread type.
> Thus, we were not getting accurate rate. The 48Mhz system
> clk is of non-spread type and we are changing to it to get
> accurate rate.
>
> Signed-off-by: Akshu Agrawal 

Reviewed-by: Daniel Kurtz 

> ---
>  sound/soc/amd/acp-da7219-max98357a.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/sound/soc/amd/acp-da7219-max98357a.c 
> b/sound/soc/amd/acp-da7219-max98357a.c
> index cf2f648..55d7f61 100644
> --- a/sound/soc/amd/acp-da7219-max98357a.c
> +++ b/sound/soc/amd/acp-da7219-max98357a.c
> @@ -42,7 +42,7 @@
>  #include "../codecs/da7219.h"
>  #include "../codecs/da7219-aad.h"
>
> -#define CZ_PLAT_CLK 2500
> +#define CZ_PLAT_CLK 4800
>  #define DUAL_CHANNEL   2
>
>  static struct snd_soc_jack cz_jack;
> --
> 1.9.1
>

Re: [PATCH] ASoC: AMD: Set constraints for DMIC and MAX98357a codec

2018-08-27 Thread Daniel Kurtz

On Tue, Aug 21, 2018 at 12:55 AM Akshu Agrawal  wrote:
>
> We support dual channel, 48Khz. This constraint was set only for
> da7219. It is being extended to DMIC and MAX98357a.
>
> Signed-off-by: Akshu Agrawal 

Reviewed-by: Daniel Kurtz 

> ---
>  sound/soc/amd/acp-da7219-max98357a.c | 33 +
>  1 file changed, 33 insertions(+)
>
> diff --git a/sound/soc/amd/acp-da7219-max98357a.c 
> b/sound/soc/amd/acp-da7219-max98357a.c
> index 066d5489..cf2f648 100644
> --- a/sound/soc/amd/acp-da7219-max98357a.c
> +++ b/sound/soc/amd/acp-da7219-max98357a.c
> @@ -162,10 +162,21 @@ static void cz_da7219_shutdown(struct snd_pcm_substream 
> *substream)
>
>  static int cz_max_startup(struct snd_pcm_substream *substream)
>  {
> +   struct snd_pcm_runtime *runtime = substream->runtime;
> struct snd_soc_pcm_runtime *rtd = substream->private_data;
> struct snd_soc_card *card = rtd->card;
> struct acp_platform_info *machine = snd_soc_card_get_drvdata(card);
>
> +   /*
> +* On this platform for PCM device we support stereo
> +*/
> +
> +   runtime->hw.channels_max = DUAL_CHANNEL;
> +   snd_pcm_hw_constraint_list(runtime, 0, SNDRV_PCM_HW_PARAM_CHANNELS,
> +  _channels);
> +   snd_pcm_hw_constraint_list(runtime, 0, SNDRV_PCM_HW_PARAM_RATE,
> +  _rates);
> +
> machine->i2s_instance = I2S_BT_INSTANCE;
> return da7219_clk_enable(substream);
>  }
> @@ -177,20 +188,42 @@ static void cz_max_shutdown(struct snd_pcm_substream 
> *substream)
>
>  static int cz_dmic0_startup(struct snd_pcm_substream *substream)
>  {
> +   struct snd_pcm_runtime *runtime = substream->runtime;
> struct snd_soc_pcm_runtime *rtd = substream->private_data;
> struct snd_soc_card *card = rtd->card;
> struct acp_platform_info *machine = snd_soc_card_get_drvdata(card);
>
> +   /*
> +* On this platform for PCM device we support stereo
> +*/
> +
> +   runtime->hw.channels_max = DUAL_CHANNEL;
> +   snd_pcm_hw_constraint_list(runtime, 0, SNDRV_PCM_HW_PARAM_CHANNELS,
> +  _channels);
> +   snd_pcm_hw_constraint_list(runtime, 0, SNDRV_PCM_HW_PARAM_RATE,
> +  _rates);
> +
> machine->i2s_instance = I2S_BT_INSTANCE;
> return da7219_clk_enable(substream);
>  }
>
>  static int cz_dmic1_startup(struct snd_pcm_substream *substream)
>  {
> +   struct snd_pcm_runtime *runtime = substream->runtime;
> struct snd_soc_pcm_runtime *rtd = substream->private_data;
> struct snd_soc_card *card = rtd->card;
> struct acp_platform_info *machine = snd_soc_card_get_drvdata(card);
>
> +   /*
> +* On this platform for PCM device we support stereo
> +*/
> +
> +   runtime->hw.channels_max = DUAL_CHANNEL;
> +   snd_pcm_hw_constraint_list(runtime, 0, SNDRV_PCM_HW_PARAM_CHANNELS,
> +  _channels);
> +   snd_pcm_hw_constraint_list(runtime, 0, SNDRV_PCM_HW_PARAM_RATE,
> +  _rates);
> +
> machine->i2s_instance = I2S_SP_INSTANCE;
> machine->capture_channel = CAP_CHANNEL0;
> return da7219_clk_enable(substream);
> --
> 1.9.1
>

Re: [PATCH] clk: x86: Set default parent to 48Mhz

2018-08-27 Thread Daniel Kurtz

On Tue, Aug 21, 2018 at 12:53 AM Akshu Agrawal  wrote:
>
> System clk provided in ST soc can be set to:
> 48Mhz, non-spread
> 25Mhz, spread
> To get accurate rate, we need it to set it at non-spread
> option which is 48Mhz.
>
> Signed-off-by: Akshu Agrawal 

Reviewed-by: Daniel Kurtz 

> ---
>  drivers/clk/x86/clk-st.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/clk/x86/clk-st.c b/drivers/clk/x86/clk-st.c
> index fb62f39..3a0996f 100644
> --- a/drivers/clk/x86/clk-st.c
> +++ b/drivers/clk/x86/clk-st.c
> @@ -46,7 +46,7 @@ static int st_clk_probe(struct platform_device *pdev)
> clk_oscout1_parents, ARRAY_SIZE(clk_oscout1_parents),
> 0, st_data->base + CLKDRVSTR2, OSCOUT1CLK25MHZ, 3, 0, NULL);
>
> -   clk_set_parent(hws[ST_CLK_MUX]->clk, hws[ST_CLK_25M]->clk);
> +   clk_set_parent(hws[ST_CLK_MUX]->clk, hws[ST_CLK_48M]->clk);
>
> hws[ST_CLK_GATE] = clk_hw_register_gate(NULL, "oscout1", 
> "oscout1_mux",
> 0, st_data->base + MISCCLKCNTL1, OSCCLKENB,
> --
> 1.9.1
>

Re: [PATCH] ASoC: AMD: Set constraints for DMIC and MAX98357a codec

2018-08-27 Thread Daniel Kurtz

On Tue, Aug 21, 2018 at 12:55 AM Akshu Agrawal  wrote:
>
> We support dual channel, 48Khz. This constraint was set only for
> da7219. It is being extended to DMIC and MAX98357a.
>
> Signed-off-by: Akshu Agrawal 

Reviewed-by: Daniel Kurtz 

> ---
>  sound/soc/amd/acp-da7219-max98357a.c | 33 +
>  1 file changed, 33 insertions(+)
>
> diff --git a/sound/soc/amd/acp-da7219-max98357a.c 
> b/sound/soc/amd/acp-da7219-max98357a.c
> index 066d5489..cf2f648 100644
> --- a/sound/soc/amd/acp-da7219-max98357a.c
> +++ b/sound/soc/amd/acp-da7219-max98357a.c
> @@ -162,10 +162,21 @@ static void cz_da7219_shutdown(struct snd_pcm_substream 
> *substream)
>
>  static int cz_max_startup(struct snd_pcm_substream *substream)
>  {
> +   struct snd_pcm_runtime *runtime = substream->runtime;
> struct snd_soc_pcm_runtime *rtd = substream->private_data;
> struct snd_soc_card *card = rtd->card;
> struct acp_platform_info *machine = snd_soc_card_get_drvdata(card);
>
> +   /*
> +* On this platform for PCM device we support stereo
> +*/
> +
> +   runtime->hw.channels_max = DUAL_CHANNEL;
> +   snd_pcm_hw_constraint_list(runtime, 0, SNDRV_PCM_HW_PARAM_CHANNELS,
> +  _channels);
> +   snd_pcm_hw_constraint_list(runtime, 0, SNDRV_PCM_HW_PARAM_RATE,
> +  _rates);
> +
> machine->i2s_instance = I2S_BT_INSTANCE;
> return da7219_clk_enable(substream);
>  }
> @@ -177,20 +188,42 @@ static void cz_max_shutdown(struct snd_pcm_substream 
> *substream)
>
>  static int cz_dmic0_startup(struct snd_pcm_substream *substream)
>  {
> +   struct snd_pcm_runtime *runtime = substream->runtime;
> struct snd_soc_pcm_runtime *rtd = substream->private_data;
> struct snd_soc_card *card = rtd->card;
> struct acp_platform_info *machine = snd_soc_card_get_drvdata(card);
>
> +   /*
> +* On this platform for PCM device we support stereo
> +*/
> +
> +   runtime->hw.channels_max = DUAL_CHANNEL;
> +   snd_pcm_hw_constraint_list(runtime, 0, SNDRV_PCM_HW_PARAM_CHANNELS,
> +  _channels);
> +   snd_pcm_hw_constraint_list(runtime, 0, SNDRV_PCM_HW_PARAM_RATE,
> +  _rates);
> +
> machine->i2s_instance = I2S_BT_INSTANCE;
> return da7219_clk_enable(substream);
>  }
>
>  static int cz_dmic1_startup(struct snd_pcm_substream *substream)
>  {
> +   struct snd_pcm_runtime *runtime = substream->runtime;
> struct snd_soc_pcm_runtime *rtd = substream->private_data;
> struct snd_soc_card *card = rtd->card;
> struct acp_platform_info *machine = snd_soc_card_get_drvdata(card);
>
> +   /*
> +* On this platform for PCM device we support stereo
> +*/
> +
> +   runtime->hw.channels_max = DUAL_CHANNEL;
> +   snd_pcm_hw_constraint_list(runtime, 0, SNDRV_PCM_HW_PARAM_CHANNELS,
> +  _channels);
> +   snd_pcm_hw_constraint_list(runtime, 0, SNDRV_PCM_HW_PARAM_RATE,
> +  _rates);
> +
> machine->i2s_instance = I2S_SP_INSTANCE;
> machine->capture_channel = CAP_CHANNEL0;
> return da7219_clk_enable(substream);
> --
> 1.9.1
>

Re: [PATCH] clk: x86: Set default parent to 48Mhz

2018-08-27 Thread Daniel Kurtz

On Tue, Aug 21, 2018 at 12:53 AM Akshu Agrawal  wrote:
>
> System clk provided in ST soc can be set to:
> 48Mhz, non-spread
> 25Mhz, spread
> To get accurate rate, we need it to set it at non-spread
> option which is 48Mhz.
>
> Signed-off-by: Akshu Agrawal 

Reviewed-by: Daniel Kurtz 

> ---
>  drivers/clk/x86/clk-st.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/clk/x86/clk-st.c b/drivers/clk/x86/clk-st.c
> index fb62f39..3a0996f 100644
> --- a/drivers/clk/x86/clk-st.c
> +++ b/drivers/clk/x86/clk-st.c
> @@ -46,7 +46,7 @@ static int st_clk_probe(struct platform_device *pdev)
> clk_oscout1_parents, ARRAY_SIZE(clk_oscout1_parents),
> 0, st_data->base + CLKDRVSTR2, OSCOUT1CLK25MHZ, 3, 0, NULL);
>
> -   clk_set_parent(hws[ST_CLK_MUX]->clk, hws[ST_CLK_25M]->clk);
> +   clk_set_parent(hws[ST_CLK_MUX]->clk, hws[ST_CLK_48M]->clk);
>
> hws[ST_CLK_GATE] = clk_hw_register_gate(NULL, "oscout1", 
> "oscout1_mux",
> 0, st_data->base + MISCCLKCNTL1, OSCCLKENB,
> --
> 1.9.1
>

Re: [PATCH v2 3/5] drivers: pinctrl: msm: enable PDC interrupt only during suspend

2018-08-27 Thread Matthias Kaehlcke

Hi Lina,

On Fri, Aug 24, 2018 at 02:01:55PM -0600, Lina Iyer wrote:
> During suspend the system may power down some of the system rails. As a
> result, the TLMM hw block may not be operational anymore and wakeup
> capable GPIOs will not be detected. The PDC however will be operational
> and the GPIOs that are routed to the PDC as IRQs can wake the system up.
> 
> To avoid being interrupted twice (for TLMM and once for PDC IRQ) when a
> GPIO trips, use TLMM for active and switch to PDC for suspend. When
> entering suspend, disable the TLMM wakeup interrupt and instead enable
> the PDC IRQ and revert upon resume.
> 
> Signed-off-by: Lina Iyer 
> ---
> Changes in v2:
>   - Fix PDC IRQ max port, 126 is the max supported in h/w
>   - Use PDC hwirq in bitmap, linux numbers could be large
>   - Setup DISABLE_UNLAZY for both TLMM and PDC IRQs
> ---
>  drivers/pinctrl/qcom/pinctrl-msm.c | 70 +-
>  drivers/pinctrl/qcom/pinctrl-msm.h |  3 ++
>  2 files changed, 72 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c 
> b/drivers/pinctrl/qcom/pinctrl-msm.c
> index b675ea56a4ff..a880cefbd248 100644
> --- a/drivers/pinctrl/qcom/pinctrl-msm.c
> +++ b/drivers/pinctrl/qcom/pinctrl-msm.c
> @@ -37,6 +37,7 @@
>  #include "../pinctrl-utils.h"
>  
>  #define MAX_NR_GPIO 300
> +#define MAX_PDC_HWIRQ 126
>  #define PS_HOLD_OFFSET 0x820
>  
>  /**
> @@ -51,6 +52,7 @@
>   * @enabled_irqs:   Bitmap of currently enabled irqs.
>   * @dual_edge_irqs: Bitmap of irqs that need sw emulated dual edge
>   *  detection.
> + * @pdc_hwirqs: Bitmap of wakeup capable irqs.
>   * @soc;Reference to soc_data of platform specific data.
>   * @regs:   Base address for the TLMM register map.
>   */
> @@ -68,11 +70,15 @@ struct msm_pinctrl {
>  
>   DECLARE_BITMAP(dual_edge_irqs, MAX_NR_GPIO);
>   DECLARE_BITMAP(enabled_irqs, MAX_NR_GPIO);
> + DECLARE_BITMAP(pdc_hwirqs, MAX_PDC_HWIRQ);
>  
>   const struct msm_pinctrl_soc_data *soc;
>   void __iomem *regs;
> + struct irq_domain *pdc_irq_domain;
>  };
>  
> +static bool in_suspend;
> +
>  static int msm_get_groups_count(struct pinctrl_dev *pctldev)
>  {
>   struct msm_pinctrl *pctrl = pinctrl_dev_get_drvdata(pctldev);
> @@ -787,8 +793,13 @@ static int msm_gpio_irq_set_wake(struct irq_data *d, 
> unsigned int on)
>  
>   raw_spin_lock_irqsave(>lock, flags);
>  
> - if (pdc_irqd)
> + if (pdc_irqd && !in_suspend) {
>   irq_set_irq_wake(pdc_irqd->irq, on);
> + if (on)
> + set_bit(pdc_irqd->hwirq, pctrl->pdc_hwirqs);
> + else
> + clear_bit(pdc_irqd->hwirq, pctrl->pdc_hwirqs);
> + }
>  
>   irq_set_irq_wake(pctrl->irq, on);
>  
> @@ -919,7 +930,12 @@ static int msm_gpio_pdc_pin_request(struct irq_data *d)
>   }
>  
>   irq_set_handler_data(d->irq, irq_get_irq_data(irq));
> + irq_set_handler_data(irq, d);
> + irq_set_status_flags(irq, IRQ_DISABLE_UNLAZY);
> + irq_set_status_flags(d->irq, IRQ_DISABLE_UNLAZY);
>   disable_irq(irq);
> + if (!pctrl->pdc_irq_domain)
> + pctrl->pdc_irq_domain = irq_get_irq_data(irq)->domain;
>  
>   return 0;
>  }
> @@ -1069,6 +1085,58 @@ static void msm_pinctrl_setup_pm_reset(struct 
> msm_pinctrl *pctrl)
>   }
>  }
>  
> +int __maybe_unused msm_pinctrl_suspend_late(struct device *dev)
> +{
> + struct msm_pinctrl *pctrl = dev_get_drvdata(dev);
> + struct irq_data *irqd;
> + unsigned int irq;
> + int i;
> +
> + in_suspend = true;
> + for_each_set_bit(i, pctrl->pdc_hwirqs, MAX_PDC_HWIRQ) {
> + irq = irq_find_mapping(pctrl->pdc_irq_domain, i);
> + irqd = irq_get_handler_data(irq);
> + /*
> +  * We don't know if the TLMM will be functional
> +  * or not, during the suspend. If its functional,
> +  * we do not want duplicate interrupts from PDC.
> +  * Hence disable the GPIO IRQ and enable PDC IRQ.
> +  */
> + if (irqd_is_wakeup_set(irqd)) {
> + disable_irq_wake(irqd->irq);
> + disable_irq(irqd->irq);
> + enable_irq(irq);
> + }

Would it make sense to limit this to edge triggered interrupts since
the interrupt handler does nothing for level triggered ones?

Cheers

Matthias

Re: [PATCH v2 3/5] drivers: pinctrl: msm: enable PDC interrupt only during suspend

2018-08-27 Thread Matthias Kaehlcke

Hi Lina,

On Fri, Aug 24, 2018 at 02:01:55PM -0600, Lina Iyer wrote:
> During suspend the system may power down some of the system rails. As a
> result, the TLMM hw block may not be operational anymore and wakeup
> capable GPIOs will not be detected. The PDC however will be operational
> and the GPIOs that are routed to the PDC as IRQs can wake the system up.
> 
> To avoid being interrupted twice (for TLMM and once for PDC IRQ) when a
> GPIO trips, use TLMM for active and switch to PDC for suspend. When
> entering suspend, disable the TLMM wakeup interrupt and instead enable
> the PDC IRQ and revert upon resume.
> 
> Signed-off-by: Lina Iyer 
> ---
> Changes in v2:
>   - Fix PDC IRQ max port, 126 is the max supported in h/w
>   - Use PDC hwirq in bitmap, linux numbers could be large
>   - Setup DISABLE_UNLAZY for both TLMM and PDC IRQs
> ---
>  drivers/pinctrl/qcom/pinctrl-msm.c | 70 +-
>  drivers/pinctrl/qcom/pinctrl-msm.h |  3 ++
>  2 files changed, 72 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c 
> b/drivers/pinctrl/qcom/pinctrl-msm.c
> index b675ea56a4ff..a880cefbd248 100644
> --- a/drivers/pinctrl/qcom/pinctrl-msm.c
> +++ b/drivers/pinctrl/qcom/pinctrl-msm.c
> @@ -37,6 +37,7 @@
>  #include "../pinctrl-utils.h"
>  
>  #define MAX_NR_GPIO 300
> +#define MAX_PDC_HWIRQ 126
>  #define PS_HOLD_OFFSET 0x820
>  
>  /**
> @@ -51,6 +52,7 @@
>   * @enabled_irqs:   Bitmap of currently enabled irqs.
>   * @dual_edge_irqs: Bitmap of irqs that need sw emulated dual edge
>   *  detection.
> + * @pdc_hwirqs: Bitmap of wakeup capable irqs.
>   * @soc;Reference to soc_data of platform specific data.
>   * @regs:   Base address for the TLMM register map.
>   */
> @@ -68,11 +70,15 @@ struct msm_pinctrl {
>  
>   DECLARE_BITMAP(dual_edge_irqs, MAX_NR_GPIO);
>   DECLARE_BITMAP(enabled_irqs, MAX_NR_GPIO);
> + DECLARE_BITMAP(pdc_hwirqs, MAX_PDC_HWIRQ);
>  
>   const struct msm_pinctrl_soc_data *soc;
>   void __iomem *regs;
> + struct irq_domain *pdc_irq_domain;
>  };
>  
> +static bool in_suspend;
> +
>  static int msm_get_groups_count(struct pinctrl_dev *pctldev)
>  {
>   struct msm_pinctrl *pctrl = pinctrl_dev_get_drvdata(pctldev);
> @@ -787,8 +793,13 @@ static int msm_gpio_irq_set_wake(struct irq_data *d, 
> unsigned int on)
>  
>   raw_spin_lock_irqsave(>lock, flags);
>  
> - if (pdc_irqd)
> + if (pdc_irqd && !in_suspend) {
>   irq_set_irq_wake(pdc_irqd->irq, on);
> + if (on)
> + set_bit(pdc_irqd->hwirq, pctrl->pdc_hwirqs);
> + else
> + clear_bit(pdc_irqd->hwirq, pctrl->pdc_hwirqs);
> + }
>  
>   irq_set_irq_wake(pctrl->irq, on);
>  
> @@ -919,7 +930,12 @@ static int msm_gpio_pdc_pin_request(struct irq_data *d)
>   }
>  
>   irq_set_handler_data(d->irq, irq_get_irq_data(irq));
> + irq_set_handler_data(irq, d);
> + irq_set_status_flags(irq, IRQ_DISABLE_UNLAZY);
> + irq_set_status_flags(d->irq, IRQ_DISABLE_UNLAZY);
>   disable_irq(irq);
> + if (!pctrl->pdc_irq_domain)
> + pctrl->pdc_irq_domain = irq_get_irq_data(irq)->domain;
>  
>   return 0;
>  }
> @@ -1069,6 +1085,58 @@ static void msm_pinctrl_setup_pm_reset(struct 
> msm_pinctrl *pctrl)
>   }
>  }
>  
> +int __maybe_unused msm_pinctrl_suspend_late(struct device *dev)
> +{
> + struct msm_pinctrl *pctrl = dev_get_drvdata(dev);
> + struct irq_data *irqd;
> + unsigned int irq;
> + int i;
> +
> + in_suspend = true;
> + for_each_set_bit(i, pctrl->pdc_hwirqs, MAX_PDC_HWIRQ) {
> + irq = irq_find_mapping(pctrl->pdc_irq_domain, i);
> + irqd = irq_get_handler_data(irq);
> + /*
> +  * We don't know if the TLMM will be functional
> +  * or not, during the suspend. If its functional,
> +  * we do not want duplicate interrupts from PDC.
> +  * Hence disable the GPIO IRQ and enable PDC IRQ.
> +  */
> + if (irqd_is_wakeup_set(irqd)) {
> + disable_irq_wake(irqd->irq);
> + disable_irq(irqd->irq);
> + enable_irq(irq);
> + }

Would it make sense to limit this to edge triggered interrupts since
the interrupt handler does nothing for level triggered ones?

Cheers

Matthias

Re: [PATCH 1/2] soc: imx: gpc: use A_CORE instread of A7 for more i.MX platforms

2018-08-27 Thread Andrey Smirnov

On Sun, Aug 26, 2018 at 7:14 PM Shawn Guo  wrote:
>
> Andrey,
>
> Are you fine with these two patches?
>

I made a small comment on 2/2, but otherwise both patches seem
reasonable (Acks provided in separate reply).

Let me know if you need anything else from me.

Thanks,
Andrey Smirnov

Re: TLB flushes on fixmap changes

2018-08-27 Thread Nadav Amit

at 3:32 PM, Andy Lutomirski  wrote:

> On Mon, Aug 27, 2018 at 2:55 PM, Nadav Amit  wrote:
>> at 1:16 PM, Nadav Amit  wrote:
>> 
>>> at 12:58 PM, Andy Lutomirski  wrote:
>>> 
 On Mon, Aug 27, 2018 at 12:43 PM, Nadav Amit  wrote:
> at 12:10 PM, Nadav Amit  wrote:
> 
>> at 11:58 AM, Andy Lutomirski  wrote:
>> 
>>> On Mon, Aug 27, 2018 at 11:54 AM, Nadav Amit  
>>> wrote:
> On Mon, Aug 27, 2018 at 10:34 AM, Nadav Amit  
> wrote:
> What do you all think?
 
 I agree in general. But I think that current->mm would need to be 
 loaded, as
 otherwise I am afraid it would break switch_mm_irqs_off().
>>> 
>>> What breaks?
>> 
>> Actually nothing. I just saw the IBPB stuff regarding tsk, but it should 
>> not
>> matter.
> 
> So here is what I got. It certainly needs some cleanup, but it boots.
> 
> Let me know how crappy you find it...
> 
> 
> diff --git a/arch/x86/include/asm/mmu_context.h 
> b/arch/x86/include/asm/mmu_context.h
> index bbc796eb0a3b..336779650a41 100644
> --- a/arch/x86/include/asm/mmu_context.h
> +++ b/arch/x86/include/asm/mmu_context.h
> @@ -343,4 +343,24 @@ static inline unsigned long 
> __get_current_cr3_fast(void)
>  return cr3;
> }
> 
> +typedef struct {
> +   struct mm_struct *prev;
> +} temporary_mm_state_t;
> +
> +static inline temporary_mm_state_t use_temporary_mm(struct mm_struct *mm)
> +{
> +   temporary_mm_state_t state;
> +
> +   lockdep_assert_irqs_disabled();
> +   state.prev = this_cpu_read(cpu_tlbstate.loaded_mm);
> +   switch_mm_irqs_off(NULL, mm, current);
> +   return state;
> +}
> +
> +static inline void unuse_temporary_mm(temporary_mm_state_t prev)
> +{
> +   lockdep_assert_irqs_disabled();
> +   switch_mm_irqs_off(NULL, prev.prev, current);
> +}
> +
> #endif /* _ASM_X86_MMU_CONTEXT_H */
> diff --git a/arch/x86/include/asm/pgtable.h 
> b/arch/x86/include/asm/pgtable.h
> index 5715647fc4fe..ef62af9a0ef7 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -976,6 +976,10 @@ static inline void __meminit 
> init_trampoline_default(void)
>  /* Default trampoline pgd value */
>  trampoline_pgd_entry = init_top_pgt[pgd_index(__PAGE_OFFSET)];
> }
> +
> +void __init patching_mm_init(void);
> +#define patching_mm_init patching_mm_init
> +
> # ifdef CONFIG_RANDOMIZE_MEMORY
> void __meminit init_trampoline(void);
> # else
> diff --git a/arch/x86/include/asm/pgtable_64_types.h 
> b/arch/x86/include/asm/pgtable_64_types.h
> index 054765ab2da2..9f44262abde0 100644
> --- a/arch/x86/include/asm/pgtable_64_types.h
> +++ b/arch/x86/include/asm/pgtable_64_types.h
> @@ -116,6 +116,9 @@ extern unsigned int ptrs_per_p4d;
> #define LDT_PGD_ENTRY  (pgtable_l5_enabled() ? LDT_PGD_ENTRY_L5 : 
> LDT_PGD_ENTRY_L4)
> #define LDT_BASE_ADDR  (LDT_PGD_ENTRY << PGDIR_SHIFT)
> 
> +#define TEXT_POKE_PGD_ENTRY-5UL
> +#define TEXT_POKE_ADDR (TEXT_POKE_PGD_ENTRY << PGDIR_SHIFT)
> +
> #define __VMALLOC_BASE_L4  0xc900UL
> #define __VMALLOC_BASE_L5  0xffa0UL
> 
> diff --git a/arch/x86/include/asm/pgtable_types.h 
> b/arch/x86/include/asm/pgtable_types.h
> index 99fff853c944..840c72ec8c4f 100644
> --- a/arch/x86/include/asm/pgtable_types.h
> +++ b/arch/x86/include/asm/pgtable_types.h
> @@ -505,6 +505,9 @@ pgprot_t phys_mem_access_prot(struct file *file, 
> unsigned long pfn,
> /* Install a pte for a particular vaddr in kernel space. */
> void set_pte_vaddr(unsigned long vaddr, pte_t pte);
> 
> +struct mm_struct;
> +void set_mm_pte_vaddr(struct mm_struct *mm, unsigned long vaddr, pte_t 
> pte);
> +
> #ifdef CONFIG_X86_32
> extern void native_pagetable_init(void);
> #else
> diff --git a/arch/x86/include/asm/text-patching.h 
> b/arch/x86/include/asm/text-patching.h
> index 2ecd34e2d46c..cb364ea5b19d 100644
> --- a/arch/x86/include/asm/text-patching.h
> +++ b/arch/x86/include/asm/text-patching.h
> @@ -38,4 +38,6 @@ extern void *text_poke(void *addr, const void *opcode, 
> size_t len);
> extern int poke_int3_handler(struct pt_regs *regs);
> extern void *text_poke_bp(void *addr, const void *opcode, size_t len, 
> void *handler);
> 
> +extern struct mm_struct *patching_mm;
> +
> #endif /* _ASM_X86_TEXT_PATCHING_H */
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index a481763a3776..fd8a950b0d62 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -11,6 +11,7 @@
> #include 
> #include 
>

Re: [PATCH 1/2] soc: imx: gpc: use A_CORE instread of A7 for more i.MX platforms

2018-08-27 Thread Andrey Smirnov

On Sun, Aug 26, 2018 at 7:14 PM Shawn Guo  wrote:
>
> Andrey,
>
> Are you fine with these two patches?
>

I made a small comment on 2/2, but otherwise both patches seem
reasonable (Acks provided in separate reply).

Let me know if you need anything else from me.

Thanks,
Andrey Smirnov

Re: TLB flushes on fixmap changes

2018-08-27 Thread Nadav Amit

at 3:32 PM, Andy Lutomirski  wrote:

> On Mon, Aug 27, 2018 at 2:55 PM, Nadav Amit  wrote:
>> at 1:16 PM, Nadav Amit  wrote:
>> 
>>> at 12:58 PM, Andy Lutomirski  wrote:
>>> 
 On Mon, Aug 27, 2018 at 12:43 PM, Nadav Amit  wrote:
> at 12:10 PM, Nadav Amit  wrote:
> 
>> at 11:58 AM, Andy Lutomirski  wrote:
>> 
>>> On Mon, Aug 27, 2018 at 11:54 AM, Nadav Amit  
>>> wrote:
> On Mon, Aug 27, 2018 at 10:34 AM, Nadav Amit  
> wrote:
> What do you all think?
 
 I agree in general. But I think that current->mm would need to be 
 loaded, as
 otherwise I am afraid it would break switch_mm_irqs_off().
>>> 
>>> What breaks?
>> 
>> Actually nothing. I just saw the IBPB stuff regarding tsk, but it should 
>> not
>> matter.
> 
> So here is what I got. It certainly needs some cleanup, but it boots.
> 
> Let me know how crappy you find it...
> 
> 
> diff --git a/arch/x86/include/asm/mmu_context.h 
> b/arch/x86/include/asm/mmu_context.h
> index bbc796eb0a3b..336779650a41 100644
> --- a/arch/x86/include/asm/mmu_context.h
> +++ b/arch/x86/include/asm/mmu_context.h
> @@ -343,4 +343,24 @@ static inline unsigned long 
> __get_current_cr3_fast(void)
>  return cr3;
> }
> 
> +typedef struct {
> +   struct mm_struct *prev;
> +} temporary_mm_state_t;
> +
> +static inline temporary_mm_state_t use_temporary_mm(struct mm_struct *mm)
> +{
> +   temporary_mm_state_t state;
> +
> +   lockdep_assert_irqs_disabled();
> +   state.prev = this_cpu_read(cpu_tlbstate.loaded_mm);
> +   switch_mm_irqs_off(NULL, mm, current);
> +   return state;
> +}
> +
> +static inline void unuse_temporary_mm(temporary_mm_state_t prev)
> +{
> +   lockdep_assert_irqs_disabled();
> +   switch_mm_irqs_off(NULL, prev.prev, current);
> +}
> +
> #endif /* _ASM_X86_MMU_CONTEXT_H */
> diff --git a/arch/x86/include/asm/pgtable.h 
> b/arch/x86/include/asm/pgtable.h
> index 5715647fc4fe..ef62af9a0ef7 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -976,6 +976,10 @@ static inline void __meminit 
> init_trampoline_default(void)
>  /* Default trampoline pgd value */
>  trampoline_pgd_entry = init_top_pgt[pgd_index(__PAGE_OFFSET)];
> }
> +
> +void __init patching_mm_init(void);
> +#define patching_mm_init patching_mm_init
> +
> # ifdef CONFIG_RANDOMIZE_MEMORY
> void __meminit init_trampoline(void);
> # else
> diff --git a/arch/x86/include/asm/pgtable_64_types.h 
> b/arch/x86/include/asm/pgtable_64_types.h
> index 054765ab2da2..9f44262abde0 100644
> --- a/arch/x86/include/asm/pgtable_64_types.h
> +++ b/arch/x86/include/asm/pgtable_64_types.h
> @@ -116,6 +116,9 @@ extern unsigned int ptrs_per_p4d;
> #define LDT_PGD_ENTRY  (pgtable_l5_enabled() ? LDT_PGD_ENTRY_L5 : 
> LDT_PGD_ENTRY_L4)
> #define LDT_BASE_ADDR  (LDT_PGD_ENTRY << PGDIR_SHIFT)
> 
> +#define TEXT_POKE_PGD_ENTRY-5UL
> +#define TEXT_POKE_ADDR (TEXT_POKE_PGD_ENTRY << PGDIR_SHIFT)
> +
> #define __VMALLOC_BASE_L4  0xc900UL
> #define __VMALLOC_BASE_L5  0xffa0UL
> 
> diff --git a/arch/x86/include/asm/pgtable_types.h 
> b/arch/x86/include/asm/pgtable_types.h
> index 99fff853c944..840c72ec8c4f 100644
> --- a/arch/x86/include/asm/pgtable_types.h
> +++ b/arch/x86/include/asm/pgtable_types.h
> @@ -505,6 +505,9 @@ pgprot_t phys_mem_access_prot(struct file *file, 
> unsigned long pfn,
> /* Install a pte for a particular vaddr in kernel space. */
> void set_pte_vaddr(unsigned long vaddr, pte_t pte);
> 
> +struct mm_struct;
> +void set_mm_pte_vaddr(struct mm_struct *mm, unsigned long vaddr, pte_t 
> pte);
> +
> #ifdef CONFIG_X86_32
> extern void native_pagetable_init(void);
> #else
> diff --git a/arch/x86/include/asm/text-patching.h 
> b/arch/x86/include/asm/text-patching.h
> index 2ecd34e2d46c..cb364ea5b19d 100644
> --- a/arch/x86/include/asm/text-patching.h
> +++ b/arch/x86/include/asm/text-patching.h
> @@ -38,4 +38,6 @@ extern void *text_poke(void *addr, const void *opcode, 
> size_t len);
> extern int poke_int3_handler(struct pt_regs *regs);
> extern void *text_poke_bp(void *addr, const void *opcode, size_t len, 
> void *handler);
> 
> +extern struct mm_struct *patching_mm;
> +
> #endif /* _ASM_X86_TEXT_PATCHING_H */
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index a481763a3776..fd8a950b0d62 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -11,6 +11,7 @@
> #include 
> #include 
>

Re: [PATCH 1/2] soc: imx: gpc: use A_CORE instread of A7 for more i.MX platforms

2018-08-27 Thread Andrey Smirnov

On Sun, Aug 5, 2018 at 11:46 PM Anson Huang  wrote:
>
> gpcv2 driver is NOT just used on i.MX7D which has Cortex-A7
> cores, but also on i.MX8MQ/i.MX8MM platforms which use Cortex-A53
> cores, so let's use A_CORE instread of A7 to avoid confusion.
>
> Signed-off-by: Anson Huang 

Looks reasonable to me:

Acked-by: Andrey Smirnov 

> ---
>  drivers/soc/imx/gpcv2.c | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/soc/imx/gpcv2.c b/drivers/soc/imx/gpcv2.c
> index 6ef18cf..0e31465 100644
> --- a/drivers/soc/imx/gpcv2.c
> +++ b/drivers/soc/imx/gpcv2.c
> @@ -20,14 +20,14 @@
>  #include 
>  #include 
>
> -#define GPC_LPCR_A7_BSC0x000
> +#define GPC_LPCR_A_CORE_BSC0x000
>
>  #define GPC_PGC_CPU_MAPPING0x0ec
> -#define USB_HSIC_PHY_A7_DOMAIN BIT(6)
> -#define USB_OTG2_PHY_A7_DOMAIN BIT(5)
> -#define USB_OTG1_PHY_A7_DOMAIN BIT(4)
> -#define PCIE_PHY_A7_DOMAIN BIT(3)
> -#define MIPI_PHY_A7_DOMAIN BIT(2)
> +#define USB_HSIC_PHY_A_CORE_DOMAIN BIT(6)
> +#define USB_OTG2_PHY_A_CORE_DOMAIN BIT(5)
> +#define USB_OTG1_PHY_A_CORE_DOMAIN BIT(4)
> +#define PCIE_PHY_A_CORE_DOMAIN BIT(3)
> +#define MIPI_PHY_A_CORE_DOMAIN BIT(2)
>
>  #define GPC_PU_PGC_SW_PUP_REQ  0x0f8
>  #define GPC_PU_PGC_SW_PDN_REQ  0x104
> @@ -167,7 +167,7 @@ static const struct imx7_pgc_domain imx7_pgc_domains[] = {
> },
> .bits  = {
> .pxx = MIPI_PHY_SW_Pxx_REQ,
> -   .map = MIPI_PHY_A7_DOMAIN,
> +   .map = MIPI_PHY_A_CORE_DOMAIN,
> },
> .voltage   = 100,
> .pgc   = PGC_MIPI,
> @@ -179,7 +179,7 @@ static const struct imx7_pgc_domain imx7_pgc_domains[] = {
> },
> .bits  = {
> .pxx = PCIE_PHY_SW_Pxx_REQ,
> -   .map = PCIE_PHY_A7_DOMAIN,
> +   .map = PCIE_PHY_A_CORE_DOMAIN,
> },
> .voltage   = 100,
> .pgc   = PGC_PCIE,
> @@ -191,7 +191,7 @@ static const struct imx7_pgc_domain imx7_pgc_domains[] = {
> },
> .bits  = {
> .pxx = USB_HSIC_PHY_SW_Pxx_REQ,
> -   .map = USB_HSIC_PHY_A7_DOMAIN,
> +   .map = USB_HSIC_PHY_A_CORE_DOMAIN,
> },
> .voltage   = 120,
> .pgc   = PGC_USB_HSIC,
> @@ -261,7 +261,7 @@ builtin_platform_driver(imx7_pgc_domain_driver)
>  static int imx_gpcv2_probe(struct platform_device *pdev)
>  {
> static const struct regmap_range yes_ranges[] = {
> -   regmap_reg_range(GPC_LPCR_A7_BSC,
> +   regmap_reg_range(GPC_LPCR_A_CORE_BSC,
>  GPC_M4_PU_PDN_FLG),
> regmap_reg_range(GPC_PGC_CTRL(PGC_MIPI),
>  GPC_PGC_SR(PGC_MIPI)),
> --
> 2.7.4
>

Re: [PATCH 1/2] soc: imx: gpc: use A_CORE instread of A7 for more i.MX platforms

2018-08-27 Thread Andrey Smirnov

On Sun, Aug 5, 2018 at 11:46 PM Anson Huang  wrote:
>
> gpcv2 driver is NOT just used on i.MX7D which has Cortex-A7
> cores, but also on i.MX8MQ/i.MX8MM platforms which use Cortex-A53
> cores, so let's use A_CORE instread of A7 to avoid confusion.
>
> Signed-off-by: Anson Huang 

Looks reasonable to me:

Acked-by: Andrey Smirnov 

> ---
>  drivers/soc/imx/gpcv2.c | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/soc/imx/gpcv2.c b/drivers/soc/imx/gpcv2.c
> index 6ef18cf..0e31465 100644
> --- a/drivers/soc/imx/gpcv2.c
> +++ b/drivers/soc/imx/gpcv2.c
> @@ -20,14 +20,14 @@
>  #include 
>  #include 
>
> -#define GPC_LPCR_A7_BSC0x000
> +#define GPC_LPCR_A_CORE_BSC0x000
>
>  #define GPC_PGC_CPU_MAPPING0x0ec
> -#define USB_HSIC_PHY_A7_DOMAIN BIT(6)
> -#define USB_OTG2_PHY_A7_DOMAIN BIT(5)
> -#define USB_OTG1_PHY_A7_DOMAIN BIT(4)
> -#define PCIE_PHY_A7_DOMAIN BIT(3)
> -#define MIPI_PHY_A7_DOMAIN BIT(2)
> +#define USB_HSIC_PHY_A_CORE_DOMAIN BIT(6)
> +#define USB_OTG2_PHY_A_CORE_DOMAIN BIT(5)
> +#define USB_OTG1_PHY_A_CORE_DOMAIN BIT(4)
> +#define PCIE_PHY_A_CORE_DOMAIN BIT(3)
> +#define MIPI_PHY_A_CORE_DOMAIN BIT(2)
>
>  #define GPC_PU_PGC_SW_PUP_REQ  0x0f8
>  #define GPC_PU_PGC_SW_PDN_REQ  0x104
> @@ -167,7 +167,7 @@ static const struct imx7_pgc_domain imx7_pgc_domains[] = {
> },
> .bits  = {
> .pxx = MIPI_PHY_SW_Pxx_REQ,
> -   .map = MIPI_PHY_A7_DOMAIN,
> +   .map = MIPI_PHY_A_CORE_DOMAIN,
> },
> .voltage   = 100,
> .pgc   = PGC_MIPI,
> @@ -179,7 +179,7 @@ static const struct imx7_pgc_domain imx7_pgc_domains[] = {
> },
> .bits  = {
> .pxx = PCIE_PHY_SW_Pxx_REQ,
> -   .map = PCIE_PHY_A7_DOMAIN,
> +   .map = PCIE_PHY_A_CORE_DOMAIN,
> },
> .voltage   = 100,
> .pgc   = PGC_PCIE,
> @@ -191,7 +191,7 @@ static const struct imx7_pgc_domain imx7_pgc_domains[] = {
> },
> .bits  = {
> .pxx = USB_HSIC_PHY_SW_Pxx_REQ,
> -   .map = USB_HSIC_PHY_A7_DOMAIN,
> +   .map = USB_HSIC_PHY_A_CORE_DOMAIN,
> },
> .voltage   = 120,
> .pgc   = PGC_USB_HSIC,
> @@ -261,7 +261,7 @@ builtin_platform_driver(imx7_pgc_domain_driver)
>  static int imx_gpcv2_probe(struct platform_device *pdev)
>  {
> static const struct regmap_range yes_ranges[] = {
> -   regmap_reg_range(GPC_LPCR_A7_BSC,
> +   regmap_reg_range(GPC_LPCR_A_CORE_BSC,
>  GPC_M4_PU_PDN_FLG),
> regmap_reg_range(GPC_PGC_CTRL(PGC_MIPI),
>  GPC_PGC_SR(PGC_MIPI)),
> --
> 2.7.4
>

Re: [PATCH v2] selftests: membarrier: fix test by checking supported commands

2018-08-27 Thread Shuah Khan

Hi Rafael,

Thanks for the ping.

On 08/09/2018 02:21 PM, Rafael David Tinoco wrote:
> Makes membarrier_test compatible with older kernels (LTS) by checking if
> the membarrier features exist before running the tests.
> 
> Link: https://bugs.linaro.org/show_bug.cgi?id=3771
> Signed-off-by: Rafael David Tinoco 
> Cc:  #v4.17
> ---
>  .../selftests/membarrier/membarrier_test.c| 71 +++
>  1 file changed, 40 insertions(+), 31 deletions(-)
> 
> diff --git a/tools/testing/selftests/membarrier/membarrier_test.c 
> b/tools/testing/selftests/membarrier/membarrier_test.c
> index 6793f8ecc8e7..4dc263824bda 100644
> --- a/tools/testing/selftests/membarrier/membarrier_test.c
> +++ b/tools/testing/selftests/membarrier/membarrier_test.c
> @@ -223,7 +223,7 @@ static int test_membarrier_global_expedited_success(void)
>   return 0;
>  }
>  
> -static int test_membarrier(void)
> +static int test_membarrier(int supported)
>  {
>   int status;
>  
> @@ -236,21 +236,22 @@ static int test_membarrier(void)
>   status = test_membarrier_global_success();
>   if (status)
>   return status;
> - status = test_membarrier_private_expedited_fail();
> - if (status)
> - return status;
> - status = test_membarrier_register_private_expedited_success();
> - if (status)
> - return status;
> - status = test_membarrier_private_expedited_success();
> - if (status)
> - return status;
> - status = sys_membarrier(MEMBARRIER_CMD_QUERY, 0);
> - if (status < 0) {
> - ksft_test_result_fail("sys_membarrier() failed\n");
> - return status;
> +
> + /* commit 22e4ebb975822833b083533035233d128b30e98f added this feature */

Get rid of this comment.

> + if (supported & MEMBARRIER_CMD_PRIVATE_EXPEDITED) {
> + status = test_membarrier_private_expedited_fail();
> + if (status)
> + return status;
> + status = test_membarrier_register_private_expedited_success();
> + if (status)
> + return status;
> + status = test_membarrier_private_expedited_success();
> + if (status)
> + return status;
>   }

This change moves several tests under this check. These should run to test
the case when MEMBARRIER_CMD_PRIVATE_EXPEDITED isn't supported. This change
reduces coverage.

> - if (status & MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE) {
> +
> + /* commit 70216e18e519a54a2f13569e8caff99a092a92d6 added this feature */

Get rid of the above comment.

> + if (supported & MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE) {
>   status = test_membarrier_private_expedited_sync_core_fail();
>   if (status)
>   return status;
> @@ -261,23 +262,28 @@ static int test_membarrier(void)
>   if (status)
>   return status;
>   }


> - /*
> -  * It is valid to send a global membarrier from a non-registered
> -  * process.
> -  */
> - status = test_membarrier_global_expedited_success();
> - if (status)
> - return status;
> - status = test_membarrier_register_global_expedited_success();
> - if (status)
> - return status;
> - status = test_membarrier_global_expedited_success();
> - if (status)
> - return status;
> +
> + /* commit c5f58bd58f432be5d92df33c5458e0bcbee3aadf added this feature */

Get rid of the above comment.

> + if (supported & MEMBARRIER_CMD_GLOBAL_EXPEDITED) {
> + /*
> +  * It is valid to send a global membarrier from a non-registered
> +  * process.
> +  */
> + status = test_membarrier_global_expedited_success();
> + if (status)
> + return status;
> + status = test_membarrier_register_global_expedited_success();
> + if (status)
> + return status;
> + status = test_membarrier_global_expedited_success();
> + if (status)
> + return status;
> + }
> +

There skip handling missing here. Without this the test result reports
pass which is incorrect.

If feature isn't supported, test should report that the feature test is
skipped not passed.

What I would like to see here is a skip for each individual test not one
skip for all 3 tests.

This applies to the if (supported & MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE)
case above.

When I run this test on 4.19 I see, 

TAP version 13
selftests: membarrier: membarrier_test

ok 1 sys_membarrier available
ok 2 sys membarrier invalid command test: command = -1, flags = 0, errno = 22. 
Failed as expected
ok 3 sys membarrier MEMBARRIER_CMD_QUERY invalid flags test: flags = 1, errno = 
22. Failed as expected
ok 4 sys membarrier MEMBARRIER_CMD_GLOBAL test: flags = 0
ok 5 sys

Re: [PATCH v2] selftests: membarrier: fix test by checking supported commands

2018-08-27 Thread Shuah Khan

Hi Rafael,

Thanks for the ping.

On 08/09/2018 02:21 PM, Rafael David Tinoco wrote:
> Makes membarrier_test compatible with older kernels (LTS) by checking if
> the membarrier features exist before running the tests.
> 
> Link: https://bugs.linaro.org/show_bug.cgi?id=3771
> Signed-off-by: Rafael David Tinoco 
> Cc:  #v4.17
> ---
>  .../selftests/membarrier/membarrier_test.c| 71 +++
>  1 file changed, 40 insertions(+), 31 deletions(-)
> 
> diff --git a/tools/testing/selftests/membarrier/membarrier_test.c 
> b/tools/testing/selftests/membarrier/membarrier_test.c
> index 6793f8ecc8e7..4dc263824bda 100644
> --- a/tools/testing/selftests/membarrier/membarrier_test.c
> +++ b/tools/testing/selftests/membarrier/membarrier_test.c
> @@ -223,7 +223,7 @@ static int test_membarrier_global_expedited_success(void)
>   return 0;
>  }
>  
> -static int test_membarrier(void)
> +static int test_membarrier(int supported)
>  {
>   int status;
>  
> @@ -236,21 +236,22 @@ static int test_membarrier(void)
>   status = test_membarrier_global_success();
>   if (status)
>   return status;
> - status = test_membarrier_private_expedited_fail();
> - if (status)
> - return status;
> - status = test_membarrier_register_private_expedited_success();
> - if (status)
> - return status;
> - status = test_membarrier_private_expedited_success();
> - if (status)
> - return status;
> - status = sys_membarrier(MEMBARRIER_CMD_QUERY, 0);
> - if (status < 0) {
> - ksft_test_result_fail("sys_membarrier() failed\n");
> - return status;
> +
> + /* commit 22e4ebb975822833b083533035233d128b30e98f added this feature */

Get rid of this comment.

> + if (supported & MEMBARRIER_CMD_PRIVATE_EXPEDITED) {
> + status = test_membarrier_private_expedited_fail();
> + if (status)
> + return status;
> + status = test_membarrier_register_private_expedited_success();
> + if (status)
> + return status;
> + status = test_membarrier_private_expedited_success();
> + if (status)
> + return status;
>   }

This change moves several tests under this check. These should run to test
the case when MEMBARRIER_CMD_PRIVATE_EXPEDITED isn't supported. This change
reduces coverage.

> - if (status & MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE) {
> +
> + /* commit 70216e18e519a54a2f13569e8caff99a092a92d6 added this feature */

Get rid of the above comment.

> + if (supported & MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE) {
>   status = test_membarrier_private_expedited_sync_core_fail();
>   if (status)
>   return status;
> @@ -261,23 +262,28 @@ static int test_membarrier(void)
>   if (status)
>   return status;
>   }


> - /*
> -  * It is valid to send a global membarrier from a non-registered
> -  * process.
> -  */
> - status = test_membarrier_global_expedited_success();
> - if (status)
> - return status;
> - status = test_membarrier_register_global_expedited_success();
> - if (status)
> - return status;
> - status = test_membarrier_global_expedited_success();
> - if (status)
> - return status;
> +
> + /* commit c5f58bd58f432be5d92df33c5458e0bcbee3aadf added this feature */

Get rid of the above comment.

> + if (supported & MEMBARRIER_CMD_GLOBAL_EXPEDITED) {
> + /*
> +  * It is valid to send a global membarrier from a non-registered
> +  * process.
> +  */
> + status = test_membarrier_global_expedited_success();
> + if (status)
> + return status;
> + status = test_membarrier_register_global_expedited_success();
> + if (status)
> + return status;
> + status = test_membarrier_global_expedited_success();
> + if (status)
> + return status;
> + }
> +

There skip handling missing here. Without this the test result reports
pass which is incorrect.

If feature isn't supported, test should report that the feature test is
skipped not passed.

What I would like to see here is a skip for each individual test not one
skip for all 3 tests.

This applies to the if (supported & MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE)
case above.

When I run this test on 4.19 I see, 

TAP version 13
selftests: membarrier: membarrier_test

ok 1 sys_membarrier available
ok 2 sys membarrier invalid command test: command = -1, flags = 0, errno = 22. 
Failed as expected
ok 3 sys membarrier MEMBARRIER_CMD_QUERY invalid flags test: flags = 1, errno = 
22. Failed as expected
ok 4 sys membarrier MEMBARRIER_CMD_GLOBAL test: flags = 0
ok 5 sys

Re: [PATCH 2/2] soc: imx: gpcv2: make pgc driver more generic for other i.MX platforms

2018-08-27 Thread Andrey Smirnov

On Sun, Aug 5, 2018 at 11:45 PM Anson Huang  wrote:
>
> i.MX8MQ and i.MX8MM share same gpc module with i.MX7D, they
> can reuse gpcv2 pgc driver for power domain control, this
> patch renames all functions and structure definitions started
> with "imx7" to "imx", and check machine type to pass platform
> specific power domain data for power domain driver, thus make
> gpcv2 pgc driver more generic for i.MX platforms.
>

Just for the sake of


> Signed-off-by: Anson Huang 
> ---
>  drivers/soc/imx/gpcv2.c | 68 
> +
>  1 file changed, 40 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/soc/imx/gpcv2.c b/drivers/soc/imx/gpcv2.c
> index 0e31465..0e33cb5 100644
> --- a/drivers/soc/imx/gpcv2.c
> +++ b/drivers/soc/imx/gpcv2.c
> @@ -53,7 +53,7 @@
>
>  #define GPC_PGC_CTRL_PCR   BIT(0)
>
> -struct imx7_pgc_domain {
> +struct imx_pgc_domain {
> struct generic_pm_domain genpd;
> struct regmap *regmap;
> struct regulator *regulator;
> @@ -69,11 +69,11 @@ struct imx7_pgc_domain {
> struct device *dev;
>  };
>
> -static int imx7_gpc_pu_pgc_sw_pxx_req(struct generic_pm_domain *genpd,
> +static int imx_gpc_pu_pgc_sw_pxx_req(struct generic_pm_domain *genpd,
>   bool on)
>  {
> -   struct imx7_pgc_domain *domain = container_of(genpd,
> - struct imx7_pgc_domain,
> +   struct imx_pgc_domain *domain = container_of(genpd,
> + struct imx_pgc_domain,
>   genpd);
> unsigned int offset = on ?
> GPC_PU_PGC_SW_PUP_REQ : GPC_PU_PGC_SW_PDN_REQ;
> @@ -150,17 +150,17 @@ static int imx7_gpc_pu_pgc_sw_pxx_req(struct 
> generic_pm_domain *genpd,
> return ret;
>  }
>
> -static int imx7_gpc_pu_pgc_sw_pup_req(struct generic_pm_domain *genpd)
> +static int imx_gpc_pu_pgc_sw_pup_req(struct generic_pm_domain *genpd)
>  {
> -   return imx7_gpc_pu_pgc_sw_pxx_req(genpd, true);
> +   return imx_gpc_pu_pgc_sw_pxx_req(genpd, true);
>  }
>
> -static int imx7_gpc_pu_pgc_sw_pdn_req(struct generic_pm_domain *genpd)
> +static int imx_gpc_pu_pgc_sw_pdn_req(struct generic_pm_domain *genpd)
>  {
> -   return imx7_gpc_pu_pgc_sw_pxx_req(genpd, false);
> +   return imx_gpc_pu_pgc_sw_pxx_req(genpd, false);
>  }
>
> -static const struct imx7_pgc_domain imx7_pgc_domains[] = {
> +static const struct imx_pgc_domain imx7_pgc_domains[] = {
> [IMX7_POWER_DOMAIN_MIPI_PHY] = {
> .genpd = {
> .name  = "mipi-phy",
> @@ -198,9 +198,9 @@ static const struct imx7_pgc_domain imx7_pgc_domains[] = {
> },
>  };
>
> -static int imx7_pgc_domain_probe(struct platform_device *pdev)
> +static int imx_pgc_domain_probe(struct platform_device *pdev)
>  {
> -   struct imx7_pgc_domain *domain = pdev->dev.platform_data;
> +   struct imx_pgc_domain *domain = pdev->dev.platform_data;
> int ret;
>
> domain->dev = >dev;
> @@ -233,9 +233,9 @@ static int imx7_pgc_domain_probe(struct platform_device 
> *pdev)
> return ret;
>  }
>
> -static int imx7_pgc_domain_remove(struct platform_device *pdev)
> +static int imx_pgc_domain_remove(struct platform_device *pdev)
>  {
> -   struct imx7_pgc_domain *domain = pdev->dev.platform_data;
> +   struct imx_pgc_domain *domain = pdev->dev.platform_data;
>
> of_genpd_del_provider(domain->dev->of_node);
> pm_genpd_remove(>genpd);
> @@ -243,23 +243,24 @@ static int imx7_pgc_domain_remove(struct 
> platform_device *pdev)
> return 0;
>  }
>
> -static const struct platform_device_id imx7_pgc_domain_id[] = {
> -   { "imx7-pgc-domain", },
> +static const struct platform_device_id imx_pgc_domain_id[] = {
> +   { "imx-pgc-domain", },
> { },
>  };
>
> -static struct platform_driver imx7_pgc_domain_driver = {
> +static struct platform_driver imx_pgc_domain_driver = {
> .driver = {
> -   .name = "imx7-pgc",
> +   .name = "imx-pgc",
> },
> -   .probe= imx7_pgc_domain_probe,
> -   .remove   = imx7_pgc_domain_remove,
> -   .id_table = imx7_pgc_domain_id,
> +   .probe= imx_pgc_domain_probe,
> +   .remove   = imx_pgc_domain_remove,
> +   .id_table = imx_pgc_domain_id,
>  };
> -builtin_platform_driver(imx7_pgc_domain_driver)
> +builtin_platform_driver(imx_pgc_domain_driver)
>
>  static int imx_gpcv2_probe(struct platform_device *pdev)
>  {
> +   static const struct imx_pgc_domain *imx_pgc_domains;
> static const struct regmap_range yes_ranges[] = {
> regmap_reg_range(GPC_LPCR_A_CORE_BSC,
>  GPC_M4_PU_PDN_FLG),
> @@ -287,6 +288,7 @@ static int imx_gpcv2_probe(struct platform_device *pdev)
> struct regmap *regmap;
> struct resource *res;
> void __iomem

Re: [PATCH 2/2] soc: imx: gpcv2: make pgc driver more generic for other i.MX platforms

2018-08-27 Thread Andrey Smirnov

On Sun, Aug 5, 2018 at 11:45 PM Anson Huang  wrote:
>
> i.MX8MQ and i.MX8MM share same gpc module with i.MX7D, they
> can reuse gpcv2 pgc driver for power domain control, this
> patch renames all functions and structure definitions started
> with "imx7" to "imx", and check machine type to pass platform
> specific power domain data for power domain driver, thus make
> gpcv2 pgc driver more generic for i.MX platforms.
>

Just for the sake of


> Signed-off-by: Anson Huang 
> ---
>  drivers/soc/imx/gpcv2.c | 68 
> +
>  1 file changed, 40 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/soc/imx/gpcv2.c b/drivers/soc/imx/gpcv2.c
> index 0e31465..0e33cb5 100644
> --- a/drivers/soc/imx/gpcv2.c
> +++ b/drivers/soc/imx/gpcv2.c
> @@ -53,7 +53,7 @@
>
>  #define GPC_PGC_CTRL_PCR   BIT(0)
>
> -struct imx7_pgc_domain {
> +struct imx_pgc_domain {
> struct generic_pm_domain genpd;
> struct regmap *regmap;
> struct regulator *regulator;
> @@ -69,11 +69,11 @@ struct imx7_pgc_domain {
> struct device *dev;
>  };
>
> -static int imx7_gpc_pu_pgc_sw_pxx_req(struct generic_pm_domain *genpd,
> +static int imx_gpc_pu_pgc_sw_pxx_req(struct generic_pm_domain *genpd,
>   bool on)
>  {
> -   struct imx7_pgc_domain *domain = container_of(genpd,
> - struct imx7_pgc_domain,
> +   struct imx_pgc_domain *domain = container_of(genpd,
> + struct imx_pgc_domain,
>   genpd);
> unsigned int offset = on ?
> GPC_PU_PGC_SW_PUP_REQ : GPC_PU_PGC_SW_PDN_REQ;
> @@ -150,17 +150,17 @@ static int imx7_gpc_pu_pgc_sw_pxx_req(struct 
> generic_pm_domain *genpd,
> return ret;
>  }
>
> -static int imx7_gpc_pu_pgc_sw_pup_req(struct generic_pm_domain *genpd)
> +static int imx_gpc_pu_pgc_sw_pup_req(struct generic_pm_domain *genpd)
>  {
> -   return imx7_gpc_pu_pgc_sw_pxx_req(genpd, true);
> +   return imx_gpc_pu_pgc_sw_pxx_req(genpd, true);
>  }
>
> -static int imx7_gpc_pu_pgc_sw_pdn_req(struct generic_pm_domain *genpd)
> +static int imx_gpc_pu_pgc_sw_pdn_req(struct generic_pm_domain *genpd)
>  {
> -   return imx7_gpc_pu_pgc_sw_pxx_req(genpd, false);
> +   return imx_gpc_pu_pgc_sw_pxx_req(genpd, false);
>  }
>
> -static const struct imx7_pgc_domain imx7_pgc_domains[] = {
> +static const struct imx_pgc_domain imx7_pgc_domains[] = {
> [IMX7_POWER_DOMAIN_MIPI_PHY] = {
> .genpd = {
> .name  = "mipi-phy",
> @@ -198,9 +198,9 @@ static const struct imx7_pgc_domain imx7_pgc_domains[] = {
> },
>  };
>
> -static int imx7_pgc_domain_probe(struct platform_device *pdev)
> +static int imx_pgc_domain_probe(struct platform_device *pdev)
>  {
> -   struct imx7_pgc_domain *domain = pdev->dev.platform_data;
> +   struct imx_pgc_domain *domain = pdev->dev.platform_data;
> int ret;
>
> domain->dev = >dev;
> @@ -233,9 +233,9 @@ static int imx7_pgc_domain_probe(struct platform_device 
> *pdev)
> return ret;
>  }
>
> -static int imx7_pgc_domain_remove(struct platform_device *pdev)
> +static int imx_pgc_domain_remove(struct platform_device *pdev)
>  {
> -   struct imx7_pgc_domain *domain = pdev->dev.platform_data;
> +   struct imx_pgc_domain *domain = pdev->dev.platform_data;
>
> of_genpd_del_provider(domain->dev->of_node);
> pm_genpd_remove(>genpd);
> @@ -243,23 +243,24 @@ static int imx7_pgc_domain_remove(struct 
> platform_device *pdev)
> return 0;
>  }
>
> -static const struct platform_device_id imx7_pgc_domain_id[] = {
> -   { "imx7-pgc-domain", },
> +static const struct platform_device_id imx_pgc_domain_id[] = {
> +   { "imx-pgc-domain", },
> { },
>  };
>
> -static struct platform_driver imx7_pgc_domain_driver = {
> +static struct platform_driver imx_pgc_domain_driver = {
> .driver = {
> -   .name = "imx7-pgc",
> +   .name = "imx-pgc",
> },
> -   .probe= imx7_pgc_domain_probe,
> -   .remove   = imx7_pgc_domain_remove,
> -   .id_table = imx7_pgc_domain_id,
> +   .probe= imx_pgc_domain_probe,
> +   .remove   = imx_pgc_domain_remove,
> +   .id_table = imx_pgc_domain_id,
>  };
> -builtin_platform_driver(imx7_pgc_domain_driver)
> +builtin_platform_driver(imx_pgc_domain_driver)
>
>  static int imx_gpcv2_probe(struct platform_device *pdev)
>  {
> +   static const struct imx_pgc_domain *imx_pgc_domains;
> static const struct regmap_range yes_ranges[] = {
> regmap_reg_range(GPC_LPCR_A_CORE_BSC,
>  GPC_M4_PU_PDN_FLG),
> @@ -287,6 +288,7 @@ static int imx_gpcv2_probe(struct platform_device *pdev)
> struct regmap *regmap;
> struct resource *res;
> void __iomem

Re: [PATCH v13 09/13] x86/sgx: Enclave Page Cache (EPC) memory manager

2018-08-27 Thread Dave Hansen

On 08/27/2018 11:53 AM, Jarkko Sakkinen wrote:
> +enum sgx_alloc_flags {
> + SGX_ALLOC_ATOMIC= BIT(0),
> +};

Doing this with enums is unprecedented IMNHO.  Why are you doing it this
way for simple, one-off constants?

Re: [PATCH v13 09/13] x86/sgx: Enclave Page Cache (EPC) memory manager

2018-08-27 Thread Dave Hansen

On 08/27/2018 11:53 AM, Jarkko Sakkinen wrote:
> +enum sgx_alloc_flags {
> + SGX_ALLOC_ATOMIC= BIT(0),
> +};

Doing this with enums is unprecedented IMNHO.  Why are you doing it this
way for simple, one-off constants?

Re: [PATCH 1/3] swap: Use __try_to_reclaim_swap() in free_swap_and_cache()

2018-08-27 Thread Andrew Morton

On Mon, 27 Aug 2018 15:55:33 +0800 Huang Ying  wrote:

> The code path to reclaim the swap entry in free_swap_and_cache() is
> almost same as that of __try_to_reclaim_swap().  The largest
> difference is just coding style.  So the support to the additional
> requirement of free_swap_and_cache() is added into
> __try_to_reclaim_swap().  free_swap_and_cache() is changed to call
> __try_to_reclaim_swap(), and delete the duplicated code.  This will
> improve code readability and reduce the potential bugs.
> 
> There are 2 functionality differences between __try_to_reclaim_swap()
> and swap entry reclaim code of free_swap_and_cache().
> 
> - free_swap_and_cache() only reclaims the swap entry if the page is
>   unmapped or swap is getting full.  The support has been added into
>   __try_to_reclaim_swap().
> 
> - try_to_free_swap() (called by __try_to_reclaim_swap()) checks
>   pm_suspended_storage(), while free_swap_and_cache() not.  I think
>   this is OK.  Because the page and the swap entry can be reclaimed
>   later eventually.

hm.  Having functions take `mode' arguments which specify their actions
in this manner isn't popular (Linus ;)) but I guess the end result is
somewhat better.

Re: [PATCH 1/3] swap: Use __try_to_reclaim_swap() in free_swap_and_cache()

2018-08-27 Thread Andrew Morton

On Mon, 27 Aug 2018 15:55:33 +0800 Huang Ying  wrote:

> The code path to reclaim the swap entry in free_swap_and_cache() is
> almost same as that of __try_to_reclaim_swap().  The largest
> difference is just coding style.  So the support to the additional
> requirement of free_swap_and_cache() is added into
> __try_to_reclaim_swap().  free_swap_and_cache() is changed to call
> __try_to_reclaim_swap(), and delete the duplicated code.  This will
> improve code readability and reduce the potential bugs.
> 
> There are 2 functionality differences between __try_to_reclaim_swap()
> and swap entry reclaim code of free_swap_and_cache().
> 
> - free_swap_and_cache() only reclaims the swap entry if the page is
>   unmapped or swap is getting full.  The support has been added into
>   __try_to_reclaim_swap().
> 
> - try_to_free_swap() (called by __try_to_reclaim_swap()) checks
>   pm_suspended_storage(), while free_swap_and_cache() not.  I think
>   this is OK.  Because the page and the swap entry can be reclaimed
>   later eventually.

hm.  Having functions take `mode' arguments which specify their actions
in this manner isn't popular (Linus ;)) but I guess the end result is
somewhat better.

Re: [PATCH 1/2] x86/mm: add .data..decrypted section to hold shared variables

2018-08-27 Thread Tom Lendacky

On 08/27/2018 06:24 AM, Brijesh Singh wrote:
> kvmclock defines few static variables which are shared with hypervisor
> during the kvmclock initialization.
> 
> When SEV is active, memory is encrypted with a guest-specific key, and
> if guest OS wants to share the memory region with hypervisor then it must
> clear the C-bit before sharing it.
> 
> The '__decrypted' can be used to define a shared variables; the variables
> will be put in the .data.decryption section. This section is mapped with
> C=0 early in the boot, we also ensure that the initialized values are
> updated to match with C=0 (i.e peform an in-place decryption). The
> .data..decrypted section is PMD aligned and sized so that we avoid the
> need for spliting the pages when map with C=0.

This should probably be broken into a few smaller patches.  Maybe a
patch that adds the section and the attribute, a patch that re-arranges
the mapping setup and then the in-place decryption and clearing of the
encryption bit for the area.

> 
> Signed-off-by: Brijesh Singh 
> Fixes: 368a540e0232 ("x86/kvmclock: Remove memblock dependency")
> Cc: sta...@vger.kernel.org
> Cc: Tom Lendacky 
> Cc: k...@vger.kernel.org
> Cc: Thomas Gleixner 
> Cc: Borislav Petkov 
> Cc: "H. Peter Anvin" 
> Cc: linux-kernel@vger.kernel.org
> Cc: Paolo Bonzini 
> Cc: Sean Christopherson 
> Cc: "Radim Krčmář" 
> ---
>  arch/x86/include/asm/mem_encrypt.h |   4 +
>  arch/x86/kernel/head64.c   |  12 ++
>  arch/x86/kernel/vmlinux.lds.S  |  18 +++
>  arch/x86/mm/mem_encrypt_identity.c | 220 
> +++--
>  4 files changed, 197 insertions(+), 57 deletions(-)
> 
> diff --git a/arch/x86/include/asm/mem_encrypt.h 
> b/arch/x86/include/asm/mem_encrypt.h
> index c064383..3f7d9d3 100644
> --- a/arch/x86/include/asm/mem_encrypt.h
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -52,6 +52,8 @@ void __init mem_encrypt_init(void);
>  bool sme_active(void);
>  bool sev_active(void);
>  
> +#define __decrypted __attribute__((__section__(".data..decrypted")))
> +
>  #else/* !CONFIG_AMD_MEM_ENCRYPT */
>  
>  #define sme_me_mask  0ULL
> @@ -77,6 +79,8 @@ early_set_memory_decrypted(unsigned long vaddr, unsigned 
> long size) { return 0;
>  static inline int __init
>  early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 
> 0; }
>  
> +#define __decrypted
> +
>  #endif   /* CONFIG_AMD_MEM_ENCRYPT */
>  
>  /*
> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index 8047379..6a18297 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -43,6 +43,9 @@ extern pmd_t 
> early_dynamic_pgts[EARLY_DYNAMIC_PAGE_TABLES][PTRS_PER_PMD];
>  static unsigned int __initdata next_early_pgt;
>  pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
>  
> +/* To clear memory encryption mask from the decrypted section */
> +extern char __start_data_decrypted[], __end_data_decrypted[];
> +

Should find a header for these rather than defining them here.

>  #ifdef CONFIG_X86_5LEVEL
>  unsigned int __pgtable_l5_enabled __ro_after_init;
>  unsigned int pgdir_shift __ro_after_init = 39;
> @@ -112,6 +115,7 @@ static bool __head check_la57_support(unsigned long 
> physaddr)
>  unsigned long __head __startup_64(unsigned long physaddr,
> struct boot_params *bp)
>  {
> + unsigned long vaddr, vaddr_end;
>   unsigned long load_delta, *p;
>   unsigned long pgtable_flags;
>   pgdval_t *pgd;
> @@ -234,6 +238,14 @@ unsigned long __head __startup_64(unsigned long physaddr,
>   /* Encrypt the kernel and related (if SME is active) */
>   sme_encrypt_kernel(bp);
>  
> + /* Clear the memory encryption mask from the decrypted section */
> + vaddr = (unsigned long)__start_data_decrypted;
> + vaddr_end = (unsigned long)__end_data_decrypted;
> + for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
> + i = pmd_index(vaddr);
> + pmd[i] -= sme_get_me_mask();
> + }
> +
>   /*
>* Return the SME encryption mask (if SME is active) to be used as a
>* modifier for the initial pgdir entry programmed into CR3.
> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
> index 8bde0a4..511b875 100644
> --- a/arch/x86/kernel/vmlinux.lds.S
> +++ b/arch/x86/kernel/vmlinux.lds.S
> @@ -89,6 +89,22 @@ PHDRS {
>   note PT_NOTE FLAGS(0);  /* ___ */
>  }
>  
> +/*
> + * This section contains data which will be mapped as decrypted. Memory
> + * encryption operates on a page basis. But we make this section a pmd
> + * aligned to avoid spliting the pages while mapping the section early.
> + *
> + * Note: We use a separate section so that only this section gets
> + * decrypted to avoid exposing more than we wish.
> + */
> +#define DATA_DECRYPTED_SECTION   
> \
> + . = ALIGN(PMD_SIZE);\
> +

Re: [PATCH 1/2] x86/mm: add .data..decrypted section to hold shared variables

2018-08-27 Thread Tom Lendacky

On 08/27/2018 06:24 AM, Brijesh Singh wrote:
> kvmclock defines few static variables which are shared with hypervisor
> during the kvmclock initialization.
> 
> When SEV is active, memory is encrypted with a guest-specific key, and
> if guest OS wants to share the memory region with hypervisor then it must
> clear the C-bit before sharing it.
> 
> The '__decrypted' can be used to define a shared variables; the variables
> will be put in the .data.decryption section. This section is mapped with
> C=0 early in the boot, we also ensure that the initialized values are
> updated to match with C=0 (i.e peform an in-place decryption). The
> .data..decrypted section is PMD aligned and sized so that we avoid the
> need for spliting the pages when map with C=0.

This should probably be broken into a few smaller patches.  Maybe a
patch that adds the section and the attribute, a patch that re-arranges
the mapping setup and then the in-place decryption and clearing of the
encryption bit for the area.

> 
> Signed-off-by: Brijesh Singh 
> Fixes: 368a540e0232 ("x86/kvmclock: Remove memblock dependency")
> Cc: sta...@vger.kernel.org
> Cc: Tom Lendacky 
> Cc: k...@vger.kernel.org
> Cc: Thomas Gleixner 
> Cc: Borislav Petkov 
> Cc: "H. Peter Anvin" 
> Cc: linux-kernel@vger.kernel.org
> Cc: Paolo Bonzini 
> Cc: Sean Christopherson 
> Cc: "Radim Krčmář" 
> ---
>  arch/x86/include/asm/mem_encrypt.h |   4 +
>  arch/x86/kernel/head64.c   |  12 ++
>  arch/x86/kernel/vmlinux.lds.S  |  18 +++
>  arch/x86/mm/mem_encrypt_identity.c | 220 
> +++--
>  4 files changed, 197 insertions(+), 57 deletions(-)
> 
> diff --git a/arch/x86/include/asm/mem_encrypt.h 
> b/arch/x86/include/asm/mem_encrypt.h
> index c064383..3f7d9d3 100644
> --- a/arch/x86/include/asm/mem_encrypt.h
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -52,6 +52,8 @@ void __init mem_encrypt_init(void);
>  bool sme_active(void);
>  bool sev_active(void);
>  
> +#define __decrypted __attribute__((__section__(".data..decrypted")))
> +
>  #else/* !CONFIG_AMD_MEM_ENCRYPT */
>  
>  #define sme_me_mask  0ULL
> @@ -77,6 +79,8 @@ early_set_memory_decrypted(unsigned long vaddr, unsigned 
> long size) { return 0;
>  static inline int __init
>  early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 
> 0; }
>  
> +#define __decrypted
> +
>  #endif   /* CONFIG_AMD_MEM_ENCRYPT */
>  
>  /*
> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index 8047379..6a18297 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -43,6 +43,9 @@ extern pmd_t 
> early_dynamic_pgts[EARLY_DYNAMIC_PAGE_TABLES][PTRS_PER_PMD];
>  static unsigned int __initdata next_early_pgt;
>  pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
>  
> +/* To clear memory encryption mask from the decrypted section */
> +extern char __start_data_decrypted[], __end_data_decrypted[];
> +

Should find a header for these rather than defining them here.

>  #ifdef CONFIG_X86_5LEVEL
>  unsigned int __pgtable_l5_enabled __ro_after_init;
>  unsigned int pgdir_shift __ro_after_init = 39;
> @@ -112,6 +115,7 @@ static bool __head check_la57_support(unsigned long 
> physaddr)
>  unsigned long __head __startup_64(unsigned long physaddr,
> struct boot_params *bp)
>  {
> + unsigned long vaddr, vaddr_end;
>   unsigned long load_delta, *p;
>   unsigned long pgtable_flags;
>   pgdval_t *pgd;
> @@ -234,6 +238,14 @@ unsigned long __head __startup_64(unsigned long physaddr,
>   /* Encrypt the kernel and related (if SME is active) */
>   sme_encrypt_kernel(bp);
>  
> + /* Clear the memory encryption mask from the decrypted section */
> + vaddr = (unsigned long)__start_data_decrypted;
> + vaddr_end = (unsigned long)__end_data_decrypted;
> + for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
> + i = pmd_index(vaddr);
> + pmd[i] -= sme_get_me_mask();
> + }
> +
>   /*
>* Return the SME encryption mask (if SME is active) to be used as a
>* modifier for the initial pgdir entry programmed into CR3.
> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
> index 8bde0a4..511b875 100644
> --- a/arch/x86/kernel/vmlinux.lds.S
> +++ b/arch/x86/kernel/vmlinux.lds.S
> @@ -89,6 +89,22 @@ PHDRS {
>   note PT_NOTE FLAGS(0);  /* ___ */
>  }
>  
> +/*
> + * This section contains data which will be mapped as decrypted. Memory
> + * encryption operates on a page basis. But we make this section a pmd
> + * aligned to avoid spliting the pages while mapping the section early.
> + *
> + * Note: We use a separate section so that only this section gets
> + * decrypted to avoid exposing more than we wish.
> + */
> +#define DATA_DECRYPTED_SECTION   
> \
> + . = ALIGN(PMD_SIZE);\
> +

Re: [PATCH 1/3] ARM: dts: NSP: Enable SFP on bcm958625hr

2018-08-27 Thread Florian Fainelli

On 08/27/2018 03:26 PM, Russell King - ARM Linux wrote:
> On Mon, Aug 27, 2018 at 01:03:42PM -0700, Florian Fainelli wrote:
>> Enable the SFP connected to port 5 of the switch and wire up all GPIOs
>> to the SFP cage. Because of a hardware limitation of the i2c controller
>> on the iProc SoCs which prevents large i2c (> 256 bytes) transactions to
>> work, we use the i2c-gpio interface instead, which does not have that
>> limitation. This allows us to read the SFP module EEPROM, which would
>> not be possible otherwise since it exceeds that size during a single
>> read transfer.
> 
> We shouldn't exceed 256 bytes, since 256 bytes is the "page" size
> of the EEPROM.  The most that we read in one block is either
> ETH_MODULE_SFF_8079_LEN or (ETH_MODULE_SFF_8472_LEN - 
> ETH_MODULE_SFF_8079_LEN), both of which result in no more than 256
> byte reads.

You are right, I got things mixed up here, the controller's limitation
is actually 63 bytes per transfer, I will be rewording the commit
message accordingly.
-- 
Florian

Re: [PATCH 1/3] ARM: dts: NSP: Enable SFP on bcm958625hr

2018-08-27 Thread Florian Fainelli

On 08/27/2018 03:26 PM, Russell King - ARM Linux wrote:
> On Mon, Aug 27, 2018 at 01:03:42PM -0700, Florian Fainelli wrote:
>> Enable the SFP connected to port 5 of the switch and wire up all GPIOs
>> to the SFP cage. Because of a hardware limitation of the i2c controller
>> on the iProc SoCs which prevents large i2c (> 256 bytes) transactions to
>> work, we use the i2c-gpio interface instead, which does not have that
>> limitation. This allows us to read the SFP module EEPROM, which would
>> not be possible otherwise since it exceeds that size during a single
>> read transfer.
> 
> We shouldn't exceed 256 bytes, since 256 bytes is the "page" size
> of the EEPROM.  The most that we read in one block is either
> ETH_MODULE_SFF_8079_LEN or (ETH_MODULE_SFF_8472_LEN - 
> ETH_MODULE_SFF_8079_LEN), both of which result in no more than 256
> byte reads.

You are right, I got things mixed up here, the controller's limitation
is actually 63 bytes per transfer, I will be rewording the commit
message accordingly.
-- 
Florian

Re: [PATCH v2 1/5] drivers: pinctrl: qcom: add wakeup capability to GPIO

2018-08-27 Thread Matthias Kaehlcke

Hi Lina,

On Fri, Aug 24, 2018 at 02:01:53PM -0600, Lina Iyer wrote:
> QCOM SoC's that have Power Domain Controller (PDC) chip in the always-on
> domain can wakeup the SoC, when interrupts and GPIOs are routed to the
> its interrupt controller. Only select GPIOs that are deemed wakeup

wording nit: "are routed to the|its interrupt controller"

> capable are routed to specific PDC pins. During low power state, the
> pinmux interrupt controller may be non-functional but the PDC would be.
> The PDC can detect the wakeup GPIO is triggered and bring the TLMM to an
> operational state.
> 
> Interrupts that are level triggered will be detected at the TLMM when
> the controller becomes operational. Edge interrupts however need to be
> replayed again.
> 
> Request the corresponding PDC IRQ, when the GPIO is requested as an IRQ,
> but keep it disabled. During suspend, we can enable the PDC IRQ instead
> of the GPIO IRQ, which may or not be detected.
> 
> Signed-off-by: Lina Iyer 
> ---
> Changes in v2:
>   - Remove IRQF_NO_SUSPEND and IRQF_ONE_SHOT from PDC IRQ
> Changes in v1:
>   - Trigger GPIO in h/w from PDC IRQ handler
>   - Avoid big tables for GPIO-PDC map, pick from DT instead
>   - Use handler_data
> ---
>  drivers/pinctrl/qcom/pinctrl-msm.c | 96 ++
>  1 file changed, 96 insertions(+)
> 
> diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c 
> b/drivers/pinctrl/qcom/pinctrl-msm.c
> index 0e22f52b2a19..b675ea56a4ff 100644
> --- a/drivers/pinctrl/qcom/pinctrl-msm.c
> +++ b/drivers/pinctrl/qcom/pinctrl-msm.c
> @@ -687,11 +687,15 @@ static int msm_gpio_irq_set_type(struct irq_data *d, 
> unsigned int type)
>   const struct msm_pingroup *g;
>   unsigned long flags;
>   u32 val;
> + struct irq_data *pdc_irqd = irq_get_handler_data(d->irq);
>  
>   g = >soc->groups[d->hwirq];
>  
>   raw_spin_lock_irqsave(>lock, flags);
>  
> + if (pdc_irqd)
> + irq_set_irq_type(pdc_irqd->irq, type);
> +
>   /*
>* For hw without possibility of detecting both edges
>*/
> @@ -779,9 +783,13 @@ static int msm_gpio_irq_set_wake(struct irq_data *d, 
> unsigned int on)
>   struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
>   struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
>   unsigned long flags;
> + struct irq_data *pdc_irqd = irq_get_handler_data(d->irq);
>  
>   raw_spin_lock_irqsave(>lock, flags);
>  
> + if (pdc_irqd)
> + irq_set_irq_wake(pdc_irqd->irq, on);
> +
>   irq_set_irq_wake(pctrl->irq, on);
>  
>   raw_spin_unlock_irqrestore(>lock, flags);
> @@ -863,6 +871,92 @@ static bool msm_gpio_needs_valid_mask(struct msm_pinctrl 
> *pctrl)
>   return device_property_read_u16_array(pctrl->dev, "gpios", NULL, 0) > 0;
>  }
>  
> +static irqreturn_t wake_irq_gpio_handler(int irq, void *data)
> +{
> + struct irq_data *irqd = data;
> + struct gpio_chip *gc = irq_data_get_irq_chip_data(irqd);
> + struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
> + const struct msm_pingroup *g;
> + unsigned long flags;
> + u32 val;
> +
> + if (!irqd_is_level_type(irqd)) {
> + g = >soc->groups[irqd->hwirq];
> + raw_spin_lock_irqsave(>lock, flags);
> + val = BIT(g->intr_status_bit);
> + writel(val, pctrl->regs + g->intr_status_reg);
> + raw_spin_unlock_irqrestore(>lock, flags);
> + }
> +
> + return IRQ_HANDLED;
> +}
> +
> +static int msm_gpio_pdc_pin_request(struct irq_data *d)
> +{
> + struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
> + struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
> + struct platform_device *pdev = to_platform_device(pctrl->dev);
> + const char *pin_name;
> + int irq;
> + int ret;
> +
> + pin_name = kasprintf(GFP_KERNEL, "gpio%lu", d->hwirq);
> + if (!pin_name)
> + return -ENOMEM;
> +
> + irq = platform_get_irq_byname(pdev, pin_name);
> + if (irq < 0) {
> + kfree(pin_name);
> + return 0;

Do I understand correctly that this is the case where the pin isn't
routed to the PDC?

> + }
> +
> + ret = request_irq(irq, wake_irq_gpio_handler, irqd_get_trigger_type(d),
> +   pin_name, d);
> + if (ret) {
> + pr_warn("GPIO-%lu could not be set up as wakeup", d->hwirq);

'\n' is missing

> + kfree(pin_name);
> + return ret;
> + }
> +
> + irq_set_handler_data(d->irq, irq_get_irq_data(irq));
> + disable_irq(irq);
> +
> + return 0;
> +}
> +
> +static int msm_gpio_pdc_pin_release(struct irq_data *d)
> +{
> + struct irq_data *pdc_irqd = irq_get_handler_data(d->irq);
> +
> + if (pdc_irqd) {
> + irq_set_handler_data(d->irq, NULL);
> + free_irq(pdc_irqd->irq, d);

You need to free 'pin_name' allocated in msm_gpio_pdc_pin_request().
IIUC it should be available in irq_desc->action->name.

Cheers

Matthias

Re: [PATCH v2 1/5] drivers: pinctrl: qcom: add wakeup capability to GPIO

2018-08-27 Thread Matthias Kaehlcke

Hi Lina,

On Fri, Aug 24, 2018 at 02:01:53PM -0600, Lina Iyer wrote:
> QCOM SoC's that have Power Domain Controller (PDC) chip in the always-on
> domain can wakeup the SoC, when interrupts and GPIOs are routed to the
> its interrupt controller. Only select GPIOs that are deemed wakeup

wording nit: "are routed to the|its interrupt controller"

> capable are routed to specific PDC pins. During low power state, the
> pinmux interrupt controller may be non-functional but the PDC would be.
> The PDC can detect the wakeup GPIO is triggered and bring the TLMM to an
> operational state.
> 
> Interrupts that are level triggered will be detected at the TLMM when
> the controller becomes operational. Edge interrupts however need to be
> replayed again.
> 
> Request the corresponding PDC IRQ, when the GPIO is requested as an IRQ,
> but keep it disabled. During suspend, we can enable the PDC IRQ instead
> of the GPIO IRQ, which may or not be detected.
> 
> Signed-off-by: Lina Iyer 
> ---
> Changes in v2:
>   - Remove IRQF_NO_SUSPEND and IRQF_ONE_SHOT from PDC IRQ
> Changes in v1:
>   - Trigger GPIO in h/w from PDC IRQ handler
>   - Avoid big tables for GPIO-PDC map, pick from DT instead
>   - Use handler_data
> ---
>  drivers/pinctrl/qcom/pinctrl-msm.c | 96 ++
>  1 file changed, 96 insertions(+)
> 
> diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c 
> b/drivers/pinctrl/qcom/pinctrl-msm.c
> index 0e22f52b2a19..b675ea56a4ff 100644
> --- a/drivers/pinctrl/qcom/pinctrl-msm.c
> +++ b/drivers/pinctrl/qcom/pinctrl-msm.c
> @@ -687,11 +687,15 @@ static int msm_gpio_irq_set_type(struct irq_data *d, 
> unsigned int type)
>   const struct msm_pingroup *g;
>   unsigned long flags;
>   u32 val;
> + struct irq_data *pdc_irqd = irq_get_handler_data(d->irq);
>  
>   g = >soc->groups[d->hwirq];
>  
>   raw_spin_lock_irqsave(>lock, flags);
>  
> + if (pdc_irqd)
> + irq_set_irq_type(pdc_irqd->irq, type);
> +
>   /*
>* For hw without possibility of detecting both edges
>*/
> @@ -779,9 +783,13 @@ static int msm_gpio_irq_set_wake(struct irq_data *d, 
> unsigned int on)
>   struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
>   struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
>   unsigned long flags;
> + struct irq_data *pdc_irqd = irq_get_handler_data(d->irq);
>  
>   raw_spin_lock_irqsave(>lock, flags);
>  
> + if (pdc_irqd)
> + irq_set_irq_wake(pdc_irqd->irq, on);
> +
>   irq_set_irq_wake(pctrl->irq, on);
>  
>   raw_spin_unlock_irqrestore(>lock, flags);
> @@ -863,6 +871,92 @@ static bool msm_gpio_needs_valid_mask(struct msm_pinctrl 
> *pctrl)
>   return device_property_read_u16_array(pctrl->dev, "gpios", NULL, 0) > 0;
>  }
>  
> +static irqreturn_t wake_irq_gpio_handler(int irq, void *data)
> +{
> + struct irq_data *irqd = data;
> + struct gpio_chip *gc = irq_data_get_irq_chip_data(irqd);
> + struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
> + const struct msm_pingroup *g;
> + unsigned long flags;
> + u32 val;
> +
> + if (!irqd_is_level_type(irqd)) {
> + g = >soc->groups[irqd->hwirq];
> + raw_spin_lock_irqsave(>lock, flags);
> + val = BIT(g->intr_status_bit);
> + writel(val, pctrl->regs + g->intr_status_reg);
> + raw_spin_unlock_irqrestore(>lock, flags);
> + }
> +
> + return IRQ_HANDLED;
> +}
> +
> +static int msm_gpio_pdc_pin_request(struct irq_data *d)
> +{
> + struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
> + struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
> + struct platform_device *pdev = to_platform_device(pctrl->dev);
> + const char *pin_name;
> + int irq;
> + int ret;
> +
> + pin_name = kasprintf(GFP_KERNEL, "gpio%lu", d->hwirq);
> + if (!pin_name)
> + return -ENOMEM;
> +
> + irq = platform_get_irq_byname(pdev, pin_name);
> + if (irq < 0) {
> + kfree(pin_name);
> + return 0;

Do I understand correctly that this is the case where the pin isn't
routed to the PDC?

> + }
> +
> + ret = request_irq(irq, wake_irq_gpio_handler, irqd_get_trigger_type(d),
> +   pin_name, d);
> + if (ret) {
> + pr_warn("GPIO-%lu could not be set up as wakeup", d->hwirq);

'\n' is missing

> + kfree(pin_name);
> + return ret;
> + }
> +
> + irq_set_handler_data(d->irq, irq_get_irq_data(irq));
> + disable_irq(irq);
> +
> + return 0;
> +}
> +
> +static int msm_gpio_pdc_pin_release(struct irq_data *d)
> +{
> + struct irq_data *pdc_irqd = irq_get_handler_data(d->irq);
> +
> + if (pdc_irqd) {
> + irq_set_handler_data(d->irq, NULL);
> + free_irq(pdc_irqd->irq, d);

You need to free 'pin_name' allocated in msm_gpio_pdc_pin_request().
IIUC it should be available in irq_desc->action->name.

Cheers

Matthias

Re: TLB flushes on fixmap changes

2018-08-27 Thread Andy Lutomirski

On Mon, Aug 27, 2018 at 2:55 PM, Nadav Amit  wrote:
> at 1:16 PM, Nadav Amit  wrote:
>
>> at 12:58 PM, Andy Lutomirski  wrote:
>>
>>> On Mon, Aug 27, 2018 at 12:43 PM, Nadav Amit  wrote:
 at 12:10 PM, Nadav Amit  wrote:

> at 11:58 AM, Andy Lutomirski  wrote:
>
>> On Mon, Aug 27, 2018 at 11:54 AM, Nadav Amit  
>> wrote:
 On Mon, Aug 27, 2018 at 10:34 AM, Nadav Amit  
 wrote:
 What do you all think?
>>>
>>> I agree in general. But I think that current->mm would need to be 
>>> loaded, as
>>> otherwise I am afraid it would break switch_mm_irqs_off().
>>
>> What breaks?
>
> Actually nothing. I just saw the IBPB stuff regarding tsk, but it should 
> not
> matter.

 So here is what I got. It certainly needs some cleanup, but it boots.

 Let me know how crappy you find it...


 diff --git a/arch/x86/include/asm/mmu_context.h 
 b/arch/x86/include/asm/mmu_context.h
 index bbc796eb0a3b..336779650a41 100644
 --- a/arch/x86/include/asm/mmu_context.h
 +++ b/arch/x86/include/asm/mmu_context.h
 @@ -343,4 +343,24 @@ static inline unsigned long 
 __get_current_cr3_fast(void)
   return cr3;
 }

 +typedef struct {
 +   struct mm_struct *prev;
 +} temporary_mm_state_t;
 +
 +static inline temporary_mm_state_t use_temporary_mm(struct mm_struct *mm)
 +{
 +   temporary_mm_state_t state;
 +
 +   lockdep_assert_irqs_disabled();
 +   state.prev = this_cpu_read(cpu_tlbstate.loaded_mm);
 +   switch_mm_irqs_off(NULL, mm, current);
 +   return state;
 +}
 +
 +static inline void unuse_temporary_mm(temporary_mm_state_t prev)
 +{
 +   lockdep_assert_irqs_disabled();
 +   switch_mm_irqs_off(NULL, prev.prev, current);
 +}
 +
 #endif /* _ASM_X86_MMU_CONTEXT_H */
 diff --git a/arch/x86/include/asm/pgtable.h 
 b/arch/x86/include/asm/pgtable.h
 index 5715647fc4fe..ef62af9a0ef7 100644
 --- a/arch/x86/include/asm/pgtable.h
 +++ b/arch/x86/include/asm/pgtable.h
 @@ -976,6 +976,10 @@ static inline void __meminit 
 init_trampoline_default(void)
   /* Default trampoline pgd value */
   trampoline_pgd_entry = init_top_pgt[pgd_index(__PAGE_OFFSET)];
 }
 +
 +void __init patching_mm_init(void);
 +#define patching_mm_init patching_mm_init
 +
 # ifdef CONFIG_RANDOMIZE_MEMORY
 void __meminit init_trampoline(void);
 # else
 diff --git a/arch/x86/include/asm/pgtable_64_types.h 
 b/arch/x86/include/asm/pgtable_64_types.h
 index 054765ab2da2..9f44262abde0 100644
 --- a/arch/x86/include/asm/pgtable_64_types.h
 +++ b/arch/x86/include/asm/pgtable_64_types.h
 @@ -116,6 +116,9 @@ extern unsigned int ptrs_per_p4d;
 #define LDT_PGD_ENTRY  (pgtable_l5_enabled() ? LDT_PGD_ENTRY_L5 : 
 LDT_PGD_ENTRY_L4)
 #define LDT_BASE_ADDR  (LDT_PGD_ENTRY << PGDIR_SHIFT)

 +#define TEXT_POKE_PGD_ENTRY-5UL
 +#define TEXT_POKE_ADDR (TEXT_POKE_PGD_ENTRY << PGDIR_SHIFT)
 +
 #define __VMALLOC_BASE_L4  0xc900UL
 #define __VMALLOC_BASE_L5  0xffa0UL

 diff --git a/arch/x86/include/asm/pgtable_types.h 
 b/arch/x86/include/asm/pgtable_types.h
 index 99fff853c944..840c72ec8c4f 100644
 --- a/arch/x86/include/asm/pgtable_types.h
 +++ b/arch/x86/include/asm/pgtable_types.h
 @@ -505,6 +505,9 @@ pgprot_t phys_mem_access_prot(struct file *file, 
 unsigned long pfn,
 /* Install a pte for a particular vaddr in kernel space. */
 void set_pte_vaddr(unsigned long vaddr, pte_t pte);

 +struct mm_struct;
 +void set_mm_pte_vaddr(struct mm_struct *mm, unsigned long vaddr, pte_t 
 pte);
 +
 #ifdef CONFIG_X86_32
 extern void native_pagetable_init(void);
 #else
 diff --git a/arch/x86/include/asm/text-patching.h 
 b/arch/x86/include/asm/text-patching.h
 index 2ecd34e2d46c..cb364ea5b19d 100644
 --- a/arch/x86/include/asm/text-patching.h
 +++ b/arch/x86/include/asm/text-patching.h
 @@ -38,4 +38,6 @@ extern void *text_poke(void *addr, const void *opcode, 
 size_t len);
 extern int poke_int3_handler(struct pt_regs *regs);
 extern void *text_poke_bp(void *addr, const void *opcode, size_t len, void 
 *handler);

 +extern struct mm_struct *patching_mm;
 +
 #endif /* _ASM_X86_TEXT_PATCHING_H */
 diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
 index a481763a3776..fd8a950b0d62 100644
 --- a/arch/x86/kernel/alternative.c
 +++ b/arch/x86/kernel/alternative.c
 @@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
 +#include 
 #include 
 #include 
 #include 
 @@ -701,8 +702,36 @@ void *text_poke(void *addr, const void *opcode, 
 size_t len)

Re: TLB flushes on fixmap changes

2018-08-27 Thread Andy Lutomirski

On Mon, Aug 27, 2018 at 2:55 PM, Nadav Amit  wrote:
> at 1:16 PM, Nadav Amit  wrote:
>
>> at 12:58 PM, Andy Lutomirski  wrote:
>>
>>> On Mon, Aug 27, 2018 at 12:43 PM, Nadav Amit  wrote:
 at 12:10 PM, Nadav Amit  wrote:

> at 11:58 AM, Andy Lutomirski  wrote:
>
>> On Mon, Aug 27, 2018 at 11:54 AM, Nadav Amit  
>> wrote:
 On Mon, Aug 27, 2018 at 10:34 AM, Nadav Amit  
 wrote:
 What do you all think?
>>>
>>> I agree in general. But I think that current->mm would need to be 
>>> loaded, as
>>> otherwise I am afraid it would break switch_mm_irqs_off().
>>
>> What breaks?
>
> Actually nothing. I just saw the IBPB stuff regarding tsk, but it should 
> not
> matter.

 So here is what I got. It certainly needs some cleanup, but it boots.

 Let me know how crappy you find it...


 diff --git a/arch/x86/include/asm/mmu_context.h 
 b/arch/x86/include/asm/mmu_context.h
 index bbc796eb0a3b..336779650a41 100644
 --- a/arch/x86/include/asm/mmu_context.h
 +++ b/arch/x86/include/asm/mmu_context.h
 @@ -343,4 +343,24 @@ static inline unsigned long 
 __get_current_cr3_fast(void)
   return cr3;
 }

 +typedef struct {
 +   struct mm_struct *prev;
 +} temporary_mm_state_t;
 +
 +static inline temporary_mm_state_t use_temporary_mm(struct mm_struct *mm)
 +{
 +   temporary_mm_state_t state;
 +
 +   lockdep_assert_irqs_disabled();
 +   state.prev = this_cpu_read(cpu_tlbstate.loaded_mm);
 +   switch_mm_irqs_off(NULL, mm, current);
 +   return state;
 +}
 +
 +static inline void unuse_temporary_mm(temporary_mm_state_t prev)
 +{
 +   lockdep_assert_irqs_disabled();
 +   switch_mm_irqs_off(NULL, prev.prev, current);
 +}
 +
 #endif /* _ASM_X86_MMU_CONTEXT_H */
 diff --git a/arch/x86/include/asm/pgtable.h 
 b/arch/x86/include/asm/pgtable.h
 index 5715647fc4fe..ef62af9a0ef7 100644
 --- a/arch/x86/include/asm/pgtable.h
 +++ b/arch/x86/include/asm/pgtable.h
 @@ -976,6 +976,10 @@ static inline void __meminit 
 init_trampoline_default(void)
   /* Default trampoline pgd value */
   trampoline_pgd_entry = init_top_pgt[pgd_index(__PAGE_OFFSET)];
 }
 +
 +void __init patching_mm_init(void);
 +#define patching_mm_init patching_mm_init
 +
 # ifdef CONFIG_RANDOMIZE_MEMORY
 void __meminit init_trampoline(void);
 # else
 diff --git a/arch/x86/include/asm/pgtable_64_types.h 
 b/arch/x86/include/asm/pgtable_64_types.h
 index 054765ab2da2..9f44262abde0 100644
 --- a/arch/x86/include/asm/pgtable_64_types.h
 +++ b/arch/x86/include/asm/pgtable_64_types.h
 @@ -116,6 +116,9 @@ extern unsigned int ptrs_per_p4d;
 #define LDT_PGD_ENTRY  (pgtable_l5_enabled() ? LDT_PGD_ENTRY_L5 : 
 LDT_PGD_ENTRY_L4)
 #define LDT_BASE_ADDR  (LDT_PGD_ENTRY << PGDIR_SHIFT)

 +#define TEXT_POKE_PGD_ENTRY-5UL
 +#define TEXT_POKE_ADDR (TEXT_POKE_PGD_ENTRY << PGDIR_SHIFT)
 +
 #define __VMALLOC_BASE_L4  0xc900UL
 #define __VMALLOC_BASE_L5  0xffa0UL

 diff --git a/arch/x86/include/asm/pgtable_types.h 
 b/arch/x86/include/asm/pgtable_types.h
 index 99fff853c944..840c72ec8c4f 100644
 --- a/arch/x86/include/asm/pgtable_types.h
 +++ b/arch/x86/include/asm/pgtable_types.h
 @@ -505,6 +505,9 @@ pgprot_t phys_mem_access_prot(struct file *file, 
 unsigned long pfn,
 /* Install a pte for a particular vaddr in kernel space. */
 void set_pte_vaddr(unsigned long vaddr, pte_t pte);

 +struct mm_struct;
 +void set_mm_pte_vaddr(struct mm_struct *mm, unsigned long vaddr, pte_t 
 pte);
 +
 #ifdef CONFIG_X86_32
 extern void native_pagetable_init(void);
 #else
 diff --git a/arch/x86/include/asm/text-patching.h 
 b/arch/x86/include/asm/text-patching.h
 index 2ecd34e2d46c..cb364ea5b19d 100644
 --- a/arch/x86/include/asm/text-patching.h
 +++ b/arch/x86/include/asm/text-patching.h
 @@ -38,4 +38,6 @@ extern void *text_poke(void *addr, const void *opcode, 
 size_t len);
 extern int poke_int3_handler(struct pt_regs *regs);
 extern void *text_poke_bp(void *addr, const void *opcode, size_t len, void 
 *handler);

 +extern struct mm_struct *patching_mm;
 +
 #endif /* _ASM_X86_TEXT_PATCHING_H */
 diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
 index a481763a3776..fd8a950b0d62 100644
 --- a/arch/x86/kernel/alternative.c
 +++ b/arch/x86/kernel/alternative.c
 @@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
 +#include 
 #include 
 #include 
 #include 
 @@ -701,8 +702,36 @@ void *text_poke(void *addr, const void *opcode, 
 size_t len)

Re: [PATCH 1/3] ARM: dts: NSP: Enable SFP on bcm958625hr

2018-08-27 Thread Russell King - ARM Linux

On Mon, Aug 27, 2018 at 01:03:42PM -0700, Florian Fainelli wrote:
> Enable the SFP connected to port 5 of the switch and wire up all GPIOs
> to the SFP cage. Because of a hardware limitation of the i2c controller
> on the iProc SoCs which prevents large i2c (> 256 bytes) transactions to
> work, we use the i2c-gpio interface instead, which does not have that
> limitation. This allows us to read the SFP module EEPROM, which would
> not be possible otherwise since it exceeds that size during a single
> read transfer.

We shouldn't exceed 256 bytes, since 256 bytes is the "page" size
of the EEPROM.  The most that we read in one block is either
ETH_MODULE_SFF_8079_LEN or (ETH_MODULE_SFF_8472_LEN - 
ETH_MODULE_SFF_8079_LEN), both of which result in no more than 256
byte reads.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up

Re: [PATCH 1/3] ARM: dts: NSP: Enable SFP on bcm958625hr

2018-08-27 Thread Russell King - ARM Linux

On Mon, Aug 27, 2018 at 01:03:42PM -0700, Florian Fainelli wrote:
> Enable the SFP connected to port 5 of the switch and wire up all GPIOs
> to the SFP cage. Because of a hardware limitation of the i2c controller
> on the iProc SoCs which prevents large i2c (> 256 bytes) transactions to
> work, we use the i2c-gpio interface instead, which does not have that
> limitation. This allows us to read the SFP module EEPROM, which would
> not be possible otherwise since it exceeds that size during a single
> read transfer.

We shouldn't exceed 256 bytes, since 256 bytes is the "page" size
of the EEPROM.  The most that we read in one block is either
ETH_MODULE_SFF_8079_LEN or (ETH_MODULE_SFF_8472_LEN - 
ETH_MODULE_SFF_8079_LEN), both of which result in no more than 256
byte reads.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up

< 1 2 3 4 5 6 7 8 9 10 >

201 - 300 of 1294 matches

Mail list logo