date:20200630

Re: [PATCH 8/9] soundwire: intel: add wake interrupt support

2020-06-30 Thread Vinod Koul

On 30-06-20, 12:18, Pierre-Louis Bossart wrote:
> > > + return 0;
> > > + }
> > > +
> > > + shim = sdw->link_res->shim;
> > > + wake_sts = intel_readw(shim, SDW_SHIM_WAKESTS);
> > > +
> > > + if (!(wake_sts & BIT(sdw->instance)))
> > > + return 0;
> > > +
> > > + /* disable WAKEEN interrupt ASAP to prevent interrupt flood */
> > > + intel_shim_wake(sdw, false);
> > 
> > when & where is this enabled?
> 
> in follow-up patches where the clock-stop mode is enabled.

ok

> > > +  * wake up master and slave so that slave can notify master
> > > +  * the wakeen event and let codec driver check codec status
> > > +  */
> > > + list_for_each_entry(slave, >slaves, node) {
> > > + /*
> > > +  * discard devices that are defined in ACPI tables but
> > > +  * not physically present and devices that cannot
> > > +  * generate wakes
> > > +  */
> > > + if (slave->dev_num_sticky && slave->prop.wake_capable)
> > > + pm_request_resume(>dev);
> > 
> > Hmmm, shouldn't slave do this? would it not make sense to notify the
> > slave thru callback and then slave decides to resume or not..?
> 
> In this mode, the bus is clock-stop mode, and events are detected with level
> detector tied to PCI events. The master and slave devices are all in
> pm_runtime suspended states. The codec cannot make any decisions on its own
> since the bus is stopped, it needs to first resume, which assumes that the
> master resumes first and the enumeration re-done before it can access any of
> its registers.
> 
> By looping through the list of devices that can generate events, you end-up
> first forcing the master to resume, and then each slave resumes and can
> check who generated the event and what happened while suspended. if the
> codec didn't generate the event it will go back to suspended mode after the
> usual timeout.
> 
> We can add a callback but that callback would only be used for Intel
> solutions, but internally it would only do a pm_request_resume() since the
> codec cannot make any decisions before first resuming. In other words, it
> would be an Intel-specific callback that is implemented using generic resume
> operations. It's probably better to keep this in Intel-specific code, no?

I do not like the idea that a device would be woken up, that kind of
defeats the whole idea behind the runtime pm. Waking up a device to
check the events is a generic sdw concept, I don't see that as Intel
specific one.

I would like to see a generic callback for the devices and let devices
do the resume part, that is standard operating principle when we deal
with suspended devices. If the device thinks they need to resume, they
will do the runtime resume, check the status and sleep if not
applicable. Since we have set the parents correctly, any resume
operation for slaves would wake master up as well...

I do not see a need for intel driver to resume slave devices here, or
did I miss something?

-- 
~Vinod

Re: [PATCH 6/6] mm: Add memalloc_nowait

2020-06-30 Thread Michal Hocko

On Wed 01-07-20 05:12:03, Matthew Wilcox wrote:
> On Tue, Jun 30, 2020 at 08:34:36AM +0200, Michal Hocko wrote:
> > On Mon 29-06-20 22:28:30, Matthew Wilcox wrote:
> > [...]
> > > The documentation is hard to add a new case to, so I rewrote it.  What
> > > do you think?  (Obviously I'll split this out differently for submission;
> > > this is just what I have in my tree right now).
> > 
> > I am fine with your changes. Few notes below.
> 
> Thanks!
> 
> > > -It turned out though that above approach has led to
> > > -abuses when the restricted gfp mask is used "just in case" without a
> > > -deeper consideration which leads to problems because an excessive use
> > > -of GFP_NOFS/GFP_NOIO can lead to memory over-reclaim or other memory
> > > -reclaim issues.
> > 
> > I believe this is an important part because it shows that new people
> > coming to the existing code shouldn't take it as correct and rather
> > question it. Also having a clear indication that overuse is causing real
> > problems that might be not immediately visible to subsystems outside of
> > MM.
> 
> It seemed to say a lot of the same things as this paragraph:
> 
> +You may notice that quite a few allocations in the existing code specify
> +``GFP_NOIO`` or ``GFP_NOFS``. Historically, they were used to prevent
> +recursion deadlocks caused by direct memory reclaim calling back into
> +the FS or IO paths and blocking on already held resources. Since 4.12
> +the preferred way to address this issue is to use the new scope APIs
> +described below.
> 
> Since this is in core-api/ rather than vm/, I felt that discussion of
> the problems that it causes to the mm was a bit too much detail for the
> people who would be reading this document.  Maybe I could move that
> information into a new Documentation/vm/reclaim.rst file?

Hmm, my experience is that at least some users of NOFS/NOIO use this
flag just to be sure they do not do something wrong without realizing
that this might have a very negative effect on the whole system
operation. That was the main motivation to have an explicit note there.
I am not sure having that in MM internal documentation will make it
stand out for a general reader.

But I will not insist of course.

> Let's see if Our Grumpy Editor has time to give us his advice on this.
> 
> > > -FS/IO code then simply calls the appropriate save function before
> > > -any critical section with respect to the reclaim is started - e.g.
> > > -lock shared with the reclaim context or when a transaction context
> > > -nesting would be possible via reclaim.  
> > 
> > [...]
> > 
> > > +These functions should be called at the point where any memory allocation
> > > +would start to cause problems.  That is, do not simply wrap individual
> > > +memory allocation calls which currently use ``GFP_NOFS`` with a pair
> > > +of calls to memalloc_nofs_save() and memalloc_nofs_restore().  Instead,
> > > +find the lock which is taken that would cause problems if memory reclaim
> > > +reentered the filesystem, place a call to memalloc_nofs_save() before it
> > > +is acquired and a call to memalloc_nofs_restore() after it is released.
> > > +Ideally also add a comment explaining why this lock will be problematic.
> > 
> > The above text has mentioned the transaction context nesting as well and
> > that was a hint by Dave IIRC. It is imho good to have an example of
> > other reentrant points than just locks. I believe another useful example
> > would be something like loop device which is mixing IO and FS layers but
> > I am not familiar with all the details to give you an useful text.
> 
> I'll let Mikulas & Dave finish fighting about that before I write any
> text mentioning the loop driver.  How about this for mentioning the
> filesystem transaction possibility?
> 
> @@ -103,12 +103,16 @@ flags specified by any particular call to allocate 
> memory.
>  
>  These functions should be called at the point where any memory allocation
>  would start to cause problems.  That is, do not simply wrap individual
> -memory allocation calls which currently use ``GFP_NOFS`` with a pair
> -of calls to memalloc_nofs_save() and memalloc_nofs_restore().  Instead,
> -find the lock which is taken that would cause problems if memory reclaim
> +memory allocation calls which currently use ``GFP_NOFS`` with a pair of
> +calls to memalloc_nofs_save() and memalloc_nofs_restore().  Instead, find
> +the resource which is acquired that would cause problems if memory reclaim
>  reentered the filesystem, place a call to memalloc_nofs_save() before it
>  is acquired and a call to memalloc_nofs_restore() after it is released.
>  Ideally also add a comment explaining why this lock will be problematic.
> +A resource might be a lock which would need to be acquired by an attempt
> +to reclaim memory, or it might be starting a transaction that should not
> +nest over a memory reclaim transaction.  Deep knowledge of the filesystem
> +or driver is often needed to place memory scoping calls

Re: [PATCH v2] dt-bindings: display: Convert connectors to DT schema

2020-06-30 Thread Laurent Pinchart

Hi Rob,

Thank you for the patch.

On Tue, Jun 30, 2020 at 02:02:16PM -0600, Rob Herring wrote:
> Convert the analog TV, DVI, HDMI, and VGA connector bindings to DT schema
> format.
> 
> Cc: Sam Ravnborg 
> Cc: Laurent Pinchart 
> Cc: Maxime Ripard 
> Signed-off-by: Rob Herring 

Reviewed-by: Laurent Pinchart 

> ---
> v2:
> - Make Laurent maintainer
> - Add missing port and compatible required
> - Drop copy-n-paste 'type' from dvi-connector
> - Use 4 space indent on examples
> ---
>  .../display/connector/analog-tv-connector.txt | 31 
>  .../connector/analog-tv-connector.yaml| 52 ++
>  .../display/connector/dvi-connector.txt   | 36 --
>  .../display/connector/dvi-connector.yaml  | 70 +++
>  .../display/connector/hdmi-connector.txt  | 31 
>  .../display/connector/hdmi-connector.yaml | 64 +
>  .../display/connector/vga-connector.txt   | 36 --
>  .../display/connector/vga-connector.yaml  | 46 
>  8 files changed, 232 insertions(+), 134 deletions(-)
>  delete mode 100644 
> Documentation/devicetree/bindings/display/connector/analog-tv-connector.txt
>  create mode 100644 
> Documentation/devicetree/bindings/display/connector/analog-tv-connector.yaml
>  delete mode 100644 
> Documentation/devicetree/bindings/display/connector/dvi-connector.txt
>  create mode 100644 
> Documentation/devicetree/bindings/display/connector/dvi-connector.yaml
>  delete mode 100644 
> Documentation/devicetree/bindings/display/connector/hdmi-connector.txt
>  create mode 100644 
> Documentation/devicetree/bindings/display/connector/hdmi-connector.yaml
>  delete mode 100644 
> Documentation/devicetree/bindings/display/connector/vga-connector.txt
>  create mode 100644 
> Documentation/devicetree/bindings/display/connector/vga-connector.yaml
> 
> diff --git 
> a/Documentation/devicetree/bindings/display/connector/analog-tv-connector.txt 
> b/Documentation/devicetree/bindings/display/connector/analog-tv-connector.txt
> deleted file mode 100644
> index 883bcb2604c7..
> --- 
> a/Documentation/devicetree/bindings/display/connector/analog-tv-connector.txt
> +++ /dev/null
> @@ -1,31 +0,0 @@
> -Analog TV Connector
> -===
> -
> -Required properties:
> -- compatible: "composite-video-connector" or "svideo-connector"
> -
> -Optional properties:
> -- label: a symbolic name for the connector
> -- sdtv-standards: limit the supported TV standards on a connector to the 
> given
> -  ones. If not specified all TV standards are allowed.
> -  Possible TV standards are defined in
> -  include/dt-bindings/display/sdtv-standards.h.
> -
> -Required nodes:
> -- Video port for TV input
> -
> -Example
> 
> -#include 
> -
> -tv: connector {
> - compatible = "composite-video-connector";
> - label = "tv";
> - sdtv-standards = <(SDTV_STD_PAL | SDTV_STD_NTSC)>;
> -
> - port {
> - tv_connector_in: endpoint {
> - remote-endpoint = <_out>;
> - };
> - };
> -};
> diff --git 
> a/Documentation/devicetree/bindings/display/connector/analog-tv-connector.yaml
>  
> b/Documentation/devicetree/bindings/display/connector/analog-tv-connector.yaml
> new file mode 100644
> index ..eebe88fed999
> --- /dev/null
> +++ 
> b/Documentation/devicetree/bindings/display/connector/analog-tv-connector.yaml
> @@ -0,0 +1,52 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +%YAML 1.2
> +---
> +$id: 
> http://devicetree.org/schemas/display/connector/analog-tv-connector.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Analog TV Connector
> +
> +maintainers:
> +  - Laurent Pinchart 
> +
> +properties:
> +  compatible:
> +enum:
> +  - composite-video-connector
> +  - svideo-connector
> +
> +  label: true
> +
> +  sdtv-standards:
> +description:
> +  Limit the supported TV standards on a connector to the given ones. If
> +  not specified all TV standards are allowed. Possible TV standards are
> +  defined in include/dt-bindings/display/sdtv-standards.h.
> +$ref: /schemas/types.yaml#/definitions/uint32
> +
> +  port:
> +description: Connection to controller providing analog TV signals
> +
> +required:
> +  - compatible
> +  - port
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +#include 
> +
> +connector {
> +compatible = "composite-video-connector";
> +label = "tv";
> +sdtv-standards = <(SDTV_STD_PAL | SDTV_STD_NTSC)>;
> +
> +port {
> +tv_connector_in: endpoint {
> +remote-endpoint = <_out>;
> +};
> +};
> +};
> +
> +...
> diff --git 
> a/Documentation/devicetree/bindings/display/connector/dvi-connector.txt 
> b/Documentation/devicetree/bindings/display/connector/dvi-connector.txt
> deleted file mode 100644
> index 207e42e9eba0..
> ---

Re: linux-next: manual merge of the rcu tree with the kbuild tree

2020-06-30 Thread Marco Elver

On Wed, 1 Jul 2020 at 03:34, Stephen Rothwell  wrote:
>
> Hi all,
>
> Today's linux-next merge of the rcu tree got a conflict in:
>
>   kernel/kcsan/Makefile
>
> between commit:
>
>   f7c28e224da6 ("kbuild: remove cc-option test of -fno-stack-protector")

Is it possible that this patch drops the KCSAN portion? The patch
"kcsan: Simplify compiler flags" does the same, but is part of a
future pull request intended for 5.9.

The KCSAN changes had been in -next for well over a week. Also, I'm
sorry I hadn't seen your patch before, otherwise I would have noticed
this.

Please see: https://lkml.kernel.org/r/20200624190236.GA6603@paulmck-ThinkPad-P72

> from the kbuild tree and commits:
>
>   2839a232071f ("kcsan: Simplify compiler flags")
>   61d56d7aa5ec ("kcsan: Disable branch tracing in core runtime")
>
> from the rcu tree.
>
> I fixed it up (I just used the rcu tree version) and can carry the fix
> as necessary. This is now fixed as far as linux-next is concerned, but
> any non trivial conflicts should be mentioned to your upstream maintainer
> when your tree is submitted for merging.  You may also want to consider
> cooperating with the maintainer of the conflicting tree to minimise any
> particularly complex conflicts.

Thank you!

Thanks,
-- Marco

Re: [PATCH 2/2] watchdog: rti: tweak min_hw_heartbeat_ms to match initial allowed window

2020-06-30 Thread Tero Kristo


On 30/06/2020 23:23, Guenter Roeck wrote:

On Thu, Jun 25, 2020 at 08:04:50PM +0300, Tero Kristo wrote:

On 25/06/2020 16:35, Guenter Roeck wrote:

On 6/25/20 1:32 AM, Tero Kristo wrote:

On 24/06/2020 18:24, Jan Kiszka wrote:

On 24.06.20 13:45, Tero Kristo wrote:

If the RTI watchdog has been started by someone (like bootloader) when
the driver probes, we must adjust the initial ping timeout to match the
currently running watchdog window to avoid generating watchdog reset.

Signed-off-by: Tero Kristo 
---
    drivers/watchdog/rti_wdt.c | 25 +
    1 file changed, 25 insertions(+)

diff --git a/drivers/watchdog/rti_wdt.c b/drivers/watchdog/rti_wdt.c
index d456dd72d99a..02ea2b2435f5 100644
--- a/drivers/watchdog/rti_wdt.c
+++ b/drivers/watchdog/rti_wdt.c
@@ -55,11 +55,13 @@ static int heartbeat;
     * @base - base io address of WD device
     * @freq - source clock frequency of WDT
     * @wdd  - hold watchdog device as is in WDT core
+ * @min_hw_heartbeat_save - save of the min hw heartbeat value
     */
    struct rti_wdt_device {
    void __iomem    *base;
    unsigned long    freq;
    struct watchdog_device    wdd;
+    unsigned int    min_hw_heartbeat_save;
    };
    static int rti_wdt_start(struct watchdog_device *wdd)
@@ -107,6 +109,11 @@ static int rti_wdt_ping(struct watchdog_device *wdd)
    /* put watchdog in active state */
    writel_relaxed(WDKEY_SEQ1, wdt->base + RTIWDKEY);
+    if (wdt->min_hw_heartbeat_save) {
+    wdd->min_hw_heartbeat_ms = wdt->min_hw_heartbeat_save;
+    wdt->min_hw_heartbeat_save = 0;
+    }
+
    return 0;
    }
@@ -201,6 +208,24 @@ static int rti_wdt_probe(struct platform_device *pdev)
    goto err_iomap;
    }
+    if (readl(wdt->base + RTIDWDCTRL) == WDENABLE_KEY) {
+    u32 time_left;
+    u32 heartbeat;
+
+    set_bit(WDOG_HW_RUNNING, >status);
+    time_left = rti_wdt_get_timeleft(wdd);
+    heartbeat = readl(wdt->base + RTIDWDPRLD);
+    heartbeat <<= WDT_PRELOAD_SHIFT;
+    heartbeat /= wdt->freq;
+    if (time_left < heartbeat / 2)
+    wdd->min_hw_heartbeat_ms = 0;
+    else
+    wdd->min_hw_heartbeat_ms =
+    (time_left - heartbeat / 2 + 1) * 1000;
+
+    wdt->min_hw_heartbeat_save = 11 * heartbeat * 1000 / 20;
+    }
+
    ret = watchdog_register_device(wdd);
    if (ret) {
    dev_err(dev, "cannot register watchdog device\n");



This assumes that the bootloader also programmed a 50% window, right? The 
pending U-Boot patch will do that, but what if that may chance or someone uses 
a different setup?


Yes, we assume 50%. I think based on the hw design, 50% is the only sane value 
to be used, otherwise you just shrink the open window too much and for no 
apparent reason.



Not sure if that is a valid assumption. Someone who designs a watchdog
with such a narrow ping window might as well also use it. The question
is if you want to rely on that assumption, or check and change it if needed.


Right, if that is a blocker, I can modify the code. Should be maybe couple
of lines addition.


Also, I wonder if we should add an API function such as
"set_last_hw_keepalive()" to avoid all that complexity.


I can try adding that also if it is desirable.



But wait, the code doesn't really match what the description of this
patch claims, or at least the description is misleading. Per the
description, this is to prevent an early timeout. However, the problem
here is that the watchdog core does not generate a ping, even if
requested, because it believes that it just generated one right before
the watchdog timer was registered, and that it can not generate another
one because min_hw_heartbeat_ms has not elapsed.


You are right. Maybe the patch description could use some more beef into it.



With that in mind, the problem is a bit more complex.

First, the driver doesn't really update the current timeout to the
value that is currently configured and enabled. Instead, it just
uses/assumes the default (DEFAULT_HEARTBEAT or whatever the heartbeat
module parameter is set to). This means that it is still possible for
an early timeout to occur if there is a mismatch between the bootloader
timeout and the timeout assumed by the driver. Worse, the timeout
is only updated in the start function - and the start function isn't
called if the watchdog is already running. Actually, the driver does
not support updating the timeout at all. This means that a mismatch
between the bootloader timeout and the timeout assumed by the driver
is not handled well.

To solve this, the driver would have to update the actual timeout to
whatever is programmed into the chip and ignore any module parameter
and default settings if the watchdog is already running. Alternatively,
it would have to support updating the timeout (if the hardware supports
that) after the watchdog was started.


Hardware supports changing the timeout value,

Re: [PATCH] cpuidle: change enter_s2idle() prototype

2020-06-30 Thread Daniel Lezcano

On 01/07/2020 04:39, Neal Liu wrote:
> On Mon, 2020-06-29 at 17:17 +0200, Rafael J. Wysocki wrote:
>> On Monday, June 29, 2020 11:05:40 AM CEST Neal Liu wrote:
>>> Control Flow Integrity(CFI) is a security mechanism that disallows
>>> changes to the original control flow graph of a compiled binary,
>>> making it significantly harder to perform such attacks.
>>>
>>> init_state_node() assigns same function pointer to idle_state->enter
>>> and idle_state->enter_s2idle. This definitely causes CFI failure
>>> when calling either enter() or enter_s2idle().
>>>
>>> Align enter_s2idle() with enter() function prototype to fix CFI
>>> failure.
>>
>> That needs to be documented somewhere close to the definition of the
>> callbacks in question.
>>
>> Otherwise it is completely unclear why this is a good idea.
>>
> 
> The problem is, init_state_mode() assign same function callback to
> different function pointer declarations.
> 
> static int init_state_node(struct cpuidle_state *idle_state,
>const struct of_device_id *matches,
>struct device_node *state_node)
> {
> ...
> idle_state->enter = match_id->data;
> ...
> idle_state->enter_s2idle = match_id->data;
> }
> 
> Function declarations:
> 
> struct cpuidle_state {
> ...
> int (*enter)(struct cpuidle_device *dev,
> struct cpuidle_driver *drv,
> int index);
> 
> void (*enter_s2idle) (struct cpuidle_device *dev,
>   struct cpuidle_driver *drv,
>   int index);
> };
> 
> In this case, either enter() or enter_s2idle() would cause CFI check
> failed since they use same callee.
> 
> We try to align function prototype of enter() since it needs return
> value for some use cases. The return value of enter_s2idle() is no need
> currently.

Thanks for the clarification, you may add this description along with
the changelog.


>>> Signed-off-by: Neal Liu 
>>> ---
>>>  drivers/acpi/processor_idle.c   |6 --
>>>  drivers/cpuidle/cpuidle-tegra.c |8 +---
>>>  drivers/idle/intel_idle.c   |6 --
>>>  include/linux/cpuidle.h |6 +++---
>>>  4 files changed, 16 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
>>> index 75534c5..6ffb6c9 100644
>>> --- a/drivers/acpi/processor_idle.c
>>> +++ b/drivers/acpi/processor_idle.c
>>> @@ -655,8 +655,8 @@ static int acpi_idle_enter(struct cpuidle_device *dev,
>>> return index;
>>>  }
>>>  
>>> -static void acpi_idle_enter_s2idle(struct cpuidle_device *dev,
>>> -  struct cpuidle_driver *drv, int index)
>>> +static int acpi_idle_enter_s2idle(struct cpuidle_device *dev,
>>> + struct cpuidle_driver *drv, int index)
>>>  {
>>> struct acpi_processor_cx *cx = per_cpu(acpi_cstate[index], dev->cpu);
>>>  
>>> @@ -674,6 +674,8 @@ static void acpi_idle_enter_s2idle(struct 
>>> cpuidle_device *dev,
>>> }
>>> }
>>> acpi_idle_do_entry(cx);
>>> +
>>> +   return 0;
>>>  }
>>>  
>>>  static int acpi_processor_setup_cpuidle_cx(struct acpi_processor *pr,
>>> diff --git a/drivers/cpuidle/cpuidle-tegra.c 
>>> b/drivers/cpuidle/cpuidle-tegra.c
>>> index 1500458..a12fb14 100644
>>> --- a/drivers/cpuidle/cpuidle-tegra.c
>>> +++ b/drivers/cpuidle/cpuidle-tegra.c
>>> @@ -253,11 +253,13 @@ static int tegra_cpuidle_enter(struct cpuidle_device 
>>> *dev,
>>> return err ? -1 : index;
>>>  }
>>>  
>>> -static void tegra114_enter_s2idle(struct cpuidle_device *dev,
>>> - struct cpuidle_driver *drv,
>>> - int index)
>>> +static int tegra114_enter_s2idle(struct cpuidle_device *dev,
>>> +struct cpuidle_driver *drv,
>>> +int index)
>>>  {
>>> tegra_cpuidle_enter(dev, drv, index);
>>> +
>>> +   return 0;
>>>  }
>>>  
>>>  /*
>>> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
>>> index f449584..b178da3 100644
>>> --- a/drivers/idle/intel_idle.c
>>> +++ b/drivers/idle/intel_idle.c
>>> @@ -175,13 +175,15 @@ static __cpuidle int intel_idle(struct cpuidle_device 
>>> *dev,
>>>   * Invoked as a suspend-to-idle callback routine with frozen user space, 
>>> frozen
>>>   * scheduler tick and suspended scheduler clock on the target CPU.
>>>   */
>>> -static __cpuidle void intel_idle_s2idle(struct cpuidle_device *dev,
>>> -   struct cpuidle_driver *drv, int index)
>>> +static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev,
>>> +  struct cpuidle_driver *drv, int index)
>>>  {
>>> unsigned long eax = flg2MWAIT(drv->states[index].flags);
>>> unsigned long ecx = 1; /* break on interrupt flag */
>>>  
>>> mwait_idle_with_hints(eax, ecx);
>>> +
>>> +   return 0;
>>>  }
>>>  
>>>  /*
>>> diff --git

Re: [PATCH v1] orinoco: use generic power management

2020-06-30 Thread Vaibhav Gupta

On Wed, 24 Jun 2020 at 23:14, Vaibhav Gupta  wrote:
>
> With the support of generic PM callbacks, drivers no longer need to use
> legacy .suspend() and .resume() in which they had to maintain PCI states
> changes and device's power state themselves. The required operations are
> done by PCI core.
>
> PCI drivers are not expected to invoke PCI helper functions like
> pci_save/restore_state(), pci_enable/disable_device(),
> pci_set_power_state(), etc. Their tasks are completed by PCI core itself.
>
> Compile-tested only.
>
> Signed-off-by: Vaibhav Gupta 
> ---
>  .../intersil/orinoco/orinoco_nortel.c |  3 +-
>  .../wireless/intersil/orinoco/orinoco_pci.c   |  3 +-
>  .../wireless/intersil/orinoco/orinoco_pci.h   | 32 ++-
>  .../wireless/intersil/orinoco/orinoco_plx.c   |  3 +-
>  .../wireless/intersil/orinoco/orinoco_tmd.c   |  3 +-
>  5 files changed, 13 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/net/wireless/intersil/orinoco/orinoco_nortel.c 
> b/drivers/net/wireless/intersil/orinoco/orinoco_nortel.c
> index 048693b6c6c2..96a03d10a080 100644
> --- a/drivers/net/wireless/intersil/orinoco/orinoco_nortel.c
> +++ b/drivers/net/wireless/intersil/orinoco/orinoco_nortel.c
> @@ -290,8 +290,7 @@ static struct pci_driver orinoco_nortel_driver = {
> .id_table   = orinoco_nortel_id_table,
> .probe  = orinoco_nortel_init_one,
> .remove = orinoco_nortel_remove_one,
> -   .suspend= orinoco_pci_suspend,
> -   .resume = orinoco_pci_resume,
> +   .driver.pm  = _pci_pm_ops,
>  };
>
>  static char version[] __initdata = DRIVER_NAME " " DRIVER_VERSION
> diff --git a/drivers/net/wireless/intersil/orinoco/orinoco_pci.c 
> b/drivers/net/wireless/intersil/orinoco/orinoco_pci.c
> index 4938a2208a37..f3c86b07b1b9 100644
> --- a/drivers/net/wireless/intersil/orinoco/orinoco_pci.c
> +++ b/drivers/net/wireless/intersil/orinoco/orinoco_pci.c
> @@ -230,8 +230,7 @@ static struct pci_driver orinoco_pci_driver = {
> .id_table   = orinoco_pci_id_table,
> .probe  = orinoco_pci_init_one,
> .remove = orinoco_pci_remove_one,
> -   .suspend= orinoco_pci_suspend,
> -   .resume = orinoco_pci_resume,
> +   .driver.pm  = _pci_pm_ops,
>  };
>
>  static char version[] __initdata = DRIVER_NAME " " DRIVER_VERSION
> diff --git a/drivers/net/wireless/intersil/orinoco/orinoco_pci.h 
> b/drivers/net/wireless/intersil/orinoco/orinoco_pci.h
> index 43f5b9f5a0b0..d49d940864b4 100644
> --- a/drivers/net/wireless/intersil/orinoco/orinoco_pci.h
> +++ b/drivers/net/wireless/intersil/orinoco/orinoco_pci.h
> @@ -18,51 +18,37 @@ struct orinoco_pci_card {
> void __iomem *attr_io;
>  };
>
> -#ifdef CONFIG_PM
> -static int orinoco_pci_suspend(struct pci_dev *pdev, pm_message_t state)
> +static int __maybe_unused orinoco_pci_suspend(struct device *dev_d)
>  {
> +   struct pci_dev *pdev = to_pci_dev(dev_d);
> struct orinoco_private *priv = pci_get_drvdata(pdev);
>
> orinoco_down(priv);
> free_irq(pdev->irq, priv);
> -   pci_save_state(pdev);
> -   pci_disable_device(pdev);
> -   pci_set_power_state(pdev, PCI_D3hot);
>
> return 0;
>  }
>
> -static int orinoco_pci_resume(struct pci_dev *pdev)
> +static int __maybe_unused orinoco_pci_resume(struct device *dev_d)
>  {
> +   struct pci_dev *pdev = to_pci_dev(dev_d);
> struct orinoco_private *priv = pci_get_drvdata(pdev);
> struct net_device *dev = priv->ndev;
> int err;
>
> -   pci_set_power_state(pdev, PCI_D0);
> -   err = pci_enable_device(pdev);
> -   if (err) {
> -   printk(KERN_ERR "%s: pci_enable_device failed on resume\n",
> -  dev->name);
> -   return err;
> -   }
> -   pci_restore_state(pdev);
> -
> err = request_irq(pdev->irq, orinoco_interrupt, IRQF_SHARED,
>   dev->name, priv);
> if (err) {
> printk(KERN_ERR "%s: cannot re-allocate IRQ on resume\n",
>dev->name);
> -   pci_disable_device(pdev);
> return -EBUSY;
> }
>
> -   err = orinoco_up(priv);
> -
> -   return err;
> +   return orinoco_up(priv);
>  }
> -#else
> -#define orinoco_pci_suspend NULL
> -#define orinoco_pci_resume NULL
> -#endif
> +
> +static SIMPLE_DEV_PM_OPS(orinoco_pci_pm_ops,
> +orinoco_pci_suspend,
> +orinoco_pci_resume);
>
>  #endif /* _ORINOCO_PCI_H */
> diff --git a/drivers/net/wireless/intersil/orinoco/orinoco_plx.c 
> b/drivers/net/wireless/intersil/orinoco/orinoco_plx.c
> index 221352027779..16dada94c774 100644
> --- a/drivers/net/wireless/intersil/orinoco/orinoco_plx.c
> +++ b/drivers/net/wireless/intersil/orinoco/orinoco_plx.c
> @@ -336,8 +336,7 @@ static struct pci_driver orinoco_plx_driver = {
> .id_table   =

linux-next: Tree for Jul 1

2020-06-30 Thread Stephen Rothwell

Hi all,

Changes since 20200630:

My fixes tree contains:

  dbf24e30ce2e ("device_cgroup: Fix RCU list debugging warning")
  b236d81d9e4f ("powerpc/boot/dts: Fix dtc "pciex" warnings")

The tip tree still had one build failure for which I reverted a commit.

The rcu tree gained a conflict against the kbuild tree.

Non-merge commits (relative to Linus' tree): 3919
 4475 files changed, 304606 insertions(+), 81540 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig and htmldocs. And finally, a simple boot test
of the powerpc pseries_le_defconfig kernel in qemu (with and without
kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 322 trees (counting Linus' and 83 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (7c30b859a947 Merge tag 'spi-fix-v5.8-rc3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi)
Merging fixes/master (b236d81d9e4f powerpc/boot/dts: Fix dtc "pciex" warnings)
Merging kbuild-current/fixes (758abb5a6024 docs: kbuild: fix ReST formatting)
Merging arc-current/for-curr (10011f7d95de ARCv2: support loop buffer (LPB) 
disabling)
Merging arm-current/fixes (3866f217aaa8 ARM: 8977/1: ptrace: Fix mask for thumb 
breakpoint hook)
Merging arm-soc-fixes/arm/fixes (42d3f7e8da1b Merge tag 'imx-fixes-5.8' of 
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes)
Merging uniphier-fixes/fixes (0e698dfa2822 Linux 5.7-rc4)
Merging arm64-fixes/for-next/fixes (108447fd0d1a arm64: Add KRYO{3,4}XX silver 
CPU cores to SSB safelist)
Merging m68k-current/for-linus (3381df095419 m68k: tools: Replace zero-length 
array with flexible-array member)
Merging powerpc-fixes/fixes (19ab500edb5d powerpc/mm/pkeys: Make pkey access 
check work on execute_only_key)
Merging s390-fixes/fixes (95e61b1b5d63 s390/setup: init jump labels before 
command line parsing)
Merging sparc/master (5124b31c1e90 sparc: piggyback: handle invalid image)
Merging fscrypt-current/for-stable (2b4eae95c736 fscrypt: don't evict dirty 
inodes after removing key)
Merging net/master (0433c93dff14 Merge branch 'net-ipa-three-bug-fixes')
Merging bpf/master (d923021c2ce1 bpf: Add tests for PTR_TO_BTF_ID vs. null 
comparison)
Merging ipsec/master (4f47e8ab6ab7 xfrm: policy: match with both mark and mask 
on user interfaces)
Merging netfilter/master (4a21185cda0f Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net)
Merging ipvs/master (a41de0c215ff netfilter: ipset: fix unaligned atomic access)
Merging wireless-drivers/master (dc7bd30b97aa mt76: mt7615: fix EEPROM buffer 
size)
Merging mac80211/master (60a0121f8fa6 nl80211: fix memory leak when parsing 
NL80211_ATTR_HE_BSS_COLOR)
Merging rdma-fixes/for-rc (9ebcfadb0610 Linux 5.8-rc3)
Merging sound-current/for-linus (d02b10590953 Merge tag 'asoc-fix-v5.8-rc3' of 
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus)
Merging sound-asoc-fixes/for-linus (bb22b4545b6c Merge remote-tracking branch 
'asoc/for-5.8' into asoc-linus)
Merging regmap-fixes/for-linus (82228364de4a Merge remote-tracking branch 
'regmap/for-5.8' into regmap-linus)
Merging regulator-fixes/for-linus (8edefa0e7f72 Merge remote-tracking branch 
'regulator/for-5.8' into regulator-linus)
Merging spi-fixes/for-linus (f4cbc0282ae1 Merge remote-tracking branch 
'spi/for-5.8' into spi-linus)
Merging pci-current/for-linus (b3a9e3b9622a Linux 5.8-rc1)
Merging driver-core.cu

Re: [PATCH v9 1/2] of_graph: add of_graph_is_present()

2020-06-30 Thread Laurent Pinchart

Hi Dmitry,

Thank you for the patch.

On Wed, Jul 01, 2020 at 05:16:16AM +0300, Dmitry Osipenko wrote:
> In some case, like a DRM display code for example, it's useful to silently
> check whether port node exists at all in a device-tree before proceeding
> with parsing of the graph.
> 
> This patch adds of_graph_is_present() which returns true if given
> device-tree node contains OF graph port.
> 
> Reviewed-by: Rob Herring 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/of/property.c| 52 +---
>  include/linux/of_graph.h |  6 +
>  2 files changed, 49 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/of/property.c b/drivers/of/property.c
> index 6a5760f0d6cd..e12b8b491837 100644
> --- a/drivers/of/property.c
> +++ b/drivers/of/property.c
> @@ -29,6 +29,48 @@
>  
>  #include "of_private.h"
>  
> +/**
> + * of_graph_get_first_local_port() - get first local port node
> + * @node: pointer to a local endpoint device_node

It's not an endpoint.

> + *
> + * Return: First local port node associated with local endpoint node linked
> + *  to @node. Use of_node_put() on it when done.
> + */
> +static struct device_node *
> +of_graph_get_first_local_port(const struct device_node *node)
> +{
> + struct device_node *ports, *port;
> +
> + ports = of_get_child_by_name(node, "ports");
> + if (ports)
> + node = ports;
> +
> + port = of_get_child_by_name(node, "port");
> + of_node_put(ports);
> +
> + return port;
> +}
> +
> +/**
> + * of_graph_is_present() - check graph's presence
> + * @node: pointer to a device_node checked for the graph's presence
> + *
> + * Return: True if @node has a port or ports sub-node, false otherwise.
> + */
> +bool of_graph_is_present(const struct device_node *node)
> +{
> + struct device_node *local;
> +
> + local = of_graph_get_first_local_port(node);
> + if (!local)
> + return false;
> +
> + of_node_put(local);
> +
> + return true;
> +}
> +EXPORT_SYMBOL(of_graph_is_present);
> +
>  /**
>   * of_property_count_elems_of_size - Count the number of elements in a 
> property
>   *
> @@ -608,15 +650,7 @@ struct device_node *of_graph_get_next_endpoint(const 
> struct device_node *parent,
>* parent port node.
>*/
>   if (!prev) {
> - struct device_node *node;
> -
> - node = of_get_child_by_name(parent, "ports");
> - if (node)
> - parent = node;
> -
> - port = of_get_child_by_name(parent, "port");
> - of_node_put(node);
> -
> + port = of_graph_get_first_local_port(parent);

I think this introduces a bug below in the function, where parent is
used and is expected to point to the ports node if available. I'd leave
this part of the change out, and inline +of_graph_get_first_local_port()
in of_graph_is_present().

>   if (!port) {
>   pr_err("graph: no port node found in %pOF\n", parent);
>   return NULL;
> diff --git a/include/linux/of_graph.h b/include/linux/of_graph.h
> index 01038a6aade0..4d7756087b6b 100644
> --- a/include/linux/of_graph.h
> +++ b/include/linux/of_graph.h
> @@ -38,6 +38,7 @@ struct of_endpoint {
>child = of_graph_get_next_endpoint(parent, child))
>  
>  #ifdef CONFIG_OF
> +bool of_graph_is_present(const struct device_node *node);
>  int of_graph_parse_endpoint(const struct device_node *node,
>   struct of_endpoint *endpoint);
>  int of_graph_get_endpoint_count(const struct device_node *np);
> @@ -56,6 +57,11 @@ struct device_node *of_graph_get_remote_node(const struct 
> device_node *node,
>u32 port, u32 endpoint);
>  #else
>  
> +static inline bool of_graph_is_present(const struct device_node *node)
> +{
> + return false;
> +}
> +
>  static inline int of_graph_parse_endpoint(const struct device_node *node,
>   struct of_endpoint *endpoint)
>  {

-- 
Regards,

Laurent Pinchart

Re: [PATCH 7/9] soundwire: intel/cadence: merge Soundwire interrupt handlers/threads

2020-06-30 Thread Vinod Koul

On 30-06-20, 11:46, Pierre-Louis Bossart wrote:

> > Is this called from irq context or irq thread or something else?
> 
> from IRQ thread, hence the name, see pointers above.
> 
> The key part is that we could only make the hardware work as intended by
> using a single thread for all interrupt sources, and that patch is just the
> generalization of what was implemented for HDaudio in mid-2019 after months
> of lost interrupts and IPC errors. See below the code from
> sound/soc/sof/intel/hda.c for interrupt handling.

Sounds good. Now that you are already in irq thread, does it make sense
to spawn a worker thread for this and handle it there? Why not do in the
irq thread itself. Using a thread kind of defeats the whole point behind
concept of irq threads

-- 
~Vinod

Re: [RFC][PATCH 3/8] mm/vmscan: Attempt to migrate page in lieu of discard

2020-06-30 Thread David Rientjes

On Tue, 30 Jun 2020, Yang Shi wrote:

> > > From: Dave Hansen 
> > > 
> > > If a memory node has a preferred migration path to demote cold pages,
> > > attempt to move those inactive pages to that migration node before
> > > reclaiming. This will better utilize available memory, provide a faster
> > > tier than swapping or discarding, and allow such pages to be reused
> > > immediately without IO to retrieve the data.
> > > 
> > > When handling anonymous pages, this will be considered before swap if
> > > enabled. Should the demotion fail for any reason, the page reclaim
> > > will proceed as if the demotion feature was not enabled.
> > > 
> > Thanks for sharing these patches and kick-starting the conversation, Dave.
> > 
> > Could this cause us to break a user's mbind() or allow a user to
> > circumvent their cpuset.mems?
> > 
> > Because we don't have a mapping of the page back to its allocation
> > context (or the process context in which it was allocated), it seems like
> > both are possible.
> 
> Yes, this could break the memory placement policy enforced by mbind and
> cpuset. I discussed this with Michal on mailing list and tried to find a way
> to solve it, but unfortunately it seems not easy as what you mentioned above.
> The memory policy and cpuset is stored in task_struct rather than mm_struct.
> It is not easy to trace back to task_struct from page (owner field of
> mm_struct might be helpful, but it depends on CONFIG_MEMCG and is not
> preferred way).
> 

Yeah, and Ying made a similar response to this message.

We can do this if we consider pmem not to be a separate memory tier from 
the system perspective, however, but rather the socket perspective.  In 
other words, a node can only demote to a series of exclusive pmem ranges 
and promote to the same series of ranges in reverse order.  So DRAM node 0 
can only demote to PMEM node 2 while DRAM node 1 can only demote to PMEM 
node 3 -- a pmem range cannot be demoted to, or promoted from, more than 
one DRAM node.

This naturally takes care of mbind() and cpuset.mems if we consider pmem 
just to be slower volatile memory and we don't need to deal with the 
latency concerns of cross socket migration.  A user page will never be 
demoted to a pmem range across the socket and will never be promoted to a 
different DRAM node that it doesn't have access to.

That can work with the NUMA abstraction for pmem, but it could also 
theoretically be a new memory zone instead.  If all memory living on pmem 
is migratable (the natural way that memory hotplug is done, so we can 
offline), this zone would live above ZONE_MOVABLE.  Zonelist ordering 
would determine whether we can allocate directly from this memory based on 
system config or a new gfp flag that could be set for users of a mempolicy 
that allows allocations directly from pmem.  If abstracted as a NUMA node 
instead, interleave over nodes {0,2,3} or a cpuset.mems of {0,2,3} doesn't 
make much sense.

Kswapd would need to be enlightened for proper pgdat and pmem balancing 
but in theory it should be simpler because it only has its own node to 
manage.  Existing per-zone watermarks might be easy to use to fine tune 
the policy from userspace: the scale factor determines how much memory we 
try to keep free on DRAM for migration from pmem, for example.  We also 
wouldn't have to deal with node hotplug or updating of demotion/promotion 
node chains.

Maybe the strongest advantage of the node abstraction is the ability to 
use autonuma and migrate_pages()/move_pages() API for moving pages 
explicitly?  Mempolicies could be used for migration to "top-tier" memory, 
i.e. ZONE_NORMAL or ZONE_MOVABLE, instead.

Re: [PATCH 2/5] soundwire: stream: add helper to startup/shutdown streams

2020-06-30 Thread Vinod Koul

On 30-06-20, 11:58, Pierre-Louis Bossart wrote:
> > > +int sdw_startup_stream(void *sdw_substream)
> > 
> > Can we have kernel doc style Documentation for exported APIs?
> 
> yes, that's a miss indeed.
> 
> Though if we follow the existing examples it's not going to be very
> informative, e.g.

Yeah but...
> 
> /**
>  * sdw_disable_stream() - Disable SoundWire stream
>  *
>  * @stream: Soundwire stream
>  *
>  * Documentation/driver-api/soundwire/stream.rst explains this API in detail

it would help to have this pointer. I plan to include the kernel-doc
comments in Documentation for sdw so it would great to have these in
place

-- 
~Vinod

Re: PCI: Replace lkml.org, spinics, gmane with lore.kernel.org

2020-06-30 Thread Joe Perches

On Tue, 2020-06-30 at 14:04 -0600, Jonathan Corbet wrote:
> On Tue, 30 Jun 2020 13:09:17 -0500
> Bjorn Helgaas  wrote:
> 
> > PCI: Replace lkml.org, spinics, gmane with lore.kernel.org
> > 
> > The lkml.org, spinics.net, and gmane.org archives are not very reliable
> > and, in some cases, not even easily accessible.  Replace links to them with
> > links to lore.kernel.org, the archives hosted by kernel.org.
> > 
> > I found the gmane items via the Wayback Machine archive at
> > https://web.archive.org/.
> 
> Heh...now *that* sounds like a project that could generate a lot of churn,
> and perhaps even be worth it.  Settling on a consistent (and working!)
> email archive would improve the docs quite a bit.  I'll add that to the
> list...

At least for the lkml.org/lkml links
here's the current -next diff done by a
cript that looks at the message-id of
each lkml.org link.
---
 CREDITS|  2 +-
 Documentation/PCI/pci.rst  |  2 +-
 Documentation/RCU/RTFP.txt | 94 +++---
 Documentation/accounting/cgroupstats.rst   |  4 +-
 Documentation/admin-guide/cgroup-v1/memory.rst | 14 ++--
 Documentation/admin-guide/cpu-load.rst |  2 +-
 .../admin-guide/kernel-per-CPU-kthreads.rst|  2 +-
 Documentation/driver-api/gpio/driver.rst   |  4 +-
 Documentation/gpu/todo.rst |  2 +-
 Documentation/power/freezing-of-tasks.rst  |  2 +-
 Documentation/process/adding-syscalls.rst  | 18 ++---
 Documentation/process/submitting-patches.rst   |  4 +-
 Documentation/scheduler/sched-deadline.rst |  2 +-
 Documentation/security/lsm-development.rst |  2 +-
 Documentation/timers/timers-howto.rst  |  2 +-
 .../translations/it_IT/process/adding-syscalls.rst | 18 ++---
 .../it_IT/process/submitting-patches.rst   |  4 +-
 Documentation/translations/ja_JP/SubmittingPatches |  4 +-
 .../zh_CN/process/submitting-patches.rst   |  4 +-
 arch/arc/include/asm/irqflags-compact.h|  4 +-
 arch/arc/mm/dma.c  |  2 +-
 arch/arc/plat-axs10x/axs10x.c  |  2 +-
 arch/arc/plat-hsdk/platform.c  |  2 +-
 arch/arm/kernel/hibernate.c|  2 +-
 arch/arm64/kernel/hibernate.c  |  2 +-
 arch/mips/include/asm/page.h   |  2 +-
 drivers/block/aoe/aoecmd.c |  2 +-
 drivers/pci/setup-res.c|  2 +-
 drivers/staging/clocking-wizard/TODO   |  2 +-
 drivers/staging/vc04_services/bcm2835-audio/TODO   |  2 +-
 drivers/usb/serial/ark3116.c   |  2 +-
 drivers/xen/xen-acpi-processor.c   |  2 +-
 tools/perf/Documentation/examples.txt  |  2 +-
 tools/perf/util/data-convert-bt.c  |  2 +-
 tools/scripts/Makefile.include |  2 +-
 35 files changed, 110 insertions(+), 110 deletions(-)

diff --git a/CREDITS b/CREDITS
index 0787b5872906..c4f051d2b35c 100644
--- a/CREDITS
+++ b/CREDITS
@@ -546,7 +546,7 @@ D: gadget layers, SPI subsystem, GPIO subsystem, and more 
than a few
 D: device drivers.  His encouragement also helped many engineers get
 D: started working on the Linux kernel.  David passed away in early
 D: 2011, and will be greatly missed.
-W: https://lkml.org/lkml/2011/4/5/36
+W: https://lore.kernel.org/r/20110405034819.ga7...@kroah.com
 
 N: Gary Brubaker
 E: xav...@ix.netcom.com
diff --git a/Documentation/PCI/pci.rst b/Documentation/PCI/pci.rst
index d10d3fe604c5..d8c3df6d21af 100644
--- a/Documentation/PCI/pci.rst
+++ b/Documentation/PCI/pci.rst
@@ -214,7 +214,7 @@ the PCI device by calling pci_enable_device(). This will:
problem and unlikely to get fixed soon.
 
This has been discussed before but not changed as of 2.6.19:
-   http://lkml.org/lkml/2006/3/2/194
+   https://lore.kernel.org/r/20060302180025.gc28...@flint.arm.linux.org.uk
 
 
 pci_set_master() will enable DMA by setting the bus master bit
diff --git a/Documentation/RCU/RTFP.txt b/Documentation/RCU/RTFP.txt
index 9bccf16736f7..3b0876c77355 100644
--- a/Documentation/RCU/RTFP.txt
+++ b/Documentation/RCU/RTFP.txt
@@ -683,7 +683,7 @@ Orran Krieger and Rusty Russell and Dipankar Sarma and 
Maneesh Soni"
 ,month="October"
 ,year="2001"
 ,note="Available:
-\url{http://lkml.org/lkml/2001/10/13/105}
+\url{https://lore.kernel.org/r/pine.lnx.4.33.0110131015410.8707-100...@penguin.transmeta.com}
 [Viewed August 21, 2004]"
 ,annotation={
 }
@@ -826,7 +826,7 @@ Symposium on Distributed Computing}
 ,month="October"
 ,year="2002"
 ,note="Available:
-\url{https://lkml.org/lkml/2002/10/24/262}
+\url{https://lore.kernel.org/r/3db86b05.447e7...@us.ibm.com}
 [Viewed February 15, 2014]"
 ,annotation={
Mingming Cao's patch to introduce RCU to SysV IPC.
@@ -839,7 +839,7 @@ Symposium on Distributed Computing}

Re: [RFC v2] fpga: dfl: RFC PCI config

2020-06-30 Thread Xu Yilun

On Tue, Jun 30, 2020 at 11:49:50AM -0700, t...@redhat.com wrote:
> From: Tom Rix 
> 
> Create some top level configs the map to dfl pci cards.
> 
> Autoselect the parts of fpga that are needed to run these cards
> as well as the defining the other subsystem dependencies.
> 
> Signed-off-by: Tom Rix 
> ---
>  v1 change subsystem selects to depends
> 
>  Documentation/fpga/dfl.rst | 30 ++
>  drivers/fpga/Kconfig   | 27 +++
>  2 files changed, 57 insertions(+)
> 
> diff --git a/Documentation/fpga/dfl.rst b/Documentation/fpga/dfl.rst
> index d7648d7c7eee..c1ae6b539f08 100644
> --- a/Documentation/fpga/dfl.rst
> +++ b/Documentation/fpga/dfl.rst
> @@ -500,6 +500,36 @@ Developer only needs to provide a sub feature driver 
> with matched feature id.
>  FME Partial Reconfiguration Sub Feature driver (see 
> drivers/fpga/dfl-fme-pr.c)
>  could be a reference.
>  
> +Kernel configuration
> +
> +
> +While it is possible to manually setup a configuration to match your device,
> +there are some top level configurations that collect configurations for
> +some reference PCI cards.  Below describes these configuration as well as
> +what other kernel configs are needed for proper configuration.
> +
> +FPGA_DFL_PAC10
> +Intel Arria 10 GX PCI card, PCI id 0X09C4
> +Depends on
> +  SPI_ALTERA
> +  MFD_INTEL_M10_BMC
> +  SENSORS_INTEL_M10_BMC_HWMON
> +
> +FPGA_DFL_D5005
> +Intel Stratix 10, D5005 PCI card, PCI id 0X0B2B
> +Depends on
> +  SPI_ALTERA
> +  MFD_INTEL_M10_BMC
> +  SENSORS_INTEL_M10_BMC_HWMON
> +  INTEL_S10_PHY
> +
> +FPGA_DFL_N3000
> +Intel Network Accelerator, N3000 PCI card, PCI id 0X0B30
> +Depends on
> +  SPI_ALTERA
> +  MFD_INTEL_M10_BMC
> +  SENSORS_INTEL_M10_BMC_HWMON
> +  INTEL_LL_10G_MAC
>  
>  Open discussion
>  ===
> diff --git a/drivers/fpga/Kconfig b/drivers/fpga/Kconfig
> index 9d53bd9094e2..96603b1f6ff5 100644
> --- a/drivers/fpga/Kconfig
> +++ b/drivers/fpga/Kconfig
> @@ -138,6 +138,33 @@ config OF_FPGA_REGION
> Support for loading FPGA images by applying a Device Tree
> overlay.
>  
> +config FPGA_DFL_PAC10
> + tristate "Intel Arria 10 GX PCI card"
> + depends on SPI_ALTERA
> + depends on SENSORS_INTEL_M10_BMC_HWMON
> + depends on MFD_INTEL_M10_BMC
> + select FPGA_DFL
> + select FPGA_DFL_FME

The FPGA_DFL_FME depends on HWMON & PERF_EVENTS, seems we cannot select
it either.

> + select FPGA_DFL_FME_MGR
> + select FPGA_DFL_FME_BRIDGE
> + select FPGA_DFL_FME_REGION
> + select FPGA_DFL_AFU
> + select FPGA_DFL_SPI_ALTERA
> + select FPGA_DFL_PCI
> + select IFPGA_SEC_MGR
> +
> +config FPGA_DFL_D5005
> + tristate "Intel Stratix 10, D5005 PCI card"
> + depends on INTEL_S10_PHY
> + select FPGA_DFL_PAC10
> + select FPGA_DFl_HSSI
> +
> +config FPGA_DFL_N3000
> + tristate "Intel Network Accelerator, N3000 PCI card"
> + depends on INTEL_LL_10G_MAC
> + select FPGA_DFL_PAC10
> + select FPGA_DFL_N3000_NIOS
> +
>  config FPGA_DFL
>   tristate "FPGA Device Feature List (DFL) support"
>   select FPGA_BRIDGE
> -- 
> 2.18.1

[PATCH V2] arm64/cpufeature: Validate feature bits spacing in arm64_ftr_regs[]

2020-06-30 Thread Anshuman Khandual

arm64_feature_bits for a register in arm64_ftr_regs[] are in a descending
order as per their shift values. Validate that these features bits are
defined correctly and do not overlap with each other. This check protects
against any inadvertent erroneous changes to the register definitions.

Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Suzuki K Poulose 
Cc: Mark Brown 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual 
---
Applies on 5.8-rc3.

Changes in V2:

- Replaced WARN_ON() with WARN() dropping the conditional block per Suzuki

Changes in V1: (https://patchwork.kernel.org/patch/11606285/)

 arch/arm64/kernel/cpufeature.c | 45 +++---
 1 file changed, 42 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9f63053a63a9..7bd7e6f936a5 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -697,11 +697,50 @@ static s64 arm64_ftr_safe_value(const struct 
arm64_ftr_bits *ftrp, s64 new,
 
 static void __init sort_ftr_regs(void)
 {
-   int i;
+   const struct arm64_ftr_reg *ftr_reg;
+   const struct arm64_ftr_bits *ftr_bits;
+   unsigned int i, j, width, shift, prev_shift;
+
+   for (i = 0; i < ARRAY_SIZE(arm64_ftr_regs); i++) {
+   /*
+* Features here must be sorted in descending order with respect
+* to their shift values and should not overlap with each other.
+*/
+   ftr_reg = arm64_ftr_regs[i].reg;
+   for (ftr_bits = ftr_reg->ftr_bits, j = 0;
+   ftr_bits->width != 0; ftr_bits++, j++) {
+   WARN((ftr_bits->shift  + ftr_bits->width) > 64,
+   "%s has invalid feature at shift %d\n",
+   ftr_reg->name, ftr_bits->shift);
+
+   /*
+* Skip the first feature. There is nothing to
+* compare against for now.
+*/
+   if (j == 0)
+   continue;
+
+   prev_shift = ftr_reg->ftr_bits[j - 1].shift;
+   width = ftr_reg->ftr_bits[j].width;
+   shift = ftr_reg->ftr_bits[j].shift;
+   WARN(prev_shift < (shift + width),
+   "%s has feature overlap at shift %d\n",
+   ftr_reg->name, ftr_bits->shift);
+   }
 
-   /* Check that the array is sorted so that we can do the binary search */
-   for (i = 1; i < ARRAY_SIZE(arm64_ftr_regs); i++)
+   /*
+* Skip the first register. There is nothing to
+* compare against for now.
+*/
+   if (i == 0)
+   continue;
+   /*
+* Registers here must be sorted in ascending order with respect
+* to sys_id for subsequent binary search in get_arm64_ftr_reg()
+* to work correctly.
+*/
BUG_ON(arm64_ftr_regs[i].sys_id < arm64_ftr_regs[i - 1].sys_id);
+   }
 }
 
 /*
-- 
2.20.1

[PATCH] ALSA: hda/realtek: Serialize setting GPIO LED

2020-06-30 Thread Kai-Heng Feng

If a system has two GPIO controlled LED, one for mute and another one
for micmute, and both of them are on before system suspend, sometimes
one of them won't be turned off by system suspend.

The codec doesn't seem to be able to control multiple GPIO LEDs at the
same time, so introduce a new mutex to serialize setting the LED, to
prevent the issue from happening.

Signed-off-by: Kai-Heng Feng 
---
 include/sound/hda_codec.h | 1 +
 sound/pci/hda/hda_codec.c | 1 +
 sound/pci/hda/patch_realtek.c | 3 +++
 3 files changed, 5 insertions(+)

diff --git a/include/sound/hda_codec.h b/include/sound/hda_codec.h
index d16a4229209b..3a1792bbb7ac 100644
--- a/include/sound/hda_codec.h
+++ b/include/sound/hda_codec.h
@@ -206,6 +206,7 @@ struct hda_codec {
 
struct mutex spdif_mutex;
struct mutex control_mutex;
+   struct mutex led_mutex;
struct snd_array spdif_out;
unsigned int spdif_in_enable;   /* SPDIF input enable? */
const hda_nid_t *slave_dig_outs; /* optional digital out slave widgets 
*/
diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c
index 7e3ae4534df9..ec477cd8afed 100644
--- a/sound/pci/hda/hda_codec.c
+++ b/sound/pci/hda/hda_codec.c
@@ -946,6 +946,7 @@ int snd_hda_codec_device_new(struct hda_bus *bus, struct 
snd_card *card,
codec->addr = codec_addr;
mutex_init(>spdif_mutex);
mutex_init(>control_mutex);
+   mutex_init(>led_mutex);
snd_array_init(>mixers, sizeof(struct hda_nid_item), 32);
snd_array_init(>nids, sizeof(struct hda_nid_item), 32);
snd_array_init(>init_pins, sizeof(struct hda_pincfg), 16);
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index 53e0eef8b042..96dac365fbb8 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -4101,7 +4101,10 @@ static void alc_update_gpio_led(struct hda_codec *codec, 
unsigned int mask,
 {
if (polarity)
enabled = !enabled;
+
+   mutex_lock(>led_mutex);
alc_update_gpio_data(codec, mask, !enabled); /* muted -> LED on */
+   mutex_unlock(>led_mutex);
 }
 
 /* turn on/off mute LED via GPIO per vmaster hook */
-- 
2.17.1

RE: [PATCH V3 02/10] init.h: Fix the __setup_param() macro for module build

2020-06-30 Thread Anson Huang

Hi, Arnd


> Subject: Re: [PATCH V3 02/10] init.h: Fix the __setup_param() macro for
> module build
> 
> On Mon, Jun 29, 2020 at 1:40 PM Anson Huang 
> wrote:
> > > Subject: Re: [PATCH V3 02/10] init.h: Fix the __setup_param() macro
> > > for module build
> > >
> > > On Mon, Jun 29, 2020 at 8:06 AM Anson Huang 
> > > wrote:
> > > >
> > > > Keep __setup_param() to use same parameters for both built in and
> > > > built as module, it can make the drivers which call it easier when
> > > > the drivers can be built in or built as module.
> > > >
> > > > Signed-off-by: Anson Huang 
> > >
> > > I wonder if we should instead drop the __setup() and __setup_param()
> > > definitions from the #else block here. This was clearly not used
> > > anywhere, and it sounds like any possible user is broken and should
> > > be changed to not use
> > > __setup() anyway.
> > >
> >
> >
> > It makes sense to drop the __setup() and __serup_param() in the #else
> > block, just use one definition for all cases, if no one objects, I will 
> > remove
> them in next patch series.
> 
> Ok, sounds good. Note that there may be users of the plain __setup() that just
> get turned into nops right now. Usually those are already enclosed in "#ifndef
> MODULE", but if they are not, then removing the definition would cause a
> build error.
> 
> Have a look if you can find such instances, and either change the patch to add
> the missing "#ifndef MODULE" checks, or just drop the __setup_param() and
> leave the __setup() if it gets too complicated.

Looks like the __setup_param() defined in "#ifndef MODULE" can NOT be used for 
MODULE build at all, so sharing same implementation is NOT available, so if it 
is NOT
that critical, I plan to keep the #else block in this patch, let me know if you 
have further
concern or any other suggestion, below is the build error reported for module 
build using
__setup_param() implementation for built in.

thanks,
Anson


In file included from ./arch/arm64/include/asm/alternative.h:12,
 from ./arch/arm64/include/asm/lse.h:15,
 from ./arch/arm64/include/asm/cmpxchg.h:14,
 from ./arch/arm64/include/asm/atomic.h:16,
 from ./include/linux/atomic.h:7,
 from ./include/asm-generic/bitops/atomic.h:5,
 from ./arch/arm64/include/asm/bitops.h:26,
 from ./include/linux/bitops.h:29,
 from ./include/linux/kernel.h:12,
 from ./include/linux/clk.h:13,
 from drivers/clk/imx/clk.c:2:
./include/linux/init.h:177:16: error: variable ‘__setup_imx_keep_uart_earlycon’ 
has initializer but incomplete type
  177 |  static struct obs_kernel_param __setup_##unique_id  \
  |^~~~
drivers/clk/imx/clk.c:157:1: note: in expansion of macro ‘__setup_param’
  157 | __setup_param("earlycon", imx_keep_uart_earlycon,
  | ^
./include/linux/init.h:180:7: warning: excess elements in struct initializer
  180 |   = { __setup_str_##unique_id, fn, early }
  |   ^~~~
drivers/clk/imx/clk.c:157:1: note: in expansion of macro ‘__setup_param’
  157 | __setup_param("earlycon", imx_keep_uart_earlycon,
  | ^
./include/linux/init.h:180:7: note: (near initialization for 
‘__setup_imx_keep_uart_earlycon’)
  180 |   = { __setup_str_##unique_id, fn, early }
  |   ^~~~
drivers/clk/imx/clk.c:157:1: note: in expansion of macro ‘__setup_param’
  157 | __setup_param("earlycon", imx_keep_uart_earlycon,
  | ^
drivers/clk/imx/clk.c:158:8: warning: excess elements in struct initializer
  158 |imx_keep_uart_clocks_param, 0);
  |^~
./include/linux/init.h:180:32: note: in definition of macro ‘__setup_param’
  180 |   = { __setup_str_##unique_id, fn, early }
  |^~
drivers/clk/imx/clk.c:158:8: note: (near initialization for 
‘__setup_imx_keep_uart_earlycon’)
  158 |imx_keep_uart_clocks_param, 0);
  |^~
./include/linux/init.h:180:32: note: in definition of macro ‘__setup_param’
  180 |   = { __setup_str_##unique_id, fn, early }
  |^~
drivers/clk/imx/clk.c:158:36: warning: excess elements in struct initializer
  158 |imx_keep_uart_clocks_param, 0);
  |^
./include/linux/init.h:180:36: note: in definition of macro ‘__setup_param’
  180 |   = { __setup_str_##unique_id, fn, early }
  |^
drivers/clk/imx/clk.c:158:36: note: (near initialization for 
‘__setup_imx_keep_uart_earlycon’)
  158 |imx_keep_uart_clocks_param, 0);
  |^
./include/linux/init.h:180:36: note: in definition of macro ‘__setup_param’
  180 |   = { __setup_str_##unique_id, fn, early }
  |^
./include/linux/init.h:177:16:

Re: [RFC v2] fpga: dfl: RFC PCI config

2020-06-30 Thread Xu Yilun

On Tue, Jun 30, 2020 at 11:49:50AM -0700, t...@redhat.com wrote:
> From: Tom Rix 
> 
> Create some top level configs the map to dfl pci cards.
> 
> Autoselect the parts of fpga that are needed to run these cards
> as well as the defining the other subsystem dependencies.
> 
> Signed-off-by: Tom Rix 
> ---
>  v1 change subsystem selects to depends
> 
>  Documentation/fpga/dfl.rst | 30 ++
>  drivers/fpga/Kconfig   | 27 +++
>  2 files changed, 57 insertions(+)
> 
> diff --git a/Documentation/fpga/dfl.rst b/Documentation/fpga/dfl.rst
> index d7648d7c7eee..c1ae6b539f08 100644
> --- a/Documentation/fpga/dfl.rst
> +++ b/Documentation/fpga/dfl.rst
> @@ -500,6 +500,36 @@ Developer only needs to provide a sub feature driver 
> with matched feature id.
>  FME Partial Reconfiguration Sub Feature driver (see 
> drivers/fpga/dfl-fme-pr.c)
>  could be a reference.
>  
> +Kernel configuration
> +
> +
> +While it is possible to manually setup a configuration to match your device,
> +there are some top level configurations that collect configurations for
> +some reference PCI cards.  Below describes these configuration as well as
> +what other kernel configs are needed for proper configuration.
> +
> +FPGA_DFL_PAC10
> +Intel Arria 10 GX PCI card, PCI id 0X09C4
> +Depends on
> +  SPI_ALTERA
> +  MFD_INTEL_M10_BMC
> +  SENSORS_INTEL_M10_BMC_HWMON
> +
> +FPGA_DFL_D5005
> +Intel Stratix 10, D5005 PCI card, PCI id 0X0B2B
> +Depends on
> +  SPI_ALTERA
> +  MFD_INTEL_M10_BMC
> +  SENSORS_INTEL_M10_BMC_HWMON
> +  INTEL_S10_PHY
> +
> +FPGA_DFL_N3000
> +Intel Network Accelerator, N3000 PCI card, PCI id 0X0B30
> +Depends on
> +  SPI_ALTERA
> +  MFD_INTEL_M10_BMC
> +  SENSORS_INTEL_M10_BMC_HWMON
> +  INTEL_LL_10G_MAC
>  
>  Open discussion
>  ===
> diff --git a/drivers/fpga/Kconfig b/drivers/fpga/Kconfig
> index 9d53bd9094e2..96603b1f6ff5 100644
> --- a/drivers/fpga/Kconfig
> +++ b/drivers/fpga/Kconfig
> @@ -138,6 +138,33 @@ config OF_FPGA_REGION
> Support for loading FPGA images by applying a Device Tree
> overlay.
>  
> +config FPGA_DFL_PAC10
> + tristate "Intel Arria 10 GX PCI card"
> + depends on SPI_ALTERA
> + depends on SENSORS_INTEL_M10_BMC_HWMON
> + depends on MFD_INTEL_M10_BMC
> + select FPGA_DFL
> + select FPGA_DFL_FME
> + select FPGA_DFL_FME_MGR
> + select FPGA_DFL_FME_BRIDGE
> + select FPGA_DFL_FME_REGION
> + select FPGA_DFL_AFU
> + select FPGA_DFL_SPI_ALTERA
> + select FPGA_DFL_PCI

FPGA_DFL_PCI depends on PCI, seems we also cannot select it.

> + select IFPGA_SEC_MGR

Since there is concern we cannot select all the configs, and now we have
some "depends on"s, some "select"s. It means people should manually
find and select the "depends on"s, then the helper config could appear
and be selected to finish the rest of selection.
IMHO seems this config is not as valuable as expected ...

> +
> +config FPGA_DFL_D5005
> + tristate "Intel Stratix 10, D5005 PCI card"
> + depends on INTEL_S10_PHY
> + select FPGA_DFL_PAC10
> + select FPGA_DFl_HSSI
> +
> +config FPGA_DFL_N3000
> + tristate "Intel Network Accelerator, N3000 PCI card"
> + depends on INTEL_LL_10G_MAC
> + select FPGA_DFL_PAC10
> + select FPGA_DFL_N3000_NIOS
> +
>  config FPGA_DFL
>   tristate "FPGA Device Feature List (DFL) support"
>   select FPGA_BRIDGE
> -- 
> 2.18.1

mmotm 2020-06-30-21-52 uploaded

2020-06-30 Thread akpm

The mm-of-the-moment snapshot 2020-06-30-21-52 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

https://github.com/hnaz/linux-mm

The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is also available at

https://github.com/hnaz/linux-mm



This mmotm tree contains the following patches against 5.8-rc3:
(patches marked "*" will be included in linux-next)

  origin.patch
* hugetlb-fix-pages-per-hugetlb-calculation.patch
* samples-vfs-avoid-warning-in-statx-override.patch
* mm-cmac-use-exact_nid-true-to-fix-possible-per-numa-cma-leak.patch
* vmalloc-fix-the-owner-argument-for-the-new-__vmalloc_node_range-callers.patch
* mm-shuffle-dont-move-pages-between-zones-and-dont-read-garbage-memmaps.patch
* mm-page_alloc-fix-documentation-error.patch
* proc-kpageflags-prevent-an-integer-overflow-in-stable_page_flags.patch
* proc-kpageflags-do-not-use-uninitialized-struct-pages.patch
* checkpatch-test-git_dir-changes.patch
* scripts-tagssh-collect-compiled-source-precisely.patch
* bloat-o-meter-support-comparing-library-archives.patch
* scripts-decode_stacktrace-skip-missing-symbols.patch
* scripts-decode_stacktrace-guess-basepath-if-not-specified.patch
* scripts-decode_stacktrace-guess-path-to-modules.patch
* scripts-decode_stacktrace-guess-path-to-vmlinux-by-release-name.patch
* ocfs2-clear-links-count-in-ocfs2_mknod-if-an-error-occurs.patch
* ocfs2-fix-ocfs2-corrupt-when-iputting-an-inode.patch
* ocfs2-change-slot-number-type-s16-to-u16.patch
* ramfs-support-o_tmpfile.patch
* kernel-watchdog-flush-all-printk-nmi-buffers-when-hardlockup-detected.patch
  mm.patch
* mm-treewide-rename-kzfree-to-kfree_sensitive.patch
* mm-ksize-should-silently-accept-a-null-pointer.patch
* mm-expand-config_slab_freelist_hardened-to-include-slab.patch
* slab-add-naive-detection-of-double-free.patch
* slab-add-naive-detection-of-double-free-fix.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks.patch
* mm-slub-make-some-slub_debug-related-attributes-read-only.patch
* mm-slub-remove-runtime-allocation-order-changes.patch
* mm-slub-make-remaining-slub_debug-related-attributes-read-only.patch
* mm-slub-make-reclaim_account-attribute-read-only.patch
* mm-slub-introduce-static-key-for-slub_debug.patch
* mm-slub-introduce-kmem_cache_debug_flags.patch
* mm-slub-introduce-kmem_cache_debug_flags-fix.patch
* mm-slub-extend-checks-guarded-by-slub_debug-static-key.patch
* mm-slab-slub-move-and-improve-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj-fix.patch
* slub-drop-lockdep_assert_held-from-put_map.patch
* mm-kcsan-instrument-slab-slub-free-with-assert_exclusive_access.patch
* mm-filemap-clear-idle-flag-for-writes.patch
* 
mm-filemap-add-missing-fgp_-flags-in-kerneldoc-comment-for-pagecache_get_page.patch
* 
mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
* mm-memcg-prepare-for-byte-sized-vmstat-items.patch
* mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
* mm-slub-implement-slub-version-of-obj_to_index.patch
* mm-memcontrol-decouple-reference-counting-from-page-accounting.patch
* mm-memcg-slab-obj_cgroup-api.patch
* mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
* mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
* mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
* mm-memcg-slab-deprecate-memorykmemslabinfo.patch
* mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
* 
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
* mm-memcg-slab-simplify-memcg-cache-creation.patch
* mm-memcg-slab-remove-memcg_kmem_get_cache.patch
*

Re: [PATCH v3] cpufreq: CPPC: simply the code access 'highest_perf' value in cppc_perf_caps struct

2020-06-30 Thread Viresh Kumar

On 01-07-20, 12:20, Xin Hao wrote:
>  The 'caps' variable has been defined, so there is no need to get
>  'highest_perf' value through 'cpu->caps.highest_perf', you can use
>  'caps->highest_perf' instead.
> 
> Signed-off-by: Xin Hao 
> ---
>  drivers/cpufreq/cppc_cpufreq.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> index 257d726a4456..051d0e56c67a 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -161,7 +161,7 @@ static unsigned int cppc_cpufreq_perf_to_khz(struct 
> cppc_cpudata *cpu,
>   if (!max_khz)
>   max_khz = cppc_get_dmi_max_khz();
>   mul = max_khz;
> - div = cpu->perf_caps.highest_perf;
> + div = caps->highest_perf;
>   }
>   return (u64)perf * mul / div;
>  }
> @@ -184,7 +184,7 @@ static unsigned int cppc_cpufreq_khz_to_perf(struct 
> cppc_cpudata *cpu,
>   } else {
>   if (!max_khz)
>   max_khz = cppc_get_dmi_max_khz();
> - mul = cpu->perf_caps.highest_perf;
> + mul = caps->highest_perf;
>   div = max_khz;
>   }

Applied. Thanks.

-- 
viresh

Re: [PATCH] pinctrl: initialise nsp-mux earlier.

2020-06-30 Thread Florian Fainelli




On 6/30/2020 9:37 PM, Mark Tomlinson wrote:
> On Tue, 2020-06-30 at 20:14 -0700, Florian Fainelli wrote:
>> Sorry, it looks like I made a mistake in my testing (or I was lucky),
>>> and this patch doesn't fix the issue. What is happening is:
>>> 1) nsp-pinmux driver is registered (arch_initcall).
>>> 2) nsp-gpio-a driver is registered (arch_initcall_sync).
>>> 3) of_platform_default_populate_init() is called (also at level
>>> arch_initcall_sync), which scans the device tree, adds the nsp-gpio-a
>>> device, runs its probe, and this returns -EPROBE_DEFER with the error
>>> message.
>>> 4) Only now nsp-pinmux device is probed.
>>>
>>> Changing the 'arch_initcall_sync' to 'device_initcall' in nsp-gpio-a
>>> ensures that the pinmux is probed first since
>>> of_platform_default_populate_init() will be called between the two
>>> register calls, and the error goes away. Is this change acceptable as a
>>> solution?
>>
>> If probe deferral did not work, certainly but it sounds like this is
>> being done just for the sake of eliminating a round of probe deferral,
>> is there a functional problem this is fixing?
> 
> No, I'm just trying to prevent an "error" message appearing in syslog.
> 
>>> The actual error message in syslog is:
>>>
>>> kern.err kernel: gpiochip_add_data_with_key: GPIOs 480..511
>>> (1820.gpio) failed to register, -517
>>>
>>> So an end user sees "err" and "failed", and doesn't know what "-517"
>>> means.
>>
>> How about this instead:
>>
>> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
>> index 4fa075d49fbc..10d9d0c17c9e 100644
>> --- a/drivers/gpio/gpiolib.c
>> +++ b/drivers/gpio/gpiolib.c
>> @@ -1818,9 +1818,10 @@ int gpiochip_add_data_with_key(struct gpio_chip
>> *gc, void *data,
>> ida_simple_remove(_ida, gdev->id);
>>  err_free_gdev:
>> /* failures here can mean systems won't boot... */
>> -   pr_err("%s: GPIOs %d..%d (%s) failed to register, %d\n", __func__,
>> -  gdev->base, gdev->base + gdev->ngpio - 1,
>> -  gc->label ? : "generic", ret);
>> +   if (ret != -EPROBE_DEFER)
>> +   pr_err("%s: GPIOs %d..%d (%s) failed to register, %d\n",
>> +   __func__, gdev->base, gdev->base + gdev->ngpio - 1,
>> +   gc->label ? : "generic", ret);
>> kfree(gdev);
>> return ret;
>>  }
>>
> That was one of my thoughts too. I found someone had tried that
> earlier, but it was rejected:
> 
> 
> https://patchwork.ozlabs.org/project/linux-gpio/patch/1516566774-1786-1-git-send-email-da...@lechnology.com/

clk or reset APIs do not complain loudly on EPROBE_DEFER, it seems to me
that GPIO should follow here. Also, it does look like Linus was in
agreement in the end, not sure why it was not applied though.
-- 
Florian

[PATCH V3] arm64/hugetlb: Reserve CMA areas for gigantic pages on 16K and 64K configs

2020-06-30 Thread Anshuman Khandual

Currently 'hugetlb_cma=' command line argument does not create CMA area on
ARM64_16K_PAGES and ARM64_64K_PAGES based platforms. Instead, it just ends
up with the following warning message. Reason being, hugetlb_cma_reserve()
never gets called for these huge page sizes.

[   64.255669] hugetlb_cma: the option isn't supported by current arch

This enables CMA areas reservation on ARM64_16K_PAGES and ARM64_64K_PAGES
configs by defining an unified arm64_hugetlb_cma_reseve() that is wrapped
in CONFIG_CMA. Call site for arm64_hugetlb_cma_reserve() is also protected
as  is conditionally included and hence cannot contain stub
for the inverse config i.e !(CONFIG_HUGETLB_PAGE && CONFIG_CMA).

Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Mark Rutland 
Cc: Mike Kravetz 
Cc: Barry Song 
Cc: Andrew Morton 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual 
---
Applies on 5.8-rc3.

Changes in V3:

- Dropped the stub, protected call site, moved the declaration to a header

Changes in V2: (https://patchwork.kernel.org/patch/11630503/)

- Moved arm64_hugetlb_cma_reserve() stub and declaration near call site

Changes in V1: (https://patchwork.kernel.org/patch/11619839/)

 arch/arm64/include/asm/hugetlb.h |  2 ++
 arch/arm64/mm/hugetlbpage.c  | 38 
 arch/arm64/mm/init.c |  4 ++--
 3 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 94ba0c5bced2..5abf91e3494c 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -49,6 +49,8 @@ extern void set_huge_swap_pte_at(struct mm_struct *mm, 
unsigned long addr,
 pte_t *ptep, pte_t pte, unsigned long sz);
 #define set_huge_swap_pte_at set_huge_swap_pte_at
 
+void __init arm64_hugetlb_cma_reserve(void);
+
 #include 
 
 #endif /* __ASM_HUGETLB_H */
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 0a52ce46f020..ea7fb48b8617 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -19,6 +19,44 @@
 #include 
 #include 
 
+/*
+ * HugeTLB Support Matrix
+ *
+ * ---
+ * | Page Size | CONT PTE |  PMD  | CONT PMD |  PUD  |
+ * ---
+ * | 4K|   64K|   2M  |32M   |   1G  |
+ * |16K|2M|  32M  | 1G   |   |
+ * |64K|2M| 512M  |16G   |   |
+ * ---
+ */
+
+/*
+ * Reserve CMA areas for the largest supported gigantic
+ * huge page when requested. Any other smaller gigantic
+ * huge pages could still be served from those areas.
+ */
+#ifdef CONFIG_CMA
+void __init arm64_hugetlb_cma_reserve(void)
+{
+   int order;
+
+#ifdef CONFIG_ARM64_4K_PAGES
+   order = PUD_SHIFT - PAGE_SHIFT;
+#else
+   order = CONT_PMD_SHIFT + PMD_SHIFT - PAGE_SHIFT;
+#endif
+   /*
+* HugeTLB CMA reservation is required for gigantic
+* huge pages which could not be allocated via the
+* page allocator. Just warn if there is any change
+* breaking this assumption.
+*/
+   WARN_ON(order <= MAX_ORDER);
+   hugetlb_cma_reserve(order);
+}
+#endif /* CONFIG_CMA */
+
 #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
 bool arch_hugetlb_migration_supported(struct hstate *h)
 {
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1e93cfc7c47a..5f5665b9b026 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -425,8 +425,8 @@ void __init bootmem_init(void)
 * initialize node_online_map that gets used in hugetlb_cma_reserve()
 * while allocating required CMA size across online nodes.
 */
-#ifdef CONFIG_ARM64_4K_PAGES
-   hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT);
+#if defined(CONFIG_HUGETLB_PAGE) && defined(CONFIG_CMA)
+   arm64_hugetlb_cma_reserve();
 #endif
 
/*
-- 
2.20.1

Re: [PATCH v3 1/2] remoteproc: Add remoteproc character device interface

2020-06-30 Thread Bjorn Andersson

On Tue 30 Jun 00:43 PDT 2020, Arnaud POULIQUEN wrote:

> 
> 
> On 6/30/20 7:38 AM, Siddharth Gupta wrote:
> > 
> > On 6/17/2020 1:44 AM, Arnaud POULIQUEN wrote:
> >>
> >> On 6/16/20 9:56 PM, risha...@codeaurora.org wrote:
> >>> On 2020-04-30 01:30, Arnaud POULIQUEN wrote:
>  Hi Rishabh,
> 
> 
>  On 4/21/20 8:10 PM, Rishabh Bhatnagar wrote:
> > Add the character device interface into remoteproc framework.
> > This interface can be used in order to boot/shutdown remote
> > subsystems and provides a basic ioctl based interface to implement
> > supplementary functionality. An ioctl call is implemented to enable
> > the shutdown on release feature which will allow remote processors to
> > be shutdown when the controlling userpsace application crashes or
> > hangs.
> >
>  Thanks for intruducing Ioctl, this will help for future evolutions.
> 
> > Signed-off-by: Rishabh Bhatnagar 
> > ---
> >   Documentation/userspace-api/ioctl/ioctl-number.rst |   1 +
> >   drivers/remoteproc/Kconfig |   9 ++
> >   drivers/remoteproc/Makefile|   1 +
> >   drivers/remoteproc/remoteproc_cdev.c   | 143
> > +
> >   drivers/remoteproc/remoteproc_internal.h   |  21 +++
> >   include/linux/remoteproc.h |   3 +
> >   include/uapi/linux/remoteproc_cdev.h   |  20 +++
> >   7 files changed, 198 insertions(+)
> >   create mode 100644 drivers/remoteproc/remoteproc_cdev.c
> >   create mode 100644 include/uapi/linux/remoteproc_cdev.h
> >
> > diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst
> > b/Documentation/userspace-api/ioctl/ioctl-number.rst
> > index 2e91370..412b2a0 100644
> > --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
> > +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
> > @@ -337,6 +337,7 @@ Code  Seq#Include File
> >Comments
> >   0xB4  00-0F  linux/gpio.h
> > 
> >   0xB5  00-0F  uapi/linux/rpmsg.h
> > 
> >   0xB6  alllinux/fpga-dfl.h
> > +0xB7  alluapi/linux/remoteproc_cdev.h  
> > 
> >   0xC0  00-0F  linux/usb/iowarrior.h
> >   0xCA  00-0F  uapi/misc/cxl.h
> >   0xCA  10-2F  uapi/misc/ocxl.h
> > diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
> > index de3862c..6374b79 100644
> > --- a/drivers/remoteproc/Kconfig
> > +++ b/drivers/remoteproc/Kconfig
> > @@ -14,6 +14,15 @@ config REMOTEPROC
> >
> >   if REMOTEPROC
> >
> > +config REMOTEPROC_CDEV
> > +   bool "Remoteproc character device interface"
> > +   help
> > + Say y here to have a character device interface for Remoteproc
> > + framework. Userspace can boot/shutdown remote processors 
> > through
> > + this interface.
> > +
> > + It's safe to say N if you don't want to use this interface.
> > +
> >   config IMX_REMOTEPROC
> > tristate "IMX6/7 remoteproc support"
> > depends on ARCH_MXC
> > diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
> > index e30a1b1..b7d4f77 100644
> > --- a/drivers/remoteproc/Makefile
> > +++ b/drivers/remoteproc/Makefile
> > @@ -9,6 +9,7 @@ remoteproc-y+= 
> > remoteproc_debugfs.o
> >   remoteproc-y  += remoteproc_sysfs.o
> >   remoteproc-y  += remoteproc_virtio.o
> >   remoteproc-y  += remoteproc_elf_loader.o
> > +obj-$(CONFIG_REMOTEPROC_CDEV)  += remoteproc_cdev.o
> >   obj-$(CONFIG_IMX_REMOTEPROC)  += imx_rproc.o
> >   obj-$(CONFIG_MTK_SCP) += mtk_scp.o mtk_scp_ipi.o
> >   obj-$(CONFIG_OMAP_REMOTEPROC) += omap_remoteproc.o
> > diff --git a/drivers/remoteproc/remoteproc_cdev.c
> > b/drivers/remoteproc/remoteproc_cdev.c
> > new file mode 100644
> > index 000..65142ec
> > --- /dev/null
> > +++ b/drivers/remoteproc/remoteproc_cdev.c
> > @@ -0,0 +1,143 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Character device interface driver for Remoteproc framework.
> > + *
> > + * Copyright (c) 2020, The Linux Foundation. All rights reserved.
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "remoteproc_internal.h"
> > +
> > +#define NUM_RPROC_DEVICES  64
> > +static dev_t rproc_major;
> > +
> > +static ssize_t rproc_cdev_write(struct file *filp, const char __user
> >

Re: [git pull] drm for 5.8-rc1

2020-06-30 Thread James Jones

This implies something is trying to use one of the old 
DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK format modifiers with DRM-KMS without 
first checking whether it is supported by the kernel.  I had tried to 
force an Xorg+Mesa stack without my userspace patches to hit this error 
when testing, but must have missed some permutation.  If the stalled 
Mesa patches go in, this would stop happening of course, but those were 
held up for a long time in review, and are now waiting on me to make 
some modifications.


Are you using the modesetting driver in X?  If so, with glamor I 
presume?  What version of Mesa?  Any distro patches?  Any non-default 
xorg.conf options that would affect modesetting, your X driver if it 
isn't modesetting, or glamour?


Thanks,
-James

On 6/30/20 4:08 PM, Kirill A. Shutemov wrote:

On Tue, Jun 02, 2020 at 04:06:32PM +1000, Dave Airlie wrote:

James Jones (4):

...

   drm/nouveau/kms: Support NVIDIA format modifiers


This commit is the first one that breaks Xorg startup for my setup:
GTX 1080 + Dell UP2414Q (4K DP MST monitor).

I believe this is the crucial part of dmesg (full dmesg is attached):

[   29.997140] [drm:nouveau_framebuffer_new] Unsupported modifier: 
0x314
[   29.997143] [drm:drm_internal_framebuffer_create] could not create 
framebuffer
[   29.997145] [drm:drm_ioctl] pid=3393, ret = -22

Any suggestions?

Re: [PATCH] pinctrl: initialise nsp-mux earlier.

2020-06-30 Thread Mark Tomlinson

On Tue, 2020-06-30 at 20:14 -0700, Florian Fainelli wrote:
> Sorry, it looks like I made a mistake in my testing (or I was lucky),
> > and this patch doesn't fix the issue. What is happening is:
> > 1) nsp-pinmux driver is registered (arch_initcall).
> > 2) nsp-gpio-a driver is registered (arch_initcall_sync).
> > 3) of_platform_default_populate_init() is called (also at level
> > arch_initcall_sync), which scans the device tree, adds the nsp-gpio-a
> > device, runs its probe, and this returns -EPROBE_DEFER with the error
> > message.
> > 4) Only now nsp-pinmux device is probed.
> > 
> > Changing the 'arch_initcall_sync' to 'device_initcall' in nsp-gpio-a
> > ensures that the pinmux is probed first since
> > of_platform_default_populate_init() will be called between the two
> > register calls, and the error goes away. Is this change acceptable as a
> > solution?
> 
> If probe deferral did not work, certainly but it sounds like this is
> being done just for the sake of eliminating a round of probe deferral,
> is there a functional problem this is fixing?

No, I'm just trying to prevent an "error" message appearing in syslog.

> > The actual error message in syslog is:
> > 
> > kern.err kernel: gpiochip_add_data_with_key: GPIOs 480..511
> > (1820.gpio) failed to register, -517
> > 
> > So an end user sees "err" and "failed", and doesn't know what "-517"
> > means.
> 
> How about this instead:
> 
> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> index 4fa075d49fbc..10d9d0c17c9e 100644
> --- a/drivers/gpio/gpiolib.c
> +++ b/drivers/gpio/gpiolib.c
> @@ -1818,9 +1818,10 @@ int gpiochip_add_data_with_key(struct gpio_chip
> *gc, void *data,
> ida_simple_remove(_ida, gdev->id);
>  err_free_gdev:
> /* failures here can mean systems won't boot... */
> -   pr_err("%s: GPIOs %d..%d (%s) failed to register, %d\n", __func__,
> -  gdev->base, gdev->base + gdev->ngpio - 1,
> -  gc->label ? : "generic", ret);
> +   if (ret != -EPROBE_DEFER)
> +   pr_err("%s: GPIOs %d..%d (%s) failed to register, %d\n",
> +   __func__, gdev->base, gdev->base + gdev->ngpio - 1,
> +   gc->label ? : "generic", ret);
> kfree(gdev);
> return ret;
>  }
> 
That was one of my thoughts too. I found someone had tried that
earlier, but it was rejected:


https://patchwork.ozlabs.org/project/linux-gpio/patch/1516566774-1786-1-git-send-email-da...@lechnology.com/

Re: [PATCH] Revert "serial: 8250: Fix max baud limit in generic 8250 port"

2020-06-30 Thread Lukas Wunner

On Tue, Jun 30, 2020 at 04:42:11PM -0700, Daniel Winkler wrote:
> This reverts commit 0eeaf62981ecc79e8395ca8caa1570eaf3a12257.

That is not an upstream commit.  You probably mean:

commit 7b668c064ec33f3d687c3a413d05e355172e6c92
Author: Serge Semin 
Date:   Thu May 7 02:31:32 2020 +0300

serial: 8250: Fix max baud limit in generic 8250 port

And you didn't cc the commit author (hereby fixed).

Thanks,

Lukas

> 
> The change regresses the QCA6174A-3 bluetooth chip, preventing
> firmware from being properly loaded. We have verified that without
> this patch, the chip works as intended.
> 
> Signed-off-by: Daniel Winkler 
> ---
> 
>  drivers/tty/serial/8250/8250_port.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/tty/serial/8250/8250_port.c 
> b/drivers/tty/serial/8250/8250_port.c
> index 1632f7d25acca..e057c65ac1580 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -2618,8 +2618,6 @@ static unsigned int serial8250_get_baud_rate(struct 
> uart_port *port,
>struct ktermios *termios,
>struct ktermios *old)
>  {
> - unsigned int tolerance = port->uartclk / 100;
> -
>   /*
>* Ask the core to calculate the divisor for us.
>* Allow 1% tolerance at the upper limit so uart clks marginally
> @@ -2628,7 +2626,7 @@ static unsigned int serial8250_get_baud_rate(struct 
> uart_port *port,
>*/
>   return uart_get_baud_rate(port, termios, old,
> port->uartclk / 16 / UART_DIV_MAX,
> -   (port->uartclk + tolerance) / 16);
> +   port->uartclk);
>  }
>  
>  void
> -- 
> 2.27.0.212.ge8ba1cc988-goog

[RFC PATCH] interconnect: qcom: add functions to query addr/cmds for a path

2020-06-30 Thread Jonathan Marek

The a6xx GMU can vote for ddr and cnoc bandwidth, but it needs to be able
to query the interconnect driver for bcm addresses and commands.

I'm not sure what is the best way to go about implementing this, this is
what I came up with.

I included a quick example of how this can be used by the a6xx driver to
fill out the GMU bw_table (two ddr bandwidth levels in this example, note
this would be using the frequency table in dts and not hardcoded values).

Signed-off-by: Jonathan Marek 
---
 drivers/gpu/drm/msm/adreno/a6xx_hfi.c | 20 ---
 drivers/interconnect/qcom/icc-rpmh.c  | 50 +++
 include/soc/qcom/icc.h| 11 ++
 3 files changed, 68 insertions(+), 13 deletions(-)
 create mode 100644 include/soc/qcom/icc.h

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c 
b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
index ccd44d0418f8..1fb8f0480be3 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "a6xx_gmu.h"
 #include "a6xx_gmu.xml.h"
@@ -320,24 +321,18 @@ static void a640_build_bw_table(struct 
a6xx_hfi_msg_bw_table *msg)
msg->cnoc_cmds_data[1][2] =  0x6001;
 }
 
-static void a650_build_bw_table(struct a6xx_hfi_msg_bw_table *msg)
+static void a650_build_bw_table(struct a6xx_hfi_msg_bw_table *msg, struct 
icc_path *path)
 {
/*
 * Send a single "off" entry just to get things running
 * TODO: bus scaling
 */
-   msg->bw_level_num = 1;
-
-   msg->ddr_cmds_num = 3;
+   msg->bw_level_num = 2;
msg->ddr_wait_bitmask = 0x01;
 
-   msg->ddr_cmds_addrs[0] = 0x5;
-   msg->ddr_cmds_addrs[1] = 0x50004;
-   msg->ddr_cmds_addrs[2] = 0x5007c;
-
-   msg->ddr_cmds_data[0][0] =  0x4000;
-   msg->ddr_cmds_data[0][1] =  0x4000;
-   msg->ddr_cmds_data[0][2] =  0x4000;
+   msg->ddr_cmds_num = qcom_icc_query_addr(path, msg->ddr_cmds_addrs);
+   qcom_icc_query_cmd(path, msg->ddr_cmds_data[0], 0, 0);
+   qcom_icc_query_cmd(path, msg->ddr_cmds_data[1], 0, 7216000);
 
/*
 * These are the CX (CNOC) votes - these are used by the GMU but the
@@ -388,7 +383,6 @@ static void a6xx_build_bw_table(struct 
a6xx_hfi_msg_bw_table *msg)
msg->cnoc_cmds_data[1][2] =  0x6001;
 }
 
-
 static int a6xx_hfi_send_bw_table(struct a6xx_gmu *gmu)
 {
struct a6xx_hfi_msg_bw_table msg = { 0 };
@@ -400,7 +394,7 @@ static int a6xx_hfi_send_bw_table(struct a6xx_gmu *gmu)
else if (adreno_is_a640(adreno_gpu))
a640_build_bw_table();
else if (adreno_is_a650(adreno_gpu))
-   a650_build_bw_table();
+   a650_build_bw_table(, adreno_gpu->base.icc_path);
else
a6xx_build_bw_table();
 
diff --git a/drivers/interconnect/qcom/icc-rpmh.c 
b/drivers/interconnect/qcom/icc-rpmh.c
index 3ac5182c9ab2..3ce2920330f9 100644
--- a/drivers/interconnect/qcom/icc-rpmh.c
+++ b/drivers/interconnect/qcom/icc-rpmh.c
@@ -9,6 +9,7 @@
 
 #include "bcm-voter.h"
 #include "icc-rpmh.h"
+#include "../internal.h"
 
 /**
  * qcom_icc_pre_aggregate - cleans up stale values from prior icc_set
@@ -92,6 +93,55 @@ int qcom_icc_set(struct icc_node *src, struct icc_node *dst)
 }
 EXPORT_SYMBOL_GPL(qcom_icc_set);
 
+static u32 bcm_query(struct qcom_icc_bcm *bcm, u64 sum_avg, u64 max_peak)
+{
+   u64 temp, agg_peak = 0;
+   int i;
+
+   for (i = 0; i < bcm->num_nodes; i++) {
+   temp = max_peak * bcm->aux_data.width;
+   do_div(temp, bcm->nodes[i]->buswidth);
+   agg_peak = max(agg_peak, temp);
+   }
+
+   temp = agg_peak * 1000ULL;
+   do_div(temp, bcm->aux_data.unit);
+
+   // TODO vote_x
+
+   return BCM_TCS_CMD(true, temp != 0, 0, temp);
+}
+
+int qcom_icc_query_addr(struct icc_path *path, u32 *addr)
+{
+   struct qcom_icc_node *qn;
+   int i, j, k = 0;
+
+   for (i = 0; i < path->num_nodes; i++) {
+   qn = path->reqs[i].node->data;
+   for (j = 0; j < qn->num_bcms; j++, k++)
+   addr[k] = qn->bcms[j]->addr;
+   }
+
+   return k;
+}
+EXPORT_SYMBOL_GPL(qcom_icc_query_addr);
+
+int qcom_icc_query_cmd(struct icc_path *path, u32 *cmd, u64 avg, u64 max)
+{
+   struct qcom_icc_node *qn;
+   int i, j, k = 0;
+
+   for (i = 0; i < path->num_nodes; i++) {
+   qn = path->reqs[i].node->data;
+   for (j = 0; j < qn->num_bcms; j++, k++)
+   cmd[k] = bcm_query(qn->bcms[j], avg, max);
+   }
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(qcom_icc_query_cmd);
+
 /**
  * qcom_icc_bcm_init - populates bcm aux data and connect qnodes
  * @bcm: bcm to be initialized
diff --git a/include/soc/qcom/icc.h b/include/soc/qcom/icc.h
new file mode 100644
index ..8d0ddde49739
--- /dev/null
+++ b/include/soc/qcom/icc.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier:

Re: [PATCH 1/2] f2fs: split f2fs_allocate_new_segments()

2020-06-30 Thread Jaegeuk Kim

On 07/01, Chao Yu wrote:
> Jaegeuk, could you please help to change __allocate_new_segment() to static
> in your tree?

Sure. :)

> 
> On 2020/6/30 4:19, Jaegeuk Kim wrote:
> > On 06/22, Chao Yu wrote:
> >> to two independent functions:
> >> - f2fs_allocate_new_segment() for specified type segment allocation
> >> - f2fs_allocate_new_segments() for all data type segments allocation
> >>
> >> Signed-off-by: Chao Yu 
> >> ---
> >>  fs/f2fs/f2fs.h |  3 ++-
> >>  fs/f2fs/file.c |  2 +-
> >>  fs/f2fs/recovery.c |  2 +-
> >>  fs/f2fs/segment.c  | 39 +++
> >>  4 files changed, 27 insertions(+), 19 deletions(-)
> >>
> >> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >> index 70565d81320b..07290943e91d 100644
> >> --- a/fs/f2fs/f2fs.h
> >> +++ b/fs/f2fs/f2fs.h
> >> @@ -3327,7 +3327,8 @@ void f2fs_release_discard_addrs(struct f2fs_sb_info 
> >> *sbi);
> >>  int f2fs_npages_for_summary_flush(struct f2fs_sb_info *sbi, bool for_ra);
> >>  void f2fs_allocate_segment_for_resize(struct f2fs_sb_info *sbi, int type,
> >>unsigned int start, unsigned int end);
> >> -void f2fs_allocate_new_segments(struct f2fs_sb_info *sbi, int type);
> >> +void f2fs_allocate_new_segment(struct f2fs_sb_info *sbi, int type);
> >> +void f2fs_allocate_new_segments(struct f2fs_sb_info *sbi);
> >>  int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct fstrim_range *range);
> >>  bool f2fs_exist_trim_candidates(struct f2fs_sb_info *sbi,
> >>struct cp_control *cpc);
> >> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> >> index f196187159e9..67c65e40b22b 100644
> >> --- a/fs/f2fs/file.c
> >> +++ b/fs/f2fs/file.c
> >> @@ -1659,7 +1659,7 @@ static int expand_inode_data(struct inode *inode, 
> >> loff_t offset,
> >>map.m_seg_type = CURSEG_COLD_DATA_PINNED;
> >>  
> >>f2fs_lock_op(sbi);
> >> -  f2fs_allocate_new_segments(sbi, CURSEG_COLD_DATA);
> >> +  f2fs_allocate_new_segment(sbi, CURSEG_COLD_DATA_PINNED);
> > 
> > This should be CURSEG_COLD_DATA. Otherwise it causes the below kernel panic.
> > I fixed this in the -dev, so let me know, if you have other concern.
> > 
> >   259 Unable to handle kernel NULL pointer dereference at virtual address 
> > 0008
> >   259 task: 82b4de99 task.stack: c6b39dbf
> >   259 pc : f2fs_do_write_data_page+0x2b4/0x794
> >   259 lr : f2fs_do_write_data_page+0x290/0x794
> >   259 sp : ff800c83b5a0 pstate : 60c00145
> >   259 Call trace:
> >   259  f2fs_do_write_data_page+0x2b4/0x794
> >   259  f2fs_write_single_data_page+0x4a4/0x764
> >   259  f2fs_write_data_pages+0x4dc/0x968
> >   259  do_writepages+0x60/0x124
> >   259  __writeback_single_inode+0xd8/0x490
> >   259  writeback_sb_inodes+0x3a8/0x6e4
> >   259  __writeback_inodes_wb+0xa4/0x14c
> >   259  wb_writeback+0x218/0x434
> >   259  wb_workfn+0x2bc/0x57c
> >   259  process_one_work+0x25c/0x440
> >   259  worker_thread+0x24c/0x480
> >   259  kthread+0x11c/0x12c
> >   259  ret_from_fork+0x10/0x18
> > 
> >>f2fs_unlock_op(sbi);
> >>  
> >>err = f2fs_map_blocks(inode, , 1, F2FS_GET_BLOCK_PRE_DIO);
> >> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> >> index ae5310f02e7f..af974ba273b3 100644
> >> --- a/fs/f2fs/recovery.c
> >> +++ b/fs/f2fs/recovery.c
> >> @@ -742,7 +742,7 @@ static int recover_data(struct f2fs_sb_info *sbi, 
> >> struct list_head *inode_list,
> >>f2fs_put_page(page, 1);
> >>}
> >>if (!err)
> >> -  f2fs_allocate_new_segments(sbi, NO_CHECK_TYPE);
> >> +  f2fs_allocate_new_segments(sbi);
> >>return err;
> >>  }
> >>  
> >> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> >> index 113114f98087..f15711e8ee5b 100644
> >> --- a/fs/f2fs/segment.c
> >> +++ b/fs/f2fs/segment.c
> >> @@ -2707,28 +2707,35 @@ void f2fs_allocate_segment_for_resize(struct 
> >> f2fs_sb_info *sbi, int type,
> >>up_read(_I(sbi)->curseg_lock);
> >>  }
> >>  
> >> -void f2fs_allocate_new_segments(struct f2fs_sb_info *sbi, int type)
> >> +void __allocate_new_segment(struct f2fs_sb_info *sbi, int type)
> >>  {
> >> -  struct curseg_info *curseg;
> >> +  struct curseg_info *curseg = CURSEG_I(sbi, type);
> >>unsigned int old_segno;
> >> -  int i;
> >>  
> >> -  down_write(_I(sbi)->sentry_lock);
> >> +  if (!curseg->next_blkoff &&
> >> +  !get_valid_blocks(sbi, curseg->segno, false) &&
> >> +  !get_ckpt_valid_blocks(sbi, curseg->segno))
> >> +  return;
> >>  
> >> -  for (i = CURSEG_HOT_DATA; i <= CURSEG_COLD_DATA; i++) {
> >> -  if (type != NO_CHECK_TYPE && i != type)
> >> -  continue;
> >> +  old_segno = curseg->segno;
> >> +  SIT_I(sbi)->s_ops->allocate_segment(sbi, type, true);
> >> +  locate_dirty_segment(sbi, old_segno);
> >> +}
> >>  
> >> -  curseg = CURSEG_I(sbi, i);
> >> -  if (type == NO_CHECK_TYPE || curseg->next_blkoff ||
> >> -

[PATCH v3] cpufreq: CPPC: simply the code access 'highest_perf' value in cppc_perf_caps struct

2020-06-30 Thread Xin Hao

 The 'caps' variable has been defined, so there is no need to get
 'highest_perf' value through 'cpu->caps.highest_perf', you can use
 'caps->highest_perf' instead.

Signed-off-by: Xin Hao 
---
 drivers/cpufreq/cppc_cpufreq.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 257d726a4456..051d0e56c67a 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -161,7 +161,7 @@ static unsigned int cppc_cpufreq_perf_to_khz(struct 
cppc_cpudata *cpu,
if (!max_khz)
max_khz = cppc_get_dmi_max_khz();
mul = max_khz;
-   div = cpu->perf_caps.highest_perf;
+   div = caps->highest_perf;
}
return (u64)perf * mul / div;
 }
@@ -184,7 +184,7 @@ static unsigned int cppc_cpufreq_khz_to_perf(struct 
cppc_cpudata *cpu,
} else {
if (!max_khz)
max_khz = cppc_get_dmi_max_khz();
-   mul = cpu->perf_caps.highest_perf;
+   mul = caps->highest_perf;
div = max_khz;
}
 
-- 
2.24.1

Re: [EXTERNAL] Re: [PATCH v14 4/4] power: supply: bq25150 introduce the bq25150

2020-06-30 Thread Ricardo Rivera-Matos




On 6/30/20 6:33 PM, Sebastian Reichel wrote:

Hi,

On Tue, Jun 30, 2020 at 04:54:26PM -0500, Ricardo Rivera-Matos wrote:

Introduce the bq2515x family of chargers.

The BQ2515X family of devices are highly integrated battery management
ICs that integrate the most common functions for wearable devices
namely a charger, an output voltage rail, ADC for battery and system
monitoring, and a push-button controller.

Datasheets:
bq25150 - http://www.ti.com/lit/ds/symlink/bq25150.pdf
bq25155 - http://www.ti.com/lit/ds/symlink/bq25155.pdf

Signed-off-by: Ricardo Rivera-Matos 
---
  drivers/power/supply/Kconfig   |   13 +
  drivers/power/supply/Makefile  |1 +
  drivers/power/supply/bq2515x_charger.c | 1158 
  3 files changed, 1172 insertions(+)
  create mode 100644 drivers/power/supply/bq2515x_charger.c

diff --git a/drivers/power/supply/Kconfig b/drivers/power/supply/Kconfig
index 44d3c8512fb8..faf2830aa152 100644
--- a/drivers/power/supply/Kconfig
+++ b/drivers/power/supply/Kconfig
@@ -610,6 +610,19 @@ config CHARGER_BQ24735
help
  Say Y to enable support for the TI BQ24735 battery charger.
  
+config CHARGER_BQ2515X

+   tristate "TI BQ2515X battery charger family"
+   depends on I2C
+   depends on GPIOLIB || COMPILE_TEST
+   select REGMAP_I2C
+   help
+ Say Y to enable support for the TI BQ2515X family of battery
+ charging integrated circuits. The BQ2515X are highly integrated
+ battery charge management ICs that integrate the most common
+ functions for wearable devices, namely a charger, an output voltage
+ rail, ADC for battery and system monitoring, and push-button
+ controller.
+
  config CHARGER_BQ25890
tristate "TI BQ25890 battery charger driver"
depends on I2C
diff --git a/drivers/power/supply/Makefile b/drivers/power/supply/Makefile
index b9644663e435..b3c694a65114 100644
--- a/drivers/power/supply/Makefile
+++ b/drivers/power/supply/Makefile
@@ -82,6 +82,7 @@ obj-$(CONFIG_CHARGER_BQ2415X) += bq2415x_charger.o
  obj-$(CONFIG_CHARGER_BQ24190) += bq24190_charger.o
  obj-$(CONFIG_CHARGER_BQ24257) += bq24257_charger.o
  obj-$(CONFIG_CHARGER_BQ24735) += bq24735-charger.o
+obj-$(CONFIG_CHARGER_BQ2515X)  += bq2515x_charger.o
  obj-$(CONFIG_CHARGER_BQ25890) += bq25890_charger.o
  obj-$(CONFIG_CHARGER_SMB347)  += smb347-charger.o
  obj-$(CONFIG_CHARGER_TPS65090)+= tps65090-charger.o
diff --git a/drivers/power/supply/bq2515x_charger.c 
b/drivers/power/supply/bq2515x_charger.c
new file mode 100644
index ..f386484b5035
--- /dev/null
+++ b/drivers/power/supply/bq2515x_charger.c
@@ -0,0 +1,1158 @@
+// SPDX-License-Identifier: GPL-2.0
+// BQ2515X Battery Charger Driver
+// Copyright (C) 2020 Texas Instruments Incorporated - http://www.ti.com/
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define BQ2515X_MANUFACTURER "Texas Instruments"
+
+#define BQ2515X_STAT0  0x00
+#define BQ2515X_STAT1  0x01
+#define BQ2515X_STAT2  0x02
+#define BQ2515X_FLAG0  0x03
+#define BQ2515X_FLAG1  0x04
+#define BQ2515X_FLAG2  0x05
+#define BQ2515X_FLAG3  0x06
+#define BQ2515X_MASK0  0x07
+#define BQ2515X_MASK1  0x08
+#define BQ2515X_MASK2  0x09
+#define BQ2515X_MASK3  0x0a
+#define BQ2515X_VBAT_CTRL  0x12
+#define BQ2515X_ICHG_CTRL  0x13
+#define BQ2515X_PCHRGCTRL  0x14
+#define BQ2515X_TERMCTRL   0x15
+#define BQ2515X_BUVLO  0x16
+#define BQ2515X_CHARGERCTRL0   0x17
+#define BQ2515X_CHARGERCTRL1   0x18
+#define BQ2515X_ILIMCTRL   0x19
+#define BQ2515X_LDOCTRL0x1d
+#define BQ2515X_MRCTRL 0x30
+#define BQ2515X_ICCTRL00x35
+#define BQ2515X_ICCTRL10x36
+#define BQ2515X_ICCTRL20x37
+#define BQ2515X_ADCCTRL0   0x40
+#define BQ2515X_ADCCTRL1   0x41
+#define BQ2515X_ADC_VBAT_M 0x42
+#define BQ2515X_ADC_VBAT_L 0x43
+#define BQ2515X_ADC_TS_M   0x44
+#define BQ2515X_ADC_TS_L   0x45
+#define BQ2515X_ADC_ICHG_M 0x46
+#define BQ2515X_ADC_ICHG_L 0x47
+#define BQ2515X_ADC_ADCIN_M0x48
+#define BQ2515X_ADC_ADCIN_L0x49
+#define BQ2515X_ADC_VIN_M  0x4a
+#define BQ2515X_ADC_VIN_L  0x4b
+#define BQ2515X_ADC_PMID_M 0x4c
+#define BQ2515X_ADC_PMID_L 0x4d
+#define BQ2515X_ADC_IIN_M  0x4e
+#define BQ2515X_ADC_IIN_L  0x4f
+#define BQ2515X_ADC_COMP1_M0x52
+#define BQ2515X_ADC_COMP1_L0X53
+#define BQ2515X_ADC_COMP2_M0X54
+#define BQ2515X_ADC_COMP2_L0x55
+#define BQ2515X_ADC_COMP3_M0x56
+#define BQ2515X_ADC_COMP3_L0x57
+#define BQ2515X_ADC_READ_EN0x58
+#define BQ2515X_TS_FASTCHGCTRL 0x61
+#define BQ2515X_TS_COLD0x62
+#define BQ2515X_TS_COOL0x63
+#define BQ2515X_TS_WARM0x64
+#define BQ2515X_TS_HOT 0x65

Re: [RFC PATCH 0/6] Support raw event and DT for perf on RISC-V

2020-06-30 Thread Anup Patel

On Wed, Jul 1, 2020 at 6:48 AM Alan Kao  wrote:
>
> On Mon, Jun 29, 2020 at 11:19:09AM +0800, Zong Li wrote:
> > This patch set adds raw event support on RISC-V. In addition, we
> > introduce the DT mechanism to make our perf more generic and common.
> >
> > Currently, we set the hardware events by writing the mhpmeventN CSRs, it
> > would raise an illegal instruction exception and trap into m-mode to
> > emulate event selector CSRs access. It doesn't make sense because we
> > shouldn't write the m-mode CSRs in s-mode. Ideally, we should set event
> > selector through standard SBI call or the shadow CSRs of s-mode. We have
> > prepared a proposal of a new SBI extension, called "PMU SBI extension",
> > but we also discussing the feasibility of accessing these PMU CSRs on
> > s-mode at the same time, such as delegation mechanism, so I was
> > wondering if we could use SBI calls first and make the PMU SBI extension
> > as legacy when s-mode access mechanism is accepted by Foundation? or
> > keep the current situation to see what would happen in the future.
> >
> > This patch set also introduces the DT mechanism, we don't want to add too
> > much platform-dependency code in perf like other architectures, so we
> > put the mapping of generic hardware events to DT, then we can easy to
> > transfer generic hardware events to vendor's own hardware events without
> > any platfrom-dependency stuff in our perf.
> >
> > Zong Li (6):
> >   dt-bindings: riscv: Add YAML documentation for PMU
> >   riscv: dts: sifive: Add DT support for PMU
> >   riscv: add definition of hpmcounter CSRs
> >   riscv: perf: Add raw event support
> >   riscv: perf: introduce DT mechanism
> >   riscv: remove PMU menu of Kconfig
> >
>
> DT-based PMU registration looks good to me. Together with Anup's feedback,
> we can anticipate that the following items will be:
>
> - rewrite RISC-V PMU to a platform driver
> - propose SBI PMU extention
> - fixes: RV32 counter access, namings, etc.
>
> Yes, all are good directions towards better counting (`perf stat`) function.
> But as the original author of RISC-V perf port, please allow me to address
> the fundamental problems of RISC-V perf, again [0][1][2][3], that the sampling
> (`perf record`) function never earned enough respect.  Counting gives you a
> shallow view regarding an application, while sampling demystifies one for you.
>
> The problems are three-fold
> (1) Interrupt
> Sampling in perf requires that a HPM raises an interrupt when it overflows.
> Making RISC-V perf platform driver or not has nothing to do with this.  This
> requires more discussions in TGs.
> (2) S-mode access to PMU CSRs
> This is also addressed in this patch set but to me, it is kind of like a
> SBI-solves-them-all mindset to me.  Perf event is for performance monitoring
> thus we should eliminate any possible overhead if we can.  Setting event masks
> through SBI calls for counting maybe OK, but if we really take sampling and
> interrupt handling into consideration, it is questionable if it is still a
> viable way.

Yes, we should certainly not have any SBI call for reading the PMU counter.
The S-mode software should always have direct access to the actual counter
value (i.e. CSR for HW counters and memory location for SBI specific counters).

The SBI calls that we have been discussing here only deal with describing
counters and configuring it.

> (3) Registers, registers, registers
> There is just no enough CSR/function for perf sampling. The previous proposal
> explains why [2].
>
> Perf sampling is off-topic but somehow related, so I bring it up here just
> for your information.

I agree with 1) and 2) limitations mentioned above. We certainly need a
RISC-V PMU extension in RISC-V privilege spec. Maybe you can propose
creating a working-group for this ??

My worry is that defining RISC-V PMU extension will take time and meanwhile
more HW will show-up this year and next year which will have the same set of
basic HPMCOUNTER CSRs. We are trying to brainstorm the best thing we can
do when we have just HPMCOUNTER CSRs accessible to S-mode. The SBI
PMU extension discussed here only tries to complement existing HPMCOUNTER
CSRs so that SOC designers can at least provide implementation specific CSRs
for configuring HW counters. The SBI PMU extension won't be able to solve the
counter overflow detection so we will have to depend on software techniques to
detect overflow.

>
> As this patch set goes v2, the PMU porting guide in [0] should be removed 
> since
> it contains no useful information anymore.

I agree. This guide should be either updated or removed.

>
> [0] Documentation/riscv/pmu.rst
> [1] https://www.youtube.com/watch?v=Onvlcl4e2IU
> [2] https://github.com/riscv/riscv-isa-manual/issues/402
> This proposal has been posted in Privileged Spec Task Group, in
> https://lists.riscv.org/g/tech-privileged-archive/message/488?p=,,,20,0,0,0::Created,,Proposal,20,2,40,32306071
> but never receive any feedback.
> [3]

Re: [f2fs-dev] [PATCH] f2fs: add GC_URGENT_LOW mode in gc_urgent

2020-06-30 Thread Daeho Jeong

Yes, it's correct.

2020년 7월 1일 (수) 오후 12:35, Chao Yu 님이 작성:
>
> On 2020/6/30 8:54, Daeho Jeong wrote:
> > From: Daeho Jeong 
> >
> > Added a new gc_urgent mode, GC_URGENT_LOW, in which mode
> > F2FS will lower the bar of checking idle in order to
> > process outstanding discard commands and GC a little bit
> > aggressively.
> >
> > Signed-off-by: Daeho Jeong 
> > ---
> >  Documentation/ABI/testing/sysfs-fs-f2fs |  4 +++-
> >  fs/f2fs/f2fs.h  | 10 --
> >  fs/f2fs/gc.c|  6 +++---
> >  fs/f2fs/segment.c   |  4 ++--
> >  fs/f2fs/sysfs.c |  6 --
> >  5 files changed, 20 insertions(+), 10 deletions(-)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs 
> > b/Documentation/ABI/testing/sysfs-fs-f2fs
> > index 4bb93a06d8ab..7f730c4c8df2 100644
> > --- a/Documentation/ABI/testing/sysfs-fs-f2fs
> > +++ b/Documentation/ABI/testing/sysfs-fs-f2fs
> > @@ -229,7 +229,9 @@ Date: August 2017
> >  Contact: "Jaegeuk Kim" 
> >  Description: Do background GC agressively when set. When gc_urgent = 1,
> >   background thread starts to do GC by given 
> > gc_urgent_sleep_time
> > - interval. It is set to 0 by default.
> > + interval. When gc_urgent = 2, F2FS will lower the bar of
> > + checking idle in order to process outstanding discard commands
> > + and GC a little bit aggressively. It is set to 0 by default.
> >
> >  What:/sys/fs/f2fs//gc_urgent_sleep_time
> >  Date:August 2017
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > index e6e47618a357..4b28fd42fdbc 100644
> > --- a/fs/f2fs/f2fs.h
> > +++ b/fs/f2fs/f2fs.h
> > @@ -1283,7 +1283,8 @@ enum {
> >   GC_NORMAL,
> >   GC_IDLE_CB,
> >   GC_IDLE_GREEDY,
> > - GC_URGENT,
> > + GC_URGENT_HIGH,
> > + GC_URGENT_LOW,
> >  };
> >
> >  enum {
> > @@ -1540,6 +1541,7 @@ struct f2fs_sb_info {
> >   unsigned int cur_victim_sec;/* current victim section num 
> > */
> >   unsigned int gc_mode;   /* current GC state */
> >   unsigned int next_victim_seg[2];/* next segment in victim 
> > section */
> > +
> >   /* for skip statistic */
> >   unsigned int atomic_files;  /* # of opened atomic file */
> >   unsigned long long skipped_atomic_files[2]; /* FG_GC and BG_GC */
> > @@ -2480,7 +2482,7 @@ static inline void *f2fs_kmem_cache_alloc(struct 
> > kmem_cache *cachep,
> >
> >  static inline bool is_idle(struct f2fs_sb_info *sbi, int type)
> >  {
> > - if (sbi->gc_mode == GC_URGENT)
> > + if (sbi->gc_mode == GC_URGENT_HIGH)
> >   return true;
> >
> >   if (get_pages(sbi, F2FS_RD_DATA) || get_pages(sbi, F2FS_RD_NODE) ||
> > @@ -2498,6 +2500,10 @@ static inline bool is_idle(struct f2fs_sb_info *sbi, 
> > int type)
> >   atomic_read(_I(sbi)->fcc_info->queued_flush))
> >   return false;
> >
> > + if (sbi->gc_mode == GC_URGENT_LOW &&
> > + (type == DISCARD_TIME || type == GC_TIME))
> > + return true;
> > +
> >   return f2fs_time_over(sbi, type);
> >  }
> >
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index 6eec3b2d606d..3b718da69910 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -82,7 +82,7 @@ static int gc_thread_func(void *data)
> >* invalidated soon after by user update or deletion.
> >* So, I'd like to wait some time to collect dirty segments.
> >*/
> > - if (sbi->gc_mode == GC_URGENT) {
> > + if (sbi->gc_mode == GC_URGENT_HIGH) {
> >   wait_ms = gc_th->urgent_sleep_time;
> >   down_write(>gc_lock);
> >   goto do_gc;
> > @@ -176,7 +176,7 @@ static int select_gc_type(struct f2fs_sb_info *sbi, int 
> > gc_type)
> >   gc_mode = GC_CB;
> >   break;
> >   case GC_IDLE_GREEDY:
> > - case GC_URGENT:
> > + case GC_URGENT_HIGH:
> >   gc_mode = GC_GREEDY;
> >   break;
> >   }
> > @@ -211,7 +211,7 @@ static void select_policy(struct f2fs_sb_info *sbi, int 
> > gc_type,
> >* foreground GC and urgent GC cases.
> >*/
> >   if (gc_type != FG_GC &&
> > - (sbi->gc_mode != GC_URGENT) &&
> > + (sbi->gc_mode != GC_URGENT_HIGH) &&
> >   p->max_search > sbi->max_victim_search)
> >   p->max_search = sbi->max_victim_search;
> >
> > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > index b45e473508a9..5924b3965ae4 100644
> > --- a/fs/f2fs/segment.c
> > +++ b/fs/f2fs/segment.c
> > @@ -174,7 +174,7 @@ bool f2fs_need_SSR(struct f2fs_sb_info *sbi)
> >
> >   if (f2fs_lfs_mode(sbi))
> >   return false;
> > - if (sbi->gc_mode == GC_URGENT)
> > + if (sbi->gc_mode ==

[PATCH] drm/msm/dpu: fix wrong return value in dpu_encoder_init()

2020-06-30 Thread Tianjia Zhang

A positive value ENOMEM is returned here. I thinr this is a typo error.
It is necessary to return a negative error value.

Signed-off-by: Tianjia Zhang 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index 63976dcd2ac8..119c89659e71 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -2183,7 +2183,7 @@ struct drm_encoder *dpu_encoder_init(struct drm_device 
*dev,
 
dpu_enc = devm_kzalloc(dev->dev, sizeof(*dpu_enc), GFP_KERNEL);
if (!dpu_enc)
-   return ERR_PTR(ENOMEM);
+   return ERR_PTR(-ENOMEM);
 
rc = drm_encoder_init(dev, _enc->base, _encoder_funcs,
drm_enc_mode, NULL);
-- 
2.17.1

Re: [PATCH v2] cpufreq: CPPC: fix some unreasonable codes in cppc_cpufreq_perf_to_khz()

2020-06-30 Thread Viresh Kumar

On 01-07-20, 11:26, Xin Hao wrote:
>  The 'caps' variable has been defined, so there is no need to get
>  'highest_perf' value through 'cpu->caps.highest_perf', you can use
>  'caps->highest_perf' instead.
> 
> Signed-off-by: Xin Hao 
> ---
>  drivers/cpufreq/cppc_cpufreq.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> index 257d726a4456..444ee76a6bae 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -161,7 +161,7 @@ static unsigned int cppc_cpufreq_perf_to_khz(struct 
> cppc_cpudata *cpu,
>   if (!max_khz)
>   max_khz = cppc_get_dmi_max_khz();
>   mul = max_khz;
> - div = cpu->perf_caps.highest_perf;
> + div = caps->highest_perf;
>   }
>   return (u64)perf * mul / div;
>  }

Applied. Thanks.

-- 
viresh

Re: [PATCH 6/6] mm: Add memalloc_nowait

2020-06-30 Thread Matthew Wilcox

On Tue, Jun 30, 2020 at 08:34:36AM +0200, Michal Hocko wrote:
> On Mon 29-06-20 22:28:30, Matthew Wilcox wrote:
> [...]
> > The documentation is hard to add a new case to, so I rewrote it.  What
> > do you think?  (Obviously I'll split this out differently for submission;
> > this is just what I have in my tree right now).
> 
> I am fine with your changes. Few notes below.

Thanks!

> > -It turned out though that above approach has led to
> > -abuses when the restricted gfp mask is used "just in case" without a
> > -deeper consideration which leads to problems because an excessive use
> > -of GFP_NOFS/GFP_NOIO can lead to memory over-reclaim or other memory
> > -reclaim issues.
> 
> I believe this is an important part because it shows that new people
> coming to the existing code shouldn't take it as correct and rather
> question it. Also having a clear indication that overuse is causing real
> problems that might be not immediately visible to subsystems outside of
> MM.

It seemed to say a lot of the same things as this paragraph:

+You may notice that quite a few allocations in the existing code specify
+``GFP_NOIO`` or ``GFP_NOFS``. Historically, they were used to prevent
+recursion deadlocks caused by direct memory reclaim calling back into
+the FS or IO paths and blocking on already held resources. Since 4.12
+the preferred way to address this issue is to use the new scope APIs
+described below.

Since this is in core-api/ rather than vm/, I felt that discussion of
the problems that it causes to the mm was a bit too much detail for the
people who would be reading this document.  Maybe I could move that
information into a new Documentation/vm/reclaim.rst file?

Let's see if Our Grumpy Editor has time to give us his advice on this.

> > -FS/IO code then simply calls the appropriate save function before
> > -any critical section with respect to the reclaim is started - e.g.
> > -lock shared with the reclaim context or when a transaction context
> > -nesting would be possible via reclaim.  
> 
> [...]
> 
> > +These functions should be called at the point where any memory allocation
> > +would start to cause problems.  That is, do not simply wrap individual
> > +memory allocation calls which currently use ``GFP_NOFS`` with a pair
> > +of calls to memalloc_nofs_save() and memalloc_nofs_restore().  Instead,
> > +find the lock which is taken that would cause problems if memory reclaim
> > +reentered the filesystem, place a call to memalloc_nofs_save() before it
> > +is acquired and a call to memalloc_nofs_restore() after it is released.
> > +Ideally also add a comment explaining why this lock will be problematic.
> 
> The above text has mentioned the transaction context nesting as well and
> that was a hint by Dave IIRC. It is imho good to have an example of
> other reentrant points than just locks. I believe another useful example
> would be something like loop device which is mixing IO and FS layers but
> I am not familiar with all the details to give you an useful text.

I'll let Mikulas & Dave finish fighting about that before I write any
text mentioning the loop driver.  How about this for mentioning the
filesystem transaction possibility?

@@ -103,12 +103,16 @@ flags specified by any particular call to allocate memory.
 
 These functions should be called at the point where any memory allocation
 would start to cause problems.  That is, do not simply wrap individual
-memory allocation calls which currently use ``GFP_NOFS`` with a pair
-of calls to memalloc_nofs_save() and memalloc_nofs_restore().  Instead,
-find the lock which is taken that would cause problems if memory reclaim
+memory allocation calls which currently use ``GFP_NOFS`` with a pair of
+calls to memalloc_nofs_save() and memalloc_nofs_restore().  Instead, find
+the resource which is acquired that would cause problems if memory reclaim
 reentered the filesystem, place a call to memalloc_nofs_save() before it
 is acquired and a call to memalloc_nofs_restore() after it is released.
 Ideally also add a comment explaining why this lock will be problematic.
+A resource might be a lock which would need to be acquired by an attempt
+to reclaim memory, or it might be starting a transaction that should not
+nest over a memory reclaim transaction.  Deep knowledge of the filesystem
+or driver is often needed to place memory scoping calls correctly.
 
 Please note that the proper pairing of save/restore functions
 allows nesting so it is safe to call memalloc_noio_save() and

> > @@ -104,16 +134,19 @@ ARCH_KMALLOC_MINALIGN bytes.  For sizes which are a 
> > power of two, the
> >  alignment is also guaranteed to be at least the respective size.
> >  
> >  For large allocations you can use vmalloc() and vzalloc(), or directly
> > -request pages from the page allocator. The memory allocated by `vmalloc`
> > -and related functions is not physically contiguous.
> > +request pages from the page allocator.  The memory allocated by `vmalloc`
> > +and related

[PATCH] ARM: dts: imx6ul: Add ASRC device node

2020-06-30 Thread Shengjiu Wang

Add ASRC device node.

Signed-off-by: Shengjiu Wang 
---
 arch/arm/boot/dts/imx6ul.dtsi | 25 +
 1 file changed, 25 insertions(+)

diff --git a/arch/arm/boot/dts/imx6ul.dtsi b/arch/arm/boot/dts/imx6ul.dtsi
index 5379a03391bd..d10d5eb55a88 100644
--- a/arch/arm/boot/dts/imx6ul.dtsi
+++ b/arch/arm/boot/dts/imx6ul.dtsi
@@ -351,6 +351,31 @@
dma-names = "rx", "tx";
status = "disabled";
};
+
+   asrc: asrc@2034000 {
+   compatible = "fsl,imx6ul-asrc", 
"fsl,imx53-asrc";
+   reg = <0x2034000 0x4000>;
+   interrupts = ;
+   clocks = < IMX6UL_CLK_ASRC_IPG>,
+   < IMX6UL_CLK_ASRC_MEM>, 
< 0>,
+   < 0>, < 0>, < 
0>, < 0>,
+   < 0>, < 0>, < 
0>, < 0>,
+   < 0>, < 0>, < 
0>, < 0>,
+   < IMX6UL_CLK_SPDIF>, 
< 0>, < 0>,
+   < IMX6UL_CLK_SPBA>;
+   clock-names = "mem", "ipg", "asrck_0",
+   "asrck_1", "asrck_2", 
"asrck_3", "asrck_4",
+   "asrck_5", "asrck_6", 
"asrck_7", "asrck_8",
+   "asrck_9", "asrck_a", 
"asrck_b", "asrck_c",
+   "asrck_d", "asrck_e", 
"asrck_f", "spba";
+   dmas = < 17 23 1>, < 18 23 
1>, < 19 23 1>,
+   < 20 23 1>, < 21 23 
1>, < 22 23 1>;
+   dma-names = "rxa", "rxb", "rxc",
+   "txa", "txb", "txc";
+   fsl,asrc-rate  = <48000>;
+   fsl,asrc-width = <16>;
+   status = "okay";
+   };
};
 
tsc: tsc@204 {
-- 
2.21.0

Re: [regression] TCP_MD5SIG on established sockets

2020-06-30 Thread Herbert Xu

On Tue, Jun 30, 2020 at 08:36:51PM -0700, Eric Dumazet wrote:
>
> If I knew so many people were excited about TCP / MD5, I would have
> posted all my patches on lkml ;)
> 
> Without the smp_wmb() we would still need something to prevent KMSAN
> from detecting that we read uninitialized bytes,
> if key->keylen is increased.  (initial content of key->key[] is garbage)
> 
> Something like this :

LGTM.  Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH v2 12/18] media: mtk-vcodec: venc: set OUTPUT buffers field to V4L2_FIELD_NONE

2020-06-30 Thread Tiffany Lin

On Fri, 2020-06-26 at 17:04 +0900, Alexandre Courbot wrote:
> A default value of 0 means V4L2_FIELD_ANY, which is not correct.
> Reported by v4l2-compliance.
> 

Acked-by: Tiffany Lin 


> Signed-off-by: Alexandre Courbot 
> ---
>  drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c 
> b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
> index f833aee4a06f..1a981d842c19 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
> @@ -893,8 +893,17 @@ static void vb2ops_venc_stop_streaming(struct vb2_queue 
> *q)
>   ctx->state = MTK_STATE_FREE;
>  }
>  
> +static int vb2ops_venc_buf_out_validate(struct vb2_buffer *vb)
> +{
> + struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
> +
> + vbuf->field = V4L2_FIELD_NONE;
> + return 0;
> +}
> +
>  static const struct vb2_ops mtk_venc_vb2_ops = {
>   .queue_setup= vb2ops_venc_queue_setup,
> + .buf_out_validate   = vb2ops_venc_buf_out_validate,
>   .buf_prepare= vb2ops_venc_buf_prepare,
>   .buf_queue  = vb2ops_venc_buf_queue,
>   .wait_prepare   = vb2_ops_wait_prepare,

Re: [EXT] Re: [PATCH v4 2/2] ARM: imx6plus: enable internal routing of clk_enet_ref where possible

2020-06-30 Thread Fabio Estevam

On Wed, Jul 1, 2020 at 12:42 AM Andy Duan  wrote:

> It doesn't break old dtbs, and doesn't break imx6q/dl/solo.

Well, it breaks imx6qp as I said multiple times.

It does not break in your case because you are using NXP U-Boot.

You cannot assume people are using NXP U-Boot.

RE: [EXT] Re: [PATCH v4 2/2] ARM: imx6plus: enable internal routing of clk_enet_ref where possible

2020-06-30 Thread Andy Duan

From: Fabio Estevam  Sent: Wednesday, July 1, 2020 11:39 AM 
> Hi Andy,
> 
> On Wed, Jul 1, 2020 at 12:18 AM Andy Duan  wrote:
> 
> > --- a/arch/arm/boot/dts/imx6qdl-sabresd.dtsi
> > +++ b/arch/arm/boot/dts/imx6qdl-sabresd.dtsi
> > @@ -202,6 +202,8 @@
> >   {
> > pinctrl-names = "default";
> > pinctrl-0 = <_enet>;
> > +   assigned-clocks = < IMX6QDL_CLK_ENET_REF>;
> > +   assigned-clock-rates = <12500>;
> 
> I don't think this is an acceptable solution as it breaks old dtb's.

It doesn't break old dtbs, and doesn't break imx6q/dl/solo.

Re: [EXT] Re: [PATCH v4 2/2] ARM: imx6plus: enable internal routing of clk_enet_ref where possible

2020-06-30 Thread Fabio Estevam

Hi Andy,

On Wed, Jul 1, 2020 at 12:18 AM Andy Duan  wrote:

> --- a/arch/arm/boot/dts/imx6qdl-sabresd.dtsi
> +++ b/arch/arm/boot/dts/imx6qdl-sabresd.dtsi
> @@ -202,6 +202,8 @@
>   {
> pinctrl-names = "default";
> pinctrl-0 = <_enet>;
> +   assigned-clocks = < IMX6QDL_CLK_ENET_REF>;
> +   assigned-clock-rates = <12500>;

I don't think this is an acceptable solution as it breaks old dtb's.

[tip:irq/urgent] BUILD SUCCESS 98817a84ff1c755c347ac633ff017a623a631fad

2020-06-30 Thread kernel test robot

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git  
irq/urgent
branch HEAD: 98817a84ff1c755c347ac633ff017a623a631fad  Merge tag 
'irqchip-fixes-5.8-1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent

elapsed time: 1005m

configs tested: 120
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

arm defconfig
arm  allyesconfig
arm  allmodconfig
arm   allnoconfig
arm64allyesconfig
arm64   defconfig
arm64allmodconfig
arm64 allnoconfig
m68k alldefconfig
arm socfpga_defconfig
mips  allnoconfig
arm s3c2410_defconfig
openrisc simple_smp_defconfig
armvt8500_v6_v7_defconfig
arm   corgi_defconfig
m68km5307c3_defconfig
powerpc  g5_defconfig
arm   u8500_defconfig
sparc   sparc32_defconfig
i386  allnoconfig
i386 allyesconfig
i386defconfig
i386  debian-10.3
ia64 allmodconfig
ia64defconfig
ia64  allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k  allnoconfig
m68k   sun3_defconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
nios2allyesconfig
openriscdefconfig
c6x  allyesconfig
c6x   allnoconfig
openrisc allyesconfig
nds32   defconfig
nds32 allnoconfig
csky allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
h8300allmodconfig
xtensa  defconfig
arc defconfig
arc  allyesconfig
sh   allmodconfig
shallnoconfig
microblazeallnoconfig
mips allyesconfig
mips allmodconfig
pariscallnoconfig
parisc  defconfig
parisc   allyesconfig
parisc   allmodconfig
powerpc defconfig
powerpc  allyesconfig
powerpc  rhel-kconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a001-20200630
i386 randconfig-a003-20200630
i386 randconfig-a002-20200630
i386 randconfig-a004-20200630
i386 randconfig-a005-20200630
i386 randconfig-a006-20200630
x86_64   randconfig-a011-20200630
x86_64   randconfig-a014-20200630
x86_64   randconfig-a013-20200630
x86_64   randconfig-a015-20200630
x86_64   randconfig-a016-20200630
x86_64   randconfig-a012-20200630
x86_64   randconfig-a012-20200701
x86_64   randconfig-a016-20200701
x86_64   randconfig-a014-20200701
x86_64   randconfig-a011-20200701
x86_64   randconfig-a015-20200701
x86_64   randconfig-a013-20200701
i386 randconfig-a011-20200630
i386 randconfig-a016-20200630
i386 randconfig-a015-20200630
i386 randconfig-a012-20200630
i386 randconfig-a014-20200630
i386 randconfig-a013-20200630
i386 randconfig-a011-20200701
i386 randconfig-a015-20200701
i386 randconfig-a014-20200701
i386 randconfig-a016-20200701
i386 randconfig-a012-20200701
i386 randconfig-a013-20200701
riscvallyesconfig
riscv allnoconfig
riscv   defconfig
riscvallmodconfig
s390 allyesconfig
s390

Re: [regression] TCP_MD5SIG on established sockets

2020-06-30 Thread Eric Dumazet

On Tue, Jun 30, 2020 at 7:59 PM Herbert Xu  wrote:
>
> On Tue, Jun 30, 2020 at 07:30:43PM -0700, Eric Dumazet wrote:
> >
> > I made this clear in the changelog, do we want comments all over the places 
> > ?
> > Do not get me wrong, we had this bug for years and suddenly this is a
> > big deal...
>
> I thought you were adding a new pair of smp_rmb/smp_wmb.  If they
> already exist in the code then I agree it's not a big deal.  But
> adding a new pair of bogus smp_Xmb's is bad for maintenance.
>

If I knew so many people were excited about TCP / MD5, I would have
posted all my patches on lkml ;)

Without the smp_wmb() we would still need something to prevent KMSAN
from detecting that we read uninitialized bytes,
if key->keylen is increased.  (initial content of key->key[] is garbage)

Something like this :

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 
f111660453241692a17c881dd6dc2910a1236263..c3af8180c7049d5c4987bf5c67e4aff2ed6967c9
100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4033,11 +4033,9 @@ EXPORT_SYMBOL(tcp_md5_hash_skb_data);

 int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, const struct
tcp_md5sig_key *key)
 {
-   u8 keylen = key->keylen;
+   u8 keylen = READ_ONCE(key->keylen); /* paired with
WRITE_ONCE() in tcp_md5_do_add */
struct scatterlist sg;

-   smp_rmb(); /* paired with smp_wmb() in tcp_md5_do_add() */
-
sg_init_one(, key->key, keylen);
ahash_request_set_crypt(hp->md5_req, , NULL, keylen);
return crypto_ahash_update(hp->md5_req);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 
99916fcc15ca0be12c2c133ff40516f79e6fdf7f..0d08e0134335a21d23702e6a5c24a0f2b3c61c6f
100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1114,9 +1114,13 @@ int tcp_md5_do_add(struct sock *sk, const union
tcp_md5_addr *addr,
/* Pre-existing entry - just update that one. */
memcpy(key->key, newkey, newkeylen);

-   smp_wmb(); /* pairs with smp_rmb() in tcp_md5_hash_key() */
+   /* Pairs with READ_ONCE() in tcp_md5_hash_key().
+* Also note that a reader could catch new key->keylen value
+* but old key->key[], this is the reason we use __GFP_ZERO
+* at sock_kmalloc() time below these lines.
+*/
+   WRITE_ONCE(key->keylen, newkeylen);

-   key->keylen = newkeylen;
return 0;
}

@@ -1132,7 +1136,7 @@ int tcp_md5_do_add(struct sock *sk, const union
tcp_md5_addr *addr,
rcu_assign_pointer(tp->md5sig_info, md5sig);
}

-   key = sock_kmalloc(sk, sizeof(*key), gfp);
+   key = sock_kmalloc(sk, sizeof(*key), gfp | __GFP_ZERO);
if (!key)
return -ENOMEM;
if (!tcp_alloc_md5sig_pool()) {

Re: [f2fs-dev] [PATCH] f2fs: add GC_URGENT_LOW mode in gc_urgent

2020-06-30 Thread Chao Yu

On 2020/6/30 8:54, Daeho Jeong wrote:
> From: Daeho Jeong 
> 
> Added a new gc_urgent mode, GC_URGENT_LOW, in which mode
> F2FS will lower the bar of checking idle in order to
> process outstanding discard commands and GC a little bit
> aggressively.
> 
> Signed-off-by: Daeho Jeong 
> ---
>  Documentation/ABI/testing/sysfs-fs-f2fs |  4 +++-
>  fs/f2fs/f2fs.h  | 10 --
>  fs/f2fs/gc.c|  6 +++---
>  fs/f2fs/segment.c   |  4 ++--
>  fs/f2fs/sysfs.c |  6 --
>  5 files changed, 20 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs 
> b/Documentation/ABI/testing/sysfs-fs-f2fs
> index 4bb93a06d8ab..7f730c4c8df2 100644
> --- a/Documentation/ABI/testing/sysfs-fs-f2fs
> +++ b/Documentation/ABI/testing/sysfs-fs-f2fs
> @@ -229,7 +229,9 @@ Date: August 2017
>  Contact: "Jaegeuk Kim" 
>  Description: Do background GC agressively when set. When gc_urgent = 1,
>   background thread starts to do GC by given gc_urgent_sleep_time
> - interval. It is set to 0 by default.
> + interval. When gc_urgent = 2, F2FS will lower the bar of
> + checking idle in order to process outstanding discard commands
> + and GC a little bit aggressively. It is set to 0 by default.
>  
>  What:/sys/fs/f2fs//gc_urgent_sleep_time
>  Date:August 2017
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index e6e47618a357..4b28fd42fdbc 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -1283,7 +1283,8 @@ enum {
>   GC_NORMAL,
>   GC_IDLE_CB,
>   GC_IDLE_GREEDY,
> - GC_URGENT,
> + GC_URGENT_HIGH,
> + GC_URGENT_LOW,
>  };
>  
>  enum {
> @@ -1540,6 +1541,7 @@ struct f2fs_sb_info {
>   unsigned int cur_victim_sec;/* current victim section num */
>   unsigned int gc_mode;   /* current GC state */
>   unsigned int next_victim_seg[2];/* next segment in victim 
> section */
> +
>   /* for skip statistic */
>   unsigned int atomic_files;  /* # of opened atomic file */
>   unsigned long long skipped_atomic_files[2]; /* FG_GC and BG_GC */
> @@ -2480,7 +2482,7 @@ static inline void *f2fs_kmem_cache_alloc(struct 
> kmem_cache *cachep,
>  
>  static inline bool is_idle(struct f2fs_sb_info *sbi, int type)
>  {
> - if (sbi->gc_mode == GC_URGENT)
> + if (sbi->gc_mode == GC_URGENT_HIGH)
>   return true;
>  
>   if (get_pages(sbi, F2FS_RD_DATA) || get_pages(sbi, F2FS_RD_NODE) ||
> @@ -2498,6 +2500,10 @@ static inline bool is_idle(struct f2fs_sb_info *sbi, 
> int type)
>   atomic_read(_I(sbi)->fcc_info->queued_flush))
>   return false;
>  
> + if (sbi->gc_mode == GC_URGENT_LOW &&
> + (type == DISCARD_TIME || type == GC_TIME))
> + return true;
> +
>   return f2fs_time_over(sbi, type);
>  }
>  
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 6eec3b2d606d..3b718da69910 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -82,7 +82,7 @@ static int gc_thread_func(void *data)
>* invalidated soon after by user update or deletion.
>* So, I'd like to wait some time to collect dirty segments.
>*/
> - if (sbi->gc_mode == GC_URGENT) {
> + if (sbi->gc_mode == GC_URGENT_HIGH) {
>   wait_ms = gc_th->urgent_sleep_time;
>   down_write(>gc_lock);
>   goto do_gc;
> @@ -176,7 +176,7 @@ static int select_gc_type(struct f2fs_sb_info *sbi, int 
> gc_type)
>   gc_mode = GC_CB;
>   break;
>   case GC_IDLE_GREEDY:
> - case GC_URGENT:
> + case GC_URGENT_HIGH:
>   gc_mode = GC_GREEDY;
>   break;
>   }
> @@ -211,7 +211,7 @@ static void select_policy(struct f2fs_sb_info *sbi, int 
> gc_type,
>* foreground GC and urgent GC cases.
>*/
>   if (gc_type != FG_GC &&
> - (sbi->gc_mode != GC_URGENT) &&
> + (sbi->gc_mode != GC_URGENT_HIGH) &&
>   p->max_search > sbi->max_victim_search)
>   p->max_search = sbi->max_victim_search;
>  
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index b45e473508a9..5924b3965ae4 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -174,7 +174,7 @@ bool f2fs_need_SSR(struct f2fs_sb_info *sbi)
>  
>   if (f2fs_lfs_mode(sbi))
>   return false;
> - if (sbi->gc_mode == GC_URGENT)
> + if (sbi->gc_mode == GC_URGENT_HIGH)
>   return true;
>   if (unlikely(is_sbi_flag_set(sbi, SBI_CP_DISABLED)))
>   return true;
> @@ -1759,7 +1759,7 @@ static int issue_discard_thread(void *data)
>   continue;
>   }
>  
> - if (sbi->gc_mode == GC_URGENT)
> +

Re: [PATCH v2 11/18] media: mtk-vcodec: venc support MIN_OUTPUT_BUFFERS control

2020-06-30 Thread Tiffany Lin

On Fri, 2020-06-26 at 17:04 +0900, Alexandre Courbot wrote:
> This control is required by v4l2-compliance for encoders. A value of 1
> should be suitable for all scenarios.
> 
Acked-by: Tiffany Lin 

> Signed-off-by: Alexandre Courbot 
> ---
>  drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c 
> b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
> index f2ba19c32400..f833aee4a06f 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
> @@ -1206,6 +1206,8 @@ int mtk_vcodec_enc_ctrls_setup(struct mtk_vcodec_ctx 
> *ctx)
>  
>   v4l2_ctrl_handler_init(handler, MTK_MAX_CTRLS_HINT);
>  
> + v4l2_ctrl_new_std(handler, ops, V4L2_CID_MIN_BUFFERS_FOR_OUTPUT,
> +   1, 1, 1, 1);
>   v4l2_ctrl_new_std(handler, ops, V4L2_CID_MPEG_VIDEO_BITRATE,
> ctx->dev->venc_pdata->min_bitrate,
> ctx->dev->venc_pdata->max_bitrate, 1, 400);

[PATCH] kbuild: always create directories of targets

2020-06-30 Thread Masahiro Yamada

Currently, the directories of objects are automatically created
only for O= builds.

It should not hurt to cater to this for in-tree builds too.

Signed-off-by: Masahiro Yamada 
---

 scripts/Makefile.build | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index ca24c3077fef..98013fcde935 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -517,15 +517,13 @@ existing-targets := $(wildcard $(sort $(targets)))
 
 -include $(foreach f,$(existing-targets),$(dir $(f)).$(notdir $(f)).cmd)
 
-ifdef building_out_of_srctree
 # Create directories for object files if they do not exist
-obj-dirs := $(sort $(obj) $(patsubst %/,%, $(dir $(targets
+obj-dirs := $(sort $(patsubst %/,%, $(dir $(targets
 # If targets exist, their directories apparently exist. Skip mkdir.
 existing-dirs := $(sort $(patsubst %/,%, $(dir $(existing-targets
 obj-dirs := $(strip $(filter-out $(existing-dirs), $(obj-dirs)))
 ifneq ($(obj-dirs),)
 $(shell mkdir -p $(obj-dirs))
 endif
-endif
 
 .PHONY: $(PHONY)
-- 
2.25.1

Re: [PATCH v2 10/18] Revert "media: mtk-vcodec: Remove extra area allocation in an input buffer on encoding"

2020-06-30 Thread Tiffany Lin

On Fri, 2020-06-26 at 17:04 +0900, Alexandre Courbot wrote:
> This reverts commit 81735ecb62f882853a37a8c157407ec4aed44fd0.
> 
> The hardware needs data to follow the previous alignment, so this extra
> space was not superfluous after all. Besides, this also made
> v4l2-compliance's G_FMT and S_FMT tests regress.
> 
Acked-by: Tiffany Lin 

> Signed-off-by: Alexandre Courbot 
> ---
>  drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c 
> b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
> index 05743a745a11..f2ba19c32400 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
> @@ -299,12 +299,14 @@ static int vidioc_try_fmt(struct v4l2_format *f,
>  
>   pix_fmt_mp->num_planes = fmt->num_planes;
>   pix_fmt_mp->plane_fmt[0].sizeimage =
> - pix_fmt_mp->width * pix_fmt_mp->height;
> + pix_fmt_mp->width * pix_fmt_mp->height +
> + ((ALIGN(pix_fmt_mp->width, 16) * 2) * 16);
>   pix_fmt_mp->plane_fmt[0].bytesperline = pix_fmt_mp->width;
>  
>   if (pix_fmt_mp->num_planes == 2) {
>   pix_fmt_mp->plane_fmt[1].sizeimage =
> - (pix_fmt_mp->width * pix_fmt_mp->height) / 2;
> + (pix_fmt_mp->width * pix_fmt_mp->height) / 2 +
> + (ALIGN(pix_fmt_mp->width, 16) * 16);
>   pix_fmt_mp->plane_fmt[2].sizeimage = 0;
>   pix_fmt_mp->plane_fmt[1].bytesperline =
>   pix_fmt_mp->width;
> @@ -312,7 +314,8 @@ static int vidioc_try_fmt(struct v4l2_format *f,
>   } else if (pix_fmt_mp->num_planes == 3) {
>   pix_fmt_mp->plane_fmt[1].sizeimage =
>   pix_fmt_mp->plane_fmt[2].sizeimage =
> - (pix_fmt_mp->width * pix_fmt_mp->height) / 4;
> + (pix_fmt_mp->width * pix_fmt_mp->height) / 4 +
> + ((ALIGN(pix_fmt_mp->width, 16) / 2) * 16);
>   pix_fmt_mp->plane_fmt[1].bytesperline =
>   pix_fmt_mp->plane_fmt[2].bytesperline =
>   pix_fmt_mp->width / 2;

[PATCH] imx: Provide correct number of resources when registering gpio devices

2020-06-30 Thread Guenter Roeck

Since commit a85a6c86c25be ("driver core: platform: Clarify that IRQ 0 is
invalid"), the kernel is a bit touchy when it encounters interrupt 0.
As a result, there are lots of warnings such as the following when booting
systems such as 'kzm'.

WARNING: CPU: 0 PID: 1 at drivers/base/platform.c:224 
platform_get_irq_optional+0x118/0x128
0 is an invalid IRQ number
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3 #1
Hardware name: Kyoto Microcomputer Co., Ltd. KZM-ARM11-01
[] (unwind_backtrace) from [] (show_stack+0x10/0x14)
[] (show_stack) from [] (dump_stack+0xe8/0x120)
[] (dump_stack) from [] (__warn+0xe4/0x108)
[] (__warn) from [] (warn_slowpath_fmt+0x74/0xbc)
[] (warn_slowpath_fmt) from [] 
(platform_get_irq_optional+0x118/0x128)
[] (platform_get_irq_optional) from [] 
(platform_irq_count+0x20/0x3c)
[] (platform_irq_count) from [] (mxc_gpio_probe+0x8c/0x494)
[] (mxc_gpio_probe) from [] (platform_drv_probe+0x48/0x98)
[] (platform_drv_probe) from [] (really_probe+0x214/0x344)
[] (really_probe) from [] (driver_probe_device+0x58/0xb4)
[] (driver_probe_device) from [] 
(device_driver_attach+0x58/0x60)
[] (device_driver_attach) from [] 
(__driver_attach+0x84/0xc0)
[] (__driver_attach) from [] (bus_for_each_dev+0x78/0xb8)
[] (bus_for_each_dev) from [] (bus_add_driver+0x154/0x1e0)
[] (bus_add_driver) from [] (driver_register+0x74/0x108)
[] (driver_register) from [] (do_one_initcall+0x80/0x3b4)
[] (do_one_initcall) from [] 
(kernel_init_freeable+0x170/0x208)
[] (kernel_init_freeable) from [] (kernel_init+0x8/0x11c)
[] (kernel_init) from [] (ret_from_fork+0x14/0x20)

As it turns out, mxc_register_gpio() is a bit lax when setting the
number of resources: it registers a resource with interrupt 0 when in
reality there is no such interrupt. Fix the problem by not declaring
the second interrupt resource if there is no second interrupt.

Fixes: a85a6c86c25be ("driver core: platform: Clarify that IRQ 0 is invalid")
Cc: Bjorn Helgaas 
Signed-off-by: Guenter Roeck 
---
 arch/arm/mach-imx/devices/platform-gpio-mxc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mach-imx/devices/platform-gpio-mxc.c 
b/arch/arm/mach-imx/devices/platform-gpio-mxc.c
index 78628ef12672..355de845224c 100644
--- a/arch/arm/mach-imx/devices/platform-gpio-mxc.c
+++ b/arch/arm/mach-imx/devices/platform-gpio-mxc.c
@@ -24,7 +24,8 @@ struct platform_device *__init mxc_register_gpio(char *name, 
int id,
.flags = IORESOURCE_IRQ,
},
};
+   unsigned int nres;
 
-   return platform_device_register_resndata(_aips_bus,
-   name, id, res, ARRAY_SIZE(res), NULL, 0);
+   nres = irq_high ? ARRAY_SIZE(res) : ARRAY_SIZE(res) - 1;
+   return platform_device_register_resndata(_aips_bus, name, id, res, 
nres, NULL, 0);
 }
-- 
2.17.1

[PATCH v2] cpufreq: CPPC: fix some unreasonable codes in cppc_cpufreq_perf_to_khz()

2020-06-30 Thread Xin Hao

 The 'caps' variable has been defined, so there is no need to get
 'highest_perf' value through 'cpu->caps.highest_perf', you can use
 'caps->highest_perf' instead.

Signed-off-by: Xin Hao 
---
 drivers/cpufreq/cppc_cpufreq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 257d726a4456..444ee76a6bae 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -161,7 +161,7 @@ static unsigned int cppc_cpufreq_perf_to_khz(struct 
cppc_cpudata *cpu,
if (!max_khz)
max_khz = cppc_get_dmi_max_khz();
mul = max_khz;
-   div = cpu->perf_caps.highest_perf;
+   div = caps->highest_perf;
}
return (u64)perf * mul / div;
 }
--
2.24.1

Re: [PATCH] KVM: x86: drop erroneous mmu_check_root() from fast_pgd_switch()

2020-06-30 Thread Junaid Shahid


On 6/30/20 3:07 AM, Vitaly Kuznetsov wrote:

Undesired triple fault gets injected to L1 guest on SVM when L2 is
launched with certain CR3 values. It seems the mmu_check_root()
check in fast_pgd_switch() is wrong: first of all we don't know
if 'new_pgd' is a GPA or a nested GPA and, in case it is a nested
GPA, we can't check it with kvm_is_visible_gfn().

The problematic code path is:
nested_svm_vmrun()
   ...
   nested_prepare_vmcb_save()
 kvm_set_cr3(..., nested_vmcb->save.cr3)
   kvm_mmu_new_pgd()
 ...
 mmu_check_root() -> TRIPLE FAULT

The mmu_check_root() check in fast_pgd_switch() seems to be
superfluous even for non-nested case: when GPA is outside of the
visible range cached_root_available() will fail for non-direct
roots (as we can't have a matching one on the list) and we don't
seem to care for direct ones.

Also, raising #TF immediately when a non-existent GFN is written to CR3
doesn't seem to mach architecture behavior.

Fixes: 7c390d350f8b ("kvm: x86: Add fast CR3 switch code path")
Signed-off-by: Vitaly Kuznetsov 
---
- The patch fixes the immediate issue and doesn't seem to break any
   tests even with shadow PT but I'm not sure I properly understood
   why the check was there in the first place. Please review!
---
  arch/x86/kvm/mmu/mmu.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 76817d13c86e..286c74d2ae8d 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4277,8 +4277,7 @@ static bool fast_pgd_switch(struct kvm_vcpu *vcpu, gpa_t 
new_pgd,
 */
if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL &&
mmu->root_level >= PT64_ROOT_4LEVEL)
-   return !mmu_check_root(vcpu, new_pgd >> PAGE_SHIFT) &&
-  cached_root_available(vcpu, new_pgd, new_role);
+   return cached_root_available(vcpu, new_pgd, new_role);
  
  	return false;

  }



The check does seem superfluous, so should be ok to remove. Though I think that 
fast_pgd_switch() really should be getting only L1 GPAs. Otherwise, there could 
be confusion between the same GPAs from two different L2s.

IIUC, at least on Intel, only L1 CR3s (including shadow L1 CR3s for L2) or L1 EPTPs should 
get to fast_pgd_switch(). But I am not familiar enough with SVM to see why an L2 GPA would 
end up there. From a cursory look, it seems that until "978ce5837c7e KVM: SVM: always 
update CR3 in VMCB", enter_svm_guest_mode() was calling kvm_set_cr3() only when using 
shadow paging, in which case I assume that nested_vmcb->save.cr3 would have been an L1 
CR3 shadowing the L2 CR3, correct? But now kvm_set_cr3() is called even when not using 
shadow paging, which I suppose is how we are getting the L2 CR3. Should we skip calling 
fast_pgd_switch() in that particular case?

Thanks,
Junaid

Re: [RFC PATCH 1/6] dt-bindings: riscv: Add YAML documentation for PMU

2020-06-30 Thread Zong Li

On Mon, Jun 29, 2020 at 4:31 PM Anup Patel  wrote:
>
> On Mon, Jun 29, 2020 at 12:06 PM Zong Li  wrote:
> >
> > On Mon, Jun 29, 2020 at 12:38 PM Anup Patel  wrote:
> > >
> > > On Mon, Jun 29, 2020 at 9:58 AM Zong Li  wrote:
> > > >
> > > > On Mon, Jun 29, 2020 at 12:09 PM Anup Patel  wrote:
> > > > >
> > > > > On Mon, Jun 29, 2020 at 8:49 AM Zong Li  wrote:
> > > > > >
> > > > > > Add device tree bindings for performance monitor unit. And it 
> > > > > > passes the
> > > > > > dt_binding_check verification.
> > > > > >
> > > > > > Signed-off-by: Zong Li 
> > > > > > ---
> > > > > >  .../devicetree/bindings/riscv/pmu.yaml| 59 
> > > > > > +++
> > > > > >  1 file changed, 59 insertions(+)
> > > > > >  create mode 100644 Documentation/devicetree/bindings/riscv/pmu.yaml
> > > > > >
> > > > > > diff --git a/Documentation/devicetree/bindings/riscv/pmu.yaml 
> > > > > > b/Documentation/devicetree/bindings/riscv/pmu.yaml
> > > > > > new file mode 100644
> > > > > > index ..f55ccbc6c685
> > > > > > --- /dev/null
> > > > > > +++ b/Documentation/devicetree/bindings/riscv/pmu.yaml
> > > > > > @@ -0,0 +1,59 @@
> > > > > > +# SPDX-License-Identifier: GPL-2.0
> > > > > > +%YAML 1.2
> > > > > > +---
> > > > > > +$id: http://devicetree.org/schemas/riscv/pmu.yaml#
> > > > > > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > > > > > +
> > > > > > +title: RISC-V Performance Monitor Units
> > > > > > +
> > > > > > +maintainers:
> > > > > > +  - Zong Li 
> > > > > > +  - Paul Walmsley 
> > > > > > +  - Palmer Dabbelt 
> > > > > > +
> > > > > > +properties:
> > > > > > +  compatible:
> > > > > > +items:
> > > > > > +  - const: riscv,pmu
> > > > > > +
> > > > > > +  riscv,width-base-cntr:
> > > > > > +description: The width of cycle and instret CSRs.
> > > > > > +$ref: /schemas/types.yaml#/definitions/uint32
> > > > > > +
> > > > > > +  riscv,width-event-cntr:
> > > > > > +description: The width of hpmcounter CSRs.
> > > > > > +$ref: /schemas/types.yaml#/definitions/uint32
> > > > >
> > > > > The terms "base" and "event" is confusing because
> > > > > we only have counters with no interrupt associated with it.
> > > > >
> > > > > The RISC-V spec defines 3 counters and rest are all
> > > > > implementation specific counters.
> > > >
> > > > As I know, there are 2 counters of spec definition: cycle and instret.
> > > > What is the 3rd counter you mentioned?
> > >
> > > TIME is a counter CSR.
> > >
> > > >
> > > > >
> > > > > I suggest using the terms "spec counters" and "impl counters"
> > > > > instead of "base counters" and "event counters".
> > > >
> > > > OK, they are good to me. Let me change it.
> > > >
> > > >
> > > > >
> > > > > Further, "riscv,width" properties are redundant because
> > > > > RISC-V spec clearly tells that counters are 64bit for both
> > > > > RV32 and RV64.
> > > > >
> >
> > Sorry for the lost replying. The maximum length of counters is 64, but
> > it doesn't require to implement all bits. A real case is that
> > unleashed board only implements 40 bit for mhpmcounters.
>
> The "3.1.11 Hardware Performance Monitor" clearly states that
> all counters are 64bit
>

In the privileged spec, 3.1.11 section said, "The mhpmcounters are
WARL registers that support up to 64 bits of precision on RV32 and
RV64".

It seems to me that WARL implies the size of registers could be
variable, and support up to 64 bits as maximum size.

> To take care of the unleashed board, the "riscv,width-xyz" DT properties
> should be optional. Whenever these properties are not present, we
> should assume 64bit counter width.
>
> >
> > > > > > +
> > > > > > +  riscv,n-event-cntr:
> > > > > > +description: The number of hpmcounter CSRs.
> > > > > > +$ref: /schemas/types.yaml#/definitions/uint32
> > > > > > +
> > > > > > +  riscv,hw-event-map:
> > > > > > +description: The mapping of generic hardware events. Default 
> > > > > > is no mapping.
> > > > > > +$ref: /schemas/types.yaml#/definitions/uint32-array
> > > > > > +
> > > > > > +  riscv,hw-cache-event-map:
> > > > > > +description: The mapping of generic hardware cache events.
> > > > > > +  Default is no mapping.
> > > > > > +$ref: /schemas/types.yaml#/definitions/uint32-array
> > > > > > +
> > > > > > +required:
> > > > > > +  - compatible
> > > > > > +  - riscv,width-base-cntr
> > > > > > +  - riscv,width-event-cntr
> > > > > > +  - riscv,n-event-cntr
> > > > > > +
> > > > > > +additionalProperties: false
> > > > > > +
> > > > > > +examples:
> > > > > > +  - |
> > > > > > +pmu {
> > > > > > +  compatible = "riscv,pmu";
> > > > > > +  riscv,width-base-cntr = <64>;
> > > > > > +  riscv,width-event-cntr = <40>;
> > > > > > +  riscv,n-event-cntr = <2>;
> > > > > > +  riscv,hw-event-map = <0x0 0x0 0x1 0x1 0x3 0x0202 0x4 0x4000>;
> > > > > > +  riscv,hw-cache-event-map = <0x010201 0x0102 0x010204 0x0802>;
> > > > > > +};
> > > > > > +
> > > > > > +...
> > > >

Re: [PATCH 4.19 011/131] btrfs: make caching_thread use btrfs_find_next_key

2020-06-30 Thread Sasha Levin


On Tue, Jun 30, 2020 at 11:09:21PM +0200, Pavel Machek wrote:

On Mon 2020-06-29 11:33:02, Sasha Levin wrote:

From: Josef Bacik 

[ Upstream commit 6a9fb468f1152d6254f49fee6ac28c3cfa3367e5 ]

extent-tree.c has a find_next_key that just walks up the path to find
the next key, but it is used for both the caching stuff and the snapshot
delete stuff.  The snapshot deletion stuff is special so it can't really
use btrfs_find_next_key, but the caching thread stuff can.  We just need
to fix btrfs_find_next_key to deal with ->skip_locking and then it works
exactly the same as the private find_next_key helper.

Signed-off-by: Josef Bacik 
Signed-off-by: David Sterba 
Signed-off-by: Sasha Levin 


According to changelog, this is not known to fix a bug. Why is it
needed in stable?


Right. I've dropped it, thanks!


--
Thanks,
Sasha

Re: [PATCH v2 01/18] media: mtk-vcodec: abstract firmware interface

2020-06-30 Thread Tiffany Lin

On Fri, 2020-06-26 at 17:04 +0900, Alexandre Courbot wrote:
> From: Yunfei Dong 
> 
> MT8183's codec firwmare is run by a different remote processor from
> MT8173. While the firmware interface is basically the same, the way to
> invoke it differs. Abstract all firmware calls under a layer that will
> allow us to handle both firmware types transparently.
> 

Acked-by: Tiffany Lin 

> Signed-off-by: Yunfei Dong 
> [acourbot: refactor, cleanup and split]
> Co-developed-by: Alexandre Courbot 
> Signed-off-by: Alexandre Courbot 
> [pihsun: fix error path and add mtk_vcodec_fw_release]
> Signed-off-by: Pi-Hsun Shih 
> Reviewed-by: Tiffany Lin 
> ---
>  drivers/media/platform/mtk-vcodec/Makefile|   4 +-
>  .../platform/mtk-vcodec/mtk_vcodec_dec_drv.c  |  50 ++---
>  .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c   |   1 -
>  .../platform/mtk-vcodec/mtk_vcodec_drv.h  |   5 +-
>  .../platform/mtk-vcodec/mtk_vcodec_enc_drv.c  |  47 ++---
>  .../platform/mtk-vcodec/mtk_vcodec_enc_pm.c   |   2 -
>  .../media/platform/mtk-vcodec/mtk_vcodec_fw.c | 172 ++
>  .../media/platform/mtk-vcodec/mtk_vcodec_fw.h |  36 
>  .../platform/mtk-vcodec/mtk_vcodec_util.c |   1 -
>  .../platform/mtk-vcodec/vdec/vdec_h264_if.c   |   1 -
>  .../platform/mtk-vcodec/vdec/vdec_vp8_if.c|   1 -
>  .../platform/mtk-vcodec/vdec/vdec_vp9_if.c|   1 -
>  .../media/platform/mtk-vcodec/vdec_drv_base.h |   2 -
>  .../media/platform/mtk-vcodec/vdec_drv_if.c   |   1 -
>  .../media/platform/mtk-vcodec/vdec_vpu_if.c   |  12 +-
>  .../media/platform/mtk-vcodec/vdec_vpu_if.h   |  11 +-
>  .../platform/mtk-vcodec/venc/venc_h264_if.c   |  15 +-
>  .../platform/mtk-vcodec/venc/venc_vp8_if.c|   8 +-
>  .../media/platform/mtk-vcodec/venc_drv_if.c   |   1 -
>  .../media/platform/mtk-vcodec/venc_vpu_if.c   |  17 +-
>  .../media/platform/mtk-vcodec/venc_vpu_if.h   |   5 +-
>  21 files changed, 290 insertions(+), 103 deletions(-)
>  create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_fw.c
>  create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_fw.h
> 
> diff --git a/drivers/media/platform/mtk-vcodec/Makefile 
> b/drivers/media/platform/mtk-vcodec/Makefile
> index 37b94b555fa1..b8636119ed0a 100644
> --- a/drivers/media/platform/mtk-vcodec/Makefile
> +++ b/drivers/media/platform/mtk-vcodec/Makefile
> @@ -12,7 +12,7 @@ mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
>   vdec_vpu_if.o \
>   mtk_vcodec_dec.o \
>   mtk_vcodec_dec_pm.o \
> -
> + mtk_vcodec_fw.o
>  
>  mtk-vcodec-enc-y := venc/venc_vp8_if.o \
>   venc/venc_h264_if.o \
> @@ -25,5 +25,3 @@ mtk-vcodec-enc-y := venc/venc_vp8_if.o \
>  
>  mtk-vcodec-common-y := mtk_vcodec_intr.o \
>   mtk_vcodec_util.o\
> -
> -ccflags-y += -I$(srctree)/drivers/media/platform/mtk-vpu
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c 
> b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
> index 97a1b6664c20..4f07a5fcce7f 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
> @@ -20,7 +20,7 @@
>  #include "mtk_vcodec_dec_pm.h"
>  #include "mtk_vcodec_intr.h"
>  #include "mtk_vcodec_util.h"
> -#include "mtk_vpu.h"
> +#include "mtk_vcodec_fw.h"
>  
>  #define VDEC_HW_ACTIVE   0x10
>  #define VDEC_IRQ_CFG 0x11
> @@ -77,22 +77,6 @@ static irqreturn_t mtk_vcodec_dec_irq_handler(int irq, 
> void *priv)
>   return IRQ_HANDLED;
>  }
>  
> -static void mtk_vcodec_dec_reset_handler(void *priv)
> -{
> - struct mtk_vcodec_dev *dev = priv;
> - struct mtk_vcodec_ctx *ctx;
> -
> - mtk_v4l2_err("Watchdog timeout!!");
> -
> - mutex_lock(>dev_mutex);
> - list_for_each_entry(ctx, >ctx_list, list) {
> - ctx->state = MTK_STATE_ABORT;
> - mtk_v4l2_debug(0, "[%d] Change to state MTK_STATE_ERROR",
> - ctx->id);
> - }
> - mutex_unlock(>dev_mutex);
> -}
> -
>  static int fops_vcodec_open(struct file *file)
>  {
>   struct mtk_vcodec_dev *dev = video_drvdata(file);
> @@ -144,21 +128,20 @@ static int fops_vcodec_open(struct file *file)
>   if (v4l2_fh_is_singular(>fh)) {
>   mtk_vcodec_dec_pw_on(>pm);
>   /*
> -  * vpu_load_firmware checks if it was loaded already and
> -  * does nothing in that case
> +  * Does nothing if firmware was already loaded.
>*/
> - ret = vpu_load_firmware(dev->vpu_plat_dev);
> + ret = mtk_vcodec_fw_load_firmware(dev->fw_handler);
>   if (ret < 0) {
>   /*
>* Return 0 if downloading firmware successfully,
>* otherwise it is failed
>*/
> - mtk_v4l2_err("vpu_load_firmware failed!");
> + mtk_v4l2_err("failed to load firmware!");
>   goto err_load_fw;

RE: [EXT] Re: [PATCH v4 2/2] ARM: imx6plus: enable internal routing of clk_enet_ref where possible

2020-06-30 Thread Andy Duan

From: Sven Van Asbroeck  Sent: Tuesday, June 30, 2020 
11:24 PM
> Andy, Fabio,
> 
> On Tue, Jun 30, 2020 at 2:36 AM Andy Duan  wrote:
> >
> > Sven, no matter PHY supply 125Mhz clock to pad or not,  GPR5[9] is to
> > select RGMII gtx clock source from:
> > - 0 Clock from pad
> > - 1 Clock from PLL
> >
> > Since i.MX6QP can internally supply clock to MAC, we can set GPR5[9] bit by
> default.
> 
> That's true. But on the sabresd I notice that the PHY's ref_clk output is from
> CLK_25M.
> The default ref_clk freq for that PHY is 25 MHz, and I don't see anyone change
> the default in the devicetree. I also see that a 25 MHz crystal is fitted, 
> which
> also suggests 25 Mhz output.
> 
> On the imx6, the default ref_clk frequency from ANATOP is 50Mhz. I don't see
> anyone change that default in the devicetree either.
> 
> So is it possible that, when we switch GPR5[9] on, the external 25MHz clock is
> replaced by the internal 50MHz clock? If so, I'm not sure it'll work...?

Fabio, the reason is that you don't update uboot that why we cannot reproduce
the issue on imx6qp sabresd.

Sven, uboot board file set the clock rate.
board/freescale/mx6sabresd/mx6sabresd.c:
if (is_mx6dqp()) {
int ret;

/* select ENET MAC0 TX clock from PLL */
imx_iomux_set_gpr_register(5, 9, 1, 1);
ret = enable_fec_anatop_clock(0, ENET_125MHZ);
if (ret)
printf("Error fec anatop clock settings!\n");
}

Sven, to avoid to depend on uboot setting, for the patch, it is better to bind
below change for dts (even if non imx6qp, ptp clock can be set to 125Mhz):

--- a/arch/arm/boot/dts/imx6qdl-sabresd.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-sabresd.dtsi
@@ -202,6 +202,8 @@
  {
pinctrl-names = "default";
pinctrl-0 = <_enet>;
+   assigned-clocks = < IMX6QDL_CLK_ENET_REF>;
+   assigned-clock-rates = <12500>;
phy-mode = "rgmii-id";

[PATCH net v1] hinic: fix passing non negative value to ERR_PTR

2020-06-30 Thread Luo bin

get_dev_cap and set_resources_state functions may return a positive
value because of hardware failure, and the positive return value
can not be passed to ERR_PTR directly.

Fixes: 7dd29ee12865 ("hinic: add sriov feature support")
Signed-off-by: Luo bin 
---
 drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c 
b/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c
index 0245da02efbb..b735bc537508 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c
@@ -814,6 +814,8 @@ struct hinic_hwdev *hinic_init_hwdev(struct pci_dev *pdev)
 err_init_msix:
 err_pfhwdev_alloc:
hinic_free_hwif(hwif);
+   if (err > 0)
+   err = -EIO;
return ERR_PTR(err);
 }
 
-- 
2.17.1

Re: [RFC PATCH 0/6] Support raw event and DT for perf on RISC-V

2020-06-30 Thread Zong Li

On Wed, Jul 1, 2020 at 8:52 AM Alan Kao  wrote:
>
> On Mon, Jun 29, 2020 at 11:19:09AM +0800, Zong Li wrote:
> > This patch set adds raw event support on RISC-V. In addition, we
> > introduce the DT mechanism to make our perf more generic and common.
> >
> > Currently, we set the hardware events by writing the mhpmeventN CSRs, it
> > would raise an illegal instruction exception and trap into m-mode to
> > emulate event selector CSRs access. It doesn't make sense because we
> > shouldn't write the m-mode CSRs in s-mode. Ideally, we should set event
> > selector through standard SBI call or the shadow CSRs of s-mode. We have
> > prepared a proposal of a new SBI extension, called "PMU SBI extension",
> > but we also discussing the feasibility of accessing these PMU CSRs on
> > s-mode at the same time, such as delegation mechanism, so I was
> > wondering if we could use SBI calls first and make the PMU SBI extension
> > as legacy when s-mode access mechanism is accepted by Foundation? or
> > keep the current situation to see what would happen in the future.
> >
> > This patch set also introduces the DT mechanism, we don't want to add too
> > much platform-dependency code in perf like other architectures, so we
> > put the mapping of generic hardware events to DT, then we can easy to
> > transfer generic hardware events to vendor's own hardware events without
> > any platfrom-dependency stuff in our perf.
> >
> > Zong Li (6):
> >   dt-bindings: riscv: Add YAML documentation for PMU
> >   riscv: dts: sifive: Add DT support for PMU
> >   riscv: add definition of hpmcounter CSRs
> >   riscv: perf: Add raw event support
> >   riscv: perf: introduce DT mechanism
> >   riscv: remove PMU menu of Kconfig
> >
>
> DT-based PMU registration looks good to me. Together with Anup's feedback,
> we can anticipate that the following items will be:
>
> - rewrite RISC-V PMU to a platform driver
> - propose SBI PMU extention
> - fixes: RV32 counter access, namings, etc.
>
> Yes, all are good directions towards better counting (`perf stat`) function.
> But as the original author of RISC-V perf port, please allow me to address
> the fundamental problems of RISC-V perf, again [0][1][2][3], that the sampling
> (`perf record`) function never earned enough respect.  Counting gives you a
> shallow view regarding an application, while sampling demystifies one for you.
>
> The problems are three-fold
> (1) Interrupt
> Sampling in perf requires that a HPM raises an interrupt when it overflows.
> Making RISC-V perf platform driver or not has nothing to do with this.  This
> requires more discussions in TGs.
> (2) S-mode access to PMU CSRs
> This is also addressed in this patch set but to me, it is kind of like a
> SBI-solves-them-all mindset to me.  Perf event is for performance monitoring
> thus we should eliminate any possible overhead if we can.  Setting event masks
> through SBI calls for counting maybe OK, but if we really take sampling and
> interrupt handling into consideration, it is questionable if it is still a
> viable way.
> (3) Registers, registers, registers
> There is just no enough CSR/function for perf sampling. The previous proposal
> explains why [2].
>
> Perf sampling is off-topic but somehow related, so I bring it up here just
> for your information.
>

Agree, sampling is an important measurement for perf, we should integrate it
to perf as soon as possible after overflow interrupt mechanism is standardized.

> As this patch set goes v2, the PMU porting guide in [0] should be removed 
> since
> it contains no useful information anymore.
>

It seems that the document mentioned some hook functions, it is good for me to
reserve this document, maybe we could try to give some modification. I
would check that. Thanks

> [0] Documentation/riscv/pmu.rst
> [1] https://www.youtube.com/watch?v=Onvlcl4e2IU
> [2] https://github.com/riscv/riscv-isa-manual/issues/402
> This proposal has been posted in Privileged Spec Task Group, in
> https://lists.riscv.org/g/tech-privileged-archive/message/488?p=,,,20,0,0,0::Created,,Proposal,20,2,40,32306071
> but never receive any feedback.
> [3] https://lists.riscv.org/g/tech-unixplatformspec/message/84
> I intended to discuss [2] in the Unixplatform Spec Task Group at the
> online meeting, but obviously people were too busy knowing who the new
> RISC-V CTO is and what he has done to even follow the agenda.
>

Re: [PATCH v2 06/18] media: mtk-vcodec: venc: specify supported formats per-chip

2020-06-30 Thread Tiffany Lin

On Fri, 2020-06-26 at 17:04 +0900, Alexandre Courbot wrote:
> Different chips have different supported bitrate ranges. Move the list
> of supported formats to the platform data, and split the output and
> capture formats into two lists to make it easier to find the default
> format for each queue.
> 
Acked-by: Tiffany Lin 


> Signed-off-by: Alexandre Courbot 
> ---
>  .../platform/mtk-vcodec/mtk_vcodec_drv.h  |   8 ++
>  .../platform/mtk-vcodec/mtk_vcodec_enc.c  | 122 +++---
>  .../platform/mtk-vcodec/mtk_vcodec_enc_drv.c  |  40 ++
>  3 files changed, 95 insertions(+), 75 deletions(-)
> 
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h 
> b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> index b8f913de8d80..59b4b750666b 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> @@ -313,6 +313,10 @@ enum mtk_chip {
>   * @has_lt_irq: whether the encoder uses the LT irq
>   * @min_birate: minimum supported encoding bitrate
>   * @max_bitrate: maximum supported encoding bitrate
> + * @capture_formats: array of supported capture formats
> + * @num_capture_formats: number of entries in capture_formats
> + * @output_formats: array of supported output formats
> + * @num_output_formats: number of entries in output_formats
>   */
>  struct mtk_vcodec_enc_pdata {
>   enum mtk_chip chip;
> @@ -321,6 +325,10 @@ struct mtk_vcodec_enc_pdata {
>   bool has_lt_irq;
>   unsigned long min_bitrate;
>   unsigned long max_bitrate;
> + const struct mtk_video_fmt *capture_formats;
> + size_t num_capture_formats;
> + const struct mtk_video_fmt *output_formats;
> + size_t num_output_formats;
>  };
>  
>  /**
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c 
> b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
> index 50ba9da59153..05743a745a11 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
> @@ -23,47 +23,9 @@
>  #define DFT_CFG_WIDTHMTK_VENC_MIN_W
>  #define DFT_CFG_HEIGHT   MTK_VENC_MIN_H
>  #define MTK_MAX_CTRLS_HINT   20
> -#define OUT_FMT_IDX  0
> -#define CAP_FMT_IDX  4
> -
>  
>  static void mtk_venc_worker(struct work_struct *work);
>  
> -static const struct mtk_video_fmt mtk_video_formats[] = {
> - {
> - .fourcc = V4L2_PIX_FMT_NV12M,
> - .type = MTK_FMT_FRAME,
> - .num_planes = 2,
> - },
> - {
> - .fourcc = V4L2_PIX_FMT_NV21M,
> - .type = MTK_FMT_FRAME,
> - .num_planes = 2,
> - },
> - {
> - .fourcc = V4L2_PIX_FMT_YUV420M,
> - .type = MTK_FMT_FRAME,
> - .num_planes = 3,
> - },
> - {
> - .fourcc = V4L2_PIX_FMT_YVU420M,
> - .type = MTK_FMT_FRAME,
> - .num_planes = 3,
> - },
> - {
> - .fourcc = V4L2_PIX_FMT_H264,
> - .type = MTK_FMT_ENC,
> - .num_planes = 1,
> - },
> - {
> - .fourcc = V4L2_PIX_FMT_VP8,
> - .type = MTK_FMT_ENC,
> - .num_planes = 1,
> - },
> -};
> -
> -#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> -
>  static const struct mtk_codec_framesizes mtk_venc_framesizes[] = {
>   {
>   .fourcc = V4L2_PIX_FMT_H264,
> @@ -156,27 +118,17 @@ static const struct v4l2_ctrl_ops 
> mtk_vcodec_enc_ctrl_ops = {
>   .s_ctrl = vidioc_venc_s_ctrl,
>  };
>  
> -static int vidioc_enum_fmt(struct v4l2_fmtdesc *f, bool output_queue)
> +static int vidioc_enum_fmt(struct v4l2_fmtdesc *f,
> +const struct mtk_video_fmt *formats,
> +size_t num_formats)
>  {
> - const struct mtk_video_fmt *fmt;
> - int i, j = 0;
> + if (f->index >= num_formats)
> + return -EINVAL;
>  
> - for (i = 0; i < NUM_FORMATS; ++i) {
> - if (output_queue && mtk_video_formats[i].type != MTK_FMT_FRAME)
> - continue;
> - if (!output_queue && mtk_video_formats[i].type != MTK_FMT_ENC)
> - continue;
> + f->pixelformat = formats[f->index].fourcc;
> + memset(f->reserved, 0, sizeof(f->reserved));
>  
> - if (j == f->index) {
> - fmt = _video_formats[i];
> - f->pixelformat = fmt->fourcc;
> - memset(f->reserved, 0, sizeof(f->reserved));
> - return 0;
> - }
> - ++j;
> - }
> -
> - return -EINVAL;
> + return 0;
>  }
>  
>  static int vidioc_enum_framesizes(struct file *file, void *fh,
> @@ -202,13 +154,21 @@ static int vidioc_enum_framesizes(struct file *file, 
> void *fh,
>  static int vidioc_enum_fmt_vid_cap(struct file *file, void *priv,
>  struct v4l2_fmtdesc *f)
>  {
> - return vidioc_enum_fmt(f, false);
> + const

Re: [PATCH] pinctrl: initialise nsp-mux earlier.

2020-06-30 Thread Florian Fainelli




On 6/30/2020 7:23 PM, Mark Tomlinson wrote:
> On Tue, 2020-06-30 at 15:08 -0700, Ray Jui wrote:
>> May I know which GPIO driver you are referring to on NSP? Both the iProc
>> GPIO driver and the NSP GPIO driver are initialized at the level of
>> 'arch_initcall_sync', which is supposed to be after 'arch_initcall' used
>> here in the pinmux driver
> 
> Sorry, it looks like I made a mistake in my testing (or I was lucky),
> and this patch doesn't fix the issue. What is happening is:
> 1) nsp-pinmux driver is registered (arch_initcall).
> 2) nsp-gpio-a driver is registered (arch_initcall_sync).
> 3) of_platform_default_populate_init() is called (also at level
> arch_initcall_sync), which scans the device tree, adds the nsp-gpio-a
> device, runs its probe, and this returns -EPROBE_DEFER with the error
> message.
> 4) Only now nsp-pinmux device is probed.
> 
> Changing the 'arch_initcall_sync' to 'device_initcall' in nsp-gpio-a
> ensures that the pinmux is probed first since
> of_platform_default_populate_init() will be called between the two
> register calls, and the error goes away. Is this change acceptable as a
> solution?

If probe deferral did not work, certainly but it sounds like this is
being done just for the sake of eliminating a round of probe deferral,
is there a functional problem this is fixing?

> 
>>> though the probe will succeed when the driver is re-initialised, the
>>> error can be scary to end users. To fix this, change the time the
>>
>> Scary to end users? I don't know about that. -EPROBE_DEFER was
>> introduced exactly for this purpose. Perhaps users need to learn what
>> -EPROBE_DEFER errno means?
> 
> The actual error message in syslog is:
> 
> kern.err kernel: gpiochip_add_data_with_key: GPIOs 480..511
> (1820.gpio) failed to register, -517
> 
> So an end user sees "err" and "failed", and doesn't know what "-517"
> means.

How about this instead:

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 4fa075d49fbc..10d9d0c17c9e 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -1818,9 +1818,10 @@ int gpiochip_add_data_with_key(struct gpio_chip
*gc, void *data,
ida_simple_remove(_ida, gdev->id);
 err_free_gdev:
/* failures here can mean systems won't boot... */
-   pr_err("%s: GPIOs %d..%d (%s) failed to register, %d\n", __func__,
-  gdev->base, gdev->base + gdev->ngpio - 1,
-  gc->label ? : "generic", ret);
+   if (ret != -EPROBE_DEFER)
+   pr_err("%s: GPIOs %d..%d (%s) failed to register, %d\n",
+   __func__, gdev->base, gdev->base + gdev->ngpio - 1,
+   gc->label ? : "generic", ret);
kfree(gdev);
return ret;
 }

-- 
Florian

[PATCH] drm/msm/a6xx: add build_bw_table for A640/A650

2020-06-30 Thread Jonathan Marek

This sets up bw tables for A640/A650 similar to A618/A630, 0 DDR bandwidth
vote, and the CNOC vote. A640 has the same CNOC addresses as A630 and was
working, but this is required for A650 to work.

Eventually the bw table should be filled by querying the interconnect
driver for each BW in the dts, but use these dummy tables for now.

Signed-off-by: Jonathan Marek 
---
 drivers/gpu/drm/msm/adreno/a6xx_hfi.c | 74 +++
 1 file changed, 74 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c 
b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
index 9921e632f1ca..ccd44d0418f8 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
@@ -281,6 +281,76 @@ static void a618_build_bw_table(struct 
a6xx_hfi_msg_bw_table *msg)
msg->cnoc_cmds_data[1][0] =  0x6001;
 }
 
+static void a640_build_bw_table(struct a6xx_hfi_msg_bw_table *msg)
+{
+   /*
+* Send a single "off" entry just to get things running
+* TODO: bus scaling
+*/
+   msg->bw_level_num = 1;
+
+   msg->ddr_cmds_num = 3;
+   msg->ddr_wait_bitmask = 0x01;
+
+   msg->ddr_cmds_addrs[0] = 0x5;
+   msg->ddr_cmds_addrs[1] = 0x5003c;
+   msg->ddr_cmds_addrs[2] = 0x5000c;
+
+   msg->ddr_cmds_data[0][0] =  0x4000;
+   msg->ddr_cmds_data[0][1] =  0x4000;
+   msg->ddr_cmds_data[0][2] =  0x4000;
+
+   /*
+* These are the CX (CNOC) votes - these are used by the GMU but the
+* votes are known and fixed for the target
+*/
+   msg->cnoc_cmds_num = 3;
+   msg->cnoc_wait_bitmask = 0x01;
+
+   msg->cnoc_cmds_addrs[0] = 0x50034;
+   msg->cnoc_cmds_addrs[1] = 0x5007c;
+   msg->cnoc_cmds_addrs[2] = 0x5004c;
+
+   msg->cnoc_cmds_data[0][0] =  0x4000;
+   msg->cnoc_cmds_data[0][1] =  0x;
+   msg->cnoc_cmds_data[0][2] =  0x4000;
+
+   msg->cnoc_cmds_data[1][0] =  0x6001;
+   msg->cnoc_cmds_data[1][1] =  0x2001;
+   msg->cnoc_cmds_data[1][2] =  0x6001;
+}
+
+static void a650_build_bw_table(struct a6xx_hfi_msg_bw_table *msg)
+{
+   /*
+* Send a single "off" entry just to get things running
+* TODO: bus scaling
+*/
+   msg->bw_level_num = 1;
+
+   msg->ddr_cmds_num = 3;
+   msg->ddr_wait_bitmask = 0x01;
+
+   msg->ddr_cmds_addrs[0] = 0x5;
+   msg->ddr_cmds_addrs[1] = 0x50004;
+   msg->ddr_cmds_addrs[2] = 0x5007c;
+
+   msg->ddr_cmds_data[0][0] =  0x4000;
+   msg->ddr_cmds_data[0][1] =  0x4000;
+   msg->ddr_cmds_data[0][2] =  0x4000;
+
+   /*
+* These are the CX (CNOC) votes - these are used by the GMU but the
+* votes are known and fixed for the target
+*/
+   msg->cnoc_cmds_num = 1;
+   msg->cnoc_wait_bitmask = 0x01;
+
+   msg->cnoc_cmds_addrs[0] = 0x500a4;
+   msg->cnoc_cmds_data[0][0] =  0x4000;
+   msg->cnoc_cmds_data[1][0] =  0x6001;
+}
+
 static void a6xx_build_bw_table(struct a6xx_hfi_msg_bw_table *msg)
 {
/* Send a single "off" entry since the 630 GMU doesn't do bus scaling */
@@ -327,6 +397,10 @@ static int a6xx_hfi_send_bw_table(struct a6xx_gmu *gmu)
 
if (adreno_is_a618(adreno_gpu))
a618_build_bw_table();
+   else if (adreno_is_a640(adreno_gpu))
+   a640_build_bw_table();
+   else if (adreno_is_a650(adreno_gpu))
+   a650_build_bw_table();
else
a6xx_build_bw_table();
 
-- 
2.26.1

Re: [PATCH v2] dm crypt: add flags to optionally bypass dm-crypt workqueues

2020-06-30 Thread Damien Le Moal

On 2020/06/30 18:35, Ignat Korchagin wrote:
[...]
>>> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
>>> index 000ddfab5ba0..6924eb49b1df 100644
>>> --- a/drivers/md/dm-crypt.c
>>> +++ b/drivers/md/dm-crypt.c
>>> @@ -69,6 +69,7 @@ struct dm_crypt_io {
>>>   u8 *integrity_metadata;
>>>   bool integrity_metadata_from_pool;
>>>   struct work_struct work;
>>> + struct tasklet_struct tasklet;
>>>
>>>   struct convert_context ctx;
>>>
>>> @@ -127,7 +128,8 @@ struct iv_elephant_private {
>>>   * and encrypts / decrypts at the same time.
>>>   */
>>>  enum flags { DM_CRYPT_SUSPENDED, DM_CRYPT_KEY_VALID,
>>> -  DM_CRYPT_SAME_CPU, DM_CRYPT_NO_OFFLOAD };
>>> +  DM_CRYPT_SAME_CPU, DM_CRYPT_NO_OFFLOAD,
>>> +  DM_CRYPT_NO_READ_WORKQUEUE, DM_CRYPT_NO_WRITE_WORKQUEUE };
>>
>> I liked the "INLINE" naming. What about DM_CRYPT_READ_INLINE and
>> DM_CRYPT_WRITE_INLINE ? Shorter too :)
>>
>> But from the changes below, it looks like your change is now less about being
>> purely inline or synchronous but about bypassing the workqueue.
>> Is this correct ?
> 
> Yes, from the test with the NULL cipher it is clearly visible that
> workqueues are the main cause of the performance degradation. The
> previous patch actually did the same thing with the addition of a
> custom xts-proxy synchronous module, which achieved "full inline"
> processing. But it is clear now, that inline/non-inline Crypto API
> does not change much from a performance point of view.

OK. Understood. So the name DM_CRYPT_NO_READ_WORKQUEUE and
DM_CRYPT_NO_WRITE_WORKQUEUE make sense. They indeed are very descriptive.
I was just wondering how to avoid confusion with the DM_CRYPT_NO_OFFLOAD flag
for writes with better names. But I do not have better ideas :)

> 
>>>
>>>  enum cipher_flags {
>>>   CRYPT_MODE_INTEGRITY_AEAD,  /* Use authenticated mode for cihper 
>>> */
>>> @@ -1449,7 +1451,7 @@ static void kcryptd_async_done(struct 
>>> crypto_async_request *async_req,
>>>  int error);
>>>
>>>  static void crypt_alloc_req_skcipher(struct crypt_config *cc,
>>> -  struct convert_context *ctx)
>>> +  struct convert_context *ctx, bool 
>>> nobacklog)
>>>  {
>>>   unsigned key_index = ctx->cc_sector & (cc->tfms_count - 1);
>>>
>>> @@ -1463,12 +1465,12 @@ static void crypt_alloc_req_skcipher(struct 
>>> crypt_config *cc,
>>>* requests if driver request queue is full.
>>>*/
>>>   skcipher_request_set_callback(ctx->r.req,
>>> - CRYPTO_TFM_REQ_MAY_BACKLOG,
>>> + nobacklog ? 0 : CRYPTO_TFM_REQ_MAY_BACKLOG,
>>>   kcryptd_async_done, dmreq_of_req(cc, ctx->r.req));
>>
>> Will not specifying CRYPTO_TFM_REQ_MAY_BACKLOG always cause the crypto API to
>> return -EBUSY ? From the comment above the skcipher_request_set_callback(), 
>> it
>> seems that this will be the case only if the skcipher diver queue is full. 
>> So in
>> other word, keeping the kcryptd_async_done() callback and executing the 
>> skcipher
>> request through crypt_convert() and crypt_convert_block_skcipher() may still 
>> end
>> up being an asynchronous operation. Can you confirm this and is it what you
>> intended to implement ?
> 
> Yes, so far these flags should bypass dm-crypt workqueues only. I had
> a quick look around CRYPTO_TFM_REQ_MAY_BACKLOG and it seems that both
> generic xts as well as aesni implementations (and other crypto
> involved in disk encryption) do not have any logic related to the
> flag, so we may as well leave it as is.

OK. Sounds good. Less changes :)

>> From my understanding of the crypto API, and from what Eric commented, a 
>> truly
>> synchronous/inline execution of the skcypher needs a call like:
>>
>> crypto_wait_req(crypto_skcipher_encrypt(req), );
>>
>> For SMR use case were we must absolutely preserve the write requests order, 
>> the
>> above change will probably be needed. Will check again.
> 
> I think this is not an "inline" execution, rather blocking the current
> thread and waiting for the potential asynchronous crypto thread to
> finish its operation.

Well, if we block waiting for the crypto execution, crypto use becomes "inline"
in the context of the BIO submitter, so the write request order is preserved.
More a serialization than pure inlining, sure. But in the end, exactly what is
needed for zoned block device writes.

> It seems we have different use-cases here. By bypassing workqueues we
> just want to improve performance, but otherwise do not really care
> about the order of requests.

Yes. Understood. Not using the current workqueue mechanism for writes to zoned
devices is necessary because of write ordering. The performance aspect of that
is the cherry on top of the SMR support cake :)

> Waiting for crypto to complete synchronously may actually decrease
> performance, but required to preserve the order in some cases. Should
> this be a yet another flag?

Yes,

[PATCH] drm/msm: handle for EPROBE_DEFER for of_icc_get

2020-06-30 Thread Jonathan Marek

Check for EPROBE_DEFER instead of silently not using icc if the msm driver
probes before the interconnect driver.

Only check for EPROBE_DEFER because of_icc_get can return other errors that
we want to ignore (ENODATA).

Remove the WARN_ON in msm_gpu_cleanup because INIT_LIST_HEAD won't have
been called on the list yet when going through the defer error path.

Signed-off-by: Jonathan Marek 
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 17 ++---
 drivers/gpu/drm/msm/msm_gpu.c   |  2 --
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 89673c7ed473..393c00425d68 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -940,12 +940,20 @@ static int adreno_get_pwrlevels(struct device *dev,
 */
gpu->icc_path = of_icc_get(dev, NULL);
}
-   if (IS_ERR(gpu->icc_path))
+   if (IS_ERR(gpu->icc_path)) {
+   ret = PTR_ERR(gpu->icc_path);
gpu->icc_path = NULL;
+   if (ret == -EPROBE_DEFER)
+   return ret;
+   }
 
gpu->ocmem_icc_path = of_icc_get(dev, "ocmem");
-   if (IS_ERR(gpu->ocmem_icc_path))
+   if (IS_ERR(gpu->ocmem_icc_path)) {
+   ret = PTR_ERR(gpu->ocmem_icc_path);
gpu->ocmem_icc_path = NULL;
+   if (ret == -EPROBE_DEFER)
+   return ret;
+   }
 
return 0;
 }
@@ -996,6 +1004,7 @@ int adreno_gpu_init(struct drm_device *drm, struct 
platform_device *pdev,
struct adreno_platform_config *config = pdev->dev.platform_data;
struct msm_gpu_config adreno_gpu_config  = { 0 };
struct msm_gpu *gpu = _gpu->base;
+   int ret;
 
adreno_gpu->funcs = funcs;
adreno_gpu->info = adreno_info(config->rev);
@@ -1007,7 +1016,9 @@ int adreno_gpu_init(struct drm_device *drm, struct 
platform_device *pdev,
 
adreno_gpu_config.nr_rings = nr_rings;
 
-   adreno_get_pwrlevels(>dev, gpu);
+   ret = adreno_get_pwrlevels(>dev, gpu);
+   if (ret)
+   return ret;
 
pm_runtime_set_autosuspend_delay(>dev,
adreno_gpu->info->inactive_period);
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index a22d30622306..ccf9a0dd9706 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -959,8 +959,6 @@ void msm_gpu_cleanup(struct msm_gpu *gpu)
 
DBG("%s", gpu->name);
 
-   WARN_ON(!list_empty(>active_list));
-
for (i = 0; i < ARRAY_SIZE(gpu->rb); i++) {
msm_ringbuffer_destroy(gpu->rb[i]);
gpu->rb[i] = NULL;
-- 
2.26.1

[PATCH] pinctrl: aspeed: Describe the heartbeat function on ball Y23

2020-06-30 Thread Joel Stanley

From: Andrew Jeffery 

The default pinmux configuration for Y23 is to route a heartbeat to
drive a LED. Previous revisions of the AST2600 datasheet did not include
a description of this function.

Fixes: 2eda1cdec49f ("pinctrl: aspeed: Add AST2600 pinmux support")
Signed-off-by: Andrew Jeffery 
Signed-off-by: Joel Stanley 
---
 drivers/pinctrl/aspeed/pinctrl-aspeed-g6.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/pinctrl/aspeed/pinctrl-aspeed-g6.c 
b/drivers/pinctrl/aspeed/pinctrl-aspeed-g6.c
index fa32c3e9c9d1..7efe6dbe4398 100644
--- a/drivers/pinctrl/aspeed/pinctrl-aspeed-g6.c
+++ b/drivers/pinctrl/aspeed/pinctrl-aspeed-g6.c
@@ -46,6 +46,7 @@
 #define SCU634 0x634 /* Disable GPIO Internal Pull-Down #5 */
 #define SCU638 0x638 /* Disable GPIO Internal Pull-Down #6 */
 #define SCU694 0x694 /* Multi-function Pin Control #25 */
+#define SCU69C 0x69C /* Multi-function Pin Control #27 */
 #define SCUC20 0xC20 /* PCIE configuration Setting Control */
 
 #define ASPEED_G6_NR_PINS 256
@@ -819,11 +820,13 @@ FUNC_DECL_2(PWM14, PWM14G0, PWM14G1);
 #define Y23 127
 SIG_EXPR_LIST_DECL_SEMG(Y23, PWM15, PWM15G1, PWM15, SIG_DESC_SET(SCU41C, 31));
 SIG_EXPR_LIST_DECL_SESG(Y23, THRUOUT3, THRU3, SIG_DESC_SET(SCU4BC, 31));
-PIN_DECL_2(Y23, GPIOP7, PWM15, THRUOUT3);
+SIG_EXPR_LIST_DECL_SESG(Y23, HEARTBEAT, HEARTBEAT, SIG_DESC_SET(SCU69C, 31));
+PIN_DECL_3(Y23, GPIOP7, PWM15, THRUOUT3, HEARTBEAT);
 GROUP_DECL(PWM15G1, Y23);
 FUNC_DECL_2(PWM15, PWM15G0, PWM15G1);
 
 FUNC_GROUP_DECL(THRU3, AB24, Y23);
+FUNC_GROUP_DECL(HEARTBEAT, Y23);
 
 #define AA25 128
 SSSF_PIN_DECL(AA25, GPIOQ0, TACH0, SIG_DESC_SET(SCU430, 0));
@@ -1920,6 +1923,7 @@ static const struct aspeed_pin_group aspeed_g6_groups[] = 
{
ASPEED_PINCTRL_GROUP(GPIU5),
ASPEED_PINCTRL_GROUP(GPIU6),
ASPEED_PINCTRL_GROUP(GPIU7),
+   ASPEED_PINCTRL_GROUP(HEARTBEAT),
ASPEED_PINCTRL_GROUP(HVI3C3),
ASPEED_PINCTRL_GROUP(HVI3C4),
ASPEED_PINCTRL_GROUP(I2C1),
@@ -2158,6 +2162,7 @@ static const struct aspeed_pin_function 
aspeed_g6_functions[] = {
ASPEED_PINCTRL_FUNC(GPIU5),
ASPEED_PINCTRL_FUNC(GPIU6),
ASPEED_PINCTRL_FUNC(GPIU7),
+   ASPEED_PINCTRL_FUNC(HEARTBEAT),
ASPEED_PINCTRL_FUNC(I2C1),
ASPEED_PINCTRL_FUNC(I2C10),
ASPEED_PINCTRL_FUNC(I2C11),
-- 
2.27.0

Re: [PATCH 1/2] workqueue: don't always set __WQ_ORDERED implicitly

2020-06-30 Thread Bob Liu

On 6/29/20 8:37 AM, Lai Jiangshan wrote:
> On Mon, Jun 29, 2020 at 8:13 AM Bob Liu  wrote:
>>
>> On 6/28/20 11:54 PM, Lai Jiangshan wrote:
>>> On Thu, Jun 11, 2020 at 6:29 PM Bob Liu  wrote:

 Current code always set 'Unbound && max_active == 1' workqueues to ordered
 implicitly, while this may be not an expected behaviour for some use cases.

 E.g some scsi and iscsi workqueues(unbound && max_active = 1) want to be 
 bind
 to different cpu so as to get better isolation, but their cpumask can't be
 changed because WQ_ORDERED is set implicitly.
>>>
>>> Hello
>>>
>>> If I read the code correctly, the reason why their cpumask can't
>>> be changed is because __WQ_ORDERED_EXPLICIT, not __WQ_ORDERED.
>>>

 This patch adds a flag __WQ_ORDERED_DISABLE and also
 create_singlethread_workqueue_noorder() to offer an new option.

 Signed-off-by: Bob Liu 
 ---
  include/linux/workqueue.h | 4 
  kernel/workqueue.c| 4 +++-
  2 files changed, 7 insertions(+), 1 deletion(-)

 diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
 index e48554e..4c86913 100644
 --- a/include/linux/workqueue.h
 +++ b/include/linux/workqueue.h
 @@ -344,6 +344,7 @@ enum {
 __WQ_ORDERED= 1 << 17, /* internal: workqueue is 
 ordered */
 __WQ_LEGACY = 1 << 18, /* internal: 
 create*_workqueue() */
 __WQ_ORDERED_EXPLICIT   = 1 << 19, /* internal: 
 alloc_ordered_workqueue() */
 +   __WQ_ORDERED_DISABLE= 1 << 20, /* internal: don't set 
 __WQ_ORDERED implicitly */

 WQ_MAX_ACTIVE   = 512,/* I like 512, better ideas? */
 WQ_MAX_UNBOUND_PER_CPU  = 4,  /* 4 * #cpus for unbound wq */
 @@ -433,6 +434,9 @@ struct workqueue_struct *alloc_workqueue(const char 
 *fmt,
  #define create_singlethread_workqueue(name)\
 alloc_ordered_workqueue("%s", __WQ_LEGACY | WQ_MEM_RECLAIM, name)

 +#define create_singlethread_workqueue_noorder(name)\
 +   alloc_workqueue("%s", WQ_SYSFS | __WQ_LEGACY | WQ_MEM_RECLAIM | \
 +   WQ_UNBOUND | __WQ_ORDERED_DISABLE, 1, (name))
>>>
>>> I think using __WQ_ORDERED without __WQ_ORDERED_EXPLICIT is what you
>>> need, in which case cpumask is allowed to be changed.
>>>
>>
>> I don't think so, see function workqueue_apply_unbound_cpumask():
>>
>> wq_unbound_cpumask_store()
>>  > workqueue_set_unbound_cpumask()
>>> workqueue_apply_unbound_cpumask() {
>>  ...
>> 5276 /* creating multiple pwqs breaks ordering guarantee */
>> 5277 if (wq->flags & __WQ_ORDERED)
>> 5278 continue;
>>   
>>   Here will skip apply cpumask if only __WQ_ORDERED 
>> is set.
> 
> wq_unbound_cpumask_store() is for changing the cpumask of
> *all* workqueues. I don't think it can be used to make
> scsi and iscsi workqueues bound to different cpu.
> 
> apply_workqueue_attrs() is for changing the cpumask of the specific
> workqueue, which can change the cpumask of __WQ_ORDERED workqueue
> (but without __WQ_ORDERED_EXPLICIT).
> 

Yes, you are right. I made a mistake.
Sorry for the noise.

Regards,
Bob

>>
>> 5280 ctx = apply_wqattrs_prepare(wq, wq->unbound_attrs);
>>
>>  }

Re: [PATCH] iio: adc: Specify IOMEM dependency for adi-axi-adc driver

2020-06-30 Thread David Gow

On Tue, Jun 30, 2020 at 6:07 PM Jonathan Cameron
 wrote:
>
> On Tue, 30 Jun 2020 00:05:52 -0700
> David Gow  wrote:
>
> > The Analog Devices AXI ADC driver uses the devm_ioremap_resource
> > function, but does not specify a dependency on IOMEM in Kconfig. This
> > causes a build failure on architectures without IOMEM, for example, UML
> > (notably with make allyesconfig).
> >
> > Fix this by making CONFIG_ADI_AXI_ADC depend on CONFIG_IOMEM.
> >
> > Signed-off-by: David Gow 
> Hi David,
>
> Could you confirm what the build error is?  I thought the stubs added in
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1bcbfbfdeb
> were meant to allow us to avoid having lots of depends on IOMEM lines for the
> few architectures who don't support it.

No worries:
/usr/bin/ld: drivers/iio/adc/adi-axi-adc.o: in function `adi_axi_adc_probe':
torvalds-linux/drivers/iio/adc/adi-axi-adc.c:415: undefined reference
to `devm_platform_ioremap_resource'

Alas, the devm_platform_ioremap_resource function isn't handled by the
UML stubs: it all seems to be in drivers/base/platform.c and
lib/devres.c, behind #ifdef HAS_IOMEM.

In any case, improving IOMEM support for UML (at least for the KUnit
test case, which is my use case) is something I'd like to do. There
are only three drivers[1,2] upstream at the moment which fail to build
as-is, though, so it seemed worth trying to fix them in the meantime.
That being said, I tried just getting rid of the few #ifdef HAS_IOMEMs
around the various devm_*_ioremap functions, and everything seems to
be working... So maybe that's a false dependency given the various
stubs (at least on UML). I used this (hideously hacky) patch:

diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index c0d0a5490ac6..b6f08c88e2b6 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -61,7 +61,7 @@ struct resource *platform_get_resource(struct
platform_device *dev,
}
EXPORT_SYMBOL_GPL(platform_get_resource);

-#ifdef CONFIG_HAS_IOMEM
+#if 1//def CONFIG_HAS_IOMEM
/**
 * devm_platform_get_and_ioremap_resource - call devm_ioremap_resource() for a
 * platform device and get resource
diff --git a/lib/Makefile b/lib/Makefile
index b1c42c10073b..35c21af33b93 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -115,7 +115,7 @@ obj-y += math/ crypto/

obj-$(CONFIG_GENERIC_IOMAP) += iomap.o
obj-$(CONFIG_GENERIC_PCI_IOMAP) += pci_iomap.o
-obj-$(CONFIG_HAS_IOMEM) += iomap_copy.o devres.o
+obj-y += iomap_copy.o devres.o
obj-$(CONFIG_CHECK_SIGNATURE) += check_signature.o
obj-$(CONFIG_DEBUG_LOCKING_API_SELFTESTS) += locking-selftest.o

---

If this seems to work more broadly, I may try to clean it up and post
it for broader review.

Cheers,
-- David

[PATCH] pinctrl: aspeed: Improve debug output

2020-06-30 Thread Joel Stanley

From: Andrew Jeffery 

We need to iterate over each pin in a group for a function and
disable higher priority mux configurations on the pin before finally
muxing the relevant function's signal. With the current debug output it
is hard to track what register output is relevant to which operation, so
break up the actions in the debug output by providing some more context.

Before:

[5.446656] aspeed-g6-pinctrl 1e6e2000.syscon:pinctrl: request pin 37 
(B26) for 1e78.gpio:341
[5.447377] Want SCU414[0x0020]=0x1, got 0x0 from 0x
[5.447854] Want SCU4B4[0x0020]=0x1, got 0x0 from 0x
[5.448340] Want SCU4B4[0x0020]=0x1, got 0x0 from 0x

After:

[5.298053] Muxing pin 37 for GPIO
[5.298294] Disabling signal NRI4 for NRI4
[5.298593] Want SCU414[0x0020]=0x1, got 0x0 from 0x
[5.298983] Disabling signal RGMII4RXD1 for RGMII4
[5.299309] Want SCU4B4[0x0020]=0x1, got 0x0 from 0x
[5.299694] Disabling signal RMII4RXD1 for RMII4
[5.300014] Want SCU4B4[0x0020]=0x1, got 0x0 from 0x
[5.300396] Enabling signal GPIOE5 for GPIOE5
[5.300687] Muxed pin 37 as GPIOE5

Signed-off-by: Andrew Jeffery 
Signed-off-by: Joel Stanley 
---
 drivers/pinctrl/aspeed/pinctrl-aspeed.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/pinctrl/aspeed/pinctrl-aspeed.c 
b/drivers/pinctrl/aspeed/pinctrl-aspeed.c
index b625a657171e..53f3f8aec695 100644
--- a/drivers/pinctrl/aspeed/pinctrl-aspeed.c
+++ b/drivers/pinctrl/aspeed/pinctrl-aspeed.c
@@ -76,6 +76,9 @@ static int aspeed_sig_expr_enable(struct aspeed_pinmux_data 
*ctx,
 {
int ret;
 
+   pr_debug("Enabling signal %s for %s\n", expr->signal,
+expr->function);
+
ret = aspeed_sig_expr_eval(ctx, expr, true);
if (ret < 0)
return ret;
@@ -91,6 +94,9 @@ static int aspeed_sig_expr_disable(struct aspeed_pinmux_data 
*ctx,
 {
int ret;
 
+   pr_debug("Disabling signal %s for %s\n", expr->signal,
+expr->function);
+
ret = aspeed_sig_expr_eval(ctx, expr, true);
if (ret < 0)
return ret;
@@ -229,7 +235,7 @@ int aspeed_pinmux_set_mux(struct pinctrl_dev *pctldev, 
unsigned int function,
const struct aspeed_sig_expr **funcs;
const struct aspeed_sig_expr ***prios;
 
-   pr_debug("Muxing pin %d for %s\n", pin, pfunc->name);
+   pr_debug("Muxing pin %s for %s\n", pdesc->name, pfunc->name);
 
if (!pdesc)
return -EINVAL;
@@ -269,6 +275,9 @@ int aspeed_pinmux_set_mux(struct pinctrl_dev *pctldev, 
unsigned int function,
ret = aspeed_sig_expr_enable(>pinmux, expr);
if (ret)
return ret;
+
+   pr_debug("Muxed pin %s as %s for %s\n", pdesc->name, 
expr->signal,
+expr->function);
}
 
return 0;
@@ -317,6 +326,8 @@ int aspeed_gpio_request_enable(struct pinctrl_dev *pctldev,
if (!prios)
return -ENXIO;
 
+   pr_debug("Muxing pin %s for GPIO\n", pdesc->name);
+
/* Disable any functions of higher priority than GPIO */
while ((funcs = *prios)) {
if (aspeed_gpio_in_exprs(funcs))
@@ -346,14 +357,22 @@ int aspeed_gpio_request_enable(struct pinctrl_dev 
*pctldev,
 * lowest-priority signal type. As such it has no associated
 * expression.
 */
-   if (!expr)
+   if (!expr) {
+   pr_debug("Muxed pin %s as GPIO\n", pdesc->name);
return 0;
+   }
 
/*
 * If GPIO is not the lowest priority signal type, assume there is only
 * one expression defined to enable the GPIO function
 */
-   return aspeed_sig_expr_enable(>pinmux, expr);
+   ret = aspeed_sig_expr_enable(>pinmux, expr);
+   if (ret)
+   return ret;
+
+   pr_debug("Muxed pin %s as %s\n", pdesc->name, expr->signal);
+
+   return 0;
 }
 
 int aspeed_pinctrl_probe(struct platform_device *pdev,
-- 
2.27.0

Re: [regression] TCP_MD5SIG on established sockets

2020-06-30 Thread Herbert Xu

On Tue, Jun 30, 2020 at 07:30:43PM -0700, Eric Dumazet wrote:
>
> I made this clear in the changelog, do we want comments all over the places ?
> Do not get me wrong, we had this bug for years and suddenly this is a
> big deal...

I thought you were adding a new pair of smp_rmb/smp_wmb.  If they
already exist in the code then I agree it's not a big deal.  But
adding a new pair of bogus smp_Xmb's is bad for maintenance.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH 1/3] mm: set page fault address for update_mmu_cache_pmd

2020-06-30 Thread maobibo




On 06/30/2020 06:42 PM, maobibo wrote:
> 
> 
> On 06/30/2020 06:09 PM, Kirill A. Shutemov wrote:
>> On Wed, Jun 24, 2020 at 05:26:30PM +0800, Bibo Mao wrote:
>>> update_mmu_cache_pmd is used to update tlb for the pmd entry by
>>> software. On MIPS system, the tlb entry indexed by page fault
>>> address maybe exists already, only that tlb entry may be small
>>> page, also it may be huge page. Before updating pmd entry with
>>> huge page size, older tlb entry need to be invalidated.
>>>
>>> Here page fault address is passed to function update_mmu_cache_pmd,
>>> rather than pmd huge page start address. The page fault address
>>> can be used for invalidating older tlb entry.
>>>
>>> Signed-off-by: Bibo Mao 
>>> ---
>>>  arch/mips/include/asm/pgtable.h | 9 +
>>>  mm/huge_memory.c| 7 ---
>>>  mm/memory.c | 2 +-
>>>  3 files changed, 14 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/mips/include/asm/pgtable.h 
>>> b/arch/mips/include/asm/pgtable.h
>>> index dd7a0f5..bd81661 100644
>>> --- a/arch/mips/include/asm/pgtable.h
>>> +++ b/arch/mips/include/asm/pgtable.h
>>> @@ -554,11 +554,20 @@ static inline void update_mmu_cache(struct 
>>> vm_area_struct *vma,
>>>  #define__HAVE_ARCH_UPDATE_MMU_TLB
>>>  #define update_mmu_tlb update_mmu_cache
>>>  
>>> +extern void local_flush_tlb_page(struct vm_area_struct *vma,
>>> +   unsigned long page);
>>>  static inline void update_mmu_cache_pmd(struct vm_area_struct *vma,
>>> unsigned long address, pmd_t *pmdp)
>>>  {
>>> pte_t pte = *(pte_t *)pmdp;
>>>  
>>> +   /*
>>> +* If pmd_none is true, older tlb entry will be normal page.
>>> +* here to invalidate older tlb entry indexed by address
>>> +* parameter address must be page fault address rather than
>>> +* start address of pmd huge page
>>> +*/
>>> +   local_flush_tlb_page(vma, address);
>>
>> Can't say I follow what is going on.
>>
>> Why local? What happens on SMP?
>>
>> And don't you want to flush PMD_SIZE range around the address?
> There exists two conditions:
> 1. The address is accessed for the first time, there will be one tlb entry 
> with normal page
>size, and privilege for the tlb entry is none. If new tlb entry wants to 
> be added with
>huge page size, the older tlb entry needs to be removed.  Local flushing 
> is enough, if there
>are smp threads running, there will be page fault handing since privilege 
> level is none. During
>page fault handling, the other threads will do the same work, flush local 
> entry, update new entry
>with huge page size.
> 
> 2. It is not accessed by the first time, there exists old tlb entry with huge 
> page such as COW scenery.
>local_flush_tlb_page is not necessary here, old tlb with huge page will be 
> replace with new tlb
>in function __update_tlb.
> 
> For PMD_SIZE range around the address, there exists one tlb entry with huge 
> page size, or one tlb entry
> with normal page size and zero privilege. It is impossible that there exists 
> two or more tlb entries
> with normal page within PMD_SIZE range, so we do not need flush pmd range 
> size, just flush one tlb entry
> is ok.
Sorry for the noise, please discard the patch.

Actually there exists two or more tlb entries with normal page within 
PMD_SIZE range. If multiple threads run on UP or one CPU, these threads
are access the same huge page but different normal pages. Page fault
happens on thread1 and thread1 is sched out during page fault handing.
thread2 is sched in and page fault happens again, there will be two
tlb entries with normal page. This problem exists even without the patch.


> 
> regards
> bibo,mao
> 
>>
>>> __update_tlb(vma, address, pte);
>>>  }
>>>  
>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>> index 78c84be..0f9187b 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -780,6 +780,7 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, 
>>> unsigned long addr,
>>> pgtable_t pgtable)
>>>  {
>>> struct mm_struct *mm = vma->vm_mm;
>>> +   unsigned long start = addr & PMD_MASK;
>>> pmd_t entry;
>>> spinlock_t *ptl;
>>>  
>>> @@ -792,7 +793,7 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, 
>>> unsigned long addr,
>>> }
>>> entry = pmd_mkyoung(*pmd);
>>> entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
>>> -   if (pmdp_set_access_flags(vma, addr, pmd, entry, 1))
>>> +   if (pmdp_set_access_flags(vma, start, pmd, entry, 1))
>>> update_mmu_cache_pmd(vma, addr, pmd);
>>> }
>>>  
>>> @@ -813,7 +814,7 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, 
>>> unsigned long addr,
>>> pgtable = NULL;
>>> }
>>>  
>>> -   set_pmd_at(mm, addr, pmd, entry);
>>> +   set_pmd_at(mm, start, pmd, entry);
>>> update_mmu_cache_pmd(vma, addr, pmd);
>>>

[PATCH] cpufreq: CPPC: fix some unreasonable codes in cppc_cpufreq_perf_to_khz()

2020-06-30 Thread Xin Hao

Signed-off-by: Xin Hao 
---
 drivers/cpufreq/cppc_cpufreq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 257d726a4456..444ee76a6bae 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -161,7 +161,7 @@ static unsigned int cppc_cpufreq_perf_to_khz(struct 
cppc_cpudata *cpu,
if (!max_khz)
max_khz = cppc_get_dmi_max_khz();
mul = max_khz;
-   div = cpu->perf_caps.highest_perf;
+   div = caps->highest_perf;
}
return (u64)perf * mul / div;
 }
--
2.24.1

Re: [PATCH net] hinic: fix passing non negative value to ERR_PTR

2020-06-30 Thread luobin (L)

On 2020/7/1 0:20, Jakub Kicinski wrote:
> On Tue, 30 Jun 2020 14:35:54 +0800 Luo bin wrote:
>> get_dev_cap and set_resources_state functions may return a positive
>> value because of hardware failure, and the positive return value
>> can not be passed to ERR_PTR directly.
>>
>> Fixes: 7dd29ee12865 ("net-next/hinic: add sriov feature support")
>> Signed-off-by: Luo bin 
> 
> Fixes tag: Fixes: 7dd29ee12865 ("net-next/hinic: add sriov feature support")
> Has these problem(s):
>   - Subject does not match target commit subject
> Just use
>   git log -1 --format='Fixes: %h ("%s")'
> .
> 
Will fix. Thanks.

Re: [PATCH] usb: chipidea: fix ci_irq for role-switching use-case

2020-06-30 Thread Peter Chen

On 20-06-30 11:59:49, Philippe Schenker wrote:
> On Tue, 2020-06-30 at 00:43 +, Peter Chen wrote:
> > On 20-06-29 10:04:13, Philippe Schenker wrote:
> > > On Mon, 2020-06-29 at 07:26 +, Peter Chen wrote:
> > > > On 20-06-26 13:03:11, Philippe Schenker wrote:
> > > > > If the hardware is in low-power-mode and one plugs in device or
> > > > > host
> > > > > it did not switch the mode due to the early exit out of the
> > > > > interrupt.
> > > > 
> > > > Do you mean there is no coming call for role-switch? Could you
> > > > please
> > > > share
> > > > your dts changes? Try below patch:
> > > 
> > > Here are my DTS changes:
> > > 
> > > diff --git a/arch/arm/boot/dts/imx7-colibri-eval-v3.dtsi
> > > b/arch/arm/boot/dts/imx7-colibri-eval-v3.dtsi
> > > index 97601375f2640..c424f707a1afa 100644
> > > --- a/arch/arm/boot/dts/imx7-colibri-eval-v3.dtsi
> > > +++ b/arch/arm/boot/dts/imx7-colibri-eval-v3.dtsi
> > > @@ -13,6 +13,13 @@
> > > stdout-path = "serial0:115200n8";
> > > };
> > >  
> > > +   extcon_usbc_det: usbc_det {
> > > +   compatible = "linux,extcon-usb-gpio";
> > > +   id-gpio = < 14 GPIO_ACTIVE_HIGH>;
> > > +   pinctrl-names = "default";
> > > +   pinctrl-0 = <_usbc_det>;
> > > +   };
> > > +
> > > /* fixed crystal dedicated to mpc258x */
> > > clk16m: clk16m {
> > > compatible = "fixed-clock";
> > > @@ -174,6 +181,7 @@
> > >  };
> > >  
> > >   {
> > > +   extcon = <_usbc_det>, <_usbc_det>;
> > 
> > If you have only ID extcon, but no VBUS extcon, you only need to
> > add only phandle, see dt-binding for detail please.
> 
> You where right again! Thanks, this actually solves the RNDIS issue for
> our colibri-imx7 board:
> 
> +   extcon = <0>, <_usbc_det>;
> 
> Howevever on this iMX7 board we have VBUS hooked up to the SoC, that's
> why it works only with ID.
> 
> On Colibri-iMX6ULL we do not have VBUS hooked up.

So, there is no any events for connecting to Host? If it is, the
workaround for this board is disable runtime pm. And you only need to
write one extcon phandle for ID since you have external event for ID,
but no event for VBUS. ID event is not the same with VBUS, for example,
if you plug cable into host, you will not get the ID event, you could
only get VBUS event if there is an event (eg, through GPIO) for it.

Peter

> So device/host
> switching works only with 'extcon = <_usbc_det>,
> <_usbc_det>;' but then RNDIS and also a normal thumb-drive does
> not work. How could I work around this fact? A dummy-gpio that would
> always read "high" for vbus would be a solution for me.
> 
> Philippe
> 

-- 

Thanks,
Peter Chen

Re: [PATCH 4/4] iommu/vt-d: Add page response ops support

2020-06-30 Thread Lu Baolu


Hi Kevin,

On 6/30/20 2:19 PM, Tian, Kevin wrote:

From: Lu Baolu 
Sent: Sunday, June 28, 2020 8:34 AM

After a page request is handled, software must response the device which
raised the page request with the handling result. This is done through
the iommu ops.page_response if the request was reported to outside of
vendor iommu driver through iommu_report_device_fault(). This adds the
VT-d implementation of page_response ops.

Co-developed-by: Jacob Pan 
Signed-off-by: Jacob Pan 
Co-developed-by: Liu Yi L 
Signed-off-by: Liu Yi L 
Signed-off-by: Lu Baolu 
---
  drivers/iommu/intel/iommu.c |  1 +
  drivers/iommu/intel/svm.c   | 73
+
  include/linux/intel-iommu.h |  3 ++
  3 files changed, 77 insertions(+)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index de17952ed133..7eb29167e8f9 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -6057,6 +6057,7 @@ const struct iommu_ops intel_iommu_ops = {
.sva_bind   = intel_svm_bind,
.sva_unbind = intel_svm_unbind,
.sva_get_pasid  = intel_svm_get_pasid,
+   .page_response  = intel_svm_page_response,
  #endif
  };

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 4800bb6f8794..003ea9579632 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -1092,3 +1092,76 @@ int intel_svm_get_pasid(struct iommu_sva *sva)

return pasid;
  }
+
+int intel_svm_page_response(struct device *dev,
+   struct iommu_fault_event *evt,
+   struct iommu_page_response *msg)
+{
+   struct iommu_fault_page_request *prm;
+   struct intel_svm_dev *sdev;
+   struct intel_iommu *iommu;
+   struct intel_svm *svm;
+   bool private_present;
+   bool pasid_present;
+   bool last_page;
+   u8 bus, devfn;
+   int ret = 0;
+   u16 sid;
+
+   if (!dev || !dev_is_pci(dev))
+   return -ENODEV;
+
+   iommu = device_to_iommu(dev, , );
+   if (!iommu)
+   return -ENODEV;


move to the place when iommu is referenced. This place is too early.


I took this as a sanity check. If the device has no iommu backed, we
should consider it as an invalid input.




+
+   if (!msg || !evt)
+   return -EINVAL;
+
+   mutex_lock(_mutex);
+
+   prm = >fault.prm;
+   sid = PCI_DEVID(bus, devfn);
+   pasid_present = prm->flags &
IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
+   private_present = prm->flags &
IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA;
+   last_page = prm->flags &
IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE;
+
+   if (pasid_present) {
+   /* VT-d supports devices with full 20 bit PASIDs only */
+   if (pci_max_pasids(to_pci_dev(dev)) != PASID_MAX) {
+   ret = -EINVAL;
+   goto out;
+   }


shouldn't we check prm->pasid here? Above is more reasonable to be
checked when page request is reported.


Yes. I will check the pasid in both places.




+
+   ret = pasid_to_svm_sdev(dev, prm->pasid, , );
+   if (ret || !sdev)


if sdev==NULL, suppose an error (-ENODEV) should be returned here?


Yes. Good catch. I should return an error if sdev==NULL.




+   goto out;
+   }
+
+   /*
+* Per VT-d spec. v3.0 ch7.7, system software must respond
+* with page group response if private data is present (PDP)
+* or last page in group (LPIG) bit is set. This is an
+* additional VT-d feature beyond PCI ATS spec.


feature->requirement


Agreed.



Thanks
Kevin


Best regards,
baolu




+*/
+   if (last_page || private_present) {
+   struct qi_desc desc;
+
+   desc.qw0 = QI_PGRP_PASID(prm->pasid) | QI_PGRP_DID(sid)
|
+   QI_PGRP_PASID_P(pasid_present) |
+   QI_PGRP_PDP(private_present) |
+   QI_PGRP_RESP_CODE(msg->code) |
+   QI_PGRP_RESP_TYPE;
+   desc.qw1 = QI_PGRP_IDX(prm->grpid) |
QI_PGRP_LPIG(last_page);
+   desc.qw2 = 0;
+   desc.qw3 = 0;
+   if (private_present)
+   memcpy(, prm->private_data,
+  sizeof(prm->private_data));
+
+   qi_submit_sync(iommu, , 1, 0);
+   }
+out:
+   mutex_unlock(_mutex);
+   return ret;
+}
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index fc2cfc3db6e1..bf6009a344f5 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -741,6 +741,9 @@ struct iommu_sva *intel_svm_bind(struct device
*dev, struct mm_struct *mm,
 void *drvdata);
  void intel_svm_unbind(struct iommu_sva *handle);
  int intel_svm_get_pasid(struct iommu_sva *handle);
+int

Re: [PATCH] mm/vmscan: restore zone_reclaim_mode ABI

2020-06-30 Thread Andrew Morton

On Mon, 29 Jun 2020 16:37:37 -0700 Dave Hansen  wrote:

> On 6/29/20 4:30 PM, Baoquan He wrote:
> >> The only way I can plausibly think of "cleaning up" the RECLAIM_ZONE bit
> >> would be to raise our confidence that it is truly unused.  That takes
> >> time, and probably a warning if we see it being set.  If we don't run
> >> into anybody setting it or depending on it being set in a few years, we
> >> can remove it.
> > So adding the old bit back for compatibility looks good, thanks.
> > 
> > Then we have to be very careful when adding and reviewing new
> > interface introducing, should not leave one which might be used
> > in the future.
> > 
> > In fact, RECLAIM_ZONE is not completely useless. At least, when the old
> > bit 0 is set, it may enter into node_reclaim() in get_page_from_freelist(),
> > that makes it like a switch.
> > 
> > get_page_from_freelist {
> > 
> > ...
> > if (node_reclaim_mode == 0 ||   
> >   
> > 
> > !zone_allows_reclaim(ac->preferred_zoneref->zone, zone))
> > continue;
> > ...
> > }
> 
> Oh, that's a very good point.  There are a couple of those around.  Let
> me circle back and update the documentation and the variable name.  I'll
> send out another version.

Was the omission of cc:stable deliberate?

Re: [PATCH v9 0/8] KVM: Add virtualization support of split lock detection

2020-06-30 Thread Xiaoyao Li


Ping for comments.

On 5/9/2020 7:05 PM, Xiaoyao Li wrote:

This series aims to add the virtualization of split lock detection in
KVM.

Due to the fact that split lock detection is tightly coupled with CPU
model and CPU model is configurable by host VMM, we elect to use
paravirt method to expose and enumerate it for guest.

Changes in v9
  - rebase to v5.7-rc4
  - Add one patch to rename TIF_SLD to TIF_SLD_DISABLED;
  - Add one patch to remove bogus case in handle_guest_split_lock;
  - Introduce flag X86_FEATURE_SPLIT_LOCK_DETECT_FATAL and thus drop
sld_state;
  - Use X86_FEATURE_SPLIT_LOCK_DETECT and X86_FEATURE_SPLIT_LOCK_DETECT_FATAL
to determine the SLD state of host;
  - Introduce split_lock_virt_switch() and two wrappers for KVM instead
of sld_update_to();
  - Use paravirt to expose and enumerate split lock detection for guest;
  - Split lock detection can be exposed to guest when host is sld_fatal,
even though host is SMT available.

Changes in v8:
https://lkml.kernel.org/r/20200414063129.133630-1-xiaoyao...@intel.com
  - rebase to v5.7-rc1.
  - basic enabling of split lock detection already merged.
  - When host is sld_warn and nosmt, load guest's sld bit when in KVM
context, i.e., between vmx_prepare_switch_to_guest() and before
vmx_prepare_switch_to_host(), KVM uses guest sld setting.

Changes in v7:
https://lkml.kernel.org/r/20200325030924.132881-1-xiaoyao...@intel.com
  - only pick patch 1 and patch 2, and hold all the left.
  - Update SLD bit on each processor based on sld_state.

Changes in v6:
https://lkml.kernel.org/r/20200324151859.31068-1-xiaoyao...@intel.com
  - Drop the sld_not_exist flag and use X86_FEATURE_SPLIT_LOCK_DETECT to
check whether need to init split lock detection. [tglx]
  - Use tglx's method to verify the existence of split lock detectoin.
  - small optimization of sld_update_msr() that the default value of
msr_test_ctrl_cache has split_lock_detect bit cleared.
  - Drop the patch3 in v5 that introducing kvm_only option. [tglx]
  - Rebase patch4-8 to kvm/queue.
  - use the new kvm-cpu-cap to expose X86_FEATURE_CORE_CAPABILITIES in
Patch 6.

Changes in v5:
https://lkml.kernel.org/r/20200315050517.127446-1-xiaoyao...@intel.com
  - Use X86_FEATURE_SPLIT_LOCK_DETECT flag in kvm to ensure split lock
detection is really supported.
  - Add and export sld related helper functions in their related usecase
kvm patches.

Xiaoyao Li (8):
   x86/split_lock: Rename TIF_SLD to TIF_SLD_DISABLED
   x86/split_lock: Remove bogus case in handle_guest_split_lock()
   x86/split_lock: Introduce flag X86_FEATURE_SLD_FATAL and drop
 sld_state
   x86/split_lock: Introduce split_lock_virt_switch() and two wrappers
   x86/kvm: Introduce paravirt split lock detection enumeration
   KVM: VMX: Enable MSR TEST_CTRL for guest
   KVM: VMX: virtualize split lock detection
   x86/split_lock: Enable split lock detection initialization when
 running as an guest on KVM

  Documentation/virt/kvm/cpuid.rst | 29 +++
  arch/x86/include/asm/cpu.h   | 35 ++
  arch/x86/include/asm/cpufeatures.h   |  1 +
  arch/x86/include/asm/thread_info.h   |  6 +--
  arch/x86/include/uapi/asm/kvm_para.h |  8 ++--
  arch/x86/kernel/cpu/intel.c  | 59 ---
  arch/x86/kernel/kvm.c|  3 ++
  arch/x86/kernel/process.c|  2 +-
  arch/x86/kvm/cpuid.c |  6 +++
  arch/x86/kvm/vmx/vmx.c   | 72 +---
  arch/x86/kvm/vmx/vmx.h   |  3 ++
  arch/x86/kvm/x86.c   |  6 ++-
  arch/x86/kvm/x86.h   |  7 +++
  13 files changed, 196 insertions(+), 41 deletions(-)

How do you investigate the cause of total hang? Is there some information that I should pay attention to in order to get some hint?

2020-06-30 Thread 孙世龙 sunshilong

Hi, list

My x86 machine(linux4.19) sometimes hangs, suddenly not responding in
any way to the mouse or the keyboard.

How can I investigate why it hung up? Is there extra information I can
find for a clue? Is there anything less drastic than power-off to get
some kind of action, if only some limited shell or just beeps,
but might give a clue?

Thank you for your attention to this matter.

Re: [RFC PATCH 0/6] Support raw event and DT for perf on RISC-V

2020-06-30 Thread Alan Kao

Tue, Jun 30, 2020 at 06:02:43PM -0700, Atish Patra wrote:
> On Tue, Jun 30, 2020 at 5:52 PM Alan Kao  wrote:
> >
> > On Mon, Jun 29, 2020 at 11:19:09AM +0800, Zong Li wrote:
> > > This patch set adds raw event support on RISC-V. In addition, we
> > > introduce the DT mechanism to make our perf more generic and common.
> > >
> > > Currently, we set the hardware events by writing the mhpmeventN CSRs, it
> > > would raise an illegal instruction exception and trap into m-mode to
> > > emulate event selector CSRs access. It doesn't make sense because we
> > > shouldn't write the m-mode CSRs in s-mode. Ideally, we should set event
> > > selector through standard SBI call or the shadow CSRs of s-mode. We have
> > > prepared a proposal of a new SBI extension, called "PMU SBI extension",
> > > but we also discussing the feasibility of accessing these PMU CSRs on
> > > s-mode at the same time, such as delegation mechanism, so I was
> > > wondering if we could use SBI calls first and make the PMU SBI extension
> > > as legacy when s-mode access mechanism is accepted by Foundation? or
> > > keep the current situation to see what would happen in the future.
> > >
> > > This patch set also introduces the DT mechanism, we don't want to add too
> > > much platform-dependency code in perf like other architectures, so we
> > > put the mapping of generic hardware events to DT, then we can easy to
> > > transfer generic hardware events to vendor's own hardware events without
> > > any platfrom-dependency stuff in our perf.
> > >
> > > Zong Li (6):
> > >   dt-bindings: riscv: Add YAML documentation for PMU
> > >   riscv: dts: sifive: Add DT support for PMU
> > >   riscv: add definition of hpmcounter CSRs
> > >   riscv: perf: Add raw event support
> > >   riscv: perf: introduce DT mechanism
> > >   riscv: remove PMU menu of Kconfig
> > >
> >
> > DT-based PMU registration looks good to me. Together with Anup's feedback,
> > we can anticipate that the following items will be:
> >
> > - rewrite RISC-V PMU to a platform driver
> > - propose SBI PMU extention
> > - fixes: RV32 counter access, namings, etc.
> >
> > Yes, all are good directions towards better counting (`perf stat`) function.
> > But as the original author of RISC-V perf port, please allow me to address
> > the fundamental problems of RISC-V perf, again [0][1][2][3], that the 
> > sampling
> > (`perf record`) function never earned enough respect.  Counting gives you a
> > shallow view regarding an application, while sampling demystifies one for 
> > you.
> >
> > The problems are three-fold
> > (1) Interrupt
> > Sampling in perf requires that a HPM raises an interrupt when it overflows.
> > Making RISC-V perf platform driver or not has nothing to do with this.  This
> > requires more discussions in TGs.
> > (2) S-mode access to PMU CSRs
> > This is also addressed in this patch set but to me, it is kind of like a
> > SBI-solves-them-all mindset to me.  Perf event is for performance monitoring
> > thus we should eliminate any possible overhead if we can.  Setting event 
> > masks
> > through SBI calls for counting maybe OK, but if we really take sampling and
> > interrupt handling into consideration, it is questionable if it is still a
> > viable way.
> > (3) Registers, registers, registers
> > There is just no enough CSR/function for perf sampling. The previous 
> > proposal
> > explains why [2].
> >
> > Perf sampling is off-topic but somehow related, so I bring it up here just
> > for your information.
> >
> > As this patch set goes v2, the PMU porting guide in [0] should be removed 
> > since
> > it contains no useful information anymore.
> >
> > [0] Documentation/riscv/pmu.rst
> > [1] https://www.youtube.com/watch?v=Onvlcl4e2IU
> > [2] https://github.com/riscv/riscv-isa-manual/issues/402
> > This proposal has been posted in Privileged Spec Task Group, in
> > https://lists.riscv.org/g/tech-privileged-archive/message/488?p=,,,20,0,0,0::Created,,Proposal,20,2,40,32306071
> > but never receive any feedback.
> > [3] https://lists.riscv.org/g/tech-unixplatformspec/message/84
> > I intended to discuss [2] in the Unixplatform Spec Task Group at the
> > online meeting, but obviously people were too busy knowing who the new
> > RISC-V CTO is and what he has done to even follow the agenda.
> >
> 
> Sorry. The last meeting's agenda was derailed for numerous reasons.
> Are you okay with discussing this during the next meeting ?
> I have not scheduled one yet but will probably schedule it on next
> Wednesday (8th July) if there is no objection.
> I can check with Anup if he can present the SBI PMU extension as well.

Thanks for the oppertunity. 
But I don't think that the time is enough for every important topic to be
covered.  What I provided in the previous citation [2] is a proposal,
which need expert to judge and critique after thorough reading.

The TG Chair should decide the priority of the items.  If there is any chance
for our proposal, I can give brief

Re: [PATCH] cpuidle: change enter_s2idle() prototype

2020-06-30 Thread Neal Liu

On Mon, 2020-06-29 at 17:17 +0200, Rafael J. Wysocki wrote:
> On Monday, June 29, 2020 11:05:40 AM CEST Neal Liu wrote:
> > Control Flow Integrity(CFI) is a security mechanism that disallows
> > changes to the original control flow graph of a compiled binary,
> > making it significantly harder to perform such attacks.
> > 
> > init_state_node() assigns same function pointer to idle_state->enter
> > and idle_state->enter_s2idle. This definitely causes CFI failure
> > when calling either enter() or enter_s2idle().
> > 
> > Align enter_s2idle() with enter() function prototype to fix CFI
> > failure.
> 
> That needs to be documented somewhere close to the definition of the
> callbacks in question.
> 
> Otherwise it is completely unclear why this is a good idea.
> 

The problem is, init_state_mode() assign same function callback to
different function pointer declarations.

static int init_state_node(struct cpuidle_state *idle_state,
   const struct of_device_id *matches,
   struct device_node *state_node)
{
...
idle_state->enter = match_id->data;
...
idle_state->enter_s2idle = match_id->data;
}

Function declarations:

struct cpuidle_state {
...
int (*enter)(struct cpuidle_device *dev,
struct cpuidle_driver *drv,
int index);

void (*enter_s2idle) (struct cpuidle_device *dev,
  struct cpuidle_driver *drv,
  int index);
};

In this case, either enter() or enter_s2idle() would cause CFI check
failed since they use same callee.

We try to align function prototype of enter() since it needs return
value for some use cases. The return value of enter_s2idle() is no need
currently.


> > Signed-off-by: Neal Liu 
> > ---
> >  drivers/acpi/processor_idle.c   |6 --
> >  drivers/cpuidle/cpuidle-tegra.c |8 +---
> >  drivers/idle/intel_idle.c   |6 --
> >  include/linux/cpuidle.h |6 +++---
> >  4 files changed, 16 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> > index 75534c5..6ffb6c9 100644
> > --- a/drivers/acpi/processor_idle.c
> > +++ b/drivers/acpi/processor_idle.c
> > @@ -655,8 +655,8 @@ static int acpi_idle_enter(struct cpuidle_device *dev,
> > return index;
> >  }
> >  
> > -static void acpi_idle_enter_s2idle(struct cpuidle_device *dev,
> > -  struct cpuidle_driver *drv, int index)
> > +static int acpi_idle_enter_s2idle(struct cpuidle_device *dev,
> > + struct cpuidle_driver *drv, int index)
> >  {
> > struct acpi_processor_cx *cx = per_cpu(acpi_cstate[index], dev->cpu);
> >  
> > @@ -674,6 +674,8 @@ static void acpi_idle_enter_s2idle(struct 
> > cpuidle_device *dev,
> > }
> > }
> > acpi_idle_do_entry(cx);
> > +
> > +   return 0;
> >  }
> >  
> >  static int acpi_processor_setup_cpuidle_cx(struct acpi_processor *pr,
> > diff --git a/drivers/cpuidle/cpuidle-tegra.c 
> > b/drivers/cpuidle/cpuidle-tegra.c
> > index 1500458..a12fb14 100644
> > --- a/drivers/cpuidle/cpuidle-tegra.c
> > +++ b/drivers/cpuidle/cpuidle-tegra.c
> > @@ -253,11 +253,13 @@ static int tegra_cpuidle_enter(struct cpuidle_device 
> > *dev,
> > return err ? -1 : index;
> >  }
> >  
> > -static void tegra114_enter_s2idle(struct cpuidle_device *dev,
> > - struct cpuidle_driver *drv,
> > - int index)
> > +static int tegra114_enter_s2idle(struct cpuidle_device *dev,
> > +struct cpuidle_driver *drv,
> > +int index)
> >  {
> > tegra_cpuidle_enter(dev, drv, index);
> > +
> > +   return 0;
> >  }
> >  
> >  /*
> > diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> > index f449584..b178da3 100644
> > --- a/drivers/idle/intel_idle.c
> > +++ b/drivers/idle/intel_idle.c
> > @@ -175,13 +175,15 @@ static __cpuidle int intel_idle(struct cpuidle_device 
> > *dev,
> >   * Invoked as a suspend-to-idle callback routine with frozen user space, 
> > frozen
> >   * scheduler tick and suspended scheduler clock on the target CPU.
> >   */
> > -static __cpuidle void intel_idle_s2idle(struct cpuidle_device *dev,
> > -   struct cpuidle_driver *drv, int index)
> > +static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev,
> > +  struct cpuidle_driver *drv, int index)
> >  {
> > unsigned long eax = flg2MWAIT(drv->states[index].flags);
> > unsigned long ecx = 1; /* break on interrupt flag */
> >  
> > mwait_idle_with_hints(eax, ecx);
> > +
> > +   return 0;
> >  }
> >  
> >  /*
> > diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
> > index ec2ef63..bee10c0 100644
> > --- a/include/linux/cpuidle.h
> > +++ b/include/linux/cpuidle.h
> > @@ -66,9 +66,9 @@ struct cpuidle_state {

Re: [regression] TCP_MD5SIG on established sockets

2020-06-30 Thread Joe Perches

On Tue, 2020-06-30 at 19:30 -0700, Eric Dumazet wrote:
> On Tue, Jun 30, 2020 at 7:23 PM Herbert Xu  
> wrote:
> > On Tue, Jun 30, 2020 at 07:17:46PM -0700, Eric Dumazet wrote:
> > > The main issue of the prior code was the double read of key->keylen in
> > > tcp_md5_hash_key(), not that few bytes could change under us.
> > > 
> > > I used smp_rmb() to ease backports, since old kernels had no
> > > READ_ONCE()/WRITE_ONCE(), but ACCESS_ONCE() instead.
> > 
> > If it's the double-read that you're protecting against, you should
> > just use barrier() and the comment should say so too.
> 
> I made this clear in the changelog, do we want comments all over the places ?

Having to run git for every line of code isn't great.

Comments in code is better than comments in changelogs.

Re: [PATCH v12 00/11] Guest Last Branch Recording Enabling

2020-06-30 Thread Like Xu


Ping friendly.

If there is room for improvement, please let me know.

On 2020/6/23 21:13, Like Xu wrote:

On 2020/6/13 16:09, Like Xu wrote:

Hi all,

Please help review this new version for the Kenrel 5.9 release.

Now, you may apply the last two qemu-devel patches to the upstream
qemu and try the guest LBR feature with '-cpu host' command line.

v11->v12 Changelog:
- apply "Signed-off-by" form PeterZ and his codes for the perf subsystem;
- add validity checks before expose LBR via MSR_IA32_PERF_CAPABILITIES;
- refactor MSR_IA32_DEBUGCTLMSR emulation with validity check;
- reorder "perf_event_attr" fields according to how they're declared;
- replace event_is_oncpu() with "event->state" check;
- make LBR emualtion specific to vmx rather than x86 generic;
- move pass-through LBR code to vmx.c instead of pmu_intel.c;
- add vmx_lbr_en/disable_passthrough layer to make code readable;
- rewrite pmu availability check with vmx_passthrough_lbr_msrs();

You may check more details in each commit.

Previous:
https://lore.kernel.org/kvm/20200514083054.62538-1-like...@linux.intel.com/

---

...


Wei Wang (1):
  perf/x86: Fix variable types for LBR registers > Like Xu (10):
   perf/x86/core: Refactor hw->idx checks and cleanup
   perf/x86/lbr: Add interface to get LBR information
   perf/x86: Add constraint to create guest LBR event without hw counter
   perf/x86: Keep LBR records unchanged in host context for guest usage


Hi Peter,
Would you like to add "Acked-by" to the first three perf patches ?


   KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES
   KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion
   KVM: vmx/pmu: Pass-through LBR msrs when guest LBR event is scheduled
   KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI
   KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation
   KVM: vmx/pmu: Release guest LBR event via lazy release mechanism



Hi Paolo,
Would you like to take a moment to review the KVM part for this feature ?

Thanks,
Like Xu



Qemu-devel:
   target/i386: add -cpu,lbr=true support to enable guest LBR

  arch/x86/events/core.c    |  26 +--
  arch/x86/events/intel/core.c  | 109 -
  arch/x86/events/intel/lbr.c   |  51 +-
  arch/x86/events/perf_event.h  |   8 +-
  arch/x86/include/asm/perf_event.h |  34 +++-
  arch/x86/kvm/pmu.c    |  12 +-
  arch/x86/kvm/pmu.h    |   5 +
  arch/x86/kvm/vmx/capabilities.h   |  23 ++-
  arch/x86/kvm/vmx/pmu_intel.c  | 253 +-
  arch/x86/kvm/vmx/vmx.c    |  86 +-
  arch/x86/kvm/vmx/vmx.h    |  17 ++
  arch/x86/kvm/x86.c    |  13 --
  12 files changed, 559 insertions(+), 78 deletions(-)

[PATCH v2] 9p: retrieve fid from file when file instance exist.

2020-06-30 Thread Jianyong Wu

In the current setattr implementation in 9p, fid is always retrieved
from dentry no matter file instance exists or not. There may be
some info related to opened file instance dropped. so it's better
to retrieve fid from file instance if file instance is passed to setattr.

for example:
fd=open("tmp", O_RDWR);
ftruncate(fd, 10);

The file context related with fd will be lost as fid is always
retrieved from dentry, then the backend can't get the info of
file context. It is against the original intention of user and
may lead to bug.

Signed-off-by: Jianyong Wu 
---
 fs/9p/vfs_inode.c  | 6 +-
 fs/9p/vfs_inode_dotl.c | 6 +-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index c9255d399917..b33574d347fa 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -1100,7 +1100,11 @@ static int v9fs_vfs_setattr(struct dentry *dentry, 
struct iattr *iattr)
 
retval = -EPERM;
v9ses = v9fs_dentry2v9ses(dentry);
-   fid = v9fs_fid_lookup(dentry);
+   if (iattr->ia_valid & ATTR_FILE) {
+   fid = iattr->ia_file->private_data;
+   WARN_ON(!fid);
+   } else
+   fid = v9fs_fid_lookup(dentry);
if(IS_ERR(fid))
return PTR_ERR(fid);
 
diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
index 60328b21c5fb..ae714f1a630f 100644
--- a/fs/9p/vfs_inode_dotl.c
+++ b/fs/9p/vfs_inode_dotl.c
@@ -560,7 +560,11 @@ int v9fs_vfs_setattr_dotl(struct dentry *dentry, struct 
iattr *iattr)
p9attr.mtime_sec = iattr->ia_mtime.tv_sec;
p9attr.mtime_nsec = iattr->ia_mtime.tv_nsec;
 
-   fid = v9fs_fid_lookup(dentry);
+   if (iattr->ia_valid & ATTR_FILE) {
+   fid = iattr->ia_file->private_data;
+   WARN_ON(!fid);
+   } else
+   fid = v9fs_fid_lookup(dentry);
if (IS_ERR(fid))
return PTR_ERR(fid);
 
-- 
2.17.1

Re: [PATCH] lib: Extend kstrtobool() to accept "true"/"false"

2020-06-30 Thread Andrew Morton

On Mon, 29 Jun 2020 14:09:38 +0200 Pavel Machek  wrote:

> > Extend the strings recognised by kstrtobool() to cover:
> > 
> >   - 1/0
> >   - y/n
> >   - yes/no  (new)
> >   - t/f (new)
> >   - true/false  (new)
> >   - on/off
> 
> Is it good idea to add more values there? It is easy to do, but... we don't 
> want
> people to use this by hand, and ideally everyone would just use 1/0...
> 
> I also see potential for confusion... as in echo off > enable_off_mode (ok, 
> this is
> with existing code, but...)
> 
> Plus, if programs learn to do "echo true > ..." they will stop working on 
> older kernels.

I'm inclined to agree with this, It is indeed an invitation to write
non-back-compatible userspace and it simply makes the kernel interface
more complex.

Re: [PATCH] mm/sparse: only sub-section aligned range would be populated

2020-06-30 Thread Wei Yang

On Tue, Jun 30, 2020 at 02:52:35PM +0200, David Hildenbrand wrote:
>On 30.06.20 04:14, Wei Yang wrote:
>> There are two code path which invoke __populate_section_memmap()
>> 
>>   * sparse_init_nid()
>>   * sparse_add_section()
>> 
>> For both case, we are sure the memory range is sub-section aligned.
>> 
>>   * we pass PAGES_PER_SECTION to sparse_init_nid()
>>   * we check range by check_pfn_span() before calling
>> sparse_add_section()
>> 
>> Also, the counterpart of __populate_section_memmap(), we don't do such
>> calculation and check since the range is checked by check_pfn_span() in
>> __remove_pages().
>> 
>> Clear the calculation and check to keep it simple and comply with its
>> counterpart.
>> 
>> Signed-off-by: Wei Yang 
>> ---
>>  mm/sparse-vmemmap.c | 16 ++--
>>  1 file changed, 2 insertions(+), 14 deletions(-)
>> 
>> diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
>> index 0db7738d76e9..24b01ebae111 100644
>> --- a/mm/sparse-vmemmap.c
>> +++ b/mm/sparse-vmemmap.c
>> @@ -247,20 +247,8 @@ int __meminit vmemmap_populate_basepages(unsigned long 
>> start,
>>  struct page * __meminit __populate_section_memmap(unsigned long pfn,
>>  unsigned long nr_pages, int nid, struct vmem_altmap *altmap)
>>  {
>> -unsigned long start;
>> -unsigned long end;
>> -
>> -/*
>> - * The minimum granularity of memmap extensions is
>> - * PAGES_PER_SUBSECTION as allocations are tracked in the
>> - * 'subsection_map' bitmap of the section.
>> - */
>> -end = ALIGN(pfn + nr_pages, PAGES_PER_SUBSECTION);
>> -pfn &= PAGE_SUBSECTION_MASK;
>> -nr_pages = end - pfn;
>> -
>> -start = (unsigned long) pfn_to_page(pfn);
>> -end = start + nr_pages * sizeof(struct page);
>> +unsigned long start = (unsigned long) pfn_to_page(pfn);
>> +unsigned long end = start + nr_pages * sizeof(struct page);
>>  
>>  if (vmemmap_populate(start, end, nid, altmap))
>>  return NULL;
>> 
>
>Can we add a WARN_ON_ONCE to catch mis-use in the future?
>
>if (WARN_ON_ONCE(!IS_ALIGNED(pfn, PAGES_PER_SUBSECTION) ||
> !IS_ALIGNED(nr_pages, PAGES_PER_SUBSECTION))
>   return NULL;

How about to add this into both population and depopulation?

>
>-- 
>Thanks,
>
>David / dhildenb

-- 
Wei Yang
Help you, Help me

Re: [PATCH] drm: fix double free for gbo in drm_gem_vram_init and drm_gem_vram_create

2020-06-30 Thread Jia Yang

Ping...

On 2020/6/20 14:21, Jia Yang wrote:
> I got a use-after-free report when doing some fuzz test:
> 
> If ttm_bo_init() fails, the "gbo" and "gbo->bo.base" will be
> freed by ttm_buffer_object_destroy() in ttm_bo_init(). But
> then drm_gem_vram_create() and drm_gem_vram_init() will free
> "gbo" and "gbo->bo.base" again.
> 
> BUG: KMSAN: use-after-free in drm_vma_offset_remove+0xb3/0x150
> CPU: 0 PID: 24282 Comm: syz-executor.1 Tainted: GB   W 
> 5.7.0-rc4-msan #2
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> Ubuntu-1.8.2-1ubuntu1 04/01/2014
> Call Trace:
>  __dump_stack
>  dump_stack+0x1c9/0x220
>  kmsan_report+0xf7/0x1e0
>  __msan_warning+0x58/0xa0
>  drm_vma_offset_remove+0xb3/0x150
>  drm_gem_free_mmap_offset
>  drm_gem_object_release+0x159/0x180
>  drm_gem_vram_init
>  drm_gem_vram_create+0x7c5/0x990
>  drm_gem_vram_fill_create_dumb
>  drm_gem_vram_driver_dumb_create+0x238/0x590
>  drm_mode_create_dumb
>  drm_mode_create_dumb_ioctl+0x41d/0x450
>  drm_ioctl_kernel+0x5a4/0x710
>  drm_ioctl+0xc6f/0x1240
>  vfs_ioctl
>  ksys_ioctl
>  __do_sys_ioctl
>  __se_sys_ioctl+0x2e9/0x410
>  __x64_sys_ioctl+0x4a/0x70
>  do_syscall_64+0xb8/0x160
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x4689b9
> Code: fd e0 fa ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 
> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 
> 83 cb e0 fa ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:7f368fa4dc98 EFLAGS: 0246 ORIG_RAX: 0010
> RAX: ffda RBX: 0076bf00 RCX: 004689b9
> RDX: 2240 RSI: c02064b2 RDI: 0003
> RBP: 0004 R08:  R09: 
> R10:  R11: 0246 R12: 
> R13: 004d17e0 R14: 7f368fa4e6d4 R15: 0076bf0c
> 
> Uninit was created at:
>  kmsan_save_stack_with_flags
>  kmsan_internal_poison_shadow+0x66/0xd0
>  kmsan_slab_free+0x6e/0xb0
>  slab_free_freelist_hook
>  slab_free
>  kfree+0x571/0x30a0
>  drm_gem_vram_destroy
>  ttm_buffer_object_destroy+0xc8/0x130
>  ttm_bo_release
>  kref_put
>  ttm_bo_put+0x117d/0x23e0
>  ttm_bo_init_reserved+0x11c0/0x11d0
>  ttm_bo_init+0x289/0x3f0
>  drm_gem_vram_init
>  drm_gem_vram_create+0x775/0x990
>  drm_gem_vram_fill_create_dumb
>  drm_gem_vram_driver_dumb_create+0x238/0x590
>  drm_mode_create_dumb
>  drm_mode_create_dumb_ioctl+0x41d/0x450
>  drm_ioctl_kernel+0x5a4/0x710
>  drm_ioctl+0xc6f/0x1240
>  vfs_ioctl
>  ksys_ioctl
>  __do_sys_ioctl
>  __se_sys_ioctl+0x2e9/0x410
>  __x64_sys_ioctl+0x4a/0x70
>  do_syscall_64+0xb8/0x160
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> If ttm_bo_init() fails, the "gbo" will be freed by
> ttm_buffer_object_destroy() in ttm_bo_init(). But then
> drm_gem_vram_create() and drm_gem_vram_init() will free
> "gbo" again.
> 
> Reported-by: Hulk Robot 
> Signed-off-by: Jia Yang 
> ---
>  drivers/gpu/drm/drm_gem_vram_helper.c | 28 +++
>  1 file changed, 16 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c 
> b/drivers/gpu/drm/drm_gem_vram_helper.c
> index 8b2d5c945c95..1d85af9a481a 100644
> --- a/drivers/gpu/drm/drm_gem_vram_helper.c
> +++ b/drivers/gpu/drm/drm_gem_vram_helper.c
> @@ -175,6 +175,10 @@ static void drm_gem_vram_placement(struct 
> drm_gem_vram_object *gbo,
>   }
>  }
>  
> +/*
> + * Note that on error, drm_gem_vram_init will free the buffer object.
> + */
> +
>  static int drm_gem_vram_init(struct drm_device *dev,
>struct drm_gem_vram_object *gbo,
>size_t size, unsigned long pg_align)
> @@ -184,15 +188,19 @@ static int drm_gem_vram_init(struct drm_device *dev,
>   int ret;
>   size_t acc_size;
>  
> - if (WARN_ONCE(!vmm, "VRAM MM not initialized"))
> + if (WARN_ONCE(!vmm, "VRAM MM not initialized")) {
> + kfree(gbo);
>   return -EINVAL;
> + }
>   bdev = >bdev;
>  
>   gbo->bo.base.funcs = _gem_vram_object_funcs;
>  
>   ret = drm_gem_object_init(dev, >bo.base, size);
> - if (ret)
> + if (ret) {
> + kfree(gbo);
>   return ret;
> + }
>  
>   acc_size = ttm_bo_dma_acc_size(bdev, size, sizeof(*gbo));
>  
> @@ -203,13 +211,13 @@ static int drm_gem_vram_init(struct drm_device *dev,
> >placement, pg_align, false, acc_size,
> NULL, NULL, ttm_buffer_object_destroy);
>   if (ret)
> - goto err_drm_gem_object_release;
> + /*
> +  * A failing ttm_bo_init will call ttm_buffer_object_destroy
> +  * to release gbo->bo.base and kfree gbo.
> +  */
> + return ret;
>  
>   return 0;
> -
> -err_drm_gem_object_release:
> - drm_gem_object_release(>bo.base);
> - return ret;
>  }
>  
>  /**
> @@ -243,13 +251,9 @@ struct drm_gem_vram_object *drm_gem_vram_create(struct

Re: [PATCH 3/4] iommu/vt-d: Report page request faults for guest SVA

2020-06-30 Thread Lu Baolu


Hi Kevin,

Thanks a lot for reviewing my patches.

On 6/30/20 2:01 PM, Tian, Kevin wrote:

From: Lu Baolu 
Sent: Sunday, June 28, 2020 8:34 AM

A pasid might be bound to a page table from a VM guest via the iommu
ops.sva_bind_gpasid. In this case, when a DMA page fault is detected
on the physical IOMMU, we need to inject the page fault request into
the guest. After the guest completes handling the page fault, a page
response need to be sent back via the iommu ops.page_response().

This adds support to report a page request fault. Any external module
which is interested in handling this fault should regiester a notifier
callback.

Co-developed-by: Jacob Pan 
Signed-off-by: Jacob Pan 
Co-developed-by: Liu Yi L 
Signed-off-by: Liu Yi L 
Signed-off-by: Lu Baolu 
---
  drivers/iommu/intel/svm.c | 83
+--
  1 file changed, 80 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index c23167877b2b..4800bb6f8794 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -815,6 +815,69 @@ static void intel_svm_drain_prq(struct device *dev,
int pasid)
}
  }

+static int prq_to_iommu_prot(struct page_req_dsc *req)
+{
+   int prot = 0;
+
+   if (req->rd_req)
+   prot |= IOMMU_FAULT_PERM_READ;
+   if (req->wr_req)
+   prot |= IOMMU_FAULT_PERM_WRITE;
+   if (req->exe_req)
+   prot |= IOMMU_FAULT_PERM_EXEC;
+   if (req->pm_req)
+   prot |= IOMMU_FAULT_PERM_PRIV;
+
+   return prot;
+}
+
+static int
+intel_svm_prq_report(struct intel_iommu *iommu, struct page_req_dsc
*desc)
+{
+   struct iommu_fault_event event;
+   struct pci_dev *pdev;
+   u8 bus, devfn;
+   int ret = 0;
+
+   memset(, 0, sizeof(struct iommu_fault_event));
+   bus = PCI_BUS_NUM(desc->rid);
+   devfn = desc->rid & 0xff;
+   pdev = pci_get_domain_bus_and_slot(iommu->segment, bus, devfn);


Is this step necessary? dev can be passed in (based on sdev), and more
importantly iommu_report_device_fault already handles the ref counting
e.g. get_device(dev) when fault handler is valid...


Yes, agreed. I will pass device in instead.




+
+   if (!pdev) {
+   pr_err("No PCI device found for PRQ [%02x:%02x.%d]\n",
+  bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
+   return -ENODEV;
+   }
+
+   /* Fill in event data for device specific processing */
+   event.fault.type = IOMMU_FAULT_PAGE_REQ;
+   event.fault.prm.addr = desc->addr;
+   event.fault.prm.pasid = desc->pasid;
+   event.fault.prm.grpid = desc->prg_index;
+   event.fault.prm.perm = prq_to_iommu_prot(desc);
+
+   /*
+* Set last page in group bit if private data is present,
+* page response is required as it does for LPIG.
+*/
+   if (desc->lpig)
+   event.fault.prm.flags |=
IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE;
+   if (desc->pasid_present)
+   event.fault.prm.flags |=
IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
+   if (desc->priv_data_present) {
+   event.fault.prm.flags |=
IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE;


why setting lpig under this condition?


/*
 * Per VT-d spec. v3.0 ch7.7, system software must
 * respond with page group response if private data
 * is present (PDP) or last page in group (LPIG) bit
 * is set. This is an additional VT-d feature beyond
 * PCI ATS spec.
 */




+   event.fault.prm.flags |=
IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA;
+   memcpy(event.fault.prm.private_data, desc->priv_data,
+  sizeof(desc->priv_data));
+   }
+
+   ret = iommu_report_device_fault(>dev, );
+   pci_dev_put(pdev);
+
+   return ret;
+}
+
  static irqreturn_t prq_event_thread(int irq, void *d)
  {
struct intel_iommu *iommu = d;
@@ -874,6 +937,19 @@ static irqreturn_t prq_event_thread(int irq, void *d)
if (!is_canonical_address(address))
goto bad_req;

+   /*
+* If prq is to be handled outside iommu driver via receiver of
+* the fault notifiers, we skip the page response here.
+*/
+   if (svm->flags & SVM_FLAG_GUEST_MODE) {
+   int res = intel_svm_prq_report(iommu, req);
+
+   if (!res)
+   goto prq_advance;
+   else
+   goto bad_req;
+   }
+


I noted in bad_req there is another reporting logic:

 if (sdev && sdev->ops && sdev->ops->fault_cb) {
 int rwxp = (req->rd_req << 3) | (req->wr_req << 2) |
 (req->exe_req << 1) | (req->pm_req);
 sdev->ops->fault_cb(sdev->dev, req->pasid, req->addr,
 req->priv_data, rwxp, result);

Re: [regression] TCP_MD5SIG on established sockets

2020-06-30 Thread Eric Dumazet

On Tue, Jun 30, 2020 at 7:23 PM Herbert Xu  wrote:
>
> On Tue, Jun 30, 2020 at 07:17:46PM -0700, Eric Dumazet wrote:
> >
> > The main issue of the prior code was the double read of key->keylen in
> > tcp_md5_hash_key(), not that few bytes could change under us.
> >
> > I used smp_rmb() to ease backports, since old kernels had no
> > READ_ONCE()/WRITE_ONCE(), but ACCESS_ONCE() instead.
>
> If it's the double-read that you're protecting against, you should
> just use barrier() and the comment should say so too.

I made this clear in the changelog, do we want comments all over the places ?
Do not get me wrong, we had this bug for years and suddenly this is a
big deal...

MD5 keys are read with RCU protection, and tcp_md5_do_add()
might update in-place a prior key.

Normally, typical RCU updates would allocate a new piece
of memory. In this case only key->key and key->keylen might
be updated, and we do not care if an incoming packet could
see the old key, the new one, or some intermediate value,
since changing the key on a live flow is known to be problematic
anyway.

We only want to make sure that in the case key->keylen
is changed, cpus in tcp_md5_hash_key() wont try to use
uninitialized data, or crash because key->keylen was
read twice to feed sg_init_one() and ahash_request_set_crypt()

[PATCH] f2fs: fix return value of move_data_block()

2020-06-30 Thread Chao Yu

If f2fs_grab_cache_page() fails, it needs to return -ENOMEM.

Signed-off-by: Chao Yu 
---
 fs/f2fs/gc.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 3b718da69910..11b4adde9baf 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -849,8 +849,10 @@ static int move_data_block(struct inode *inode, block_t 
bidx,
 
mpage = f2fs_grab_cache_page(META_MAPPING(fio.sbi),
fio.old_blkaddr, false);
-   if (!mpage)
+   if (!mpage) {
+   err = -ENOMEM;
goto up_out;
+   }
 
fio.encrypted_page = mpage;
 
-- 
2.26.2

Re: [Bug, sched, 5.8-rc2]: PREEMPT kernels crashing in check_preempt_wakeup() running fsx on XFS

2020-06-30 Thread Dave Chinner

On Tue, Jun 30, 2020 at 10:57:32AM +0200, Peter Zijlstra wrote:
> On Tue, Jun 30, 2020 at 09:55:33AM +1000, Dave Chinner wrote:
> > Sure, but that misses the point I was making.
> > 
> > I regularly have to look deep into other subsystems to work out what
> > problem the filesystem is tripping over. I'm regularly
> > looking into parts of the IO stack, memory management, page
> > allocators, locking and atomics, workqueues, the scheduler, etc
> > because XFS makes extensive (and complex) use of the infrastructure
> > they provide. That means to debug filesystem issues, I have to be
> > able to understand what that infrastructure is trying to do and make
> > judgements as to whether that code behaving correctly or not.
> > 
> > And so when I find a reproducer for a bug that takes 20s to
> > reproduce and it points me at code that I honestily have no hope of
> 
> 20s would've been nice to have a week and a half ago, the reproduce I
> debugged this with took days to trigger.. a well, such is life.
> 
> > understanding well enough to determine if it is working correctly or
> > not, then we have a problem.  A lot of my time is spent doing root
> > cause analysis proving that such issues are -not- filesystem
> > problems (they just had "xfs" in the stack trace), hence being able
> > to read and understand the code in related core subsystems is
> > extremely important to performing my day job.
> > 
> > If more kernel code falls off the memory barrier cliff like this,
> > then the ability of people like me to find the root cause of complex
> > issues is going to be massively reduced. Writing code so smart that
> > almost no-one else can understand has always been a bad thing, and
> > memory barriers only make this problem worse... :(
> 
> How about you try and give me a hint about where you gave up and I'll
> try and write better comments?

Hard to explain. Background: we (XFS developers) got badly burnt a
few years back by memory barrier bugs in rwsems that we could not
prove were memory barrier bugs in rwsems as instrumentation made
them go away. But they were most definitely bugs in rwsems that the
maintainers said could not exist. While I could read the the rwsem
code and understand how it was supposed to work, identifying the
missing memory barrier in the code was beyond my capability. It was
also beyond the capability of the people who wrote the code, too.

As such, I'm extremely sceptical that maintainers actually
understand their complex ordering constructs as well as they think
they do, and your comments about "can't explain how it can happen on
x86-64" do nothing but re-inforce my scepticism.

To your specific question: I gave up looking at the code when I
realised I had no idea what the relationships between objects and
logic that the memory barriers were ordering against, nor what
fields within objects were protected by locks, acquire/release
depedencies, explicit memory barriers, some combination of all
three or even something subtle enough that I hadn't noticed yet.

Yes, the code explains the ordering constraints between
object A and object B, but there's nothing to to actually explain
what, say, p->on_cpu means, what it's valid states are, when teh
different values mean, and how it relates to, say, p->on_rq...

e.g. take this comment in ttwu:

   /*
 * Ensure we load p->on_rq _after_ p->state, otherwise it would
 * be possible to, falsely, observe p->on_rq == 0 and get stuck
 * in smp_cond_load_acquire() below.
 *
 * sched_ttwu_pending() try_to_wake_up()
 *   STORE p->on_rq = 1   LOAD p->state
 *   UNLOCK rq->lock
 *
 * __schedule() (switch to task 'p')
 *   LOCK rq->locksmp_rmb();
 *   smp_mb__after_spinlock();
 *   UNLOCK rq->lock
 *
 * [task p]
 *   STORE p->state = UNINTERRUPTIBLE LOAD p->on_rq
 *
 * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
 * __schedule().  See the comment for smp_mb__after_spinlock().
 *
 * A similar smb_rmb() lives in try_invoke_on_locked_down_task().
 */
smp_rmb();

The comment explains -what the code is ordering against-. I
understand that this is a modified message passing pattern that
requires explicit memory barriers because there isn't a direct
release/acquire relationship between store and load of the different
objects.

I then spent an hour learning about smp_cond_load_acquire() because this
is the first time I'd seen that construct and I had no idea what it did.
But then I understood what that specific set of ordering constraints
actually does.

And then I completely didn't understand this code, because the code
the comment references is this:

smp_cond_load_acquire(>on_cpu, !VAL);

that's *on_cpu* that the load is being done from, but the comment
that references the load_acquire is talking about

Re: [PATCH bpf-next] selftests/bpf: Switch test_vmlinux to use hrtimer_range_start_ns.

2020-06-30 Thread Yonghong Song

On 6/30/20 5:10 PM, Hao Luo wrote:

Ok, with the help of my colleague Ian Rogers, I think we solved the
mystery. Clang actually inlined hrtimer_nanosleep() inside
SyS_nanosleep(), so there is no call to that function throughout the
path of the nanosleep syscall. I've been looking at the function body
of hrtimer_nanosleep for quite some time, but clearly overlooked the
caller of hrtimer_nanosleep. hrtimer_nanosleep is pretty short and
there are many constants, inlining would not be too surprising.

Oh thanks for explanation. inlining makes sense. We have many other
instances like this in the past where kprobe won't work properly.

Could you reword your commit message then?

> causing fentry and kprobe to not hook on this function properly on a
> Clang build kernel.

The above is a little vague on what happens. What really happens is
fentry/kprobe does hook on this function but has no effect since
its caller has inlined the function.

Sigh...

Hao

On Tue, Jun 30, 2020 at 3:48 PM Hao Luo  wrote:

On Tue, Jun 30, 2020 at 1:37 PM Yonghong Song  wrote:

On 6/30/20 11:49 AM, Hao Luo wrote:

The test_vmlinux test uses hrtimer_nanosleep as hook to test tracing
programs. But it seems Clang may have done an aggressive optimization,
causing fentry and kprobe to not hook on this function properly on a
Clang build kernel.

Could you explain why it does not on clang built kernel? How did you
build the kernel? Did you use [thin]lto?

hrtimer_nanosleep is a global function who is called in several
different files. I am curious how clang optimization can make
function disappear, or make its function signature change, or
rename the function?

Yonghong,

We didn't enable LTO. It also puzzled me. But I can confirm those
fentry/kprobe test failures via many different experiments I've done.
After talking to my colleague on kernel compiling tools (Bill, cc'ed),
we suspected this could be because of clang's aggressive inlining. We
also noticed that all the callsites of hrtimer_nanosleep() are tail
calls.

For a better explanation, I can reach out to the people who are more
familiar to clang in the compiler team to see if they have any
insights. This may not be of high priority for them though.

Hao

Re: [PATCH] mm/cma.c: use exact_nid true to fix possible per-numa cma leak

2020-06-30 Thread Roman Gushchin

On Tue, Jun 30, 2020 at 07:09:31PM -0700, Andrew Morton wrote:
> On Tue, 30 Jun 2020 12:08:25 -0700 Roman Gushchin  wrote:
> 
> > On Sun, Jun 28, 2020 at 07:43:45PM +1200, Barry Song wrote:
> > > Calling cma_declare_contiguous_nid() with false exact_nid for per-numa
> > > reservation can easily cause cma leak and various confusion.
> > > For example, mm/hugetlb.c is trying to reserve per-numa cma for gigantic
> > > pages. But it can easily leak cma and make users confused when system has
> > > memoryless nodes.
> > > 
> > > In case the system has 4 numa nodes, and only numa node0 has memory.
> > > if we set hugetlb_cma=4G in bootargs, mm/hugetlb.c will get 4 cma areas
> > > for 4 different numa nodes. since exact_nid=false in current code, all
> > > 4 numa nodes will get cma successfully from node0, but hugetlb_cma[1 to 3]
> > > will never be available to hugepage will only allocate memory from
> > > hugetlb_cma[0].
> > > 
> > > In case the system has 4 numa nodes, both numa node0&2 has memory, other
> > > nodes have no memory.
> > > if we set hugetlb_cma=4G in bootargs, mm/hugetlb.c will get 4 cma areas
> > > for 4 different numa nodes. since exact_nid=false in current code, all
> > > 4 numa nodes will get cma successfully from node0 or 2, but hugetlb_cma[1]
> > > and [3] will never be available to hugepage as mm/hugetlb.c will only
> > > allocate memory from hugetlb_cma[0] and hugetlb_cma[2].
> > > This causes permanent leak of the cma areas which are supposed to be
> > > used by memoryless node.
> > > 
> > > Of cource we can workaround the issue by letting mm/hugetlb.c scan all
> > > cma areas in alloc_gigantic_page() even node_mask includes node0 only.
> > > that means when node_mask includes node0 only, we can get page from
> > > hugetlb_cma[1] to hugetlb_cma[3]. But this will cause kernel crash in
> > > free_gigantic_page() while it wants to free page by:
> > > cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order)
> > > 
> > > On the other hand, exact_nid=false won't consider numa distance, it
> > > might be not that useful to leverage cma areas on remote nodes.
> > > I feel it is much simpler to make exact_nid true to make everything
> > > clear. After that, memoryless nodes won't be able to reserve per-numa
> > > CMA from other nodes which have memory.
> > 
> > Totally agree.
> > 
> > Acked-by: Roman Gushchin 
> > 
> 
> Do we feel this merits a cc:stable?

It would be nice.

Thanks!

[PATCH v9 2/4] drm/tegra: output: Support DRM bridges

2020-06-30 Thread Dmitry Osipenko

Newer Tegra device-trees will specify a video output graph which involves
a bridge. This patch adds initial support for the DRM bridges to the Tegra
DRM output.

Acked-by: Sam Ravnborg 
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/tegra/drm.h|  2 ++
 drivers/gpu/drm/tegra/output.c | 12 
 2 files changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
index b25443255be6..f38de08e0c95 100644
--- a/drivers/gpu/drm/tegra/drm.h
+++ b/drivers/gpu/drm/tegra/drm.h
@@ -12,6 +12,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -116,6 +117,7 @@ struct tegra_output {
struct device_node *of_node;
struct device *dev;
 
+   struct drm_bridge *bridge;
struct drm_panel *panel;
struct i2c_adapter *ddc;
const struct edid *edid;
diff --git a/drivers/gpu/drm/tegra/output.c b/drivers/gpu/drm/tegra/output.c
index a6a711d54e88..ccd1421f1b24 100644
--- a/drivers/gpu/drm/tegra/output.c
+++ b/drivers/gpu/drm/tegra/output.c
@@ -5,6 +5,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 
@@ -99,8 +100,19 @@ int tegra_output_probe(struct tegra_output *output)
if (!output->of_node)
output->of_node = output->dev->of_node;
 
+   err = drm_of_find_panel_or_bridge(output->of_node, -1, -1,
+ >panel, >bridge);
+   if (err && err != -ENODEV)
+   return err;
+
panel = of_parse_phandle(output->of_node, "nvidia,panel", 0);
if (panel) {
+   /*
+* Don't mix nvidia,panel phandle with the graph in a
+* device-tree.
+*/
+   WARN_ON(output->panel || output->bridge);
+
output->panel = of_drm_find_panel(panel);
of_node_put(panel);
 
-- 
2.26.0

[tip:x86/fpu] BUILD SUCCESS 4185b3b92792eaec5869266e594338343421ffb0

2020-06-30 Thread kernel test robot

  defconfig
parisc   allyesconfig
parisc   allmodconfig
powerpc defconfig
powerpc  rhel-kconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a001-20200630
i386 randconfig-a003-20200630
i386 randconfig-a002-20200630
i386 randconfig-a004-20200630
i386 randconfig-a005-20200630
i386 randconfig-a006-20200630
i386 randconfig-a002-20200701
i386 randconfig-a001-20200701
i386 randconfig-a006-20200701
i386 randconfig-a005-20200701
i386 randconfig-a004-20200701
i386 randconfig-a003-20200701
i386 randconfig-a006-20200629
i386 randconfig-a002-20200629
i386 randconfig-a003-20200629
i386 randconfig-a001-20200629
i386 randconfig-a005-20200629
i386 randconfig-a004-20200629
x86_64   randconfig-a011-20200629
x86_64   randconfig-a012-20200629
x86_64   randconfig-a013-20200629
x86_64   randconfig-a014-20200629
x86_64   randconfig-a015-20200629
x86_64   randconfig-a016-20200629
x86_64   randconfig-a011-20200630
x86_64   randconfig-a014-20200630
x86_64   randconfig-a013-20200630
x86_64   randconfig-a015-20200630
x86_64   randconfig-a016-20200630
x86_64   randconfig-a012-20200630
x86_64   randconfig-a012-20200701
x86_64   randconfig-a016-20200701
x86_64   randconfig-a014-20200701
x86_64   randconfig-a011-20200701
x86_64   randconfig-a015-20200701
x86_64   randconfig-a013-20200701
i386 randconfig-a013-20200629
i386 randconfig-a016-20200629
i386 randconfig-a014-20200629
i386 randconfig-a012-20200629
i386 randconfig-a015-20200629
i386 randconfig-a011-20200629
i386 randconfig-a011-20200630
i386 randconfig-a016-20200630
i386 randconfig-a015-20200630
i386 randconfig-a014-20200630
i386 randconfig-a013-20200630
i386 randconfig-a012-20200630
i386 randconfig-a011-20200701
i386 randconfig-a015-20200701
i386 randconfig-a014-20200701
i386 randconfig-a016-20200701
i386 randconfig-a012-20200701
i386 randconfig-a013-20200701
riscvallyesconfig
riscv allnoconfig
riscv   defconfig
riscvallmodconfig
s390 allyesconfig
s390 allmodconfig
s390defconfig
sparcallyesconfig
sparc   defconfig
sparc64 defconfig
sparc64   allnoconfig
sparc64  allmodconfig
um   allmodconfig
umallnoconfig
um   allyesconfig
um  defconfig
x86_64   rhel-7.6
x86_64rhel-7.6-kselftests
x86_64   rhel-8.3
x86_64  kexec
x86_64   rhel
x86_64 rhel-7.2-clear
x86_64lkp
x86_64  fedora-25

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org

[PATCH v9 4/4] drm/tegra: output: rgb: Wrap directly-connected panel into DRM bridge

2020-06-30 Thread Dmitry Osipenko

Currently Tegra DRM driver manually manages display panel, but this
management could be moved out into DRM core if we'll wrap panel into
DRM bridge. This patch wraps RGB panel into a DRM bridge and removes
manual handling of the panel from the RGB output code.

Suggested-by: Laurent Pinchart 
Acked-by: Sam Ravnborg 
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/tegra/rgb.c | 70 ++---
 1 file changed, 18 insertions(+), 52 deletions(-)

diff --git a/drivers/gpu/drm/tegra/rgb.c b/drivers/gpu/drm/tegra/rgb.c
index 9a7024ec96bc..4142a56ca764 100644
--- a/drivers/gpu/drm/tegra/rgb.c
+++ b/drivers/gpu/drm/tegra/rgb.c
@@ -8,7 +8,6 @@
 
 #include 
 #include 
-#include 
 #include 
 
 #include "drm.h"
@@ -86,45 +85,13 @@ static void tegra_dc_write_regs(struct tegra_dc *dc,
tegra_dc_writel(dc, table[i].value, table[i].offset);
 }
 
-static const struct drm_connector_funcs tegra_rgb_connector_funcs = {
-   .reset = drm_atomic_helper_connector_reset,
-   .detect = tegra_output_connector_detect,
-   .fill_modes = drm_helper_probe_single_connector_modes,
-   .destroy = tegra_output_connector_destroy,
-   .atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
-   .atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
-};
-
-static enum drm_mode_status
-tegra_rgb_connector_mode_valid(struct drm_connector *connector,
-  struct drm_display_mode *mode)
-{
-   /*
-* FIXME: For now, always assume that the mode is okay. There are
-* unresolved issues with clk_round_rate(), which doesn't always
-* reliably report whether a frequency can be set or not.
-*/
-   return MODE_OK;
-}
-
-static const struct drm_connector_helper_funcs 
tegra_rgb_connector_helper_funcs = {
-   .get_modes = tegra_output_connector_get_modes,
-   .mode_valid = tegra_rgb_connector_mode_valid,
-};
-
 static void tegra_rgb_encoder_disable(struct drm_encoder *encoder)
 {
struct tegra_output *output = encoder_to_output(encoder);
struct tegra_rgb *rgb = to_rgb(output);
 
-   if (output->panel)
-   drm_panel_disable(output->panel);
-
tegra_dc_write_regs(rgb->dc, rgb_disable, ARRAY_SIZE(rgb_disable));
tegra_dc_commit(rgb->dc);
-
-   if (output->panel)
-   drm_panel_unprepare(output->panel);
 }
 
 static void tegra_rgb_encoder_enable(struct drm_encoder *encoder)
@@ -133,9 +100,6 @@ static void tegra_rgb_encoder_enable(struct drm_encoder 
*encoder)
struct tegra_rgb *rgb = to_rgb(output);
u32 value;
 
-   if (output->panel)
-   drm_panel_prepare(output->panel);
-
tegra_dc_write_regs(rgb->dc, rgb_enable, ARRAY_SIZE(rgb_enable));
 
value = DE_SELECT_ACTIVE | DE_CONTROL_NORMAL;
@@ -157,9 +121,6 @@ static void tegra_rgb_encoder_enable(struct drm_encoder 
*encoder)
tegra_dc_writel(rgb->dc, value, DC_DISP_SHIFT_CLOCK_OPTIONS);
 
tegra_dc_commit(rgb->dc);
-
-   if (output->panel)
-   drm_panel_enable(output->panel);
 }
 
 static int
@@ -278,6 +239,23 @@ int tegra_dc_rgb_init(struct drm_device *drm, struct 
tegra_dc *dc)
drm_encoder_helper_add(>encoder,
   _rgb_encoder_helper_funcs);
 
+   /*
+* Wrap directly-connected panel into DRM bridge in order to let
+* DRM core to handle panel for us.
+*/
+   if (output->panel) {
+   output->bridge = devm_drm_panel_bridge_add(output->dev,
+  output->panel);
+   if (IS_ERR(output->bridge)) {
+   dev_err(output->dev,
+   "failed to wrap panel into bridge: %pe\n",
+   output->bridge);
+   return PTR_ERR(output->bridge);
+   }
+
+   output->panel = NULL;
+   }
+
/*
 * Tegra devices that have LVDS panel utilize LVDS encoder bridge
 * for converting up to 28 LCD LVTTL lanes into 5/4 LVDS lanes that
@@ -292,8 +270,7 @@ int tegra_dc_rgb_init(struct drm_device *drm, struct 
tegra_dc *dc)
 * Newer device-trees utilize LVDS encoder bridge, which provides
 * us with a connector and handles the display panel.
 *
-* For older device-trees we fall back to our own connector and use
-* nvidia,panel phandle.
+* For older device-trees we wrapped panel into the panel-bridge.
 */
if (output->bridge) {
err = drm_bridge_attach(>encoder, output->bridge,
@@ -313,17 +290,6 @@ int tegra_dc_rgb_init(struct drm_device *drm, struct 
tegra_dc *dc)
}
 
drm_connector_attach_encoder(connector, >encoder);
-   } else {
-   drm_connector_init(drm, >connector,
-  _rgb_connector_funcs,
-

[PATCH v9 1/4] drm/tegra: output: Don't leak OF node on error

2020-06-30 Thread Dmitry Osipenko

The OF node should be put before returning error in tegra_output_probe(),
otherwise node's refcount will be leaked.

Reviewed-by: Laurent Pinchart 
Reviewed-by: Sam Ravnborg 
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/tegra/output.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/tegra/output.c b/drivers/gpu/drm/tegra/output.c
index e36e5e7c2f69..a6a711d54e88 100644
--- a/drivers/gpu/drm/tegra/output.c
+++ b/drivers/gpu/drm/tegra/output.c
@@ -102,10 +102,10 @@ int tegra_output_probe(struct tegra_output *output)
panel = of_parse_phandle(output->of_node, "nvidia,panel", 0);
if (panel) {
output->panel = of_drm_find_panel(panel);
+   of_node_put(panel);
+
if (IS_ERR(output->panel))
return PTR_ERR(output->panel);
-
-   of_node_put(panel);
}
 
output->edid = of_get_property(output->of_node, "nvidia,edid", );
@@ -113,13 +113,12 @@ int tegra_output_probe(struct tegra_output *output)
ddc = of_parse_phandle(output->of_node, "nvidia,ddc-i2c-bus", 0);
if (ddc) {
output->ddc = of_find_i2c_adapter_by_node(ddc);
+   of_node_put(ddc);
+
if (!output->ddc) {
err = -EPROBE_DEFER;
-   of_node_put(ddc);
return err;
}
-
-   of_node_put(ddc);
}
 
output->hpd_gpio = devm_gpiod_get_from_of_node(output->dev,
-- 
2.26.0

[PATCH v9 3/4] drm/tegra: output: rgb: Support LVDS encoder bridge

2020-06-30 Thread Dmitry Osipenko

Newer Tegra device-trees will specify a video output graph, which involves
LVDS encoder bridge. This patch adds support for the LVDS encoder bridge
to the RGB output, allowing us to model the display hardware properly.

Reviewed-by: Laurent Pinchart 
Acked-by: Sam Ravnborg 
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/tegra/rgb.c | 58 +++--
 1 file changed, 49 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/tegra/rgb.c b/drivers/gpu/drm/tegra/rgb.c
index 0562a7eb793f..9a7024ec96bc 100644
--- a/drivers/gpu/drm/tegra/rgb.c
+++ b/drivers/gpu/drm/tegra/rgb.c
@@ -7,6 +7,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 
@@ -267,24 +268,63 @@ int tegra_dc_rgb_remove(struct tegra_dc *dc)
 int tegra_dc_rgb_init(struct drm_device *drm, struct tegra_dc *dc)
 {
struct tegra_output *output = dc->rgb;
+   struct drm_connector *connector;
int err;
 
if (!dc->rgb)
return -ENODEV;
 
-   drm_connector_init(drm, >connector, _rgb_connector_funcs,
-  DRM_MODE_CONNECTOR_LVDS);
-   drm_connector_helper_add(>connector,
-_rgb_connector_helper_funcs);
-   output->connector.dpms = DRM_MODE_DPMS_OFF;
-
drm_simple_encoder_init(drm, >encoder, DRM_MODE_ENCODER_LVDS);
drm_encoder_helper_add(>encoder,
   _rgb_encoder_helper_funcs);
 
-   drm_connector_attach_encoder(>connector,
- >encoder);
-   drm_connector_register(>connector);
+   /*
+* Tegra devices that have LVDS panel utilize LVDS encoder bridge
+* for converting up to 28 LCD LVTTL lanes into 5/4 LVDS lanes that
+* go to display panel's receiver.
+*
+* Encoder usually have a power-down control which needs to be enabled
+* in order to transmit data to the panel.  Historically devices that
+* use an older device-tree version didn't model the bridge, assuming
+* that encoder is turned ON by default, while today's DRM allows us
+* to model LVDS encoder properly.
+*
+* Newer device-trees utilize LVDS encoder bridge, which provides
+* us with a connector and handles the display panel.
+*
+* For older device-trees we fall back to our own connector and use
+* nvidia,panel phandle.
+*/
+   if (output->bridge) {
+   err = drm_bridge_attach(>encoder, output->bridge,
+   NULL, DRM_BRIDGE_ATTACH_NO_CONNECTOR);
+   if (err) {
+   dev_err(output->dev, "failed to attach bridge: %d\n",
+   err);
+   return err;
+   }
+
+   connector = drm_bridge_connector_init(drm, >encoder);
+   if (IS_ERR(connector)) {
+   dev_err(output->dev,
+   "failed to initialize bridge connector: %pe\n",
+   connector);
+   return PTR_ERR(connector);
+   }
+
+   drm_connector_attach_encoder(connector, >encoder);
+   } else {
+   drm_connector_init(drm, >connector,
+  _rgb_connector_funcs,
+  DRM_MODE_CONNECTOR_LVDS);
+   drm_connector_helper_add(>connector,
+_rgb_connector_helper_funcs);
+   output->connector.dpms = DRM_MODE_DPMS_OFF;
+
+   drm_connector_attach_encoder(>connector,
+>encoder);
+   drm_connector_register(>connector);
+   }
 
err = tegra_output_init(drm, output);
if (err < 0) {
-- 
2.26.0

Re: [PATCH] pinctrl: initialise nsp-mux earlier.

2020-06-30 Thread Mark Tomlinson

On Tue, 2020-06-30 at 15:08 -0700, Ray Jui wrote:
> May I know which GPIO driver you are referring to on NSP? Both the iProc
> GPIO driver and the NSP GPIO driver are initialized at the level of
> 'arch_initcall_sync', which is supposed to be after 'arch_initcall' used
> here in the pinmux driver

Sorry, it looks like I made a mistake in my testing (or I was lucky),
and this patch doesn't fix the issue. What is happening is:
1) nsp-pinmux driver is registered (arch_initcall).
2) nsp-gpio-a driver is registered (arch_initcall_sync).
3) of_platform_default_populate_init() is called (also at level
arch_initcall_sync), which scans the device tree, adds the nsp-gpio-a
device, runs its probe, and this returns -EPROBE_DEFER with the error
message.
4) Only now nsp-pinmux device is probed.

Changing the 'arch_initcall_sync' to 'device_initcall' in nsp-gpio-a
ensures that the pinmux is probed first since
of_platform_default_populate_init() will be called between the two
register calls, and the error goes away. Is this change acceptable as a
solution?

> > though the probe will succeed when the driver is re-initialised, the
> > error can be scary to end users. To fix this, change the time the
> 
> Scary to end users? I don't know about that. -EPROBE_DEFER was
> introduced exactly for this purpose. Perhaps users need to learn what
> -EPROBE_DEFER errno means?

The actual error message in syslog is:

kern.err kernel: gpiochip_add_data_with_key: GPIOs 480..511
(1820.gpio) failed to register, -517

So an end user sees "err" and "failed", and doesn't know what "-517"
means.

[PATCH v9 0/4] Support DRM bridges on NVIDIA Tegra

2020-06-30 Thread Dmitry Osipenko

Hello,

This series adds initial support for the DRM bridges to NVIDIA Tegra DRM
driver. This is required by newer device-trees where we model the LVDS
encoder bridge properly.

Changelog:

v9: - Dropped the of-graph/drm-of patches from this series because they
  are now factored out into a standalone series [1].

  [1] https://patchwork.ozlabs.org/project/linux-tegra/list/?series=186813

- The "drm/panel-simple: Add missing connector type for some panels"
  patch of v8 was already applied.

v8: - The new of_graph_get_local_port() helper is replaced with the
  of_graph_presents(), which simply checks the graph presence in a
  given DT node. Thank to Laurent Pinchart for the suggestion!

- The of_graph_get_local_port() is still there, but now it isn't a public
  function anymore. In the review to v7 Laurent Pinchart suggested that
  the function's doc-comments and name could be improved and I implemented
  these suggestions in v8.

- A day ago I discovered that devm_drm_panel_bridge_add() requires
  panel to have connector type to be properly set, otherwise function
  rejects panels with the incomplete description. So, I checked what
  LVDS panels are used on Tegra and fixed the missing connector types
  in this new patch:

drm/panel-simple: Add missing connector type for some panels

v7: - Removed the obscure unused structs (which GCC doesn't detect, but CLANG
  does) in the patch "Wrap directly-connected panel into DRM bridge",
  which was reported by kernel test robot for v6.

v6: - Added r-b and acks from Rob Herring and Sam Ravnborg.

- Rebased on a recent linux-next, patches now apply without fuzz.

v5: - Added new patches that make drm_of_find_panel_or_bridge() more usable
  if graph isn't defined in a device-tree:

of_graph: add of_graph_get_local_port()
drm/of: Make drm_of_find_panel_or_bridge() to check graph's presence

- Updated "Support DRM bridges" patch to use drm_of_find_panel_or_bridge()
  directly and added WARN_ON(output->panel || output->bridge) sanity-check.

- Added new "Wrap directly-connected panel into DRM bridge" patch, as
  was suggested by Laurent Pinchart.

v4: - Following review comments that were made by Laurent Pinchart to the v3,
  we now create and use the "bridge connector".

v3: - Following recommendation from Sam Ravnborg, the new bridge attachment
  model is now being used, i.e. we ask bridge to *not* create a connector
  using the DRM_BRIDGE_ATTACH_NO_CONNECTOR flag.

- The bridge is now created only for the RGB (LVDS) output, and only
  when necessary. For now we don't need bridges for HDMI or DSI outputs.

- I noticed that we're leaking OF node in the panel's error code path,
  this is fixed now by the new patch "Don't leak OF node on error".

v2: - Added the new "rgb: Don't register connector if bridge is used"
  patch, which hides the unused connector provided by the Tegra DRM
  driver when bridge is used, since bridge provides its own connector
  to us.

Dmitry Osipenko (4):
  drm/tegra: output: Don't leak OF node on error
  drm/tegra: output: Support DRM bridges
  drm/tegra: output: rgb: Support LVDS encoder bridge
  drm/tegra: output: rgb: Wrap directly-connected panel into DRM bridge

 drivers/gpu/drm/tegra/drm.h|   2 +
 drivers/gpu/drm/tegra/output.c |  21 +--
 drivers/gpu/drm/tegra/rgb.c| 102 +
 3 files changed, 72 insertions(+), 53 deletions(-)

-- 
2.26.0

[PATCH v4 13/14] irqchip/s3c24xx: Fix potential resource leaks

2020-06-30 Thread Tiezhu Yang

In the function s3c_init_intc_of(), system resource "reg_base", "domain"
and "intc" were not released in a few error cases. Thus add jump targets
for the completion of the desired exception handling.

Fixes: f0774d41da0e ("irqchip: s3c24xx: add devicetree support")
Signed-off-by: Tiezhu Yang 
---
 drivers/irqchip/irq-s3c24xx.c | 23 +--
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/irqchip/irq-s3c24xx.c b/drivers/irqchip/irq-s3c24xx.c
index d2031fe..166c27b 100644
--- a/drivers/irqchip/irq-s3c24xx.c
+++ b/drivers/irqchip/irq-s3c24xx.c
@@ -1227,7 +1227,7 @@ static int __init s3c_init_intc_of(struct device_node *np,
struct s3c24xx_irq_of_ctrl *ctrl;
struct irq_domain *domain;
void __iomem *reg_base;
-   int i;
+   int i, ret;
 
reg_base = of_iomap(np, 0);
if (!reg_base) {
@@ -1239,7 +1239,8 @@ static int __init s3c_init_intc_of(struct device_node *np,
 _irq_ops_of, NULL);
if (!domain) {
pr_err("irq: could not create irq-domain\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto out_iounmap;
}
 
for (i = 0; i < num_ctrl; i++) {
@@ -1248,15 +1249,17 @@ static int __init s3c_init_intc_of(struct device_node 
*np,
pr_debug("irq: found controller %s\n", ctrl->name);
 
intc = kzalloc(sizeof(struct s3c_irq_intc), GFP_KERNEL);
-   if (!intc)
-   return -ENOMEM;
+   if (!intc) {
+   ret = -ENOMEM;
+   goto out_domain_remove;
+   }
 
intc->domain = domain;
intc->irqs = kcalloc(32, sizeof(struct s3c_irq_data),
 GFP_KERNEL);
if (!intc->irqs) {
-   kfree(intc);
-   return -ENOMEM;
+   ret = -ENOMEM;
+   goto out_free;
}
 
if (ctrl->parent) {
@@ -1285,6 +1288,14 @@ static int __init s3c_init_intc_of(struct device_node 
*np,
set_handle_irq(s3c24xx_handle_irq);
 
return 0;
+
+out_free:
+   kfree(intc);
+out_domain_remove:
+   irq_domain_remove(domain);
+out_iounmap:
+   iounmap(reg_base);
+   return ret;
 }
 
 static struct s3c24xx_irq_of_ctrl s3c2410_ctrl[] = {
-- 
2.1.0

Re: [regression] TCP_MD5SIG on established sockets

2020-06-30 Thread Herbert Xu

On Tue, Jun 30, 2020 at 07:17:46PM -0700, Eric Dumazet wrote:
>
> The main issue of the prior code was the double read of key->keylen in
> tcp_md5_hash_key(), not that few bytes could change under us.
>
> I used smp_rmb() to ease backports, since old kernels had no
> READ_ONCE()/WRITE_ONCE(), but ACCESS_ONCE() instead.

If it's the double-read that you're protecting against, you should
just use barrier() and the comment should say so too.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

[PATCH v4 03/14] irqchip/csky-mpintc: Fix potential resource leaks

2020-06-30 Thread Tiezhu Yang

In the function csky_mpintc_init(), system resources "__trigger",
"INTCG_base" and "root_domain" were not released in a few error
cases. Thus add jump targets for the completion of the desired
exception handling. By the way, do some coding-style cleanups
suggested by Markus.

Fixes: d8a5f5f79122 ("irqchip: add C-SKY SMP interrupt controller")
Signed-off-by: Tiezhu Yang 
---
 drivers/irqchip/irq-csky-mpintc.c | 34 +-
 1 file changed, 25 insertions(+), 9 deletions(-)

diff --git a/drivers/irqchip/irq-csky-mpintc.c 
b/drivers/irqchip/irq-csky-mpintc.c
index a1534ed..d7edc28 100644
--- a/drivers/irqchip/irq-csky-mpintc.c
+++ b/drivers/irqchip/irq-csky-mpintc.c
@@ -241,14 +241,16 @@ csky_mpintc_init(struct device_node *node, struct 
device_node *parent)
nr_irq = INTC_IRQS;
 
__trigger  = kcalloc(nr_irq, sizeof(unsigned long), GFP_KERNEL);
-   if (__trigger == NULL)
+   if (!__trigger)
return -ENXIO;
 
-   if (INTCG_base == NULL) {
+   if (!INTCG_base) {
INTCG_base = ioremap(mfcr("cr<31, 14>"),
-INTCL_SIZE*nr_cpu_ids + INTCG_SIZE);
-   if (INTCG_base == NULL)
-   return -EIO;
+INTCL_SIZE * nr_cpu_ids + INTCG_SIZE);
+   if (!INTCG_base) {
+   ret = -EIO;
+   goto err_free;
+   }
 
INTCL_base = INTCG_base + INTCG_SIZE;
 
@@ -257,8 +259,10 @@ csky_mpintc_init(struct device_node *node, struct 
device_node *parent)
 
root_domain = irq_domain_add_linear(node, nr_irq, _irqdomain_ops,
NULL);
-   if (!root_domain)
-   return -ENXIO;
+   if (!root_domain) {
+   ret = -ENXIO;
+   goto err_iounmap;
+   }
 
/* for every cpu */
for_each_present_cpu(cpu) {
@@ -270,12 +274,24 @@ csky_mpintc_init(struct device_node *node, struct 
device_node *parent)
 
 #ifdef CONFIG_SMP
ipi_irq = irq_create_mapping(root_domain, IPI_IRQ);
-   if (!ipi_irq)
-   return -EIO;
+   if (!ipi_irq) {
+   ret = -EIO;
+   goto err_domain_remove;
+   }
 
set_send_ipi(_mpintc_send_ipi, ipi_irq);
 #endif
 
return 0;
+
+#ifdef CONFIG_SMP
+err_domain_remove:
+   irq_domain_remove(root_domain);
+#endif
+err_iounmap:
+   iounmap(INTCG_base);
+err_free:
+   kfree(__trigger);
+   return ret;
 }
 IRQCHIP_DECLARE(csky_mpintc, "csky,mpintc", csky_mpintc_init);
-- 
2.1.0

[PATCH v4 02/14] irqchip/csky-apb-intc: Fix potential resource leaks

2020-06-30 Thread Tiezhu Yang

In the function ck_intc_init_comm(), system resources "reg_base" and
"root_domain" were not released in a few error cases. Thus add jump
targets for the completion of the desired exception handling.

Fixes: edff1b4835b7 ("irqchip: add C-SKY APB bus interrupt controller")
Signed-off-by: Tiezhu Yang 
---
 drivers/irqchip/irq-csky-apb-intc.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-csky-apb-intc.c 
b/drivers/irqchip/irq-csky-apb-intc.c
index 5a2ec43..11a35eb 100644
--- a/drivers/irqchip/irq-csky-apb-intc.c
+++ b/drivers/irqchip/irq-csky-apb-intc.c
@@ -118,7 +118,8 @@ ck_intc_init_comm(struct device_node *node, struct 
device_node *parent)
_generic_chip_ops, NULL);
if (!root_domain) {
pr_err("C-SKY Intc irq_domain_add failed.\n");
-   return -ENOMEM;
+   ret = -ENOMEM;
+   goto err_iounmap;
}
 
ret = irq_alloc_domain_generic_chips(root_domain, 32, 1,
@@ -126,10 +127,17 @@ ck_intc_init_comm(struct device_node *node, struct 
device_node *parent)
IRQ_NOREQUEST | IRQ_NOPROBE | IRQ_NOAUTOEN, 0, 0);
if (ret) {
pr_err("C-SKY Intc irq_alloc_gc failed.\n");
-   return -ENOMEM;
+   ret = -ENOMEM;
+   goto err_domain_remove;
}
 
return 0;
+
+err_domain_remove:
+   irq_domain_remove(root_domain);
+err_iounmap:
+   iounmap(reg_base);
+   return ret;
 }
 
 static inline bool handle_irq_perbit(struct pt_regs *regs, u32 hwirq,
-- 
2.1.0

Re: [PATCH v5 2/6] PCI: uniphier: Add misc interrupt handler to invoke PME and AER

2020-06-30 Thread Kunihiko Hayashi


Hi Marc,

On 2020/06/30 22:23, Marc Zyngier wrote:

On 2020-06-29 10:49, Kunihiko Hayashi wrote:

Hi Marc,

On 2020/06/27 18:48, Marc Zyngier wrote:

On Thu, 18 Jun 2020 09:38:09 +0100,
Kunihiko Hayashi  wrote:


The misc interrupts consisting of PME, AER, and Link event, is handled
by INTx handler, however, these interrupts should be also handled by
MSI handler.

This adds the function uniphier_pcie_misc_isr() that handles misc
interrupts, which is called from both INTx and MSI handlers.
This function detects PME and AER interrupts with the status register,
and invoke PME and AER drivers related to MSI.

And this sets the mask for misc interrupts from INTx if MSI is enabled
and sets the mask for misc interrupts from MSI if MSI is disabled.

Cc: Marc Zyngier 
Cc: Jingoo Han 
Cc: Gustavo Pimentel 
Signed-off-by: Kunihiko Hayashi 
---
  drivers/pci/controller/dwc/pcie-uniphier.c | 57 --
  1 file changed, 46 insertions(+), 11 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-uniphier.c 
b/drivers/pci/controller/dwc/pcie-uniphier.c
index a5401a0..5ce2479 100644
--- a/drivers/pci/controller/dwc/pcie-uniphier.c
+++ b/drivers/pci/controller/dwc/pcie-uniphier.c
@@ -44,7 +44,9 @@
  #define PCL_SYS_AUX_PWR_DET    BIT(8)
    #define PCL_RCV_INT    0x8108
+#define PCL_RCV_INT_ALL_INT_MASK    GENMASK(28, 25)
  #define PCL_RCV_INT_ALL_ENABLE    GENMASK(20, 17)
+#define PCL_RCV_INT_ALL_MSI_MASK    GENMASK(12, 9)
  #define PCL_CFG_BW_MGT_STATUS    BIT(4)
  #define PCL_CFG_LINK_AUTO_BW_STATUS    BIT(3)
  #define PCL_CFG_AER_RC_ERR_MSI_STATUS    BIT(2)
@@ -167,7 +169,15 @@ static void uniphier_pcie_stop_link(struct dw_pcie *pci)
    static void uniphier_pcie_irq_enable(struct uniphier_pcie_priv *priv)
  {
-    writel(PCL_RCV_INT_ALL_ENABLE, priv->base + PCL_RCV_INT);
+    u32 val;
+
+    val = PCL_RCV_INT_ALL_ENABLE;
+    if (pci_msi_enabled())
+    val |= PCL_RCV_INT_ALL_INT_MASK;
+    else
+    val |= PCL_RCV_INT_ALL_MSI_MASK;


Does this affect endpoints? Or just the RC itself?


These interrupts are asserted by RC itself, so this part affects only RC.


+
+    writel(val, priv->base + PCL_RCV_INT);
  writel(PCL_RCV_INTX_ALL_ENABLE, priv->base + PCL_RCV_INTX);
  }
  @@ -231,32 +241,56 @@ static const struct irq_domain_ops 
uniphier_intx_domain_ops = {
  .map = uniphier_pcie_intx_map,
  };
  -static void uniphier_pcie_irq_handler(struct irq_desc *desc)
+static void uniphier_pcie_misc_isr(struct pcie_port *pp, bool is_msi)
  {
-    struct pcie_port *pp = irq_desc_get_handler_data(desc);
  struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
  struct uniphier_pcie_priv *priv = to_uniphier_pcie(pci);
-    struct irq_chip *chip = irq_desc_get_chip(desc);
-    unsigned long reg;
-    u32 val, bit, virq;
+    u32 val, virq;
  -    /* INT for debug */
  val = readl(priv->base + PCL_RCV_INT);
    if (val & PCL_CFG_BW_MGT_STATUS)
  dev_dbg(pci->dev, "Link Bandwidth Management Event\n");
+
  if (val & PCL_CFG_LINK_AUTO_BW_STATUS)
  dev_dbg(pci->dev, "Link Autonomous Bandwidth Event\n");
-    if (val & PCL_CFG_AER_RC_ERR_MSI_STATUS)
-    dev_dbg(pci->dev, "Root Error\n");
-    if (val & PCL_CFG_PME_MSI_STATUS)
-    dev_dbg(pci->dev, "PME Interrupt\n");
+
+    if (is_msi) {
+    if (val & PCL_CFG_AER_RC_ERR_MSI_STATUS)
+    dev_dbg(pci->dev, "Root Error Status\n");
+
+    if (val & PCL_CFG_PME_MSI_STATUS)
+    dev_dbg(pci->dev, "PME Interrupt\n");
+
+    if (val & (PCL_CFG_AER_RC_ERR_MSI_STATUS |
+   PCL_CFG_PME_MSI_STATUS)) {
+    virq = irq_linear_revmap(pp->irq_domain, 0);
+    generic_handle_irq(virq);
+    }
+    }


Please have two handlers: one for interrupts that are from the RC,
another for interrupts coming from the endpoints.

I assume that this handler treats interrupts from the RC only and
this is set on the member ".msi_host_isr" added in the patch 1/6.
I think that the handler for interrupts coming from endpoints should be
treated as a normal case (after calling .msi_host_isr in
dw_handle_msi_irq()).


It looks pretty odd that you end-up dealing with both from the
same "parent" interrupt. I guess this is in keeping with the
rest of the DW PCIe hacks... :-/


It might be odd, however, in case of UniPhier SoC,
both MSI interrupts from endpoints and PME/AER interrupts from RC are
asserted by same "parent" interrupt. In other words, PME/AER interrupts
are notified using the parent interrupt for MSI.

MSI interrupts are treated as child interrupts with reference to
the status register in DW core. This is handled in a for-loop in
dw_handle_msi_irq().

PME/AER interrupts are treated with reference to the status register
in UniPhier glue layer, however, this couldn't be handled in the same way
directly.

So I'm trying to add .msi_host_isr function to handle this
with reference to the SoC-specific registers.

This exported function asserts MSI-0 as a shared

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1365 matches

Mail list logo