date:20210223

Re: [PATCH] memory: gpmc: fix out of bounds read and dereference on gpmc_cs[]

2021-02-23 Thread Dan Carpenter

On Tue, Feb 23, 2021 at 07:38:21PM +, Colin King wrote:
> From: Colin Ian King 
> 
> Currently the array gpmc_cs is indexed by cs before it cs is range checked
> and the pointer read from this out-of-index read is dereferenced. Fix this
> by performing the range check on cs before the read and the following
> pointer dereference.
> 
> Addresses-Coverity: ("Negative array index read")
> Fixes: 186401937927 ("memory: gpmc: Move omap gpmc code to live under 
> drivers")
> Signed-off-by: Colin Ian King 
> ---
>  drivers/memory/omap-gpmc.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/memory/omap-gpmc.c b/drivers/memory/omap-gpmc.c
> index cfa730cfd145..f80c2ea39ca4 100644
> --- a/drivers/memory/omap-gpmc.c
> +++ b/drivers/memory/omap-gpmc.c
> @@ -1009,8 +1009,8 @@ EXPORT_SYMBOL(gpmc_cs_request);
>  
>  void gpmc_cs_free(int cs)
>  {
> - struct gpmc_cs_data *gpmc = _cs[cs];
> - struct resource *res = >mem;

There is no actual dereferencing going on here, it just taking the
addresses.  But the patch is also harmless and improves readability.

regards,
dan carpenter

Re: [PATCH v21 10/10] fs/ntfs3: Add MAINTAINERS

2021-02-23 Thread Sebastian Gottschall




Am 12.02.2021 um 17:24 schrieb Konstantin Komarov:

This adds MAINTAINERS

Signed-off-by: Konstantin Komarov 


just for your info with latest ntfs3 driver

kern.err kernel: ntfs3: sda1: ntfs_evict_inode r=fe1 failed, -22.

[PATCH] arch: powerpc: kernel: Change droping to dropping in the file traps.c

2021-02-23 Thread Bhaskar Chowdhury



s/droping/dropping/

Signed-off-by: Bhaskar Chowdhury 
---
 arch/powerpc/kernel/traps.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 1583fd1c6010..83a53b67412a 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -405,7 +405,7 @@ void hv_nmi_check_nonrecoverable(struct pt_regs *regs)
 * Now test if the interrupt has hit a range that may be using
 * HSPRG1 without having RI=0 (i.e., an HSRR interrupt). The
 * problem ranges all run un-relocated. Test real and virt modes
-* at the same time by droping the high bit of the nip (virt mode
+* at the same time by dropping the high bit of the nip (virt mode
 * entry points still have the +0x4000 offset).
 */
nip &= ~0xc000ULL;
--
2.30.1

[PATCH] scsi: remove unneeded semicolon

2021-02-23 Thread Jiapeng Chong

Fix the following coccicheck warnings:

./drivers/scsi/aha1542.c:300:2-3: Unneeded semicolon.

Reported-by: Abaci Robot 
Signed-off-by: Jiapeng Chong 
---
 drivers/scsi/aha1542.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/aha1542.c b/drivers/scsi/aha1542.c
index 21aab9f..5e68982 100644
--- a/drivers/scsi/aha1542.c
+++ b/drivers/scsi/aha1542.c
@@ -297,7 +297,7 @@ static irqreturn_t aha1542_interrupt(int irq, void *dev_id)
if (flag & SCRD)
printk("SCRD ");
printk("status %02x\n", inb(STATUS(sh->io_port)));
-   };
+   }
 #endif
number_serviced = 0;
 
@@ -339,7 +339,7 @@ static irqreturn_t aha1542_interrupt(int irq, void *dev_id)
if (!number_serviced)
shost_printk(KERN_WARNING, sh, "interrupt 
received, but no mail.\n");
return IRQ_HANDLED;
-   };
+   }
 
mbo = (scsi2int(mb[mbi].ccbptr) - (unsigned 
long)aha1542->ccb_handle) / sizeof(struct ccb);
mbistatus = mb[mbi].status;
-- 
1.8.3.1

Re: [GIT PULL] Modules updates for v5.12

2021-02-23 Thread Christoph Hellwig

On Tue, Feb 23, 2021 at 12:07:39PM -0800, Linus Torvalds wrote:
> On Tue, Feb 23, 2021 at 12:03 PM Linus Torvalds
>  wrote:
> >
> > This is unacceptably slow. If that symbol trimming takes 30% of the
> > whole kernel build time, it needs to be fixed or removed.
> 
> I think I'm going to mark TRIM_UNUSED_KSYMS as "depends on BROKEN".
> There's no way I can accept that horrible overhead, and the rationale
> for that config option is questionable at best,

I think it is pretty useful for embedded setups.

BROKEN seems pretty strong for something that absolutely works as
intendended.  I guess to make you (and possibly others) not grumpy
we just need to ensure it doesn't get pulled in by allmodconfig.

So maybe just invert the symbol, default the KEEP_UNUSED_SYMBOL, and
add a message to the helptext explaining the slowdown?

Re: [PATCH v24 11/14] Documentation: Add documents for DAMON

2021-02-23 Thread SeongJae Park

On Thu, 4 Feb 2021 16:31:47 +0100 SeongJae Park  wrote:

> From: SeongJae Park 
> 
> This commit adds documents for DAMON under
> `Documentation/admin-guide/mm/damon/` and `Documentation/vm/damon/`.
> 
> Signed-off-by: SeongJae Park 
> ---
>  Documentation/admin-guide/mm/damon/guide.rst | 159 ++
>  Documentation/admin-guide/mm/damon/index.rst |  15 +
>  Documentation/admin-guide/mm/damon/plans.rst |  29 ++
>  Documentation/admin-guide/mm/damon/start.rst |  97 ++
>  Documentation/admin-guide/mm/damon/usage.rst | 304 +++
>  Documentation/admin-guide/mm/index.rst   |   1 +
>  Documentation/vm/damon/api.rst   |  20 ++
>  Documentation/vm/damon/design.rst| 166 ++
>  Documentation/vm/damon/eval.rst  | 232 ++
>  Documentation/vm/damon/faq.rst   |  58 
>  Documentation/vm/damon/index.rst |  31 ++
>  Documentation/vm/index.rst   |   1 +
>  12 files changed, 1113 insertions(+)
>  create mode 100644 Documentation/admin-guide/mm/damon/guide.rst
>  create mode 100644 Documentation/admin-guide/mm/damon/index.rst
>  create mode 100644 Documentation/admin-guide/mm/damon/plans.rst
>  create mode 100644 Documentation/admin-guide/mm/damon/start.rst
>  create mode 100644 Documentation/admin-guide/mm/damon/usage.rst
>  create mode 100644 Documentation/vm/damon/api.rst
>  create mode 100644 Documentation/vm/damon/design.rst
>  create mode 100644 Documentation/vm/damon/eval.rst
>  create mode 100644 Documentation/vm/damon/faq.rst
>  create mode 100644 Documentation/vm/damon/index.rst
> 
[...]
> diff --git a/Documentation/admin-guide/mm/damon/usage.rst 
> b/Documentation/admin-guide/mm/damon/usage.rst
> new file mode 100644
> index ..32436cf853c7
> --- /dev/null
> +++ b/Documentation/admin-guide/mm/damon/usage.rst
> @@ -0,0 +1,304 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===
> +Detailed Usages
> +===
> +
> +DAMON provides below three interfaces for different users.
> +
> +- *DAMON user space tool.*
> +  This is for privileged people such as system administrators who want a
> +  just-working human-friendly interface.  Using this, users can use the 
> DAMONâ€™s
> +  major features in a human-friendly way.  It may not be highly tuned for
> +  special cases, though.  It supports only virtual address spaces monitoring.
> +- *debugfs interface.*
> +  This is for privileged user space programmers who want more optimized use 
> of
> +  DAMON.  Using this, users can use DAMONâ€™s major features by reading
> +  from and writing to special debugfs files.  Therefore, you can write and 
> use
> +  your personalized DAMON debugfs wrapper programs that reads/writes the
> +  debugfs files instead of you.  The DAMON user space tool is also a 
> reference
> +  implementation of such programs.  It supports only virtual address spaces
> +  monitoring.
> +- *Kernel Space Programming Interface.*
> +  This is for kernel space programmers.  Using this, users can utilize every
> +  feature of DAMON most flexibly and efficiently by writing kernel space
> +  DAMON application programs for you.  You can even extend DAMON for various
> +  address spaces.
> +
> +This document does not describe the kernel space programming interface in
> +detail.  For that, please refer to the :doc:`/vm/damon/api`.
> +
> +
> +DAMON User Space Tool
> +=

This version of the patchset doesn't introduce the user space tool source code,
so putting the detailed usage here might make no sense.  I will remove this
section in the next version.  If you will review this patch, please skip this
section.
[...]
> +
> +debugfs Interface
> +=

But, this section will not be removed.  Please review.

[...]


Thanks,
SeongJae Park

Re: [PATCH 5.11 00/12] 5.11.1-rc1 review

2021-02-23 Thread Greg Kroah-Hartman

On Tue, Feb 23, 2021 at 05:12:56PM -0700, Shuah Khan wrote:
> On 2/23/21 2:05 PM, Shuah Khan wrote:
> > On 2/22/21 5:12 AM, Greg Kroah-Hartman wrote:
> > > This is the start of the stable review cycle for the 5.11.1 release.
> > > There are 12 patches in this series, all will be posted as a response
> > > to this one.  If anyone has any issues with these being applied, please
> > > let me know.
> > > 
> > > Responses should be made by Wed, 24 Feb 2021 12:07:46 +.
> > > Anything received after that time might be too late.
> > > 
> > > The whole patch series can be found in one patch at:
> > > 
> > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.11.1-rc1.gz
> > > 
> > > or in the git tree and branch at:
> > > 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> > > linux-5.11.y
> > > and the diffstat can be found below.
> > > 
> > > thanks,
> > > 
> > > greg k-h
> > > 
> > 
> > Compiled and booted on my test system. No dmesg regressions.
> > 
> > I made some progress on the drm/amdgpu display and kepboard
> > problem.
> > 
> > My system has
> >   amdgpu: ATOM BIOS: 113-RENOIR-026
> > 
> > I narrowed it down to the following as a possible lead to
> > start looking:
> > amdgpu :0b:00.0: [drm] Cannot find any crtc or sizes
> > 
> 
> It is resolved now. A hot-unplugged/plugged the HDMI cable which
> triggered reset sequence. There might be link to  AMD_DC_HDCP
> support, amdgpu_dm_atomic_commit changes that went into 5.10 and
> this behavior.
> 
> I am basing this on not seeing the problem on Linux 5.4 and until
> Linux 5.10. In any case, I wish I know more, but life is back to
> normal now.

Great, thanks for testing and tracking this down.

greg k-h

[GIT PULL] dma-mapping updates for 5.12

2021-02-23 Thread Christoph Hellwig

The following changes since commit 9f5f8ec50165630cfc49897410b30997d4d677b5:

  dma-mapping: benchmark: use u8 for reserved field in uAPI structure 
(2021-02-05 12:48:46 +0100)

are available in the Git repository at:

  git://git.infradead.org/users/hch/dma-mapping.git tags/dma-mapping-5.12

for you to fetch changes up to 81d88ce55092edf1a1f928efb373f289c6b90efd:

  dma-mapping: remove the {alloc,free}_noncoherent methods (2021-02-09 18:01:38 
+0100)


dma-mapping updates for 5.12:

 - add support to emulate processing delays in the DMA API benchmark
   selftest (Barry Song)
 - remove support for non-contiguous noncoherent allocations,
   which aren't used and will be replaced by a different API


Barry Song (1):
  dma-mapping: benchmark: pretend DMA is transmitting

Christoph Hellwig (1):
  dma-mapping: remove the {alloc,free}_noncoherent methods

 Documentation/core-api/dma-api.rst  | 64 +
 drivers/iommu/dma-iommu.c   | 30 
 include/linux/dma-map-ops.h |  5 --
 include/linux/dma-mapping.h | 17 +--
 kernel/dma/map_benchmark.c  | 12 -
 kernel/dma/mapping.c| 40 
 tools/testing/selftests/dma/dma_map_benchmark.c | 21 ++--
 7 files changed, 64 insertions(+), 125 deletions(-)

Re: [RFC][PATCH 1/3] PM /devfreq: add user frequency limits into devfreq struct

2021-02-23 Thread Chanwoo Choi

On 2/16/21 7:41 PM, Lukasz Luba wrote:
> Hi Chanwoo,
> 
> On 2/15/21 3:00 PM, Chanwoo Choi wrote:
>> Hi Lukasz,
>>
>> On Fri, Feb 12, 2021 at 7:28 AM Lukasz Luba  wrote:
>>>
>>>
>>>
>>> On 2/11/21 11:07 AM, Lukasz Luba wrote:
 Hi Chanwoo,

 On 2/3/21 10:21 AM, Lukasz Luba wrote:
> Hi Chanwoo,
>
> Thank you for looking at this.
>
> On 2/3/21 10:11 AM, Chanwoo Choi wrote:
>> Hi Lukasz,
>>
>> When accessing the max_freq and min_freq at devfreq-cooling.c,
>> even if can access 'user_max_freq' and 'lock' by using the 'devfreq'
>> instance,
>> I think that the direct access of variables
>> (lock/user_max_freq/user_min_freq)
>> of struct devfreq are not good.
>>
>> Instead, how about using the 'DEVFREQ_TRANSITION_NOTIFIER'
>> notification with following changes of 'struct devfreq_freq'?
>
> I like the idea with devfreq notification. I will have to go through the
> code to check that possibility.
>
>> Also, need to add codes into devfreq_set_target() for initializing
>> 'new_max_freq' and 'new_min_freq' before sending the DEVFREQ_POSTCHANGE
>> notification.
>>
>> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
>> index 147a229056d2..d5726592d362 100644
>> --- a/include/linux/devfreq.h
>> +++ b/include/linux/devfreq.h
>> @@ -207,6 +207,8 @@ struct devfreq {
>>    struct devfreq_freqs {
>>   unsigned long old;
>>   unsigned long new;
>> +   unsigned long new_max_freq;
>> +   unsigned long new_min_freq;
>>    };
>>
>>
>> And I think that new 'user_min_freq'/'user_max_freq' are not necessary.
>> You can get the current max_freq/min_freq by using the following steps:
>>
>>  get_freq_range(devfreq, _freq, _freq);
>>  dev_pm_opp_find_freq_floor(pdev, _freq);
>>  dev_pm_opp_find_freq_floor(pdev, _freq);
>>
>> So that you can get the 'max_freq/min_freq' and then
>> initialize the 'freqs.new_max_freq and freqs.new_min_freq'
>> with them as following:
>>
>> in devfreq_set_target()
>>  get_freq_range(devfreq, _freq, _freq);
>>  dev_pm_opp_find_freq_floor(pdev, _freq);
>>  dev_pm_opp_find_freq_floor(pdev, _freq);
>>  freqs.new_max_freq = min_freq;
>>  freqs.new_max_freq = max_freq;
>>  devfreq_notify_transition(devfreq, , DEVFREQ_POSTCHANGE);
>
> I will plumb it in and check that option. My concern is that function
> get_freq_range() would give me the max_freq value from PM QoS, which
> might be my thermal limit - lower that user_max_freq. Then I still
> need
>
> I've been playing with PM QoS notifications because I thought it would
> be possible to be notified in thermal for all new set values - even from
> devfreq sysfs user max_freq write, which has value higher that the
> current limit set by thermal governor. Unfortunately PM QoS doesn't
> send that information by design. PM QoS also by desing won't allow
> me to check first two limits in the plist - which would be thermal
> and user sysfs max_freq.
>
> I will experiment with this notifications and share the results.
> That you for your comments.

 I have experimented with your proposal. Unfortunately, the value stored
 in the pm_qos which is read by get_freq_range() is not the user max
 freq. It's the value from thermal devfreq cooling when that one is
 lower. Which is OK in the overall design, but not for my IPA use case.

 What comes to my mind is two options:
 1) this patch proposal, with simple solution of two new variables
 protected by mutex, which would maintain user stored values
 2) add a new notification chain in devfreq to notify about new
 user written value, to which devfreq cooling would register; that
 would allow devfreq cooling to get that value instantly and store
 locally
>>>
>>> 3) How about new define for existing notification chain:
>>> #define DEVFREQ_USER_CHANGE    (2)
>>
>> I think that if we add the notification with specific actor like user change
>> or OPP change or others, it is not proper. But, we can add the notification
>> for min or max frequency change timing. Because the devfreq already has
>> the notification for current frequency like DEVFREQ_PRECHANGE,
>> DEVFREQ_POSTCHANGE.
>>
>> Maybe, we can add the following notification for min/max_freq.
>> The following min_freq and max_freq values will be calculated by
>> get_freq_range().
>> DEVFREQ_MIN_FREQ_PRECHANGE
>> DEVFREQ_MIN_FREQ_POSTCHANGE
>> DEVFREQ_MAX_FREQ_PRECHANGE
>> DEVFREQ_MAX_FREQ_POSTCHANGE
> 
> Would it be then possible to pass the user max freq value written via
> sysfs? Something like in the example below, when writing into max sysfs:
> 
> 1) starting in max_freq_store()
>     freqs.new_max_freq = max_freq;
>

Re: [PATCH 2/2] nds32: use get_kernel_nofault in dump_mem

2021-02-23 Thread Christoph Hellwig

On Thu, Jan 28, 2021 at 11:16:33AM +0100, Christoph Hellwig wrote:
> On Tue, Jul 21, 2020 at 01:28:00PM +0200, Christoph Hellwig wrote:
> > Can you pich the patches up in the nds32 tree for Linus?  There are
> > not short-term dependencies on them.
> 
> It seems like these patches are still sitting in linux-next and never
> made it to Linus/

ping?

Re: [PATCH] perf-probe: dso: Add symbols in .text.* subsections to text symbol map in kenrel modules

2021-02-23 Thread Evgenii Shatokhin


On 23.02.2021 23:11, Arnaldo Carvalho de Melo wrote:

Em Tue, Feb 23, 2021 at 06:02:58PM +0300, Evgenii Shatokhin escreveu:

On 23.02.2021 10:37, Masami Hiramatsu wrote:

The kernel modules have .text.* subsections such as .text.unlikely.
Since dso__process_kernel_symbol() only identify the symbols in the ".text"
section as the text symbols and inserts it in the default dso in the map,
the symbols in such subsections can not be found by map__find_symbol().

This adds the symbols in those subsections to the default dso in the map so
that map__find_symbol() can find them. This solves the perf-probe issue on
probing online module.

Without this fix, probing on a symbol in .text.unlikely fails.

# perf probe -m nf_conntrack nf_l4proto_log_invalid
Probe point 'nf_l4proto_log_invalid' not found.
  Error: Failed to add events.


With this fix, it works because map__find_symbol() can find the symbol
correctly.

# perf probe -m nf_conntrack nf_l4proto_log_invalid
Added new event:
  probe:nf_l4proto_log_invalid (on nf_l4proto_log_invalid in nf_conntrack)

You can now use it in all perf tools, such as:

perf record -e probe:nf_l4proto_log_invalid -aR sleep 1



Reported-by: Evgenii Shatokhin 
Signed-off-by: Masami Hiramatsu 


Thanks for the fix!

It looks like it helps, at least with nf_conntrack in kernel 5.11:


So I'm taking this as you providing a:

Tested-by: Evgenii Shatokhin 

ok?


Sure, thanks!



- Arnaldo


-
# ./perf probe -v -m nf_conntrack nf_ct_resolve_clash
probe-definition(0): nf_ct_resolve_clash
symbol:nf_ct_resolve_clash file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Failed to get build-id from nf_conntrack.
Cache open error: -1
Open Debuginfo file:
/lib/modules/5.11.0-test01/kernel/net/netfilter/nf_conntrack.ko
Try to find probe point from debuginfo.
Matched function: nf_ct_resolve_clash [33616]
Probe point found: nf_ct_resolve_clash+0
Found 1 probe_trace_events.
Opening /sys/kernel/tracing//kprobe_events write=1
Opening /sys/kernel/tracing//README write=0
Writing event: p:probe/nf_ct_resolve_clash
nf_conntrack:nf_ct_resolve_clash+0
Added new event:
   probe:nf_ct_resolve_clash (on nf_ct_resolve_clash in nf_conntrack)

You can now use it in all perf tools, such as:

 perf record -e probe:nf_ct_resolve_clash -aR sleep 1
-

I guess, the patch is suitable for stable kernel branches as well.

Without the patch, the workaround you suggested earlier (using the full path
to nf_conntrack.ko) works too.


---
   tools/perf/util/symbol-elf.c |4 +++-
   1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 6dff843fd883..0c1113236913 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -985,7 +985,9 @@ static int dso__process_kernel_symbol(struct dso *dso, 
struct map *map,
if (strcmp(section_name, (curr_dso->short_name + dso->short_name_len)) 
== 0)
return 0;
-   if (strcmp(section_name, ".text") == 0) {
+   /* .text and .text.* are included in the text dso */
+   if (strncmp(section_name, ".text", 5) == 0 &&
+   (section_name[5] == '\0' || section_name[5] == '.')) {
/*
 * The initial kernel mapping is based on
 * kallsyms and identity maps.  Overwrite it to

.



Regards,
Evgenii

Re: [PATCH stable-rc queue/4.9 1/1] futex: Provide distinct return value when owner is exiting

2021-02-23 Thread Greg KH

On Wed, Feb 24, 2021 at 09:41:01AM +0800, Xiaoming Ni wrote:
> On 2021/2/23 21:00, Greg KH wrote:
> > On Mon, Feb 22, 2021 at 10:11:37PM +0800, Xiaoming Ni wrote:
> > > On 2021/2/22 20:09, Greg KH wrote:
> > > > On Mon, Feb 22, 2021 at 06:54:06PM +0800, Xiaoming Ni wrote:
> > > > > On 2021/2/22 18:16, Greg KH wrote:
> > > > > > On Mon, Feb 22, 2021 at 03:03:28PM +0800, Xiaoming Ni wrote:
> > > > > > > From: Thomas Gleixner
> > > > > > > 
> > > > > > > commit ac31c7ff8624409ba3c4901df9237a616c187a5d upstream.
> > > > > > This commit is already in the 4.9 tree.  If the backport was 
> > > > > > incorrect,
> > > > > > say that here, and describe what went wrong and why this commit 
> > > > > > fixes
> > > > > > it.
> > > > > > 
> > > > > > Also state what commit this fixes as well, otherwise this changelog 
> > > > > > just
> > > > > > looks like it is being applied again to the tree, which doesn't make
> > > > > > much sense.
> > > > > > 
> > > > > > thanks,
> > > > > > 
> > > > > > greg k-h
> > > > > > .
> > > > > 
> > > > > I wrote a cover for it. but forgot to adjust the title of the cover:
> > > > > 
> > > > > https://lore.kernel.org/lkml/20210222070328.102384-1-nixiaom...@huawei.com/
> > > > > 
> > > > > 
> > > > > I found a dead code in the queue/4.9 branch of the stable-rc 
> > > > > repository.
> > > > > 
> > > > > 2021-02-03:
> > > > > commit c27f392040e2f6 ("futex: Provide distinct return value when
> > > > >owner is exiting")
> > > > >   The function handle_exit_race does not exist. Therefore, the
> > > > >   change in handle_exit_race() is ignored in the patch round.
> > > > > 
> > > > > 2021-02-22:
> > > > > commit e55cb811e612 ("futex: Cure exit race")
> > > > >   Define the handle_exit_race() function,
> > > > >   but no branch in the function returns EBUSY.
> > > > >   As a result, dead code occurs in the attach_to_pi_owner():
> > > > > 
> > > > >   int ret = handle_exit_race(uaddr, uval, p);
> > > > >   ...
> > > > >   if (ret == -EBUSY)
> > > > >   *exiting = p; /* dead code */
> > > > > 
> > > > > To fix the dead code, modify the commit e55cb811e612 ("futex: Cure 
> > > > > exit
> > > > > race"),
> > > > > or install a patch to incorporate the changes in handle_exit_race().
> > > > > 
> > > > > I am unfamiliar with the processing of the stable-rc queue branch,
> > > > > and I cannot find the patch mail of the current branch in
> > > > >   https://lore.kernel.org/lkml/?q=%22futex%3A+Cure+exit+race%22
> > > > > Therefore, I re-integrated commit ac31c7ff8624 ("futex: Provide 
> > > > > distinct
> > > > >return value when owner is exiting").
> > > > >And wrote a cover (but forgot to adjust the title of the cover):
> > > > > 
> > > > > https://lore.kernel.org/lkml/20210222070328.102384-1-nixiaom...@huawei.com/
> > > > 
> > > > So this is a "fixup" patch, right?
> > > > 
> > > > Please clearly label it as such in your patch description and resend
> > > > this as what is here I can not apply at all.
> > > > 
> > > > thanks,
> > > > 
> > > > greg k-h
> > > > .
> > > > 
> > > Thank you for your guidance.
> > > I have updated the patch description and resent the patch based on
> > > v4.9.258-rc1
> > > https://lore.kernel.org/lkml/20210222125352.110124-1-nixiaom...@huawei.com/
> > 
> > Can you please try 4.9.258 and let me know if this is still needed or
> > not?
> > 
> > thanks,
> > 
> > greg k-h
> > .
> > 
> The dead code problem still exists in V4.9.258. No conflict occurs during my
> patch integration. Do I need to correct the version number marked in the cc
> table in the patch and resend the patch?

Please do.

thanks,

greg k-h

[PATCH v2 1/2] pstore: Add mem_type property DT parsing support

2021-02-23 Thread Mukesh Ojha

From: Mukesh Ojha 

There could be a sceanario where we define some region
in normal memory and use them store to logs which is later
retrieved by bootloader during warm reset.

In this scenario, we wanted to treat this memory as normal
cacheable memory instead of default behaviour which
is an overhead. Making it cacheable could improve
performance.

This commit gives control to change mem_type from Device
tree, and also documents the value for normal memory.

Signed-off-by: Mukesh Ojha 
---
Changes in v2:
 - if-else converted to switch case
 - updated MODULE_PARM_DESC with new memory type.
 - default setting is still intact.

 Documentation/admin-guide/ramoops.rst |  4 +++-
 fs/pstore/ram.c   |  3 ++-
 fs/pstore/ram_core.c  | 18 --
 3 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/ramoops.rst 
b/Documentation/admin-guide/ramoops.rst
index b0a1ae7..8f107d8 100644
--- a/Documentation/admin-guide/ramoops.rst
+++ b/Documentation/admin-guide/ramoops.rst
@@ -3,7 +3,7 @@ Ramoops oops/panic logger
 
 Sergiu Iordache 
 
-Updated: 17 November 2011
+Updated: 10 Feb 2021
 
 Introduction
 
@@ -30,6 +30,8 @@ mapping to pgprot_writecombine. Setting ``mem_type=1`` 
attempts to use
 depends on atomic operations. At least on ARM, pgprot_noncached causes the
 memory to be mapped strongly ordered, and atomic operations on strongly ordered
 memory are implementation defined, and won't work on many ARMs such as omaps.
+Setting ``mem_type=2`` attempts to treat the memory region as normal memory,
+which enables full cache on it. This can improve the performance.
 
 The memory area is divided into ``record_size`` chunks (also rounded down to
 power of two) and each kmesg dump writes a ``record_size`` chunk of
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index ca6d8a8..af4ca6a4 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -56,7 +56,7 @@ MODULE_PARM_DESC(mem_size,
 static unsigned int mem_type;
 module_param(mem_type, uint, 0400);
 MODULE_PARM_DESC(mem_type,
-   "set to 1 to try to use unbuffered memory (default 0)");
+   "set to 1 to use unbuffered memory, 2 for cached memory 
(default 0)");
 
 static int ramoops_max_reason = -1;
 module_param_named(max_reason, ramoops_max_reason, int, 0400);
@@ -666,6 +666,7 @@ static int ramoops_parse_dt(struct platform_device *pdev,
field = value;  \
}
 
+   parse_u32("mem-type", pdata->record_size, pdata->mem_type);
parse_u32("record-size", pdata->record_size, 0);
parse_u32("console-size", pdata->console_size, 0);
parse_u32("ftrace-size", pdata->ftrace_size, 0);
diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
index aa8e0b6..0da012f 100644
--- a/fs/pstore/ram_core.c
+++ b/fs/pstore/ram_core.c
@@ -396,6 +396,10 @@ void persistent_ram_zap(struct persistent_ram_zone *prz)
persistent_ram_update_header_ecc(prz);
 }
 
+#define MEM_TYPE_WCOMBINE  0
+#define MEM_TYPE_NONCACHED 1
+#define MEM_TYPE_NORMAL2
+
 static void *persistent_ram_vmap(phys_addr_t start, size_t size,
unsigned int memtype)
 {
@@ -409,10 +413,20 @@ static void *persistent_ram_vmap(phys_addr_t start, 
size_t size,
page_start = start - offset_in_page(start);
page_count = DIV_ROUND_UP(size + offset_in_page(start), PAGE_SIZE);
 
-   if (memtype)
+   switch (memtype) {
+   case MEM_TYPE_NORMAL:
+   prot = PAGE_KERNEL;
+   break;
+   case MEM_TYPE_NONCACHED:
prot = pgprot_noncached(PAGE_KERNEL);
-   else
+   break;
+   case MEM_TYPE_WCOMBINE:
prot = pgprot_writecombine(PAGE_KERNEL);
+   break;
+   default:
+   pr_err("invalid memory type\n");
+   return NULL;
+   }
 
pages = kmalloc_array(page_count, sizeof(struct page *), GFP_KERNEL);
if (!pages) {
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative 
Project

[PATCH 2/2] pstore: Add buffer start check during init

2021-02-23 Thread Mukesh Ojha

From: Huang Yiwei 

In a scenario of panic, when we use DRAM to store log instead
of persistant storage and during warm reset when we copy these
data outside of ram. Missing check on prz->start(write position)
can cause crash because it can be any value and can point outside
the mapped region. So add the start check to avoid.

Signed-off-by: Huang Yiwei 
Signed-off-by: Mukesh Ojha 
---
change in v2:
 - this is on top of first patchset.

 fs/pstore/ram_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
index 0da012f..a15748a 100644
--- a/fs/pstore/ram_core.c
+++ b/fs/pstore/ram_core.c
@@ -514,7 +514,7 @@ static int persistent_ram_post_init(struct 
persistent_ram_zone *prz, u32 sig,
sig ^= PERSISTENT_RAM_SIG;
 
if (prz->buffer->sig == sig) {
-   if (buffer_size(prz) == 0) {
+   if (buffer_size(prz) == 0 && buffer_start(prz) == 0) {
pr_debug("found existing empty buffer\n");
return 0;
}
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative 
Project

Re: [PATCH 01/13] staging: rtl8192e: remove blank line in bss_ht struct

2021-02-23 Thread Dan Carpenter

On Sat, Feb 20, 2021 at 03:54:05PM +, William Durand wrote:
> Fixes a checkpatch CHECK issue.
> 

All these patches have the same vague commit message.  It's okay if the
commit message basically restates the commit one line summary.  It
should say something like:

Fix a checkpatch warning about a blank line after an open curly brace.

Rename FooBar to foo_bar to silence a checkpatch warning about
CamelCase.

regards,
dan carpenter

Re: [PATCH] nand: brcmnand: fix OOB R/W with Hamming ECC

2021-02-23 Thread Miquel Raynal

Hi Álvaro,

Álvaro Fernández Rojas  wrote on Wed, 24 Feb 2021
08:16:58 +0100:

> Hi Florian,
> 
> > El 24 feb 2021, a las 4:46, Florian Fainelli  
> > escribió:
> > 
> > 
> > 
> > On 2/22/2021 12:16 PM, Álvaro Fernández Rojas wrote:
> >> Hamming ECC doesn't cover the OOB data, so reading or writing OOB shall
> >> always be done without ECC enabled.
> >> This is a problem when adding JFFS2 cleanmarkers to erased blocks. If JFFS2
> >> clenmarkers are added to the OOB with ECC enabled, OOB bytes will be 
> >> changed
> >> from ff ff ff to 00 00 00, reporting incorrect ECC errors.
> >> 
> >> Signed-off-by: Álvaro Fernández Rojas 
> > 
> > Should there be a Fixes: tag provided here for back porting to stable trees?
> 
> I think so, but the fixed commit would be the first one, right?
> 27c5b17cd1b10564fa36f8f51e4b4b41436ecc32

Yep, shouldn't be a problem.

Thanks,
Miquèl

Re: [PATCH v5 2/2] counter: add IRQ or GPIO based event counter

2021-02-23 Thread Oleksij Rempel

Hello William,

On Wed, Feb 24, 2021 at 11:34:06AM +0900, William Breathitt Gray wrote:
> On Tue, Feb 23, 2021 at 06:45:16PM +0100, Oleksij Rempel wrote:
> > Hello William,
> > 
> > Here is cooled down technical answer. Excuse me for over reacting.
> 
> Hello Oleksij,
> 
> Let me apologize too if I offended you in some way in with previous
> response, I assure you that was not my intention. I truly do believe
> this is a useful driver to have in the kernel and I want to make that
> happen; my concerns with your patch are purely technical in nature and 
> I'm certain we can find a solution working together.

No problem :)

> > On Tue, Feb 23, 2021 at 11:06:56AM +0100, Oleksij Rempel wrote:
> > > On Mon, Feb 22, 2021 at 10:43:00AM +0900, William Breathitt Gray wrote:
> > > > On Mon, Feb 15, 2021 at 10:17:37AM +0100, Oleksij Rempel wrote:
> > > > > > > +static irqreturn_t event_cnt_isr(int irq, void *dev_id)
> > > > > > > +{
> > > > > > > + struct event_cnt_priv *priv = dev_id;
> > > > > > > +
> > > > > > > + atomic_inc(>count);
> > > > > > 
> > > > > > This is just used to count the number of interrupts right? I wonder 
> > > > > > if
> > > > > > we can do this smarter. For example, the kernel already keeps track 
> > > > > > of
> > > > > > number of interrupts that has occurred for any particular IRQ line 
> > > > > > on a
> > > > > > CPU (see the 'kstat_irqs' member of struct irq_desc, and the
> > > > > > show_interrupts() function in kernel/irq/proc.c). Would it make 
> > > > > > sense to
> > > > > > simply store the initial interrupt count on driver load or 
> > > > > > enablement,
> > > > > > and then return the difference during a count_read() callback?
> > > > > 
> > > > > This driver do not makes a lot of sense without your chardev patches. 
> > > > > As
> > > > > soon as this patches go mainline, this driver will be able to send
> > > > > event with a timestamp and counter state to the user space.
> > > > > 
> > > > > With other words, we will need an irq handler anyway. In this case we
> > > > > can't save more RAM or CPU cycles by using system irq counters.
> > > > 
> > > > It's true that this driver will need an IRQ handler when the timestamp
> > > > functionality is added, but deriving the count value is different matter
> > > > regardless. There's already code in the kernel to retrieve the number of
> > > > interrupts, so it makes sense that we use that rather than rolling our
> > > > own -- at the very least to ensure the value we provide to users is
> > > > consistent with the ones already provided by other areas of the kernel.
> > 
> > The value provided by the driver is consistent only if it is not
> > overwritten by user. The driver provides an interface to reset/overwrite it.
> > At least after this step the value is not consistent.
> 
> I wasn't clear here so I apologize. What I would like is for this driver
> to maintain its own local count value derived from kstat_irqs_usr(). So
> for example, you can use the "count" member of your struct
> interrupt_cnt_priv to maintain this value (it can be unsigned int
> instead of atomic_t):
> 
> static int interrupt_cnt_read(struct counter_device *counter,
> struct counter_count *count, unsigned long *val)
> {
>   struct interrupt_cnt_priv *priv = counter->priv;
> 
>   *val = kstat_irqs_usr(priv->irq) - priv->count;
> 
>   return 0;
> }
> 
> static int interrupt_cnt_write(struct counter_device *counter,
>  struct counter_count *count,
>  const unsigned long val)
> {
>   struct interrupt_cnt_priv *priv = counter->priv;
> 
>   /* kstat_irqs_usr() returns unsigned int */
>   if (val != (unsigned int)val)
>   return -ERANGE;
> 
>   priv->count = val;
> 
>   return 0;
> }

I understand this part. There is no need to spend extra CPU cycles if
the interrupt was already counted. Just read it on user request and
calculate the offset if needed.

As soon as timestamp support is available, I will need to go back to
local counter, because the kstat_irqs_usr() will take a lot more CPU
cycles compared to private counter (it sums over all CPU local
counters). So it's better to increment a single variable, then to call
kstat_irqs_usr() from interrupt handler at IRQ rate several 10 thousands
interrupts per second.

> > > We are talking about one or two code lines. If we will take some
> > > duplication search engine, it will find that major part of the kernel
> > > is matching against it.
> > > 
> > > Newer the less, this driver provides a way to reset the counter. Why
> > > should we drop this functionality no advantage?
> > > 
> > > > To that end, I'd like to see your cnt_isr() function removed for this
> > > > patchset (you can bring it back once timestamp support is added).
> > 
> > It make no sense to request an interrupt without interrupt service
> > routine.
> > 
> >

[PATCH v2] mips: smp-bmips: fix CPU mappings

2021-02-23 Thread Álvaro Fernández Rojas

When booting bmips with SMP enabled on a BCM6358 running on CPU #1 instead of
CPU #0, the current CPU mapping code produces the following:
- smp_processor_id(): 0
- cpu_logical_map(0): 1
- cpu_number_map(0): 1

This is because SMP isn't supported on BCM6358 since it has a shared TLB, so
it is disabled and max_cpus is decreased from 2 to 1.

Signed-off-by: Álvaro Fernández Rojas 
Reviewed-by: Florian Fainelli 
---
 v2: Fix duplicated line

 arch/mips/kernel/smp-bmips.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/arch/mips/kernel/smp-bmips.c b/arch/mips/kernel/smp-bmips.c
index 359b176b665f..b6ef5f7312cf 100644
--- a/arch/mips/kernel/smp-bmips.c
+++ b/arch/mips/kernel/smp-bmips.c
@@ -134,17 +134,24 @@ static void __init bmips_smp_setup(void)
if (!board_ebase_setup)
board_ebase_setup = _ebase_setup;
 
-   __cpu_number_map[boot_cpu] = 0;
-   __cpu_logical_map[0] = boot_cpu;
-
-   for (i = 0; i < max_cpus; i++) {
-   if (i != boot_cpu) {
-   __cpu_number_map[i] = cpu;
-   __cpu_logical_map[cpu] = i;
-   cpu++;
+   if (max_cpus > 1) {
+   __cpu_number_map[boot_cpu] = 0;
+   __cpu_logical_map[0] = boot_cpu;
+
+   for (i = 0; i < max_cpus; i++) {
+   if (i != boot_cpu) {
+   __cpu_number_map[i] = cpu;
+   __cpu_logical_map[cpu] = i;
+   cpu++;
+   }
+   set_cpu_possible(i, 1);
+   set_cpu_present(i, 1);
}
-   set_cpu_possible(i, 1);
-   set_cpu_present(i, 1);
+   } else {
+   __cpu_number_map[0] = boot_cpu;
+   __cpu_logical_map[0] = 0;
+   set_cpu_possible(0, 1);
+   set_cpu_present(0, 1);
}
 }
 
-- 
2.20.1

[PATCH v2] fs/cifs:simplify the return expression of cifs_swn_auth_info_krb

2021-02-23 Thread dingsenjie

From: dingsenjie 

simplify the return expression.

Signed-off-by: dingsenjie 
---
 fs/cifs/cifs_swn.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/fs/cifs/cifs_swn.c b/fs/cifs/cifs_swn.c
index d35f599..67b3f4e 100644
--- a/fs/cifs/cifs_swn.c
+++ b/fs/cifs/cifs_swn.c
@@ -34,13 +34,7 @@ struct cifs_swn_reg {
 
 static int cifs_swn_auth_info_krb(struct cifs_tcon *tcon, struct sk_buff *skb)
 {
-   int ret;
-
-   ret = nla_put_flag(skb, CIFS_GENL_ATTR_SWN_KRB_AUTH);
-   if (ret < 0)
-   return ret;
-
-   return 0;
+   return nla_put_flag(skb, CIFS_GENL_ATTR_SWN_KRB_AUTH);
 }
 
 static int cifs_swn_auth_info_ntlm(struct cifs_tcon *tcon, struct sk_buff *skb)
-- 
1.9.1

Re: [PATCH] drivers: staging: comedi: Fixed side effects from macro definition.

2021-02-23 Thread Greg Kroah-Hartman

A: http://en.wikipedia.org/wiki/Top_post
Q: Were do I find info about this thing called top-posting?
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

On Wed, Feb 24, 2021 at 12:47:26PM +0530, chakravarthi Kulkarni wrote:
> Hi,
> 
> I tested it will unit test cases it looks fine.
> int x = 10;
> NI_USUAL_PFI_SELECT(x++)
> 
> will not have side effects as it is taken care using local variable in
> macro.

You ignored what Ian said about why this change was not ok :(

It's long deleted from my review queue, sorry.

greg k-h

[PATCH] powerpc: remove unneeded semicolon

2021-02-23 Thread Jiapeng Chong

Fix the following coccicheck warnings:

./arch/powerpc/kernel/prom_init.c:2986:2-3: Unneeded semicolon.

Reported-by: Abaci Robot 
Signed-off-by: Jiapeng Chong 
---
 arch/powerpc/kernel/prom_init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index ccf77b9..41ed7e3 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -2983,7 +2983,7 @@ static void __init fixup_device_tree_efika_add_phy(void)
" 0x3 encode-int encode+"
" s\" interrupts\" property"
" finish-device");
-   };
+   }
 
/* Check for a PHY device node - if missing then create one and
 * give it's phandle to the ethernet node */
-- 
1.8.3.1

Re: [PATCH v1 2/4] clk: rockchip: add dt-binding header for rk3568

2021-02-23 Thread Heiko Stübner

Hi Elaine,

Am Mittwoch, 24. Februar 2021, 07:35:30 CET schrieb elaine.zhang:
> Hi, Heiko:
> 
> 在 2021/2/23 下午6:45, Heiko Stübner 写道:
> > Hi Elaine,
> >
> > Am Dienstag, 23. Februar 2021, 10:53:50 CET schrieb Elaine Zhang:
> >> Add the dt-bindings header for the rk3568, that gets shared between
> >> the clock controller and the clock references in the dts.
> >> Add softreset ID for rk3568.
> >>
> >> Signed-off-by: Elaine Zhang 
> >> ---
> >>   include/dt-bindings/clock/rk3568-cru.h | 926 +
> >>   1 file changed, 926 insertions(+)
> >>   create mode 100644 include/dt-bindings/clock/rk3568-cru.h
> >>
> >> diff --git a/include/dt-bindings/clock/rk3568-cru.h 
> >> b/include/dt-bindings/clock/rk3568-cru.h
> >> new file mode 100644
> >> index ..22b0b8739b5d
> >> --- /dev/null
> >> +++ b/include/dt-bindings/clock/rk3568-cru.h
> >> @@ -0,0 +1,926 @@
> >> +/* SPDX-License-Identifier: GPL-2.0 */
> >> +/*
> >> + * Copyright (c) 2021 Rockchip Electronics Co. Ltd.
> >> + * Author: Elaine Zhang 
> >> + */
> >> +
> >> +#ifndef _DT_BINDINGS_CLK_ROCKCHIP_RK3568_H
> >> +#define _DT_BINDINGS_CLK_ROCKCHIP_RK3568_H
> >> +
> >> +/* pmucru-clocks indices */
> >> +
> >> +/* pmucru plls */
> >> +#define PLL_PPLL  1
> >> +#define PLL_HPLL  2
> >> +
> >> +/* pmucru clocks */
> >> +#define XIN_OSC0_DIV  4
> >> +#define CLK_RTC_32K   5
> >> +#define CLK_PMU   6
> >> +#define CLK_I2C0  7
> > can we change the prefix of CLK_* ids to SCLK_*
> > (for special clock), like on previous socs.
> >
> > Especially as some of them already have that SCLK_prefix already anyway.
> >
> > Having that 4-letter prefix makes reading these IDs easier as well ;-)
> 
> SCLK is for special clock, CLK is for common clock.
> 
> rk3568-cru.h is automatically generated from TRM using tools.
> Can we minimize the work of manual modification?
> Because of the increasing number of clocks, writing by hand often makes 
> mistakes.We use tools to generate rk3568-cru.h(100% use tools) and generate 
> descriptions of registers in clk-rk3568.c(50% use tools)

ok, sounds good.

I didn't realize you're using tools now, so yes the
autogenerated header can stay as it is then.


Heiko


> 
> >
> >
> > Thanks
> > Heiko
> >
> >> +#define CLK_RTC32K_FRAC   8
> >> +#define CLK_UART0_DIV 9
> >> +#define CLK_UART0_FRAC10
> >> +#define SCLK_UART011
> >> +#define DBCLK_GPIO0   12
> >> +#define CLK_PWM0  13
> >> +#define CLK_CAPTURE_PWM0_NDFT 14
> >> +#define CLK_PMUPVTM   15
> >> +#define CLK_CORE_PMUPVTM  16
> >> +#define CLK_REF24M17
> >> +#define XIN_OSC0_USBPHY0_G18
> >> +#define CLK_USBPHY0_REF   19
> >> +#define XIN_OSC0_USBPHY1_G20
> >> +#define CLK_USBPHY1_REF   21
> >> +#define XIN_OSC0_MIPIDSIPHY0_G22
> >> +#define CLK_MIPIDSIPHY0_REF   23
> >> +#define XIN_OSC0_MIPIDSIPHY1_G24
> >> +#define CLK_MIPIDSIPHY1_REF   25
> >> +#define CLK_WIFI_DIV  26
> >> +#define CLK_WIFI_OSC0 27
> >> +#define CLK_WIFI  28
> >> +#define CLK_PCIEPHY0_DIV  29
> >> +#define CLK_PCIEPHY0_OSC0 30
> >> +#define CLK_PCIEPHY0_REF  31
> >> +#define CLK_PCIEPHY1_DIV  32
> >> +#define CLK_PCIEPHY1_OSC0 33
> >> +#define CLK_PCIEPHY1_REF  34
> >> +#define CLK_PCIEPHY2_DIV  35
> >> +#define CLK_PCIEPHY2_OSC0 36
> >> +#define CLK_PCIEPHY2_REF  37
> >> +#define CLK_PCIE30PHY_REF_M   38
> >> +#define CLK_PCIE30PHY_REF_N   39
> >> +#define CLK_HDMI_REF  40
> >> +#define XIN_OSC0_EDPPHY_G 41
> >> +#define PCLK_PDPMU42
> >> +#define PCLK_PMU  43
> >> +#define PCLK_UART044
> >> +#define PCLK_I2C0 45
> >> +#define PCLK_GPIO046
> >> +#define PCLK_PMUPVTM  47
> >> +#define PCLK_PWM0 48
> >> +#define CLK_PDPMU 49
> >> +#define SCLK_32K_IOE  50
> >> +
> >> +#define CLKPMU_NR_CLKS(SCLK_32K_IOE + 1)
> >> +
> >> +/* cru-clocks indices */
> >> +
> >> +/* cru plls */
> >> +#define PLL_APLL  1
> >> +#define PLL_DPLL  2
> >> +#define PLL_CPLL  3
> >> +#define PLL_GPLL  4
> >> +#define PLL_VPLL  5
> >> +#define PLL_NPLL  6
> >> +
> >> +/* cru clocks */
> >> +#define CPLL_333M 9
> >> +#define ARMCLK10
> >> +#define USB480M   11
> >> +#define ACLK_CORE_NIU2BUS 18
> >> +#define CLK_CORE_PVTM 19
> >> +#define CLK_CORE_PVTM_CORE20
> >> +#define CLK_CORE_PVTPLL   21
> >> +#define CLK_GPU_SRC   22
> >> +#define CLK_GPU_PRE_NDFT  23
> >> +#define CLK_GPU_PRE_MUX   24
> >> +#define ACLK_GPU_PRE  25
> >> +#define PCLK_GPU_PRE  26
> >> +#define CLK_GPU   27
> >> +#define CLK_GPU_NP5   28
> >> +#define PCLK_GPU_PVTM 29
> >>

[PATCH v2] vio: make remove callback return void

2021-02-23 Thread Uwe Kleine-König

The driver core ignores the return value of struct bus_type::remove()
because there is only little that can be done. To simplify the quest to
make this function return void, let struct vio_driver::remove() return
void, too. All users already unconditionally return 0, this commit makes
it obvious that returning an error code is a bad idea and makes it
obvious for future driver authors that returning an error code isn't
intended.

Note there are two nominally different implementations for a vio bus:
one in arch/sparc/kernel/vio.c and the other in
arch/powerpc/platforms/pseries/vio.c. I didn't care to check which
driver is using which of these busses (or if even some of them can be
used with both) and simply adapt all drivers and the two bus codes in
one go.

Note that for the powerpc implementation there is a semantical change:
Before this patch for a device that was bound to a driver without a
remove callback vio_cmo_bus_remove(viodev) wasn't called. As the device
core still considers the device unbound after vio_bus_remove() returns
calling this unconditionally is the consistent behaviour which is
implemented here.

Reviewed-by: Tyrel Datwyler 
Acked-by: Lijun Pan 
Acked-by: Greg Kroah-Hartman 
Signed-off-by: Uwe Kleine-König 
---
Hello,

v1 (sent with Message-Id: 20210127215010.99954-1-...@kleine-koenig.org>
had an back then unfulfilled precondition for a patch to
drivers/net/ethernet/ibm/ibmvnic.c. That patch already got into v5.11 as
5e9eff5dfa46 "ibmvnic: device remove has higher precedence over reset".
So the way is free for this patch.

Compared to v1 I rebased on a later linus/master and added acks.

Best regards
Uwe

 arch/powerpc/include/asm/vio.h   | 2 +-
 arch/powerpc/platforms/pseries/vio.c | 7 +++
 arch/sparc/include/asm/vio.h | 2 +-
 arch/sparc/kernel/ds.c   | 6 --
 arch/sparc/kernel/vio.c  | 4 ++--
 drivers/block/sunvdc.c   | 3 +--
 drivers/char/hw_random/pseries-rng.c | 3 +--
 drivers/char/tpm/tpm_ibmvtpm.c   | 4 +---
 drivers/crypto/nx/nx-842-pseries.c   | 4 +---
 drivers/crypto/nx/nx.c   | 4 +---
 drivers/misc/ibmvmc.c| 4 +---
 drivers/net/ethernet/ibm/ibmveth.c   | 4 +---
 drivers/net/ethernet/ibm/ibmvnic.c   | 4 +---
 drivers/net/ethernet/sun/ldmvsw.c| 4 +---
 drivers/net/ethernet/sun/sunvnet.c   | 3 +--
 drivers/scsi/ibmvscsi/ibmvfc.c   | 3 +--
 drivers/scsi/ibmvscsi/ibmvscsi.c | 4 +---
 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 4 +---
 drivers/tty/hvc/hvcs.c   | 3 +--
 drivers/tty/vcc.c| 4 +---
 20 files changed, 22 insertions(+), 54 deletions(-)

diff --git a/arch/powerpc/include/asm/vio.h b/arch/powerpc/include/asm/vio.h
index 0cf52746531b..721c0d6715ac 100644
--- a/arch/powerpc/include/asm/vio.h
+++ b/arch/powerpc/include/asm/vio.h
@@ -113,7 +113,7 @@ struct vio_driver {
const char *name;
const struct vio_device_id *id_table;
int (*probe)(struct vio_dev *dev, const struct vio_device_id *id);
-   int (*remove)(struct vio_dev *dev);
+   void (*remove)(struct vio_dev *dev);
/* A driver must have a get_desired_dma() function to
 * be loaded in a CMO environment if it uses DMA.
 */
diff --git a/arch/powerpc/platforms/pseries/vio.c 
b/arch/powerpc/platforms/pseries/vio.c
index b2797cfe4e2b..9cb4fc839fd5 100644
--- a/arch/powerpc/platforms/pseries/vio.c
+++ b/arch/powerpc/platforms/pseries/vio.c
@@ -1261,7 +1261,6 @@ static int vio_bus_remove(struct device *dev)
struct vio_dev *viodev = to_vio_dev(dev);
struct vio_driver *viodrv = to_vio_driver(dev->driver);
struct device *devptr;
-   int ret = 1;
 
/*
 * Hold a reference to the device after the remove function is called
@@ -1270,13 +1269,13 @@ static int vio_bus_remove(struct device *dev)
devptr = get_device(dev);
 
if (viodrv->remove)
-   ret = viodrv->remove(viodev);
+   viodrv->remove(viodev);
 
-   if (!ret && firmware_has_feature(FW_FEATURE_CMO))
+   if (firmware_has_feature(FW_FEATURE_CMO))
vio_cmo_bus_remove(viodev);
 
put_device(devptr);
-   return ret;
+   return 0;
 }
 
 /**
diff --git a/arch/sparc/include/asm/vio.h b/arch/sparc/include/asm/vio.h
index 059f0eb678e0..8a1a83bbb6d5 100644
--- a/arch/sparc/include/asm/vio.h
+++ b/arch/sparc/include/asm/vio.h
@@ -362,7 +362,7 @@ struct vio_driver {
struct list_headnode;
const struct vio_device_id  *id_table;
int (*probe)(struct vio_dev *dev, const struct vio_device_id *id);
-   int (*remove)(struct vio_dev *dev);
+   void (*remove)(struct vio_dev *dev);
void (*shutdown)(struct vio_dev *dev);
unsigned long   driver_data;
struct device_driverdriver;
diff --git a/arch/sparc/kernel/ds.c

Re: [PATCH v2] mm: vmstat: fix /proc/sys/vm/stat_refresh generating false warnings

2021-02-23 Thread Hugh Dickins

On Thu, 6 Aug 2020, Andrew Morton wrote:
> On Thu, 6 Aug 2020 16:38:04 -0700 Roman Gushchin  wrote:

August, yikes, I thought it was much more recent.

> 
> > it seems that Hugh and me haven't reached a consensus here.
> > Can, you, please, not merge this patch into 5.9, so we would have
> > more time to find a solution, acceptable for all?
> 
> No probs.  I already had a big red asterisk on it ;)

I've a suspicion that Andrew might be tiring of his big red asterisk,
and wanting to unload
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix.patch
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2.patch
into 5.12.

I would prefer not, and reiterate my Nack: but no great harm will
befall the cosmos if he overrules that, and it does go through to
5.12 - I'll just want to revert it again later.  And I do think a
more straightforward way of suppressing those warnings would be just
to delete the code that issues them, rather than brushing them under
a carpet of overtuning.

I've been running mmotm with the patch below (shown as sign of good
faith, and for you to try, but not ready to go yet) for a few months
now - overriding your max_drift, restoring nr_writeback and friends to
the same checking, fixing the obvious reason why nr_zone_write_pending
and nr_writeback are seen negative occasionally (interrupt interrupting
to decrement those stats before they have even been incremented).

Two big BUTs (if not asterisks): since adding that patch, I have
usually forgotten all about it, so forgotten to run the script that
echoes /proc/sys/vm/stat_refresh at odd intervals while under load:
so have less data than I'd intended by now.  And secondly (and I've
just checked again this evening) I do still see nr_zone_write_pending
and nr_writeback occasionally caught negative while under load.  So,
there's something more at play, perhaps the predicted Gushchin Effect
(but wouldn't they go together if so? I've only seen them separately),
or maybe something else, I don't know.

Those are the only stats I've seen caught negative, but I don't have
CMA configured at all.  You mention nr_free_cma as the only(?) other
stat you've seen negative, that of course I won't see, but looking
at the source I now notice that NR_FREE_CMA_PAGES is incremented
and decremented according to page migratetype...

... internally we have another stat that's incremented and decremented
according to page migratetype, and that one has been seen negative too:
isn't page migratetype something that usually stays the same, but
sometimes the migratetype of the page's block can change, even while
some pages of it are allocated?  Not a stable basis for maintaining
stats, though won't matter much if they are only for display.

vmstat_refresh could just exempt nr_zone_write_pending, nr_writeback
and nr_free_cma from warnings, if we cannot find a fix to them: but
I see no reason to suppress warnings on all the other vmstats.

The patch I've been testing with:

--- mmotm/mm/page-writeback.c   2021-02-14 14:32:24.0 -0800
+++ hughd/mm/page-writeback.c   2021-02-20 18:01:11.264162616 -0800
@@ -2769,6 +2769,13 @@ int __test_set_page_writeback(struct pag
int ret, access_ret;

lock_page_memcg(page);
+   /*
+* Increment counts in advance, so that they will not go negative
+* if test_clear_page_writeback() comes in to decrement them.
+*/
+   inc_lruvec_page_state(page, NR_WRITEBACK);
+   inc_zone_page_state(page, NR_ZONE_WRITE_PENDING);
+
if (mapping && mapping_use_writeback_tags(mapping)) {
XA_STATE(xas, >i_pages, page_index(page));
struct inode *inode = mapping->host;
@@ -2804,9 +2811,14 @@ int __test_set_page_writeback(struct pag
} else {
ret = TestSetPageWriteback(page);
}
-   if (!ret) {
-   inc_lruvec_page_state(page, NR_WRITEBACK);
-   inc_zone_page_state(page, NR_ZONE_WRITE_PENDING);
+
+   if (WARN_ON_ONCE(ret)) {
+   /*
+* Correct counts in retrospect, if PageWriteback was already
+* set; but does any filesystem ever allow this to happen?
+*/
+   dec_lruvec_page_state(page, NR_WRITEBACK);
+   dec_zone_page_state(page, NR_ZONE_WRITE_PENDING);
}
unlock_page_memcg(page);
access_ret = arch_make_page_accessible(page);
--- mmotm/mm/vmstat.c   2021-02-20 17:59:44.838171232 -0800
+++ hughd/mm/vmstat.c   2021-02-20 18:01:11.272162661 -0800
@@ -1865,7 +1865,7 @@ int vmstat_refresh(struct ctl_table *tab

for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++) {
val = atomic_long_read(_zone_stat[i]);
-   if (val < -max_drift) {
+   if (val < 0) {
pr_warn("%s: %s %ld\n",
__func__, zone_stat_name(i), val);

Re: [PATCH v3 2/3] dt-bindings: usb: generic-ehci: document spurious-oc flag

2021-02-23 Thread Álvaro Fernández Rojas

I didn’t change this, but I missed Alan’s Acked-by, so:
Acked-by: Alan Stern 

> El 23 feb 2021, a las 18:44, Álvaro Fernández Rojas  
> escribió:
> 
> Over-current reporting isn't supported on some platforms such as bcm63xx.
> These devices will incorrectly report over-current if this flag isn't properly
> activated.
> 
> Signed-off-by: Álvaro Fernández Rojas 
> ---
> v3: no changes.
> v2: change flag name and improve documentation as suggested by Alan Stern.
> 
> Documentation/devicetree/bindings/usb/generic-ehci.yaml | 6 ++
> 1 file changed, 6 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/usb/generic-ehci.yaml 
> b/Documentation/devicetree/bindings/usb/generic-ehci.yaml
> index cf83f2d9afac..8089dc956ba3 100644
> --- a/Documentation/devicetree/bindings/usb/generic-ehci.yaml
> +++ b/Documentation/devicetree/bindings/usb/generic-ehci.yaml
> @@ -122,6 +122,12 @@ properties:
> description:
>   Set this flag to force EHCI reset after resume.
> 
> +  spurious-oc:
> +$ref: /schemas/types.yaml#/definitions/flag
> +description:
> +  Set this flag to indicate that the hardware sometimes turns on
> +  the OC bit when an over-current isn't actually present.
> +
>   companion:
> $ref: /schemas/types.yaml#/definitions/phandle
> description:
> -- 
> 2.20.1
>

Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero

2021-02-23 Thread Michael S. Tsirkin

On Wed, Feb 24, 2021 at 02:53:08PM +0800, Jason Wang wrote:
> 
> On 2021/2/24 2:46 下午, Michael S. Tsirkin wrote:
> > On Wed, Feb 24, 2021 at 02:04:36PM +0800, Jason Wang wrote:
> > > On 2021/2/24 1:04 下午, Michael S. Tsirkin wrote:
> > > > On Tue, Feb 23, 2021 at 11:35:57AM -0800, Si-Wei Liu wrote:
> > > > > On 2/23/2021 5:26 AM, Michael S. Tsirkin wrote:
> > > > > > On Tue, Feb 23, 2021 at 10:03:57AM +0800, Jason Wang wrote:
> > > > > > > On 2021/2/23 9:12 上午, Si-Wei Liu wrote:
> > > > > > > > On 2/21/2021 11:34 PM, Michael S. Tsirkin wrote:
> > > > > > > > > On Mon, Feb 22, 2021 at 12:14:17PM +0800, Jason Wang wrote:
> > > > > > > > > > On 2021/2/19 7:54 下午, Si-Wei Liu wrote:
> > > > > > > > > > > Commit 452639a64ad8 ("vdpa: make sure set_features is 
> > > > > > > > > > > invoked
> > > > > > > > > > > for legacy") made an exception for legacy guests to reset
> > > > > > > > > > > features to 0, when config space is accessed before 
> > > > > > > > > > > features
> > > > > > > > > > > are set. We should relieve the verify_min_features() check
> > > > > > > > > > > and allow features reset to 0 for this case.
> > > > > > > > > > > 
> > > > > > > > > > > It's worth noting that not just legacy guests could access
> > > > > > > > > > > config space before features are set. For instance, when
> > > > > > > > > > > feature VIRTIO_NET_F_MTU is advertised some modern driver
> > > > > > > > > > > will try to access and validate the MTU present in the 
> > > > > > > > > > > config
> > > > > > > > > > > space before virtio features are set.
> > > > > > > > > > This looks like a spec violation:
> > > > > > > > > > 
> > > > > > > > > > "
> > > > > > > > > > 
> > > > > > > > > > The following driver-read-only field, mtu only exists if
> > > > > > > > > > VIRTIO_NET_F_MTU is
> > > > > > > > > > set.
> > > > > > > > > > This field specifies the maximum MTU for the driver to use.
> > > > > > > > > > "
> > > > > > > > > > 
> > > > > > > > > > Do we really want to workaround this?
> > > > > > > > > > 
> > > > > > > > > > Thanks
> > > > > > > > > And also:
> > > > > > > > > 
> > > > > > > > > The driver MUST follow this sequence to initialize a device:
> > > > > > > > > 1. Reset the device.
> > > > > > > > > 2. Set the ACKNOWLEDGE status bit: the guest OS has noticed 
> > > > > > > > > the device.
> > > > > > > > > 3. Set the DRIVER status bit: the guest OS knows how to drive 
> > > > > > > > > the
> > > > > > > > > device.
> > > > > > > > > 4. Read device feature bits, and write the subset of feature 
> > > > > > > > > bits
> > > > > > > > > understood by the OS and driver to the
> > > > > > > > > device. During this step the driver MAY read (but MUST NOT 
> > > > > > > > > write)
> > > > > > > > > the device-specific configuration
> > > > > > > > > fields to check that it can support the device before 
> > > > > > > > > accepting it.
> > > > > > > > > 5. Set the FEATURES_OK status bit. The driver MUST NOT accept 
> > > > > > > > > new
> > > > > > > > > feature bits after this step.
> > > > > > > > > 6. Re-read device status to ensure the FEATURES_OK bit is 
> > > > > > > > > still set:
> > > > > > > > > otherwise, the device does not
> > > > > > > > > support our subset of features and the device is unusable.
> > > > > > > > > 7. Perform device-specific setup, including discovery of 
> > > > > > > > > virtqueues
> > > > > > > > > for the device, optional per-bus setup,
> > > > > > > > > reading and possibly writing the device’s virtio configuration
> > > > > > > > > space, and population of virtqueues.
> > > > > > > > > 8. Set the DRIVER_OK status bit. At this point the device is 
> > > > > > > > > “live”.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > so accessing config space before FEATURES_OK is a spec 
> > > > > > > > > violation, right?
> > > > > > > > It is, but it's not relevant to what this commit tries to 
> > > > > > > > address. I
> > > > > > > > thought the legacy guest still needs to be supported.
> > > > > > > > 
> > > > > > > > Having said, a separate patch has to be posted to fix the guest 
> > > > > > > > driver
> > > > > > > > issue where this discrepancy is introduced to 
> > > > > > > > virtnet_validate() (since
> > > > > > > > commit fe36cbe067). But it's not technically related to this 
> > > > > > > > patch.
> > > > > > > > 
> > > > > > > > -Siwei
> > > > > > > I think it's a bug to read config space in validate, we should 
> > > > > > > move it to
> > > > > > > virtnet_probe().
> > > > > > > 
> > > > > > > Thanks
> > > > > > I take it back, reading but not writing seems to be explicitly 
> > > > > > allowed by spec.
> > > > > > So our way to detect a legacy guest is bogus, need to think what is
> > > > > > the best way to handle this.
> > > > > Then maybe revert commit fe36cbe067 and friends, and have QEMU detect 
> > > > > legacy
> > > > > guest? Supposedly only config space write access needs to be guarded 
> > > > > before
> > > > > setting FEATURES_OK.
> > > > > 
> > > > > -Siwie
> >

Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero

2021-02-23 Thread Eli Cohen

On Wed, Feb 24, 2021 at 02:55:13PM +0800, Jason Wang wrote:
> 
> On 2021/2/24 2:47 下午, Michael S. Tsirkin wrote:
> > On Wed, Feb 24, 2021 at 08:45:20AM +0200, Eli Cohen wrote:
> > > On Wed, Feb 24, 2021 at 12:17:58AM -0500, Michael S. Tsirkin wrote:
> > > > On Wed, Feb 24, 2021 at 11:20:01AM +0800, Jason Wang wrote:
> > > > > On 2021/2/24 3:35 上午, Si-Wei Liu wrote:
> > > > > > 
> > > > > > On 2/23/2021 5:26 AM, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Feb 23, 2021 at 10:03:57AM +0800, Jason Wang wrote:
> > > > > > > > On 2021/2/23 9:12 上午, Si-Wei Liu wrote:
> > > > > > > > > On 2/21/2021 11:34 PM, Michael S. Tsirkin wrote:
> > > > > > > > > > On Mon, Feb 22, 2021 at 12:14:17PM +0800, Jason Wang wrote:
> > > > > > > > > > > On 2021/2/19 7:54 下午, Si-Wei Liu wrote:
> > > > > > > > > > > > Commit 452639a64ad8 ("vdpa: make sure set_features is 
> > > > > > > > > > > > invoked
> > > > > > > > > > > > for legacy") made an exception for legacy guests to 
> > > > > > > > > > > > reset
> > > > > > > > > > > > features to 0, when config space is accessed before 
> > > > > > > > > > > > features
> > > > > > > > > > > > are set. We should relieve the verify_min_features() 
> > > > > > > > > > > > check
> > > > > > > > > > > > and allow features reset to 0 for this case.
> > > > > > > > > > > > 
> > > > > > > > > > > > It's worth noting that not just legacy guests could 
> > > > > > > > > > > > access
> > > > > > > > > > > > config space before features are set. For instance, when
> > > > > > > > > > > > feature VIRTIO_NET_F_MTU is advertised some modern 
> > > > > > > > > > > > driver
> > > > > > > > > > > > will try to access and validate the MTU present in the 
> > > > > > > > > > > > config
> > > > > > > > > > > > space before virtio features are set.
> > > > > > > > > > > This looks like a spec violation:
> > > > > > > > > > > 
> > > > > > > > > > > "
> > > > > > > > > > > 
> > > > > > > > > > > The following driver-read-only field, mtu only exists if
> > > > > > > > > > > VIRTIO_NET_F_MTU is
> > > > > > > > > > > set.
> > > > > > > > > > > This field specifies the maximum MTU for the driver to 
> > > > > > > > > > > use.
> > > > > > > > > > > "
> > > > > > > > > > > 
> > > > > > > > > > > Do we really want to workaround this?
> > > > > > > > > > > 
> > > > > > > > > > > Thanks
> > > > > > > > > > And also:
> > > > > > > > > > 
> > > > > > > > > > The driver MUST follow this sequence to initialize a device:
> > > > > > > > > > 1. Reset the device.
> > > > > > > > > > 2. Set the ACKNOWLEDGE status bit: the guest OS has
> > > > > > > > > > noticed the device.
> > > > > > > > > > 3. Set the DRIVER status bit: the guest OS knows how to 
> > > > > > > > > > drive the
> > > > > > > > > > device.
> > > > > > > > > > 4. Read device feature bits, and write the subset of 
> > > > > > > > > > feature bits
> > > > > > > > > > understood by the OS and driver to the
> > > > > > > > > > device. During this step the driver MAY read (but MUST NOT 
> > > > > > > > > > write)
> > > > > > > > > > the device-specific configuration
> > > > > > > > > > fields to check that it can support the device before 
> > > > > > > > > > accepting it.
> > > > > > > > > > 5. Set the FEATURES_OK status bit. The driver MUST NOT 
> > > > > > > > > > accept new
> > > > > > > > > > feature bits after this step.
> > > > > > > > > > 6. Re-read device status to ensure the FEATURES_OK bit is 
> > > > > > > > > > still set:
> > > > > > > > > > otherwise, the device does not
> > > > > > > > > > support our subset of features and the device is unusable.
> > > > > > > > > > 7. Perform device-specific setup, including discovery of 
> > > > > > > > > > virtqueues
> > > > > > > > > > for the device, optional per-bus setup,
> > > > > > > > > > reading and possibly writing the device’s virtio 
> > > > > > > > > > configuration
> > > > > > > > > > space, and population of virtqueues.
> > > > > > > > > > 8. Set the DRIVER_OK status bit. At this point the device 
> > > > > > > > > > is “live”.
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > so accessing config space before FEATURES_OK is a spec
> > > > > > > > > > violation, right?
> > > > > > > > > It is, but it's not relevant to what this commit tries to 
> > > > > > > > > address. I
> > > > > > > > > thought the legacy guest still needs to be supported.
> > > > > > > > > 
> > > > > > > > > Having said, a separate patch has to be posted to fix the 
> > > > > > > > > guest driver
> > > > > > > > > issue where this discrepancy is introduced to
> > > > > > > > > virtnet_validate() (since
> > > > > > > > > commit fe36cbe067). But it's not technically related to this 
> > > > > > > > > patch.
> > > > > > > > > 
> > > > > > > > > -Siwei
> > > > > > > > I think it's a bug to read config space in validate, we should
> > > > > > > > move it to
> > > > > > > > virtnet_probe().
> > > > > > > > 
> > > > > > > > Thanks
> > > > > > > I take it back, reading but not writing seems to be explicitly
> > > > > > >

Re: linux-next: build warning after merge of the block tree

2021-02-23 Thread Chaitanya Kulkarni

On 2/23/21 21:33, Stephen Rothwell wrote:
>>>   1f83bb4b4914 ("blktrace: add blk_fill_rwbs documentation comment")
>>>
>>> -- Cheers, Stephen Rothwell  
>> I've failed to understand this warning as rwbs is present in the doc header
>> and in the function parameter :-
> I presume it is the missing ':' after @rwbs in the comment.
I've sent a fix with your reported by, it will be great if you can provide
reviewed-by tag.
> -- Cheers, Stephen Rothwell

Re: [PATCH] nand: brcmnand: fix OOB R/W with Hamming ECC

2021-02-23 Thread Álvaro Fernández Rojas

Hi Florian,

> El 24 feb 2021, a las 4:46, Florian Fainelli  escribió:
> 
> 
> 
> On 2/22/2021 12:16 PM, Álvaro Fernández Rojas wrote:
>> Hamming ECC doesn't cover the OOB data, so reading or writing OOB shall
>> always be done without ECC enabled.
>> This is a problem when adding JFFS2 cleanmarkers to erased blocks. If JFFS2
>> clenmarkers are added to the OOB with ECC enabled, OOB bytes will be changed
>> from ff ff ff to 00 00 00, reporting incorrect ECC errors.
>> 
>> Signed-off-by: Álvaro Fernández Rojas 
> 
> Should there be a Fixes: tag provided here for back porting to stable trees?

I think so, but the fixed commit would be the first one, right?
27c5b17cd1b10564fa36f8f51e4b4b41436ecc32

> -- 
> Florian

Best regards,
Álvaro.

[PATCH v2] kdb: Remove redundant function definitions/prototypes

2021-02-23 Thread Sumit Garg

Cleanup kdb code to get rid of unused function definitions/prototypes.

Signed-off-by: Sumit Garg 
---

Changes in v2:
- Keep kdbgetu64arg() the way it was.

 kernel/debug/kdb/kdb_private.h |  2 --
 kernel/debug/kdb/kdb_support.c | 18 --
 2 files changed, 20 deletions(-)

diff --git a/kernel/debug/kdb/kdb_private.h b/kernel/debug/kdb/kdb_private.h
index 3cf8d9e47939..b857a84de3b5 100644
--- a/kernel/debug/kdb/kdb_private.h
+++ b/kernel/debug/kdb/kdb_private.h
@@ -210,9 +210,7 @@ extern unsigned long kdb_task_state(const struct 
task_struct *p,
unsigned long mask);
 extern void kdb_ps_suppressed(void);
 extern void kdb_ps1(const struct task_struct *p);
-extern void kdb_print_nameval(const char *name, unsigned long val);
 extern void kdb_send_sig(struct task_struct *p, int sig);
-extern void kdb_meminfo_proc_show(void);
 extern char kdb_getchar(void);
 extern char *kdb_getstr(char *, size_t, const char *);
 extern void kdb_gdb_state_pass(char *buf);
diff --git a/kernel/debug/kdb/kdb_support.c b/kernel/debug/kdb/kdb_support.c
index 6226502ce049..b59aad1f0b55 100644
--- a/kernel/debug/kdb/kdb_support.c
+++ b/kernel/debug/kdb/kdb_support.c
@@ -665,24 +665,6 @@ unsigned long kdb_task_state(const struct task_struct *p, 
unsigned long mask)
return (mask & kdb_task_state_string(state)) != 0;
 }
 
-/*
- * kdb_print_nameval - Print a name and its value, converting the
- * value to a symbol lookup if possible.
- * Inputs:
- * namefield name to print
- * val value of field
- */
-void kdb_print_nameval(const char *name, unsigned long val)
-{
-   kdb_symtab_t symtab;
-   kdb_printf("  %-11.11s ", name);
-   if (kdbnearsym(val, ))
-   kdb_symbol_print(val, ,
-KDB_SP_VALUE|KDB_SP_SYMSIZE|KDB_SP_NEWLINE);
-   else
-   kdb_printf("0x%lx\n", val);
-}
-
 /* Last ditch allocator for debugging, so we can still debug even when
  * the GFP_ATOMIC pool has been exhausted.  The algorithms are tuned
  * for space usage, not for speed.  One smallish memory pool, the free
-- 
2.25.1

[PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-02-23 Thread Aili Yao

When the page is already poisoned, another memory_failure() call in the
same page now return 0, meaning OK. For nested memory mce handling, this
behavior may lead real serious problem, Example:

1.When LCME is enabled, and there are two processes A && B running on
different core X && Y separately, which will access one same page, then
the page corrupted when process A access it, a MCE will be rasied to
core X and the error process is just underway.

2.Then B access the page and trigger another MCE to core Y, it will also
do error process, it will see TestSetPageHWPoison be true, and 0 is
returned.

3.The kill_me_maybe will check the return:

1244 static void kill_me_maybe(struct callback_head *cb)
1245 {

1254 if (!memory_failure(p->mce_addr >> PAGE_SHIFT, flags) &&
1255 !(p->mce_kflags & MCE_IN_KERNEL_COPYIN)) {
1256 set_mce_nospec(p->mce_addr >> PAGE_SHIFT,
p->mce_whole_page);
1257 sync_core();
1258 return;
1259 }

1267 }

4. The error process for B will end, and may nothing happened if
kill-early is not set, We may let the wrong data go into effect.

For other cases which care the return value of memory_failure() should
check why they want to process a memory error which have already been
processed. This behavior seems reasonable.

In kill_me_maybe, log the fact about the memory may not recovered, and
we will kill the related process.

Signed-off-by: Aili Yao 
---
 arch/x86/kernel/cpu/mce/core.c | 2 ++
 mm/memory-failure.c| 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index e133ce1e562b..db4afc5bf15a 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1259,6 +1259,8 @@ static void kill_me_maybe(struct callback_head *cb)
}
 
if (p->mce_vaddr != (void __user *)-1l) {
+   pr_err("Memory error may not recovered: %#lx: Sending SIGBUS to 
%s:%d due to hardware memory corruption\n",
+   p->mce_addr >> PAGE_SHIFT, p->comm, p->pid);
force_sig_mceerr(BUS_MCEERR_AR, p->mce_vaddr, PAGE_SHIFT);
} else {
pr_err("Memory error not recovered");
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index e9481632fcd1..06f006174b8c 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1224,7 +1224,7 @@ static int memory_failure_hugetlb(unsigned long pfn, int 
flags)
if (TestSetPageHWPoison(head)) {
pr_err("Memory failure: %#lx: already hardware poisoned\n",
   pfn);
-   return 0;
+   return -EBUSY;
}
 
num_poisoned_pages_inc();
@@ -1420,7 +1420,7 @@ int memory_failure(unsigned long pfn, int flags)
if (TestSetPageHWPoison(p)) {
pr_err("Memory failure: %#lx: already hardware poisoned\n",
pfn);
-   return 0;
+   return -EBUSY;
}
 
orig_head = hpage = compound_head(p);
-- 
2.25.1

[PATCH] power: supply: max8997-charger: remove unneeded semicolon

2021-02-23 Thread Jiapeng Chong

Fix the following coccicheck warnings:

./drivers/power/supply/max8997_charger.c:266:3-4: Unneeded semicolon.

Reported-by: Abaci Robot 
Signed-off-by: Jiapeng Chong 
---
 drivers/power/supply/max8997_charger.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/power/supply/max8997_charger.c 
b/drivers/power/supply/max8997_charger.c
index 321bd6b..a90c018 100644
--- a/drivers/power/supply/max8997_charger.c
+++ b/drivers/power/supply/max8997_charger.c
@@ -263,7 +263,7 @@ static int max8997_battery_probe(struct platform_device 
*pdev)
if (ret) {
dev_err(>dev, "failed to register extcon 
notifier\n");
return ret;
-   };
+   }
}
 
return 0;
-- 
1.8.3.1

[PATCH] perf daemon: Fix compile error with Asan

2021-02-23 Thread Namhyung Kim

I'm seeing a build failure when build with address sanitizer.
It seems we could write to the name[100] if the var is longer.

  $ make EXTRA_CFLAGS=-fsanitize=address
  ...
CC   builtin-daemon.o
  In function ‘get_session_name’,
inlined from ‘session_config’ at builtin-daemon.c:164:6,
inlined from ‘server_config’ at builtin-daemon.c:223:10:
  builtin-daemon.c:155:11: error: writing 1 byte into a region of size 0 
[-Werror=stringop-overflow=]
155 |  *session = 0;
|  ~^~~
  builtin-daemon.c: In function ‘server_config’:
  builtin-daemon.c:162:7: note: at offset 100 to object ‘name’ with size 100 
declared here
162 |  char name[100];
|   ^~~~

Fixes: c0666261ff38 ("perf daemon: Add config file support")
Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-daemon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-daemon.c b/tools/perf/builtin-daemon.c
index 617feaf020f6..8f9fc61691da 100644
--- a/tools/perf/builtin-daemon.c
+++ b/tools/perf/builtin-daemon.c
@@ -161,7 +161,7 @@ static int session_config(struct daemon *daemon, const char 
*var, const char *va
struct daemon_session *session;
char name[100];
 
-   if (get_session_name(var, name, sizeof(name)))
+   if (get_session_name(var, name, sizeof(name) - 1))
return -EINVAL;
 
var = strchr(var, '.');
-- 
2.30.0.617.g56c4b15f3c-goog

Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero

2021-02-23 Thread Michael S. Tsirkin

On Wed, Feb 24, 2021 at 02:55:13PM +0800, Jason Wang wrote:
> 
> On 2021/2/24 2:47 下午, Michael S. Tsirkin wrote:
> > On Wed, Feb 24, 2021 at 08:45:20AM +0200, Eli Cohen wrote:
> > > On Wed, Feb 24, 2021 at 12:17:58AM -0500, Michael S. Tsirkin wrote:
> > > > On Wed, Feb 24, 2021 at 11:20:01AM +0800, Jason Wang wrote:
> > > > > On 2021/2/24 3:35 上午, Si-Wei Liu wrote:
> > > > > > 
> > > > > > On 2/23/2021 5:26 AM, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Feb 23, 2021 at 10:03:57AM +0800, Jason Wang wrote:
> > > > > > > > On 2021/2/23 9:12 上午, Si-Wei Liu wrote:
> > > > > > > > > On 2/21/2021 11:34 PM, Michael S. Tsirkin wrote:
> > > > > > > > > > On Mon, Feb 22, 2021 at 12:14:17PM +0800, Jason Wang wrote:
> > > > > > > > > > > On 2021/2/19 7:54 下午, Si-Wei Liu wrote:
> > > > > > > > > > > > Commit 452639a64ad8 ("vdpa: make sure set_features is 
> > > > > > > > > > > > invoked
> > > > > > > > > > > > for legacy") made an exception for legacy guests to 
> > > > > > > > > > > > reset
> > > > > > > > > > > > features to 0, when config space is accessed before 
> > > > > > > > > > > > features
> > > > > > > > > > > > are set. We should relieve the verify_min_features() 
> > > > > > > > > > > > check
> > > > > > > > > > > > and allow features reset to 0 for this case.
> > > > > > > > > > > > 
> > > > > > > > > > > > It's worth noting that not just legacy guests could 
> > > > > > > > > > > > access
> > > > > > > > > > > > config space before features are set. For instance, when
> > > > > > > > > > > > feature VIRTIO_NET_F_MTU is advertised some modern 
> > > > > > > > > > > > driver
> > > > > > > > > > > > will try to access and validate the MTU present in the 
> > > > > > > > > > > > config
> > > > > > > > > > > > space before virtio features are set.
> > > > > > > > > > > This looks like a spec violation:
> > > > > > > > > > > 
> > > > > > > > > > > "
> > > > > > > > > > > 
> > > > > > > > > > > The following driver-read-only field, mtu only exists if
> > > > > > > > > > > VIRTIO_NET_F_MTU is
> > > > > > > > > > > set.
> > > > > > > > > > > This field specifies the maximum MTU for the driver to 
> > > > > > > > > > > use.
> > > > > > > > > > > "
> > > > > > > > > > > 
> > > > > > > > > > > Do we really want to workaround this?
> > > > > > > > > > > 
> > > > > > > > > > > Thanks
> > > > > > > > > > And also:
> > > > > > > > > > 
> > > > > > > > > > The driver MUST follow this sequence to initialize a device:
> > > > > > > > > > 1. Reset the device.
> > > > > > > > > > 2. Set the ACKNOWLEDGE status bit: the guest OS has
> > > > > > > > > > noticed the device.
> > > > > > > > > > 3. Set the DRIVER status bit: the guest OS knows how to 
> > > > > > > > > > drive the
> > > > > > > > > > device.
> > > > > > > > > > 4. Read device feature bits, and write the subset of 
> > > > > > > > > > feature bits
> > > > > > > > > > understood by the OS and driver to the
> > > > > > > > > > device. During this step the driver MAY read (but MUST NOT 
> > > > > > > > > > write)
> > > > > > > > > > the device-specific configuration
> > > > > > > > > > fields to check that it can support the device before 
> > > > > > > > > > accepting it.
> > > > > > > > > > 5. Set the FEATURES_OK status bit. The driver MUST NOT 
> > > > > > > > > > accept new
> > > > > > > > > > feature bits after this step.
> > > > > > > > > > 6. Re-read device status to ensure the FEATURES_OK bit is 
> > > > > > > > > > still set:
> > > > > > > > > > otherwise, the device does not
> > > > > > > > > > support our subset of features and the device is unusable.
> > > > > > > > > > 7. Perform device-specific setup, including discovery of 
> > > > > > > > > > virtqueues
> > > > > > > > > > for the device, optional per-bus setup,
> > > > > > > > > > reading and possibly writing the device’s virtio 
> > > > > > > > > > configuration
> > > > > > > > > > space, and population of virtqueues.
> > > > > > > > > > 8. Set the DRIVER_OK status bit. At this point the device 
> > > > > > > > > > is “live”.
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > so accessing config space before FEATURES_OK is a spec
> > > > > > > > > > violation, right?
> > > > > > > > > It is, but it's not relevant to what this commit tries to 
> > > > > > > > > address. I
> > > > > > > > > thought the legacy guest still needs to be supported.
> > > > > > > > > 
> > > > > > > > > Having said, a separate patch has to be posted to fix the 
> > > > > > > > > guest driver
> > > > > > > > > issue where this discrepancy is introduced to
> > > > > > > > > virtnet_validate() (since
> > > > > > > > > commit fe36cbe067). But it's not technically related to this 
> > > > > > > > > patch.
> > > > > > > > > 
> > > > > > > > > -Siwei
> > > > > > > > I think it's a bug to read config space in validate, we should
> > > > > > > > move it to
> > > > > > > > virtnet_probe().
> > > > > > > > 
> > > > > > > > Thanks
> > > > > > > I take it back, reading but not writing seems to be explicitly
> > > > > > >

Re: [PATCH v3 1/1] arm64: mm: correct the inside linear map range during hotplug check

2021-02-23 Thread Anshuman Khandual




On 2/16/21 8:33 PM, Pavel Tatashin wrote:
> Memory hotplug may fail on systems with CONFIG_RANDOMIZE_BASE because the
> linear map range is not checked correctly.
> 
> The start physical address that linear map covers can be actually at the
> end of the range because of randomization. Check that and if so reduce it
> to 0.
> 
> This can be verified on QEMU with setting kaslr-seed to ~0ul:
> 
> memstart_offset_seed = 0x
> START: __pa(_PAGE_OFFSET(vabits_actual)) = 9000c000
> END:   __pa(PAGE_END - 1) =  1000bfff

This would have tripped the check in mhp_get_pluggable_range()
with errors something like here, which is expected.

Hotplug memory [0x68000-0x68800] exceeds maximum addressable range 
[0x0-0x0]
Hotplug memory [0x6c000-0x6c800] exceeds maximum addressable range 
[0x0-0x0]
Hotplug memory [0x7-0x70800] exceeds maximum addressable range 
[0x0-0x0]
Hotplug memory [0x78000-0x78800] exceeds maximum addressable range 
[0x0-0x0]
Hotplug memory [0x7c000-0x7c800] exceeds maximum addressable range 
[0x0-0x0]

> 
> Signed-off-by: Pavel Tatashin 
> Fixes: 58284a901b42 ("arm64/mm: Validate hotplug range before creating linear 
> mapping")
> Tested-by: Tyler Hicks 
> ---
>  arch/arm64/mm/mmu.c | 21 +++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index ef7698c4e2f0..0d9c115e427f 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1447,6 +1447,22 @@ static void __remove_pgd_mapping(pgd_t *pgdir, 
> unsigned long start, u64 size)
>  struct range arch_get_mappable_range(void)
>  {
>   struct range mhp_range;
> + u64 start_linear_pa = __pa(_PAGE_OFFSET(vabits_actual));
> + u64 end_linear_pa = __pa(PAGE_END - 1);
> +
> + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
> + /*
> +  * Check for a wrap, it is possible because of randomized linear
> +  * mapping the start physical address is actually bigger than
> +  * the end physical address. In this case set start to zero
> +  * because [0, end_linear_pa] range must still be able to cover
> +  * all addressable physical addresses.
> +  */
> + if (start_linear_pa > end_linear_pa)
> + start_linear_pa = 0;
> + }
> +
> + WARN_ON(start_linear_pa > end_linear_pa);
>  
>   /*
>* Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)]
> @@ -1454,8 +1470,9 @@ struct range arch_get_mappable_range(void)
>* range which can be mapped inside this linear mapping range, must
>* also be derived from its end points.
>*/
> - mhp_range.start = __pa(_PAGE_OFFSET(vabits_actual));
> - mhp_range.end =  __pa(PAGE_END - 1);
> + mhp_range.start = start_linear_pa;
> + mhp_range.end =  end_linear_pa;
> +
>   return mhp_range;
>  }

LGTM.

Reviewed-by: Anshuman Khandual

Re: [PATCH -next] sound: n64: Fix return value check in n64audio_probe()

2021-02-23 Thread Lauri Kasanen

On Wed, 24 Feb 2021 01:38:03 +
Wei Yongjun  wrote:

> In case of error, the function devm_platform_ioremap_resource()
> returns ERR_PTR() and never returns NULL. The NULL test in the
> return value check should be replaced with IS_ERR().
>
> Fixes: 1448f8acf4cc ("sound: Add n64 driver")
> Reported-by: Hulk Robot 
> Signed-off-by: Wei Yongjun 
> ---
>  sound/mips/snd-n64.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)

Reviewed-by: Lauri Kasanen 

- Lauri

[PATCH 2/2] ASoC: ak5558: Add MODULE_DEVICE_TABLE

2021-02-23 Thread Shengjiu Wang

Add missed MODULE_DEVICE_TABLE for the driver can be loaded
automatically at boot.

Fixes: 920884777480 ("ASoC: ak5558: Add support for AK5558 ADC driver")
Cc: 
Signed-off-by: Shengjiu Wang 
---
 sound/soc/codecs/ak5558.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/soc/codecs/ak5558.c b/sound/soc/codecs/ak5558.c
index 8a32b0139cb0..85bdd0534180 100644
--- a/sound/soc/codecs/ak5558.c
+++ b/sound/soc/codecs/ak5558.c
@@ -419,6 +419,7 @@ static const struct of_device_id ak5558_i2c_dt_ids[] 
__maybe_unused = {
{ .compatible = "asahi-kasei,ak5558"},
{ }
 };
+MODULE_DEVICE_TABLE(of, ak5558_i2c_dt_ids);
 
 static struct i2c_driver ak5558_i2c_driver = {
.driver = {
-- 
2.27.0

[PATCH 1/2] ASoC: ak4458: Add MODULE_DEVICE_TABLE

2021-02-23 Thread Shengjiu Wang

Add missed MODULE_DEVICE_TABLE for the driver can be loaded
automatically at boot.

Fixes: 08660086eff9 ("ASoC: ak4458: Add support for AK4458 DAC driver")
Cc: 
Signed-off-by: Shengjiu Wang 
---
 sound/soc/codecs/ak4458.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/soc/codecs/ak4458.c b/sound/soc/codecs/ak4458.c
index 472caad17012..85a1d00894a9 100644
--- a/sound/soc/codecs/ak4458.c
+++ b/sound/soc/codecs/ak4458.c
@@ -812,6 +812,7 @@ static const struct of_device_id ak4458_of_match[] = {
{ .compatible = "asahi-kasei,ak4497", .data = _drvdata},
{ },
 };
+MODULE_DEVICE_TABLE(of, ak4458_of_match);
 
 static struct i2c_driver ak4458_i2c_driver = {
.driver = {
-- 
2.27.0

Re: [PATCH v2 2/2] irqchip: add support for BCM6345 external interrupt controller

2021-02-23 Thread Álvaro Fernández Rojas

Hi Florian,

> El 24 feb 2021, a las 4:43, Florian Fainelli  escribió:
> 
> 
> 
> On 2/23/2021 12:43 PM, Álvaro Fernández Rojas wrote:
>> This interrupt controller is present on bcm63xx SoCs in order to generate
>> interrupts based on GPIO status changes.
>> 
>> Signed-off-by: Álvaro Fernández Rojas 
>> Signed-off-by: Jonas Gorski 
>> ---
> 
> [snip]
>> +static int __init bcm6345_ext_intc_of_init(struct device_node *node,
>> +   struct device_node *parent)
>> +{
>> +int num_irqs, ret = -EINVAL;
>> +unsigned i;
>> +void __iomem *base;
>> +int irqs[MAX_IRQS] = { 0 };
>> +u32 shift;
>> +bool toggle_clear_on_ack = false;
>> +
>> +num_irqs = of_irq_count(node);
>> +
>> +if (!num_irqs || num_irqs > MAX_IRQS)
>> +return -EINVAL;
>> +
>> +if (of_property_read_u32(node, "brcm,field-width", ))
>> +shift = 4;
> 
> This property is not documented in the binding, other than that:

Nice catch, I will add it in next version.

> 
> Reviewed-by: Florian Fainelli 
> -- 
> Florian

[PATCH v5] kdb: Simplify kdb commands registration

2021-02-23 Thread Sumit Garg

Simplify kdb commands registration via using linked list instead of
static array for commands storage.

Signed-off-by: Sumit Garg 
---

Changes in v5:
- Introduce new method: kdb_register_table() to register static kdb
  main and breakpoint command tables instead of using statically
  allocated commands.

Changes in v4:
- Fix kdb commands memory allocation issue prior to slab being available
  with an array of statically allocated commands. Now it works fine with
  kgdbwait.
- Fix a misc checkpatch warning.
- I have dropped Doug's review tag as I think this version includes a
  major fix that should be reviewed again.

Changes in v3:
- Remove redundant "if" check.
- Pick up review tag from Doug.

Changes in v2:
- Remove redundant NULL check for "cmd_name".
- Incorporate misc. comment.

 kernel/debug/kdb/kdb_bp.c  |  81 --
 kernel/debug/kdb/kdb_main.c| 472 -
 kernel/debug/kdb/kdb_private.h |   3 +
 3 files changed, 343 insertions(+), 213 deletions(-)

diff --git a/kernel/debug/kdb/kdb_bp.c b/kernel/debug/kdb/kdb_bp.c
index ec4940146612..c15a1c6abfd6 100644
--- a/kernel/debug/kdb/kdb_bp.c
+++ b/kernel/debug/kdb/kdb_bp.c
@@ -522,6 +522,60 @@ static int kdb_ss(int argc, const char **argv)
return KDB_CMD_SS;
 }
 
+static kdbtab_t bptab[] = {
+   {   .cmd_name = "bp",
+   .cmd_func = kdb_bp,
+   .cmd_usage = "[]",
+   .cmd_help = "Set/Display breakpoints",
+   .cmd_minlen = 0,
+   .cmd_flags = KDB_ENABLE_FLOW_CTRL | KDB_REPEAT_NO_ARGS,
+   },
+   {   .cmd_name = "bl",
+   .cmd_func = kdb_bp,
+   .cmd_usage = "[]",
+   .cmd_help = "Display breakpoints",
+   .cmd_minlen = 0,
+   .cmd_flags = KDB_ENABLE_FLOW_CTRL | KDB_REPEAT_NO_ARGS,
+   },
+   {   .cmd_name = "bc",
+   .cmd_func = kdb_bc,
+   .cmd_usage = "",
+   .cmd_help = "Clear Breakpoint",
+   .cmd_minlen = 0,
+   .cmd_flags = KDB_ENABLE_FLOW_CTRL,
+   },
+   {   .cmd_name = "be",
+   .cmd_func = kdb_bc,
+   .cmd_usage = "",
+   .cmd_help = "Enable Breakpoint",
+   .cmd_minlen = 0,
+   .cmd_flags = KDB_ENABLE_FLOW_CTRL,
+   },
+   {   .cmd_name = "bd",
+   .cmd_func = kdb_bc,
+   .cmd_usage = "",
+   .cmd_help = "Disable Breakpoint",
+   .cmd_minlen = 0,
+   .cmd_flags = KDB_ENABLE_FLOW_CTRL,
+   },
+   {   .cmd_name = "ss",
+   .cmd_func = kdb_ss,
+   .cmd_usage = "",
+   .cmd_help = "Single Step",
+   .cmd_minlen = 1,
+   .cmd_flags = KDB_ENABLE_FLOW_CTRL | KDB_REPEAT_NO_ARGS,
+   },
+};
+
+static kdbtab_t bphcmd = {
+   .cmd_name = "bph",
+   .cmd_func = kdb_bp,
+   .cmd_usage = "[]",
+   .cmd_help = "[datar [length]|dataw [length]]   Set hw brk",
+   .cmd_minlen = 0,
+   .cmd_flags = KDB_ENABLE_FLOW_CTRL | KDB_REPEAT_NO_ARGS,
+};
+
 /* Initialize the breakpoint table and registerbreakpoint commands. */
 
 void __init kdb_initbptab(void)
@@ -537,30 +591,7 @@ void __init kdb_initbptab(void)
for (i = 0, bp = kdb_breakpoints; i < KDB_MAXBPT; i++, bp++)
bp->bp_free = 1;
 
-   kdb_register_flags("bp", kdb_bp, "[]",
-   "Set/Display breakpoints", 0,
-   KDB_ENABLE_FLOW_CTRL | KDB_REPEAT_NO_ARGS);
-   kdb_register_flags("bl", kdb_bp, "[]",
-   "Display breakpoints", 0,
-   KDB_ENABLE_FLOW_CTRL | KDB_REPEAT_NO_ARGS);
+   kdb_register_table(bptab, ARRAY_SIZE(bptab));
if (arch_kgdb_ops.flags & KGDB_HW_BREAKPOINT)
-   kdb_register_flags("bph", kdb_bp, "[]",
-   "[datar [length]|dataw [length]]   Set hw brk", 0,
-   KDB_ENABLE_FLOW_CTRL | KDB_REPEAT_NO_ARGS);
-   kdb_register_flags("bc", kdb_bc, "",
-   "Clear Breakpoint", 0,
-   KDB_ENABLE_FLOW_CTRL);
-   kdb_register_flags("be", kdb_bc, "",
-   "Enable Breakpoint", 0,
-   KDB_ENABLE_FLOW_CTRL);
-   kdb_register_flags("bd", kdb_bc, "",
-   "Disable Breakpoint", 0,
-   KDB_ENABLE_FLOW_CTRL);
-
-   kdb_register_flags("ss", kdb_ss, "",
-   "Single Step", 1,
-   KDB_ENABLE_FLOW_CTRL | KDB_REPEAT_NO_ARGS);
-   /*
-* Architecture dependent initialization.
-*/
+   kdb_register_table(, 1);
 }
diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c
index 930ac1b25ec7..1e0c2c37df94 100644
--- a/kernel/debug/kdb/kdb_main.c
+++ b/kernel/debug/kdb/kdb_main.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -84,15 +85,8 @@ static unsigned int kdb_continue_catastrophic =
 static unsigned int

Re: [PATCH] qtnfmac: remove meaningless labels

2021-02-23 Thread Sergey Matyukevich

> some function's label meaningless, the return statement follows
> the goto statement, so just remove it.
> 
> Signed-off-by: wengjianfeng 
> ---
>  drivers/net/wireless/quantenna/qtnfmac/cfg80211.c | 27 
> +--
>  1 file changed, 6 insertions(+), 21 deletions(-)

Thanks for the patch. 

Reviewed-by: Sergey Matyukevich 


Regards,
Sergey

Re: [PATCH] mips: smp-bmips: fix CPU mappings

2021-02-23 Thread Álvaro Fernández Rojas

Hi Florian,

> El 24 feb 2021, a las 4:45, Florian Fainelli  escribió:
> 
> 
> 
> On 2/23/2021 4:48 AM, Álvaro Fernández Rojas wrote:
>> When booting bmips with SMP enabled on a BCM6358 running on CPU #1 instead of
>> CPU #0, the current CPU mapping code produces the following:
>> - smp_processor_id(): 0
>> - cpu_logical_map(): 1
>> - cpu_number_map(): 1
>> 
>> This is because SMP isn't supported on BCM6358 since it has a shared TLB, so
>> it is disabled and max_cpus is decreased from 2 to 1.
>> 
>> Signed-off-by: Álvaro Fernández Rojas 
>> ---
>> arch/mips/kernel/smp-bmips.c | 27 +--
>> 1 file changed, 17 insertions(+), 10 deletions(-)
>> 
>> diff --git a/arch/mips/kernel/smp-bmips.c b/arch/mips/kernel/smp-bmips.c
>> index 359b176b665f..c4760cb48a67 100644
>> --- a/arch/mips/kernel/smp-bmips.c
>> +++ b/arch/mips/kernel/smp-bmips.c
>> @@ -134,17 +134,24 @@ static void __init bmips_smp_setup(void)
>>  if (!board_ebase_setup)
>>  board_ebase_setup = _ebase_setup;
>> 
>> -__cpu_number_map[boot_cpu] = 0;
>> -__cpu_logical_map[0] = boot_cpu;
>> -
>> -for (i = 0; i < max_cpus; i++) {
>> -if (i != boot_cpu) {
>> -__cpu_number_map[i] = cpu;
>> -__cpu_logical_map[cpu] = i;
>> -cpu++;
>> +if (max_cpus > 1) {
>> +__cpu_number_map[boot_cpu] = 0;
>> +__cpu_logical_map[0] = boot_cpu;
>> +
>> +for (i = 0; i < max_cpus; i++) {
>> +if (i != boot_cpu) {
>> +__cpu_number_map[i] = cpu;
>> +__cpu_logical_map[cpu] = i;
>> +cpu++;
>> +}
>> +set_cpu_possible(i, 1);
>> +set_cpu_present(i, 1);
>>  }
>> -set_cpu_possible(i, 1);
>> -set_cpu_present(i, 1);
>> +} else {
>> +__cpu_number_map[0] = boot_cpu;
>> +__cpu_logical_map[0] = 0;
>> +set_cpu_possible(0, 1);
>> +set_cpu_possible(0, 1);
> 
> Duplicate line, with that fixed:

Nice catch, it should be set_cpu_present().

> 
> Reviewed-by: Florian Fainelli 
> -- 
> Florian

Re: [PATCH V8 1/1] i2c: i2c-qcom-geni: Add shutdown callback for i2c

2021-02-23 Thread Stephen Boyd

Quoting ro...@codeaurora.org (2021-02-18 06:15:17)
> Hi Stephen,
> 
> On 2021-01-13 12:24, Stephen Boyd wrote:
> > Quoting Roja Rani Yarubandi (2021-01-08 07:05:45)
> >> diff --git a/drivers/i2c/busses/i2c-qcom-geni.c 
> >> b/drivers/i2c/busses/i2c-qcom-geni.c
> >> index 214b4c913a13..c3f584795911 100644
> >> --- a/drivers/i2c/busses/i2c-qcom-geni.c
> >> +++ b/drivers/i2c/busses/i2c-qcom-geni.c
> >> @@ -375,6 +375,32 @@ static void geni_i2c_tx_msg_cleanup(struct 
> >> geni_i2c_dev *gi2c,
> >> }
> >>  }
> >> 
> >> +static void geni_i2c_stop_xfer(struct geni_i2c_dev *gi2c)
> >> +{
> >> +   int ret;
> >> +   u32 geni_status;
> >> +   struct i2c_msg *cur;
> >> +
> >> +   /* Resume device, as runtime suspend can happen anytime during 
> >> transfer */
> >> +   ret = pm_runtime_get_sync(gi2c->se.dev);
> >> +   if (ret < 0) {
> >> +   dev_err(gi2c->se.dev, "Failed to resume device: %d\n", 
> >> ret);
> >> +   return;
> >> +   }
> >> +
> >> +   geni_status = readl_relaxed(gi2c->se.base + SE_GENI_STATUS);
> >> +   if (geni_status & M_GENI_CMD_ACTIVE) {
> >> +   cur = gi2c->cur;
> > 
> > Why don't we need to hold the spinlock gi2c::lock here?
> > 
> 
> I am not seeing any race here. May I know which race are you suspecting 
> here?

Sorry there are long delays between posting and replies to my review
comments. It takes me some time to remember what we're talking about
because this patch has dragged on for many months.

So my understanding is that gi2c::lock protects the 'cur' pointer. I
imagine this scenario might go bad

  CPU0  CPU1

  geni_i2c_stop_xfer()  
   ...  geni_i2c_rx_one_msg()
 gi2c->cur = cur1;
   cur = gi2c->cur;
   ...   geni_i2c_tx_one_msg()
 gi2c->cur = cur2;
   geni_i2c_abort_xfer()

   if (cur->flags & I2C_M_RD)

It's almost like we should combine the geni_i2c_abort_xfer() logic with
the rx/tx message cleanup functions so that it's all done under one
lock. Unfortunately it's complicated by the fact that there are various
completion waiting timeouts involved. Fun!

But even after all that, I don't see how the geni_i2c_stop_xfer() puts a
stop to future calls to geni_i2c_rx_one_msg() or geni_i2c_tx_one_msg().
The hardware isn't disabled from what I can tell. The irq isn't
disabled, the clks aren't turned off, etc. What is to stop an i2c device
from trying to use the bus after this shutdown function is called? If
anything, this function looks like a "flush", where we flush out any
pending transfer. Where's the "plug" operation that prevents any future
operations from following this call?

BTW, I see this is merged upstream. That's great, but it seems broken.
Please fix it or revert it out.

> 
> >> +   geni_i2c_abort_xfer(gi2c);
> >> +   if (cur->flags & I2C_M_RD)
> >> +   geni_i2c_rx_msg_cleanup(gi2c, cur);
> >> +   else
> >> +   geni_i2c_tx_msg_cleanup(gi2c, cur);
> >> +   }
> >> +
> >> +   pm_runtime_put_sync_suspend(gi2c->se.dev);
> >> +}
> >> +
> >>  static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct 
> >> i2c_msg *msg,
> >> u32 m_param)
> >>  {

Re: problems with memory allocation and the alignment check

2021-02-23 Thread Michael J. Baars

On Mon, 2021-02-22 at 01:41 -0800, Andrew Pinski wrote:
> On Mon, Feb 22, 2021 at 1:37 AM Michael J. Baars
>  wrote:
> > On Mon, 2021-02-22 at 01:29 -0800, Andrew Pinski wrote:
> > > On Mon, Feb 22, 2021 at 1:17 AM Michael J. Baars
> > >  wrote:
> > > > Hi,
> > > > 
> > > > I just wrote this little program to demonstrate a possible flaw in both 
> > > > malloc and calloc.
> > > > 
> > > > If I allocate a the simplest memory region from main(), one out of 
> > > > three optimization flags fail.
> > > > If I allocate the same region from a function, three out of three 
> > > > optimization flags fail.
> > > > 
> > > > Does someone know if this really is a flaw, and if so, is it a gcc or a 
> > > > kernel flaw?
> > > 
> > > There is no flaw.  GCC (kernel, glibc) all assume unaligned accesses
> > > on x86 will not cause an exception.
> > 
> > Is this just an assumption or more like a fact? I agree with you that byte 
> > aligned is more or less the same as unaligned.
> 
> It is an assumption that is even made inside GCC.  You can modify GCC
> not to assume that but you need to recompile all libraries and even
> check the assembly code that is included with most programs.
> Why are you enabling the alignment access check anyways?  What are you
> trying to do?
> If you are looking into a performance issue with unaligned accesses,
> may I suggest you look into perf to see if you can see unaligned
> accesses?

Next to performance and correctness, I always try to keep in mind that every 
clock cycle will eventually end up on the energy bill, to avoid that computers 
cost
ten times more on the energy bill then they do in the store.

If you look at the power consumption of the Playstation 1 vs that of the 
Playstation 3 for example, you will see that the Playstation 1 uses (10 W / 240 
V
= 0.04167 A max, while the Playstation 3 consumes 240 V * 1.7 A = 408 W. 
More than 40 times as much energy!!!

Code and style always go hand in hand. Try to keep you code as sleek as 
possible and you will see that even an old computer can do a lot more than you 
ever
thought possible :)

Thanks,
Mischa.

> Thanks,
> Andrew
> 
> > > Thanks,
> > > Andrew
> > > 
> > > > Regards,
> > > > Mischa.
#include	

#include	"compression.h"

uint8_t	data_s[256] =	{
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19, 0x1A, 0x1B, 0x1C, 0x1D, 0x1E, 0x1F,
0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27, 0x28, 0x29, 0x2A, 0x2B, 0x2C, 0x2D, 0x2E, 0x2F, 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39, 0x3A, 0x3B, 0x3C, 0x3D, 0x3E, 0x3F,
0x40, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47, 0x48, 0x49, 0x4A, 0x4B, 0x4C, 0x4D, 0x4E, 0x4F, 0x50, 0x51, 0x52, 0x53, 0x54, 0x55, 0x56, 0x57, 0x58, 0x59, 0x5A, 0x5B, 0x5C, 0x5D, 0x5E, 0x5F,
0x60, 0x61, 0x62, 0x63, 0x64, 0x65, 0x66, 0x67, 0x68, 0x69, 0x6A, 0x6B, 0x6C, 0x6D, 0x6E, 0x6F, 0x70, 0x71, 0x72, 0x73, 0x74, 0x75, 0x76, 0x77, 0x78, 0x79, 0x7A, 0x7B, 0x7C, 0x7D, 0x7E, 0x7F,
0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, 0x88, 0x89, 0x8A, 0x8B, 0x8C, 0x8D, 0x8E, 0x8F, 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, 0x98, 0x99, 0x9A, 0x9B, 0x9C, 0x9D, 0x9E, 0x9F,
0xA0, 0xA1, 0xA2, 0xA3, 0xA4, 0xA5, 0xA6, 0xA7, 0xA8, 0xA9, 0xAA, 0xAB, 0xAC, 0xAD, 0xAE, 0xAF, 0xB0, 0xB1, 0xB2, 0xB3, 0xB4, 0xB5, 0xB6, 0xB7, 0xB8, 0xB9, 0xBA, 0xBB, 0xBC, 0xBD, 0xBE, 0xBF,
0xC0, 0xC1, 0xC2, 0xC3, 0xC4, 0xC5, 0xC6, 0xC7, 0xC8, 0xC9, 0xCA, 0xCB, 0xCC, 0xCD, 0xCE, 0xCF, 0xD0, 0xD1, 0xD2, 0xD3, 0xD4, 0xD5, 0xD6, 0xD7, 0xD8, 0xD9, 0xDA, 0xDB, 0xDC, 0xDD, 0xDE, 0xDF,
0xE0, 0xE1, 0xE2, 0xE3, 0xE4, 0xE5, 0xE6, 0xE7, 0xE8, 0xE9, 0xEA, 0xEB, 0xEC, 0xED, 0xEE, 0xEF, 0xF0, 0xF1, 0xF2, 0xF3, 0xF4, 0xF5, 0xF6, 0xF7, 0xF8, 0xF9, 0xFA, 0xFB, 0xFC, 0xFD, 0xFE, 0xFF
			};
/*
 :
   0:	48 89 f9 	mov%rdi,%rcx
   3:	31 d2	xor%edx,%edx
   5:	b8 00 00 00 01   	mov$0x100,%eax
   a:	66 0f 1f 44 00 00	nopw   0x0(%rax,%rax,1)
  10:	48 83 e8 01  	sub$0x1,%rax
  14:	75 fa	jne10 
  16:	88 11	mov%dl,(%rcx)
  18:	48 83 c2 01  	add$0x1,%rdx
  1c:	48 83 c1 01  	add$0x1,%rcx
  20:	48 81 fa 00 01 00 00 	cmp$0x100,%rdx
  27:	75 dc	jne5 
  29:	c3   	retq   
  2a:	66 0f 1f 44 00 00	nopw   0x0(%rax,%rax,1)
*/

void	compression_encode_prepare1	(struct compression* c)
{
	for	(uint64_t j = 0; j < (1 << 24); j++)
	for	(uint64_t i = 0; i < 256; i++)
	{
		c->data_t[i]=	i;
	}
}

void	compression_encode_prepare2	(struct compression* c)
{
	for	(uint64_t j = 0; j < (1 << 24); j++)
	asm	volatile	\
	(		\
	"	lea		%0   , %%rdi	\n"	\
	"	lea		%1   , %%rsi	\n"	\
	"	mov		$0x20, %%rcx	\n"	\
	"	rep		movsq		\n"	\
		: "=m"		(c->data_t)		\
		: "m"		(   data_s)		\
		: "%rcx", "%rsi", "%rdi"			\
	);
}

#ifndef	__COMPRESSION_H__
#define

Re: linux-next: Tree for Feb 24 [drivers/gpu/drm/amd/amdgpu/amdgpu.ko]

2021-02-23 Thread Randy Dunlap

On 2/23/21 7:36 PM, Stephen Rothwell wrote:
> Hi all,
> 
> Please do not add any changes destined for v5.13 to your linux-next
> included branches until after v5.12-rc1 has been released.
> 
> Changes since 20210223:
> 

on i386:

ERROR: modpost: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
ERROR: modpost: "__umoddi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
ERROR: modpost: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!

Full randconfig file is attached.

-- 
~Randy
Reported-by: Randy Dunlap 

config-r7997.gz
Description: application/gzip

Re: [PATCH] fs/cifs:simplify the return expression of cifs_swn_auth_info_krb

2021-02-23 Thread kernel test robot

Hi,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on cifs/for-next]
[also build test ERROR on v5.11 next-20210223]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/dingsenjie-163-com/fs-cifs-simplify-the-return-expression-of-cifs_swn_auth_info_krb/20210224-122502
base:   git://git.samba.org/sfrench/cifs-2.6.git for-next
config: s390-randconfig-r022-20210223 (attached as .config)
compiler: s390-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/2b5765f734346617361817a3b6cefea209078b3f
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
dingsenjie-163-com/fs-cifs-simplify-the-return-expression-of-cifs_swn_auth_info_krb/20210224-122502
git checkout 2b5765f734346617361817a3b6cefea209078b3f
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=s390 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   fs/cifs/cifs_swn.c: In function 'cifs_swn_auth_info_krb':
>> fs/cifs/cifs_swn.c:37:9: error: expected expression before '=' token
  37 |  return = nla_put_flag(skb, CIFS_GENL_ATTR_SWN_KRB_AUTH);
 | ^
   fs/cifs/cifs_swn.c:38:1: error: control reaches end of non-void function 
[-Werror=return-type]
  38 | }
 | ^
   cc1: some warnings being treated as errors


vim +37 fs/cifs/cifs_swn.c

34  
35  static int cifs_swn_auth_info_krb(struct cifs_tcon *tcon, struct 
sk_buff *skb)
36  {
  > 37  return = nla_put_flag(skb, CIFS_GENL_ATTR_SWN_KRB_AUTH);
38  }
39  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero

2021-02-23 Thread Jason Wang




On 2021/2/24 2:47 下午, Michael S. Tsirkin wrote:

On Wed, Feb 24, 2021 at 08:45:20AM +0200, Eli Cohen wrote:

On Wed, Feb 24, 2021 at 12:17:58AM -0500, Michael S. Tsirkin wrote:

On Wed, Feb 24, 2021 at 11:20:01AM +0800, Jason Wang wrote:

On 2021/2/24 3:35 上午, Si-Wei Liu wrote:


On 2/23/2021 5:26 AM, Michael S. Tsirkin wrote:

On Tue, Feb 23, 2021 at 10:03:57AM +0800, Jason Wang wrote:

On 2021/2/23 9:12 上午, Si-Wei Liu wrote:

On 2/21/2021 11:34 PM, Michael S. Tsirkin wrote:

On Mon, Feb 22, 2021 at 12:14:17PM +0800, Jason Wang wrote:

On 2021/2/19 7:54 下午, Si-Wei Liu wrote:

Commit 452639a64ad8 ("vdpa: make sure set_features is invoked
for legacy") made an exception for legacy guests to reset
features to 0, when config space is accessed before features
are set. We should relieve the verify_min_features() check
and allow features reset to 0 for this case.

It's worth noting that not just legacy guests could access
config space before features are set. For instance, when
feature VIRTIO_NET_F_MTU is advertised some modern driver
will try to access and validate the MTU present in the config
space before virtio features are set.

This looks like a spec violation:

"

The following driver-read-only field, mtu only exists if
VIRTIO_NET_F_MTU is
set.
This field specifies the maximum MTU for the driver to use.
"

Do we really want to workaround this?

Thanks

And also:

The driver MUST follow this sequence to initialize a device:
1. Reset the device.
2. Set the ACKNOWLEDGE status bit: the guest OS has
noticed the device.
3. Set the DRIVER status bit: the guest OS knows how to drive the
device.
4. Read device feature bits, and write the subset of feature bits
understood by the OS and driver to the
device. During this step the driver MAY read (but MUST NOT write)
the device-specific configuration
fields to check that it can support the device before accepting it.
5. Set the FEATURES_OK status bit. The driver MUST NOT accept new
feature bits after this step.
6. Re-read device status to ensure the FEATURES_OK bit is still set:
otherwise, the device does not
support our subset of features and the device is unusable.
7. Perform device-specific setup, including discovery of virtqueues
for the device, optional per-bus setup,
reading and possibly writing the device’s virtio configuration
space, and population of virtqueues.
8. Set the DRIVER_OK status bit. At this point the device is “live”.


so accessing config space before FEATURES_OK is a spec
violation, right?

It is, but it's not relevant to what this commit tries to address. I
thought the legacy guest still needs to be supported.

Having said, a separate patch has to be posted to fix the guest driver
issue where this discrepancy is introduced to
virtnet_validate() (since
commit fe36cbe067). But it's not technically related to this patch.

-Siwei

I think it's a bug to read config space in validate, we should
move it to
virtnet_probe().

Thanks

I take it back, reading but not writing seems to be explicitly
allowed by spec.
So our way to detect a legacy guest is bogus, need to think what is
the best way to handle this.

Then maybe revert commit fe36cbe067 and friends, and have QEMU detect
legacy guest? Supposedly only config space write access needs to be
guarded before setting FEATURES_OK.


I agree. My understanding is that all vDPA must be modern device (since
VIRITO_F_ACCESS_PLATFORM is mandated) instead of transitional device.

Thanks

Well mlx5 has some code to handle legacy guests ...
Eli, could you comment? Is that support unused right now?


If you mean support for version 1.0, well the knob is there but it's not
set in the firmware I use. Note sure if we will support this.

Hmm you mean it's legacy only right now?
Well at some point you will want advanced goodies like RSS
and all that is gated on 1.0 ;)



So if my understanding is correct the device/firmware is legacy but 
require VIRTIO_F_ACCESS_PLATFORM semanic? Looks like a spec violation?


Thanks





-Siwie


Rejecting reset to 0
prematurely causes correct MTU and link status unable to load
for the very first config space access, rendering issues like
guest showing inaccurate MTU value, or failure to reject
out-of-range MTU.

Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for
supported mlx5 devices")
Signed-off-by: Si-Wei Liu 
---
     drivers/vdpa/mlx5/net/mlx5_vnet.c | 15 +--
     1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c
b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 7c1f789..540dd67 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1490,14 +1490,6 @@ static u64
mlx5_vdpa_get_features(struct vdpa_device *vdev)
     return mvdev->mlx_features;
     }
-static int verify_min_features(struct mlx5_vdpa_dev *mvdev,
u64 features)
-{
-    if (!(features & BIT_ULL(VIRTIO_F_ACCESS_PLATFORM)))
-    return -EOPNOTSUPP;
-
-    return 0;
-}
-
     static int setup_virtqueues(struct mlx5_vdpa_net

Re: [PATCH] crypto: testmgr - delete some redundant code

2021-02-23 Thread Eric Biggers

On Tue, Feb 23, 2021 at 11:42:04AM +0800, Kai Ye wrote:
> Delete sg_data function, because sg_data function definition same as
> sg_virt(), so need to delete it and use sg_virt() replace to sg_data().
> 
> Signed-off-by: Kai Ye 
> ---
>  crypto/testmgr.c | 11 +++
>  1 file changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/crypto/testmgr.c b/crypto/testmgr.c
> index 9335999..e13e73c 100644
> --- a/crypto/testmgr.c
> +++ b/crypto/testmgr.c
> @@ -1168,11 +1168,6 @@ static inline int check_shash_op(const char *op, int 
> err,
>   return err;
>  }
>  
> -static inline const void *sg_data(struct scatterlist *sg)
> -{
> - return page_address(sg_page(sg)) + sg->offset;
> -}
> -
>  /* Test one hash test vector in one configuration, using the shash API */
>  static int test_shash_vec_cfg(const struct hash_testvec *vec,
> const char *vec_name,
> @@ -1230,7 +1225,7 @@ static int test_shash_vec_cfg(const struct hash_testvec 
> *vec,
>   return 0;
>   if (cfg->nosimd)
>   crypto_disable_simd_for_test();
> - err = crypto_shash_digest(desc, sg_data(>sgl[0]),
> + err = crypto_shash_digest(desc, sg_virt(>sgl[0]),
> tsgl->sgl[0].length, result);
>   if (cfg->nosimd)
>   crypto_reenable_simd_for_test();
> @@ -1266,7 +1261,7 @@ static int test_shash_vec_cfg(const struct hash_testvec 
> *vec,
>   cfg->finalization_type == FINALIZATION_TYPE_FINUP) {
>   if (divs[i]->nosimd)
>   crypto_disable_simd_for_test();
> - err = crypto_shash_finup(desc, sg_data(>sgl[i]),
> + err = crypto_shash_finup(desc, sg_virt(>sgl[i]),
>tsgl->sgl[i].length, result);
>   if (divs[i]->nosimd)
>   crypto_reenable_simd_for_test();
> @@ -1278,7 +1273,7 @@ static int test_shash_vec_cfg(const struct hash_testvec 
> *vec,
>   }
>   if (divs[i]->nosimd)
>   crypto_disable_simd_for_test();
> - err = crypto_shash_update(desc, sg_data(>sgl[i]),
> + err = crypto_shash_update(desc, sg_virt(>sgl[i]),
> tsgl->sgl[i].length);
>   if (divs[i]->nosimd)
>   crypto_reenable_simd_for_test();
> -- 

Looks good,

Reviewed-by: Eric Biggers 

- Eric

Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero

2021-02-23 Thread Jason Wang




On 2021/2/24 2:46 下午, Michael S. Tsirkin wrote:

On Wed, Feb 24, 2021 at 02:04:36PM +0800, Jason Wang wrote:

On 2021/2/24 1:04 下午, Michael S. Tsirkin wrote:

On Tue, Feb 23, 2021 at 11:35:57AM -0800, Si-Wei Liu wrote:

On 2/23/2021 5:26 AM, Michael S. Tsirkin wrote:

On Tue, Feb 23, 2021 at 10:03:57AM +0800, Jason Wang wrote:

On 2021/2/23 9:12 上午, Si-Wei Liu wrote:

On 2/21/2021 11:34 PM, Michael S. Tsirkin wrote:

On Mon, Feb 22, 2021 at 12:14:17PM +0800, Jason Wang wrote:

On 2021/2/19 7:54 下午, Si-Wei Liu wrote:

Commit 452639a64ad8 ("vdpa: make sure set_features is invoked
for legacy") made an exception for legacy guests to reset
features to 0, when config space is accessed before features
are set. We should relieve the verify_min_features() check
and allow features reset to 0 for this case.

It's worth noting that not just legacy guests could access
config space before features are set. For instance, when
feature VIRTIO_NET_F_MTU is advertised some modern driver
will try to access and validate the MTU present in the config
space before virtio features are set.

This looks like a spec violation:

"

The following driver-read-only field, mtu only exists if
VIRTIO_NET_F_MTU is
set.
This field specifies the maximum MTU for the driver to use.
"

Do we really want to workaround this?

Thanks

And also:

The driver MUST follow this sequence to initialize a device:
1. Reset the device.
2. Set the ACKNOWLEDGE status bit: the guest OS has noticed the device.
3. Set the DRIVER status bit: the guest OS knows how to drive the
device.
4. Read device feature bits, and write the subset of feature bits
understood by the OS and driver to the
device. During this step the driver MAY read (but MUST NOT write)
the device-specific configuration
fields to check that it can support the device before accepting it.
5. Set the FEATURES_OK status bit. The driver MUST NOT accept new
feature bits after this step.
6. Re-read device status to ensure the FEATURES_OK bit is still set:
otherwise, the device does not
support our subset of features and the device is unusable.
7. Perform device-specific setup, including discovery of virtqueues
for the device, optional per-bus setup,
reading and possibly writing the device’s virtio configuration
space, and population of virtqueues.
8. Set the DRIVER_OK status bit. At this point the device is “live”.


so accessing config space before FEATURES_OK is a spec violation, right?

It is, but it's not relevant to what this commit tries to address. I
thought the legacy guest still needs to be supported.

Having said, a separate patch has to be posted to fix the guest driver
issue where this discrepancy is introduced to virtnet_validate() (since
commit fe36cbe067). But it's not technically related to this patch.

-Siwei

I think it's a bug to read config space in validate, we should move it to
virtnet_probe().

Thanks

I take it back, reading but not writing seems to be explicitly allowed by spec.
So our way to detect a legacy guest is bogus, need to think what is
the best way to handle this.

Then maybe revert commit fe36cbe067 and friends, and have QEMU detect legacy
guest? Supposedly only config space write access needs to be guarded before
setting FEATURES_OK.

-Siwie

Detecting it isn't enough though, we will need a new ioctl to notify
the kernel that it's a legacy guest. Ugh :(


I'm not sure I get this, how can we know if there's a legacy driver before
set_features()?

qemu knows for sure. It does not communicate this information to the
kernel right now unfortunately.



I may miss something, but I still don't get how the new ioctl is 
supposed to work.


Thanks





And I wonder what will hapeen if we just revert the set_features(0)?

Thanks





Rejecting reset to 0
prematurely causes correct MTU and link status unable to load
for the very first config space access, rendering issues like
guest showing inaccurate MTU value, or failure to reject
out-of-range MTU.

Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for
supported mlx5 devices")
Signed-off-by: Si-Wei Liu 
---
      drivers/vdpa/mlx5/net/mlx5_vnet.c | 15 +--
      1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c
b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 7c1f789..540dd67 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1490,14 +1490,6 @@ static u64
mlx5_vdpa_get_features(struct vdpa_device *vdev)
      return mvdev->mlx_features;
      }
-static int verify_min_features(struct mlx5_vdpa_dev *mvdev,
u64 features)
-{
-    if (!(features & BIT_ULL(VIRTIO_F_ACCESS_PLATFORM)))
-    return -EOPNOTSUPP;
-
-    return 0;
-}
-
      static int setup_virtqueues(struct mlx5_vdpa_net *ndev)
      {
      int err;
@@ -1558,18 +1550,13 @@ static int
mlx5_vdpa_set_features(struct vdpa_device *vdev, u64
features)
      {
      struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
      struct mlx5_vdpa_net *ndev =

Re: [PATCH v1 2/4] clk: rockchip: add dt-binding header for rk3568

2021-02-23 Thread elaine.zhang


Hi, Heiko:

在 2021/2/23 下午6:45, Heiko Stübner 写道:

Hi Elaine,

Am Dienstag, 23. Februar 2021, 10:53:50 CET schrieb Elaine Zhang:

Add the dt-bindings header for the rk3568, that gets shared between
the clock controller and the clock references in the dts.
Add softreset ID for rk3568.

Signed-off-by: Elaine Zhang 
---
  include/dt-bindings/clock/rk3568-cru.h | 926 +
  1 file changed, 926 insertions(+)
  create mode 100644 include/dt-bindings/clock/rk3568-cru.h

diff --git a/include/dt-bindings/clock/rk3568-cru.h 
b/include/dt-bindings/clock/rk3568-cru.h
new file mode 100644
index ..22b0b8739b5d
--- /dev/null
+++ b/include/dt-bindings/clock/rk3568-cru.h
@@ -0,0 +1,926 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Rockchip Electronics Co. Ltd.
+ * Author: Elaine Zhang 
+ */
+
+#ifndef _DT_BINDINGS_CLK_ROCKCHIP_RK3568_H
+#define _DT_BINDINGS_CLK_ROCKCHIP_RK3568_H
+
+/* pmucru-clocks indices */
+
+/* pmucru plls */
+#define PLL_PPLL   1
+#define PLL_HPLL   2
+
+/* pmucru clocks */
+#define XIN_OSC0_DIV   4
+#define CLK_RTC_32K5
+#define CLK_PMU6
+#define CLK_I2C0   7

can we change the prefix of CLK_* ids to SCLK_*
(for special clock), like on previous socs.

Especially as some of them already have that SCLK_prefix already anyway.

Having that 4-letter prefix makes reading these IDs easier as well ;-)


SCLK is for special clock, CLK is for common clock.

rk3568-cru.h is automatically generated from TRM using tools.
Can we minimize the work of manual modification?
Because of the increasing number of clocks, writing by hand often makes 
mistakes.We use tools to generate rk3568-cru.h(100% use tools) and generate 
descriptions of registers in clk-rk3568.c(50% use tools)




Thanks
Heiko


+#define CLK_RTC32K_FRAC8
+#define CLK_UART0_DIV  9
+#define CLK_UART0_FRAC 10
+#define SCLK_UART0 11
+#define DBCLK_GPIO012
+#define CLK_PWM0   13
+#define CLK_CAPTURE_PWM0_NDFT  14
+#define CLK_PMUPVTM15
+#define CLK_CORE_PMUPVTM   16
+#define CLK_REF24M 17
+#define XIN_OSC0_USBPHY0_G 18
+#define CLK_USBPHY0_REF19
+#define XIN_OSC0_USBPHY1_G 20
+#define CLK_USBPHY1_REF21
+#define XIN_OSC0_MIPIDSIPHY0_G 22
+#define CLK_MIPIDSIPHY0_REF23
+#define XIN_OSC0_MIPIDSIPHY1_G 24
+#define CLK_MIPIDSIPHY1_REF25
+#define CLK_WIFI_DIV   26
+#define CLK_WIFI_OSC0  27
+#define CLK_WIFI   28
+#define CLK_PCIEPHY0_DIV   29
+#define CLK_PCIEPHY0_OSC0  30
+#define CLK_PCIEPHY0_REF   31
+#define CLK_PCIEPHY1_DIV   32
+#define CLK_PCIEPHY1_OSC0  33
+#define CLK_PCIEPHY1_REF   34
+#define CLK_PCIEPHY2_DIV   35
+#define CLK_PCIEPHY2_OSC0  36
+#define CLK_PCIEPHY2_REF   37
+#define CLK_PCIE30PHY_REF_M38
+#define CLK_PCIE30PHY_REF_N39
+#define CLK_HDMI_REF   40
+#define XIN_OSC0_EDPPHY_G  41
+#define PCLK_PDPMU 42
+#define PCLK_PMU   43
+#define PCLK_UART0 44
+#define PCLK_I2C0  45
+#define PCLK_GPIO0 46
+#define PCLK_PMUPVTM   47
+#define PCLK_PWM0  48
+#define CLK_PDPMU  49
+#define SCLK_32K_IOE   50
+
+#define CLKPMU_NR_CLKS (SCLK_32K_IOE + 1)
+
+/* cru-clocks indices */
+
+/* cru plls */
+#define PLL_APLL   1
+#define PLL_DPLL   2
+#define PLL_CPLL   3
+#define PLL_GPLL   4
+#define PLL_VPLL   5
+#define PLL_NPLL   6
+
+/* cru clocks */
+#define CPLL_333M  9
+#define ARMCLK 10
+#define USB480M11
+#define ACLK_CORE_NIU2BUS  18
+#define CLK_CORE_PVTM  19
+#define CLK_CORE_PVTM_CORE 20
+#define CLK_CORE_PVTPLL21
+#define CLK_GPU_SRC22
+#define CLK_GPU_PRE_NDFT   23
+#define CLK_GPU_PRE_MUX24
+#define ACLK_GPU_PRE   25
+#define PCLK_GPU_PRE   26
+#define CLK_GPU27
+#define CLK_GPU_NP528
+#define PCLK_GPU_PVTM  29
+#define CLK_GPU_PVTM   30
+#define CLK_GPU_PVTM_CORE  31
+#define CLK_GPU_PVTPLL 32
+#define CLK_NPU_SRC33
+#define CLK_NPU_PRE_NDFT   34
+#define CLK_NPU35
+#define CLK_NPU_NP536
+#define HCLK_NPU_PRE   37
+#define PCLK_NPU_PRE   38
+#define ACLK_NPU_PRE   39
+#define ACLK_NPU   40
+#define HCLK_NPU   41
+#define PCLK_NPU_PVTM  42
+#define CLK_NPU_PVTM   43
+#define CLK_NPU_PVTM_CORE  44
+#define CLK_NPU_PVTPLL 45
+#define CLK_DDRPHY1X_SRC   46
+#define CLK_DDRPHY1X_HWFFC_SRC 47
+#define CLK_DDR1X  48
+#define CLK_MSCH   49
+#define CLK24_DDRMON

Re: [PATCH] mtd: rawnand: qcom: update last code word register

2021-02-23 Thread Miquel Raynal

Hello,

mda...@codeaurora.org wrote on Wed, 24 Feb 2021 10:09:48 +0530:

> On 2021-02-24 01:13, mda...@codeaurora.org wrote:
> > On 2021-02-23 22:04, Miquel Raynal wrote:  
> >> Hello,  
> >> >> Md Sadre Alam  wrote on Tue, 23 Feb 2021  
> >> 01:34:27 +0530:  
> >> >>> From QPIC version 2.0 onwards new register got added to read last  
> >> >>a new  
> >> >>> codeword. This change will add the READ_LOCATION_LAST_CW_n register.  
> >> >> Add support for this READ_LOCATION_LAST_CW_n register.  
> >> >>> >>> For first three code word READ_LOCATION_n register will be  
> >>> use.For last code word READ_LOCATION_LAST_CW_n register will be
> >>> use.  
> >> >> "  
> >> In the case of QPIC v2, codewords 0, 1 and 2 will be accessed through
> >> READ_LOCATION_n, while codeword 3 will be accessed through
> >> READ_LOCATION_LAST_CW_n.
> >> "  
> >> >> When I read my own sentence, I feel that there is something wrong.  
> >> If there are only 4 codewords, I guess a QPIC v2 is able to use
> >> READ_LOCATION_3 or READ_LOCATION_LAST_CW_0 interchangeably. Isn't it?  
> >> >> I guess the point of having these "last_cw_n" registers is to support  
> >> up to 8 codewords, am I wrong? If this the case, the current patch
> >> completely fails doing that I don't get the point of such change.  
> > 
> > This register is only use to read last code word.
> > 
> > I have address all the comments from all the previous sub sequent
> > patches and pushed
> > all patches in only one series.
> > 
> > Please check.  
> 
>   The registers READ_LOCATION & READ_LOCATION_LAST are not associated with 
> number of code words.
>   These two registers are used to access the location inside a code word.

Ok. Can you please explain what is a location then? Or point me to a
datasheet that explains it.

Bottom line question: why having READ_LOCATION_0, _1,... an
READ_LOCATION_LAST_0, _1, etc?

> So whether we are having 4 code words
>   or 8 code words it doesn't matter. If we wanted access the location within 
> normal code word we have to
>   use READ_LOCATION register and if we wanted to access location in last code 
> word then we have to use
>   READ_LOCATION_LAST.
> >   
> >> >>> Signed-off-by: Md Sadre Alam   

Thanks,
Miquèl

Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero

2021-02-23 Thread Michael S. Tsirkin

On Wed, Feb 24, 2021 at 08:45:20AM +0200, Eli Cohen wrote:
> On Wed, Feb 24, 2021 at 12:17:58AM -0500, Michael S. Tsirkin wrote:
> > On Wed, Feb 24, 2021 at 11:20:01AM +0800, Jason Wang wrote:
> > > 
> > > On 2021/2/24 3:35 上午, Si-Wei Liu wrote:
> > > > 
> > > > 
> > > > On 2/23/2021 5:26 AM, Michael S. Tsirkin wrote:
> > > > > On Tue, Feb 23, 2021 at 10:03:57AM +0800, Jason Wang wrote:
> > > > > > On 2021/2/23 9:12 上午, Si-Wei Liu wrote:
> > > > > > > 
> > > > > > > On 2/21/2021 11:34 PM, Michael S. Tsirkin wrote:
> > > > > > > > On Mon, Feb 22, 2021 at 12:14:17PM +0800, Jason Wang wrote:
> > > > > > > > > On 2021/2/19 7:54 下午, Si-Wei Liu wrote:
> > > > > > > > > > Commit 452639a64ad8 ("vdpa: make sure set_features is 
> > > > > > > > > > invoked
> > > > > > > > > > for legacy") made an exception for legacy guests to reset
> > > > > > > > > > features to 0, when config space is accessed before features
> > > > > > > > > > are set. We should relieve the verify_min_features() check
> > > > > > > > > > and allow features reset to 0 for this case.
> > > > > > > > > > 
> > > > > > > > > > It's worth noting that not just legacy guests could access
> > > > > > > > > > config space before features are set. For instance, when
> > > > > > > > > > feature VIRTIO_NET_F_MTU is advertised some modern driver
> > > > > > > > > > will try to access and validate the MTU present in the 
> > > > > > > > > > config
> > > > > > > > > > space before virtio features are set.
> > > > > > > > > This looks like a spec violation:
> > > > > > > > > 
> > > > > > > > > "
> > > > > > > > > 
> > > > > > > > > The following driver-read-only field, mtu only exists if
> > > > > > > > > VIRTIO_NET_F_MTU is
> > > > > > > > > set.
> > > > > > > > > This field specifies the maximum MTU for the driver to use.
> > > > > > > > > "
> > > > > > > > > 
> > > > > > > > > Do we really want to workaround this?
> > > > > > > > > 
> > > > > > > > > Thanks
> > > > > > > > And also:
> > > > > > > > 
> > > > > > > > The driver MUST follow this sequence to initialize a device:
> > > > > > > > 1. Reset the device.
> > > > > > > > 2. Set the ACKNOWLEDGE status bit: the guest OS has
> > > > > > > > noticed the device.
> > > > > > > > 3. Set the DRIVER status bit: the guest OS knows how to drive 
> > > > > > > > the
> > > > > > > > device.
> > > > > > > > 4. Read device feature bits, and write the subset of feature 
> > > > > > > > bits
> > > > > > > > understood by the OS and driver to the
> > > > > > > > device. During this step the driver MAY read (but MUST NOT 
> > > > > > > > write)
> > > > > > > > the device-specific configuration
> > > > > > > > fields to check that it can support the device before accepting 
> > > > > > > > it.
> > > > > > > > 5. Set the FEATURES_OK status bit. The driver MUST NOT accept 
> > > > > > > > new
> > > > > > > > feature bits after this step.
> > > > > > > > 6. Re-read device status to ensure the FEATURES_OK bit is still 
> > > > > > > > set:
> > > > > > > > otherwise, the device does not
> > > > > > > > support our subset of features and the device is unusable.
> > > > > > > > 7. Perform device-specific setup, including discovery of 
> > > > > > > > virtqueues
> > > > > > > > for the device, optional per-bus setup,
> > > > > > > > reading and possibly writing the device’s virtio configuration
> > > > > > > > space, and population of virtqueues.
> > > > > > > > 8. Set the DRIVER_OK status bit. At this point the device is 
> > > > > > > > “live”.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > so accessing config space before FEATURES_OK is a spec
> > > > > > > > violation, right?
> > > > > > > It is, but it's not relevant to what this commit tries to 
> > > > > > > address. I
> > > > > > > thought the legacy guest still needs to be supported.
> > > > > > > 
> > > > > > > Having said, a separate patch has to be posted to fix the guest 
> > > > > > > driver
> > > > > > > issue where this discrepancy is introduced to
> > > > > > > virtnet_validate() (since
> > > > > > > commit fe36cbe067). But it's not technically related to this 
> > > > > > > patch.
> > > > > > > 
> > > > > > > -Siwei
> > > > > > 
> > > > > > I think it's a bug to read config space in validate, we should
> > > > > > move it to
> > > > > > virtnet_probe().
> > > > > > 
> > > > > > Thanks
> > > > > I take it back, reading but not writing seems to be explicitly
> > > > > allowed by spec.
> > > > > So our way to detect a legacy guest is bogus, need to think what is
> > > > > the best way to handle this.
> > > > Then maybe revert commit fe36cbe067 and friends, and have QEMU detect
> > > > legacy guest? Supposedly only config space write access needs to be
> > > > guarded before setting FEATURES_OK.
> > > 
> > > 
> > > I agree. My understanding is that all vDPA must be modern device (since
> > > VIRITO_F_ACCESS_PLATFORM is mandated) instead of transitional device.
> > > 
> > > Thanks
> > 
> > Well mlx5 has some code to handle legacy guests ...
> > Eli, could you

Re: [PATCH v2] cpufreq: schedutil: Call sugov_update_next_freq() before check to fast_switch_enabled

2021-02-23 Thread Viresh Kumar

On 24-02-21, 14:39, Yue Hu wrote:
> From: Yue Hu 
> 
> Note that sugov_update_next_freq() may return false, that means the
> caller sugov_fast_switch() will do nothing except fast switch check.
> 
> Similarly, sugov_deferred_update() also has unnecessary operations
> of raw_spin_{lock,unlock} in sugov_update_single_freq() for that case.
> 
> So, let's call sugov_update_next_freq() before the fast switch check
> to avoid unnecessary behaviors above. Accordingly, update interface
> definition to sugov_deferred_update() and remove sugov_fast_switch()
> since we will call cpufreq_driver_fast_switch() directly instead.
> 
> Signed-off-by: Yue Hu 
> ---
> v2: remove sugov_fast_switch() and call cpufreq_driver_fast_switch()
> directly instead, also update minor log message.
> 
>  kernel/sched/cpufreq_schedutil.c | 29 -
>  1 file changed, 12 insertions(+), 17 deletions(-)

Acked-by: Viresh Kumar 
-- 
viresh

Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero

2021-02-23 Thread Michael S. Tsirkin

On Wed, Feb 24, 2021 at 02:04:36PM +0800, Jason Wang wrote:
> 
> On 2021/2/24 1:04 下午, Michael S. Tsirkin wrote:
> > On Tue, Feb 23, 2021 at 11:35:57AM -0800, Si-Wei Liu wrote:
> > > 
> > > On 2/23/2021 5:26 AM, Michael S. Tsirkin wrote:
> > > > On Tue, Feb 23, 2021 at 10:03:57AM +0800, Jason Wang wrote:
> > > > > On 2021/2/23 9:12 上午, Si-Wei Liu wrote:
> > > > > > On 2/21/2021 11:34 PM, Michael S. Tsirkin wrote:
> > > > > > > On Mon, Feb 22, 2021 at 12:14:17PM +0800, Jason Wang wrote:
> > > > > > > > On 2021/2/19 7:54 下午, Si-Wei Liu wrote:
> > > > > > > > > Commit 452639a64ad8 ("vdpa: make sure set_features is invoked
> > > > > > > > > for legacy") made an exception for legacy guests to reset
> > > > > > > > > features to 0, when config space is accessed before features
> > > > > > > > > are set. We should relieve the verify_min_features() check
> > > > > > > > > and allow features reset to 0 for this case.
> > > > > > > > > 
> > > > > > > > > It's worth noting that not just legacy guests could access
> > > > > > > > > config space before features are set. For instance, when
> > > > > > > > > feature VIRTIO_NET_F_MTU is advertised some modern driver
> > > > > > > > > will try to access and validate the MTU present in the config
> > > > > > > > > space before virtio features are set.
> > > > > > > > This looks like a spec violation:
> > > > > > > > 
> > > > > > > > "
> > > > > > > > 
> > > > > > > > The following driver-read-only field, mtu only exists if
> > > > > > > > VIRTIO_NET_F_MTU is
> > > > > > > > set.
> > > > > > > > This field specifies the maximum MTU for the driver to use.
> > > > > > > > "
> > > > > > > > 
> > > > > > > > Do we really want to workaround this?
> > > > > > > > 
> > > > > > > > Thanks
> > > > > > > And also:
> > > > > > > 
> > > > > > > The driver MUST follow this sequence to initialize a device:
> > > > > > > 1. Reset the device.
> > > > > > > 2. Set the ACKNOWLEDGE status bit: the guest OS has noticed the 
> > > > > > > device.
> > > > > > > 3. Set the DRIVER status bit: the guest OS knows how to drive the
> > > > > > > device.
> > > > > > > 4. Read device feature bits, and write the subset of feature bits
> > > > > > > understood by the OS and driver to the
> > > > > > > device. During this step the driver MAY read (but MUST NOT write)
> > > > > > > the device-specific configuration
> > > > > > > fields to check that it can support the device before accepting 
> > > > > > > it.
> > > > > > > 5. Set the FEATURES_OK status bit. The driver MUST NOT accept new
> > > > > > > feature bits after this step.
> > > > > > > 6. Re-read device status to ensure the FEATURES_OK bit is still 
> > > > > > > set:
> > > > > > > otherwise, the device does not
> > > > > > > support our subset of features and the device is unusable.
> > > > > > > 7. Perform device-specific setup, including discovery of 
> > > > > > > virtqueues
> > > > > > > for the device, optional per-bus setup,
> > > > > > > reading and possibly writing the device’s virtio configuration
> > > > > > > space, and population of virtqueues.
> > > > > > > 8. Set the DRIVER_OK status bit. At this point the device is 
> > > > > > > “live”.
> > > > > > > 
> > > > > > > 
> > > > > > > so accessing config space before FEATURES_OK is a spec violation, 
> > > > > > > right?
> > > > > > It is, but it's not relevant to what this commit tries to address. I
> > > > > > thought the legacy guest still needs to be supported.
> > > > > > 
> > > > > > Having said, a separate patch has to be posted to fix the guest 
> > > > > > driver
> > > > > > issue where this discrepancy is introduced to virtnet_validate() 
> > > > > > (since
> > > > > > commit fe36cbe067). But it's not technically related to this patch.
> > > > > > 
> > > > > > -Siwei
> > > > > I think it's a bug to read config space in validate, we should move 
> > > > > it to
> > > > > virtnet_probe().
> > > > > 
> > > > > Thanks
> > > > I take it back, reading but not writing seems to be explicitly allowed 
> > > > by spec.
> > > > So our way to detect a legacy guest is bogus, need to think what is
> > > > the best way to handle this.
> > > Then maybe revert commit fe36cbe067 and friends, and have QEMU detect 
> > > legacy
> > > guest? Supposedly only config space write access needs to be guarded 
> > > before
> > > setting FEATURES_OK.
> > > 
> > > -Siwie
> > Detecting it isn't enough though, we will need a new ioctl to notify
> > the kernel that it's a legacy guest. Ugh :(
> 
> 
> I'm not sure I get this, how can we know if there's a legacy driver before
> set_features()?

qemu knows for sure. It does not communicate this information to the
kernel right now unfortunately.

> And I wonder what will hapeen if we just revert the set_features(0)?
> 
> Thanks
> 
> 
> > 
> > 
> > > > > > > > > Rejecting reset to 0
> > > > > > > > > prematurely causes correct MTU and link status unable to load
> > > > > > > > > for the very first config space access, rendering issues like
> > > > > > > > > guest

Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero

2021-02-23 Thread Eli Cohen

On Wed, Feb 24, 2021 at 12:17:58AM -0500, Michael S. Tsirkin wrote:
> On Wed, Feb 24, 2021 at 11:20:01AM +0800, Jason Wang wrote:
> > 
> > On 2021/2/24 3:35 上午, Si-Wei Liu wrote:
> > > 
> > > 
> > > On 2/23/2021 5:26 AM, Michael S. Tsirkin wrote:
> > > > On Tue, Feb 23, 2021 at 10:03:57AM +0800, Jason Wang wrote:
> > > > > On 2021/2/23 9:12 上午, Si-Wei Liu wrote:
> > > > > > 
> > > > > > On 2/21/2021 11:34 PM, Michael S. Tsirkin wrote:
> > > > > > > On Mon, Feb 22, 2021 at 12:14:17PM +0800, Jason Wang wrote:
> > > > > > > > On 2021/2/19 7:54 下午, Si-Wei Liu wrote:
> > > > > > > > > Commit 452639a64ad8 ("vdpa: make sure set_features is invoked
> > > > > > > > > for legacy") made an exception for legacy guests to reset
> > > > > > > > > features to 0, when config space is accessed before features
> > > > > > > > > are set. We should relieve the verify_min_features() check
> > > > > > > > > and allow features reset to 0 for this case.
> > > > > > > > > 
> > > > > > > > > It's worth noting that not just legacy guests could access
> > > > > > > > > config space before features are set. For instance, when
> > > > > > > > > feature VIRTIO_NET_F_MTU is advertised some modern driver
> > > > > > > > > will try to access and validate the MTU present in the config
> > > > > > > > > space before virtio features are set.
> > > > > > > > This looks like a spec violation:
> > > > > > > > 
> > > > > > > > "
> > > > > > > > 
> > > > > > > > The following driver-read-only field, mtu only exists if
> > > > > > > > VIRTIO_NET_F_MTU is
> > > > > > > > set.
> > > > > > > > This field specifies the maximum MTU for the driver to use.
> > > > > > > > "
> > > > > > > > 
> > > > > > > > Do we really want to workaround this?
> > > > > > > > 
> > > > > > > > Thanks
> > > > > > > And also:
> > > > > > > 
> > > > > > > The driver MUST follow this sequence to initialize a device:
> > > > > > > 1. Reset the device.
> > > > > > > 2. Set the ACKNOWLEDGE status bit: the guest OS has
> > > > > > > noticed the device.
> > > > > > > 3. Set the DRIVER status bit: the guest OS knows how to drive the
> > > > > > > device.
> > > > > > > 4. Read device feature bits, and write the subset of feature bits
> > > > > > > understood by the OS and driver to the
> > > > > > > device. During this step the driver MAY read (but MUST NOT write)
> > > > > > > the device-specific configuration
> > > > > > > fields to check that it can support the device before accepting 
> > > > > > > it.
> > > > > > > 5. Set the FEATURES_OK status bit. The driver MUST NOT accept new
> > > > > > > feature bits after this step.
> > > > > > > 6. Re-read device status to ensure the FEATURES_OK bit is still 
> > > > > > > set:
> > > > > > > otherwise, the device does not
> > > > > > > support our subset of features and the device is unusable.
> > > > > > > 7. Perform device-specific setup, including discovery of 
> > > > > > > virtqueues
> > > > > > > for the device, optional per-bus setup,
> > > > > > > reading and possibly writing the device’s virtio configuration
> > > > > > > space, and population of virtqueues.
> > > > > > > 8. Set the DRIVER_OK status bit. At this point the device is 
> > > > > > > “live”.
> > > > > > > 
> > > > > > > 
> > > > > > > so accessing config space before FEATURES_OK is a spec
> > > > > > > violation, right?
> > > > > > It is, but it's not relevant to what this commit tries to address. I
> > > > > > thought the legacy guest still needs to be supported.
> > > > > > 
> > > > > > Having said, a separate patch has to be posted to fix the guest 
> > > > > > driver
> > > > > > issue where this discrepancy is introduced to
> > > > > > virtnet_validate() (since
> > > > > > commit fe36cbe067). But it's not technically related to this patch.
> > > > > > 
> > > > > > -Siwei
> > > > > 
> > > > > I think it's a bug to read config space in validate, we should
> > > > > move it to
> > > > > virtnet_probe().
> > > > > 
> > > > > Thanks
> > > > I take it back, reading but not writing seems to be explicitly
> > > > allowed by spec.
> > > > So our way to detect a legacy guest is bogus, need to think what is
> > > > the best way to handle this.
> > > Then maybe revert commit fe36cbe067 and friends, and have QEMU detect
> > > legacy guest? Supposedly only config space write access needs to be
> > > guarded before setting FEATURES_OK.
> > 
> > 
> > I agree. My understanding is that all vDPA must be modern device (since
> > VIRITO_F_ACCESS_PLATFORM is mandated) instead of transitional device.
> > 
> > Thanks
> 
> Well mlx5 has some code to handle legacy guests ...
> Eli, could you comment? Is that support unused right now?
> 

If you mean support for version 1.0, well the knob is there but it's not
set in the firmware I use. Note sure if we will support this.

> 
> > 
> > > 
> > > -Siwie
> > > 
> > > > > > > 
> > > > > > > > > Rejecting reset to 0
> > > > > > > > > prematurely causes correct MTU and link status unable to load
> > > > > > > > > for the very first config space

Re: [RFC PATCH v5 11/19] virtio/vsock: dequeue callback for SOCK_SEQPACKET

2021-02-23 Thread Michael S. Tsirkin

On Wed, Feb 24, 2021 at 08:07:48AM +0300, Arseny Krasnov wrote:
> 
> On 23.02.2021 17:17, Michael S. Tsirkin wrote:
> > On Thu, Feb 18, 2021 at 08:39:37AM +0300, Arseny Krasnov wrote:
> >> This adds transport callback and it's logic for SEQPACKET dequeue.
> >> Callback fetches RW packets from rx queue of socket until whole record
> >> is copied(if user's buffer is full, user is not woken up). This is done
> >> to not stall sender, because if we wake up user and it leaves syscall,
> >> nobody will send credit update for rest of record, and sender will wait
> >> for next enter of read syscall at receiver's side. So if user buffer is
> >> full, we just send credit update and drop data. If during copy SEQ_BEGIN
> >> was found(and not all data was copied), copying is restarted by reset
> >> user's iov iterator(previous unfinished data is dropped).
> >>
> >> Signed-off-by: Arseny Krasnov 
> >> ---
> >>  include/linux/virtio_vsock.h|  10 +++
> >>  include/uapi/linux/virtio_vsock.h   |  16 
> >>  net/vmw_vsock/virtio_transport_common.c | 114 
> >>  3 files changed, 140 insertions(+)
> >>
> >> diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
> >> index dc636b727179..003d06ae4a85 100644
> >> --- a/include/linux/virtio_vsock.h
> >> +++ b/include/linux/virtio_vsock.h
> >> @@ -36,6 +36,11 @@ struct virtio_vsock_sock {
> >>u32 rx_bytes;
> >>u32 buf_alloc;
> >>struct list_head rx_queue;
> >> +
> >> +  /* For SOCK_SEQPACKET */
> >> +  u32 user_read_seq_len;
> >> +  u32 user_read_copied;
> >> +  u32 curr_rx_msg_cnt;
> >
> > wrap these in a struct to make it's clearer they
> > are related?
> Ack
> >
> >>  };
> >>  
> >>  struct virtio_vsock_pkt {
> >> @@ -80,6 +85,11 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
> >>   struct msghdr *msg,
> >>   size_t len, int flags);
> >>  
> >> +int
> >> +virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
> >> + struct msghdr *msg,
> >> + int flags,
> >> + bool *msg_ready);
> >>  s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
> >>  s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
> >>  
> >> diff --git a/include/uapi/linux/virtio_vsock.h 
> >> b/include/uapi/linux/virtio_vsock.h
> >> index 1d57ed3d84d2..cf9c165e5cca 100644
> >> --- a/include/uapi/linux/virtio_vsock.h
> >> +++ b/include/uapi/linux/virtio_vsock.h
> >> @@ -63,8 +63,14 @@ struct virtio_vsock_hdr {
> >>__le32  fwd_cnt;
> >>  } __attribute__((packed));
> >>  
> >> +struct virtio_vsock_seq_hdr {
> >> +  __le32  msg_cnt;
> >> +  __le32  msg_len;
> >> +} __attribute__((packed));
> >> +
> >>  enum virtio_vsock_type {
> >>VIRTIO_VSOCK_TYPE_STREAM = 1,
> >> +  VIRTIO_VSOCK_TYPE_SEQPACKET = 2,
> >>  };
> >>  
> >>  enum virtio_vsock_op {
> >> @@ -83,6 +89,11 @@ enum virtio_vsock_op {
> >>VIRTIO_VSOCK_OP_CREDIT_UPDATE = 6,
> >>/* Request the peer to send the credit info to us */
> >>VIRTIO_VSOCK_OP_CREDIT_REQUEST = 7,
> >> +
> >> +  /* Record begin for SOCK_SEQPACKET */
> >> +  VIRTIO_VSOCK_OP_SEQ_BEGIN = 8,
> >> +  /* Record end for SOCK_SEQPACKET */
> >> +  VIRTIO_VSOCK_OP_SEQ_END = 9,
> >>  };
> >>  
> >>  /* VIRTIO_VSOCK_OP_SHUTDOWN flags values */
> >> @@ -91,4 +102,9 @@ enum virtio_vsock_shutdown {
> >>VIRTIO_VSOCK_SHUTDOWN_SEND = 2,
> >>  };
> >>  
> >> +/* VIRTIO_VSOCK_OP_RW flags values */
> >> +enum virtio_vsock_rw {
> >> +  VIRTIO_VSOCK_RW_EOR = 1,
> >> +};
> >> +
> >>  #endif /* _UAPI_LINUX_VIRTIO_VSOCK_H */
> > Probably a good idea to also have a feature bit gating
> > this functionality.
> 
> IIUC this also requires some qemu patch, because in current
> 
> implementation of vsock device in qemu, there is no 'set_features'
> 
> callback for such device. This callback will handle guest's write
> 
> to feature register, by calling vhost kernel backend, where this
> 
> bit will be processed by host.

Well patching userspace to make use of a kernel feature
is par for the course, isn't it?

> 
> IMHO I'm not sure that SEQPACKET support needs feature
> 
> bit - it is just two new ops for virtio vsock protocol, and from point
> 
> of view of virtio device it is same as STREAM. May be it is needed
> 
> for cases when client tries to connect to server which doesn't support
> 
> SEQPACKET, so without bit result will be "Connection reset by peer",
> 
> and with such bit client will know that server doesn't support it and
> 
> 'socket(SOCK_SEQPACKET)' will return error?

Yes, a better error handling would be one reason to do it like this.

-- 
MST

[PATCH v2] cpufreq: schedutil: Call sugov_update_next_freq() before check to fast_switch_enabled

2021-02-23 Thread Yue Hu

From: Yue Hu 

Note that sugov_update_next_freq() may return false, that means the
caller sugov_fast_switch() will do nothing except fast switch check.

Similarly, sugov_deferred_update() also has unnecessary operations
of raw_spin_{lock,unlock} in sugov_update_single_freq() for that case.

So, let's call sugov_update_next_freq() before the fast switch check
to avoid unnecessary behaviors above. Accordingly, update interface
definition to sugov_deferred_update() and remove sugov_fast_switch()
since we will call cpufreq_driver_fast_switch() directly instead.

Signed-off-by: Yue Hu 
---
v2: remove sugov_fast_switch() and call cpufreq_driver_fast_switch()
directly instead, also update minor log message.

 kernel/sched/cpufreq_schedutil.c | 29 -
 1 file changed, 12 insertions(+), 17 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 41e498b..65fe2c8 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -114,19 +114,8 @@ static bool sugov_update_next_freq(struct sugov_policy 
*sg_policy, u64 time,
return true;
 }
 
-static void sugov_fast_switch(struct sugov_policy *sg_policy, u64 time,
- unsigned int next_freq)
+static void sugov_deferred_update(struct sugov_policy *sg_policy)
 {
-   if (sugov_update_next_freq(sg_policy, time, next_freq))
-   cpufreq_driver_fast_switch(sg_policy->policy, next_freq);
-}
-
-static void sugov_deferred_update(struct sugov_policy *sg_policy, u64 time,
- unsigned int next_freq)
-{
-   if (!sugov_update_next_freq(sg_policy, time, next_freq))
-   return;
-
if (!sg_policy->work_in_progress) {
sg_policy->work_in_progress = true;
irq_work_queue(_policy->irq_work);
@@ -368,16 +357,19 @@ static void sugov_update_single_freq(struct 
update_util_data *hook, u64 time,
sg_policy->cached_raw_freq = cached_freq;
}
 
+   if (!sugov_update_next_freq(sg_policy, time, next_f))
+   return;
+
/*
 * This code runs under rq->lock for the target CPU, so it won't run
 * concurrently on two different CPUs for the same target and it is not
 * necessary to acquire the lock in the fast switch case.
 */
if (sg_policy->policy->fast_switch_enabled) {
-   sugov_fast_switch(sg_policy, time, next_f);
+   cpufreq_driver_fast_switch(sg_policy->policy, next_f);
} else {
raw_spin_lock(_policy->update_lock);
-   sugov_deferred_update(sg_policy, time, next_f);
+   sugov_deferred_update(sg_policy);
raw_spin_unlock(_policy->update_lock);
}
 }
@@ -456,12 +448,15 @@ static unsigned int sugov_next_freq_shared(struct 
sugov_cpu *sg_cpu, u64 time)
if (sugov_should_update_freq(sg_policy, time)) {
next_f = sugov_next_freq_shared(sg_cpu, time);
 
+   if (!sugov_update_next_freq(sg_policy, time, next_f))
+   goto unlock;
+
if (sg_policy->policy->fast_switch_enabled)
-   sugov_fast_switch(sg_policy, time, next_f);
+   cpufreq_driver_fast_switch(sg_policy->policy, next_f);
else
-   sugov_deferred_update(sg_policy, time, next_f);
+   sugov_deferred_update(sg_policy);
}
-
+unlock:
raw_spin_unlock(_policy->update_lock);
 }
 
-- 
1.9.1

Re: [PATCH] ath11k: qmi: use %pad to format dma_addr_t

2021-02-23 Thread Kalle Valo

Geert Uytterhoeven  wrote:

> If CONFIG_ARCH_DMA_ADDR_T_64BIT=n:
> 
> drivers/net/wireless/ath/ath11k/qmi.c: In function 
> ‘ath11k_qmi_respond_fw_mem_request’:
> drivers/net/wireless/ath/ath11k/qmi.c:1690:8: warning: format ‘%llx’ 
> expects argument of type ‘long long unsigned int’, but argument 5 has type 
> ‘dma_addr_t’ {aka ‘unsigned int’} [-Wformat=]
>  1690 |"qmi req mem_seg[%d] 0x%llx %u %u\n", i,
> |^~~~
>  1691 | ab->qmi.target_mem[i].paddr,
> | ~~~
> |  |
> |  dma_addr_t {aka unsigned int}
> drivers/net/wireless/ath/ath11k/debug.h:64:30: note: in definition of 
> macro ‘ath11k_dbg’
>64 |   __ath11k_dbg(ar, dbg_mask, fmt, ##__VA_ARGS__); \
> |  ^~~
> drivers/net/wireless/ath/ath11k/qmi.c:1690:34: note: format string is 
> defined here
>  1690 |"qmi req mem_seg[%d] 0x%llx %u %u\n", i,
> |   ~~~^
> |  |
> |  long long unsigned int
> |   %x
> 
> Fixes: d5395a5486596308 ("ath11k: qmi: add debug message for allocated memory 
> segment addresses and sizes")
> Signed-off-by: Geert Uytterhoeven 

Patch applied to wireless-drivers.git, thanks.

ebb9d34e073d ath11k: qmi: use %pad to format dma_addr_t

-- 
https://patchwork.kernel.org/project/linux-wireless/patch/20210221182754.2071863-1-ge...@linux-m68k.org/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

[PATCH] powerpc/syscall: Force inlining of __prep_irq_for_enabled_exit()

2021-02-23 Thread Christophe Leroy

As reported by kernel test robot, a randconfig with high amount of
debuging options can lead to build failure for undefined reference
to replay_soft_interrupts() on ppc32.

This is due to gcc not seeing that __prep_irq_for_enabled_exit()
always returns true on ppc32 because it doesn't inline it for
some reason.

Force inlining of __prep_irq_for_enabled_exit() to fix the build.

Reported-by: kernel test robot 
Signed-off-by: Christophe Leroy 
Fixes: 344bb20b159d ("powerpc/syscall: Make interrupt.c buildable on PPC32")
---
 arch/powerpc/kernel/interrupt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index 398cd86b6ada..2ef3c4051bb9 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -149,7 +149,7 @@ notrace long system_call_exception(long r3, long r4, long 
r5,
  * enabled when the interrupt handler returns (indicating a process-context /
  * synchronous interrupt) then irqs_enabled should be true.
  */
-static notrace inline bool __prep_irq_for_enabled_exit(bool clear_ri)
+static notrace __always_inline bool __prep_irq_for_enabled_exit(bool clear_ri)
 {
/* This must be done with RI=1 because tracing may touch vmaps */
trace_hardirqs_on();
-- 
2.25.0

Re: [PATCH v4 1/1] kernel/crash_core: Add crashkernel=auto for vmcore creation

2021-02-23 Thread Kairui Song

On Wed, Feb 24, 2021 at 1:45 AM Saeed Mirzamohammadi
 wrote:
>
> This adds crashkernel=auto feature to configure reserved memory for
> vmcore creation. CONFIG_CRASH_AUTO_STR is defined to be set for
> different kernel distributions and different archs based on their
> needs.
>
> Signed-off-by: Saeed Mirzamohammadi 
> Signed-off-by: John Donnelly 
> Tested-by: John Donnelly 
> ---
>  Documentation/admin-guide/kdump/kdump.rst |  3 ++-
>  .../admin-guide/kernel-parameters.txt |  6 ++
>  arch/Kconfig  | 20 +++
>  kernel/crash_core.c   |  7 +++
>  4 files changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/admin-guide/kdump/kdump.rst 
> b/Documentation/admin-guide/kdump/kdump.rst
> index 75a9dd98e76e..ae030111e22a 100644
> --- a/Documentation/admin-guide/kdump/kdump.rst
> +++ b/Documentation/admin-guide/kdump/kdump.rst
> @@ -285,7 +285,8 @@ This would mean:
>  2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
>  3) if the RAM size is larger than 2G, then reserve 128M
>
> -
> +Or you can use crashkernel=auto to choose the crash kernel memory size
> +based on the recommended configuration set for each arch.
>
>  Boot into System Kernel
>  ===
> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> b/Documentation/admin-guide/kernel-parameters.txt
> index 9e3cdb271d06..a5deda5c85fe 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -747,6 +747,12 @@
> a memory unit (amount[KMG]). See also
> Documentation/admin-guide/kdump/kdump.rst for an 
> example.
>
> +   crashkernel=auto
> +   [KNL] This parameter will set the reserved memory for
> +   the crash kernel based on the value of the 
> CRASH_AUTO_STR
> +   that is the best effort estimation for each arch. See 
> also
> +   arch/Kconfig for further details.
> +
> crashkernel=size[KMG],high
> [KNL, X86-64] range could be above 4G. Allow kernel
> to allocate physical memory region from top, so could
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 24862d15f3a3..23d047548772 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -14,6 +14,26 @@ menu "General architecture-dependent options"
>  config CRASH_CORE
> bool
>
> +config CRASH_AUTO_STR
> +   string "Memory reserved for crash kernel"
> +   depends on CRASH_CORE
> +   default "1G-64G:128M,64G-1T:256M,1T-:512M"
> +   help
> + This configures the reserved memory dependent
> + on the value of System RAM. The syntax is:
> + crashkernel=:[,:,...][@offset]
> + range=start-[end]
> +
> + For example:
> + crashkernel=512M-2G:64M,2G-:128M
> +
> + This would mean:
> +
> + 1) if the RAM is smaller than 512M, then don't reserve anything
> +(this is the "rescue" case)
> + 2) if the RAM size is between 512M and 2G (exclusive), then 
> reserve 64M
> + 3) if the RAM size is larger than 2G, then reserve 128M
> +
>  config KEXEC_CORE
> select CRASH_CORE
> bool
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 825284baaf46..90f9e4bb6704 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -7,6 +7,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -250,6 +251,12 @@ static int __init __parse_crashkernel(char *cmdline,
> if (suffix)
> return parse_crashkernel_suffix(ck_cmdline, crash_size,
> suffix);
> +#ifdef CONFIG_CRASH_AUTO_STR
> +   if (strncmp(ck_cmdline, "auto", 4) == 0) {
> +   ck_cmdline = CONFIG_CRASH_AUTO_STR;
> +   pr_info("Using crashkernel=auto, the size chosen is a best 
> effort estimation.\n");
> +   }
> +#endif
> /*
>  * if the commandline contains a ':', then that's the extended
>  * syntax -- if not, it must be the classic syntax
> --
> 2.27.0
>
>
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
>

Thanks for help pushing the crashkernel=auto to upstream
This patch works well.

Tested-by: Kairui Song 


--
Best Regards,
Kairui Song

Re: linux-next: Signed-off-by missing for commit in the kbuild tree

2021-02-23 Thread Masahiro Yamada

On Wed, Feb 24, 2021 at 12:46 PM Stephen Rothwell  wrote:
>
> Hi all,
>
> Commit
>
>   67cbb9c55759 ("Makefile: reuse CC_VERSION_TEXT")
>
> is missing a Signed-off-by from its committer.

Thanks for catching this.
I will fix it for tomorrow's linux-next.



> --
> Cheers,
> Stephen Rothwell



-- 
Best Regards
Masahiro Yamada

[PATCH v4 2/2] drm/bridge: anx7625: disable regulators when power off

2021-02-23 Thread Hsin-Yi Wang

When suspending the driver, anx7625_power_standby() will be called to
turn off reset-gpios and enable-gpios. However, power supplies are not
disabled. To save power, the driver can get the power supply regulators
and turn off them in anx7625_power_standby().

Signed-off-by: Hsin-Yi Wang 
Reviewed-by: Robert Foss 
---
 drivers/gpu/drm/bridge/analogix/anx7625.c | 34 +++
 drivers/gpu/drm/bridge/analogix/anx7625.h |  1 +
 2 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.c 
b/drivers/gpu/drm/bridge/analogix/anx7625.c
index 65cc05982f826..23283ba0c4f93 100644
--- a/drivers/gpu/drm/bridge/analogix/anx7625.c
+++ b/drivers/gpu/drm/bridge/analogix/anx7625.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -875,12 +876,25 @@ static int sp_tx_edid_read(struct anx7625_data *ctx,
 static void anx7625_power_on(struct anx7625_data *ctx)
 {
struct device *dev = >client->dev;
+   int ret, i;
 
if (!ctx->pdata.low_power_mode) {
DRM_DEV_DEBUG_DRIVER(dev, "not low power mode!\n");
return;
}
 
+   for (i = 0; i < ARRAY_SIZE(ctx->pdata.supplies); i++) {
+   ret = regulator_enable(ctx->pdata.supplies[i].consumer);
+   if (ret < 0) {
+   DRM_DEV_DEBUG_DRIVER(dev, "cannot enable supply %d: 
%d\n",
+i, ret);
+   goto reg_err;
+   }
+   usleep_range(2000, 2100);
+   }
+
+   usleep_range(4000, 4100);
+
/* Power on pin enable */
gpiod_set_value(ctx->pdata.gpio_p_on, 1);
usleep_range(1, 11000);
@@ -889,11 +903,16 @@ static void anx7625_power_on(struct anx7625_data *ctx)
usleep_range(1, 11000);
 
DRM_DEV_DEBUG_DRIVER(dev, "power on !\n");
+   return;
+reg_err:
+   for (--i; i >= 0; i--)
+   regulator_disable(ctx->pdata.supplies[i].consumer);
 }
 
 static void anx7625_power_standby(struct anx7625_data *ctx)
 {
struct device *dev = >client->dev;
+   int ret;
 
if (!ctx->pdata.low_power_mode) {
DRM_DEV_DEBUG_DRIVER(dev, "not low power mode!\n");
@@ -904,6 +923,12 @@ static void anx7625_power_standby(struct anx7625_data *ctx)
usleep_range(1000, 1100);
gpiod_set_value(ctx->pdata.gpio_p_on, 0);
usleep_range(1000, 1100);
+
+   ret = regulator_bulk_disable(ARRAY_SIZE(ctx->pdata.supplies),
+ctx->pdata.supplies);
+   if (ret < 0)
+   DRM_DEV_DEBUG_DRIVER(dev, "cannot disable supplies %d\n", ret);
+
DRM_DEV_DEBUG_DRIVER(dev, "power down\n");
 }
 
@@ -1742,6 +1767,15 @@ static int anx7625_i2c_probe(struct i2c_client *client,
platform->client = client;
i2c_set_clientdata(client, platform);
 
+   pdata->supplies[0].supply = "vdd10";
+   pdata->supplies[1].supply = "vdd18";
+   pdata->supplies[2].supply = "vdd33";
+   ret = devm_regulator_bulk_get(dev, ARRAY_SIZE(pdata->supplies),
+ pdata->supplies);
+   if (ret) {
+   DRM_DEV_ERROR(dev, "fail to get power supplies: %d\n", ret);
+   return ret;
+   }
anx7625_init_gpio(platform);
 
atomic_set(>power_status, 0);
diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.h 
b/drivers/gpu/drm/bridge/analogix/anx7625.h
index 193ad86c54503..e4a086b3a3d7b 100644
--- a/drivers/gpu/drm/bridge/analogix/anx7625.h
+++ b/drivers/gpu/drm/bridge/analogix/anx7625.h
@@ -350,6 +350,7 @@ struct s_edid_data {
 struct anx7625_platform_data {
struct gpio_desc *gpio_p_on;
struct gpio_desc *gpio_reset;
+   struct regulator_bulk_data supplies[3];
struct drm_bridge *panel_bridge;
int intp_irq;
u32 low_power_mode;
-- 
2.30.1.766.gb4fecdf3b7-goog

[PATCH v4 1/2] dt-bindings: drm/bridge: anx7625: Add power supplies

2021-02-23 Thread Hsin-Yi Wang

anx7625 requires 3 power supply regulators.

Signed-off-by: Hsin-Yi Wang 
Reviewed-by: Rob Herring 
Reviewed-by: Robert Foss 
---
v3->v4: rebase to drm-misc/for-linux-next
---
 .../bindings/display/bridge/analogix,anx7625.yaml | 15 +++
 1 file changed, 15 insertions(+)

diff --git 
a/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml 
b/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml
index c789784efe306..ab48ab2f4240d 100644
--- a/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml
+++ b/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml
@@ -34,6 +34,15 @@ properties:
 description: used for reset chip control, RESET_N pin B7.
 maxItems: 1
 
+  vdd10-supply:
+description: Regulator that provides the supply 1.0V power.
+
+  vdd18-supply:
+description: Regulator that provides the supply 1.8V power.
+
+  vdd33-supply:
+description: Regulator that provides the supply 3.3V power.
+
   ports:
 $ref: /schemas/graph.yaml#/properties/ports
 
@@ -55,6 +64,9 @@ properties:
 required:
   - compatible
   - reg
+  - vdd10-supply
+  - vdd18-supply
+  - vdd33-supply
   - ports
 
 additionalProperties: false
@@ -72,6 +84,9 @@ examples:
 reg = <0x58>;
 enable-gpios = < 45 GPIO_ACTIVE_HIGH>;
 reset-gpios = < 73 GPIO_ACTIVE_HIGH>;
+vdd10-supply = <_mipibrdg>;
+vdd18-supply = <_mipibrdg>;
+vdd33-supply = <_mipibrdg>;
 
 ports {
 #address-cells = <1>;
-- 
2.30.1.766.gb4fecdf3b7-goog

[v8,7/7] MAINTAINERS: Add Jianjun Wang as MediaTek PCI co-maintainer

2021-02-23 Thread Jianjun Wang

Update entry for MediaTek PCIe controller, add Jianjun Wang
as MediaTek PCI co-maintainer.

Signed-off-by: Jianjun Wang 
Acked-by: Ryder Lee 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 546aa66428c9..bef7f4017473 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13826,6 +13826,7 @@ F:  drivers/pci/controller/dwc/pcie-histb.c
 
 PCIE DRIVER FOR MEDIATEK
 M: Ryder Lee 
+M: Jianjun Wang 
 L: linux-...@vger.kernel.org
 L: linux-media...@lists.infradead.org
 S: Supported
-- 
2.25.1

[v8,3/7] PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192

2021-02-23 Thread Jianjun Wang

MediaTek's PCIe host controller has three generation HWs, the new
generation HW is an individual bridge, it supports Gen3 speed and
compatible with Gen2, Gen1 speed.

Add support for new Gen3 controller which can be found on MT8192.

Signed-off-by: Jianjun Wang 
Acked-by: Ryder Lee 
---
 drivers/pci/controller/Kconfig  |  13 +
 drivers/pci/controller/Makefile |   1 +
 drivers/pci/controller/pcie-mediatek-gen3.c | 457 
 3 files changed, 471 insertions(+)
 create mode 100644 drivers/pci/controller/pcie-mediatek-gen3.c

diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
index 64e2f5e379aa..b242b17025b3 100644
--- a/drivers/pci/controller/Kconfig
+++ b/drivers/pci/controller/Kconfig
@@ -242,6 +242,19 @@ config PCIE_MEDIATEK
  Say Y here if you want to enable PCIe controller support on
  MediaTek SoCs.
 
+config PCIE_MEDIATEK_GEN3
+   tristate "MediaTek Gen3 PCIe controller"
+   depends on ARCH_MEDIATEK || COMPILE_TEST
+   depends on PCI_MSI_IRQ_DOMAIN
+   help
+ Adds support for PCIe Gen3 MAC controller for MediaTek SoCs.
+ This PCIe controller is compatible with Gen3, Gen2 and Gen1 speed,
+ and support up to 256 MSI interrupt numbers for
+ multi-function devices.
+
+ Say Y here if you want to enable Gen3 PCIe controller support on
+ MediaTek SoCs.
+
 config PCIE_TANGO_SMP8759
bool "Tango SMP8759 PCIe controller (DANGEROUS)"
depends on ARCH_TANGO && PCI_MSI && OF
diff --git a/drivers/pci/controller/Makefile b/drivers/pci/controller/Makefile
index 04c6edc285c5..df5d77d72a9d 100644
--- a/drivers/pci/controller/Makefile
+++ b/drivers/pci/controller/Makefile
@@ -27,6 +27,7 @@ obj-$(CONFIG_PCIE_ROCKCHIP) += pcie-rockchip.o
 obj-$(CONFIG_PCIE_ROCKCHIP_EP) += pcie-rockchip-ep.o
 obj-$(CONFIG_PCIE_ROCKCHIP_HOST) += pcie-rockchip-host.o
 obj-$(CONFIG_PCIE_MEDIATEK) += pcie-mediatek.o
+obj-$(CONFIG_PCIE_MEDIATEK_GEN3) += pcie-mediatek-gen3.o
 obj-$(CONFIG_PCIE_TANGO_SMP8759) += pcie-tango.o
 obj-$(CONFIG_VMD) += vmd.o
 obj-$(CONFIG_PCIE_BRCMSTB) += pcie-brcmstb.o
diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c 
b/drivers/pci/controller/pcie-mediatek-gen3.c
new file mode 100644
index ..c602beb9afec
--- /dev/null
+++ b/drivers/pci/controller/pcie-mediatek-gen3.c
@@ -0,0 +1,457 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * MediaTek PCIe host controller driver.
+ *
+ * Copyright (c) 2020 MediaTek Inc.
+ * Author: Jianjun Wang 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "../pci.h"
+
+#define PCIE_SETTING_REG   0x80
+#define PCIE_PCI_IDS_1 0x9c
+#define PCI_CLASS(class)   (class << 8)
+#define PCIE_RC_MODE   BIT(0)
+
+#define PCIE_CFGNUM_REG0x140
+#define PCIE_CFG_DEVFN(devfn)  ((devfn) & GENMASK(7, 0))
+#define PCIE_CFG_BUS(bus)  (((bus) << 8) & GENMASK(15, 8))
+#define PCIE_CFG_BYTE_EN(bytes)(((bytes) << 16) & GENMASK(19, 
16))
+#define PCIE_CFG_FORCE_BYTE_EN BIT(20)
+#define PCIE_CFG_OFFSET_ADDR   0x1000
+#define PCIE_CFG_HEADER(bus, devfn) \
+   (PCIE_CFG_BUS(bus) | PCIE_CFG_DEVFN(devfn))
+
+#define PCIE_RST_CTRL_REG  0x148
+#define PCIE_MAC_RSTB  BIT(0)
+#define PCIE_PHY_RSTB  BIT(1)
+#define PCIE_BRG_RSTB  BIT(2)
+#define PCIE_PE_RSTB   BIT(3)
+
+#define PCIE_LTSSM_STATUS_REG  0x150
+
+#define PCIE_LINK_STATUS_REG   0x154
+#define PCIE_PORT_LINKUP   BIT(8)
+
+#define PCIE_TRANS_TABLE_BASE_REG  0x800
+#define PCIE_ATR_SRC_ADDR_MSB_OFFSET   0x4
+#define PCIE_ATR_TRSL_ADDR_LSB_OFFSET  0x8
+#define PCIE_ATR_TRSL_ADDR_MSB_OFFSET  0xc
+#define PCIE_ATR_TRSL_PARAM_OFFSET 0x10
+#define PCIE_ATR_TLB_SET_OFFSET0x20
+
+#define PCIE_MAX_TRANS_TABLES  8
+#define PCIE_ATR_ENBIT(0)
+#define PCIE_ATR_SIZE(size) \
+   (size) - 1) << 1) & GENMASK(6, 1)) | PCIE_ATR_EN)
+#define PCIE_ATR_ID(id)((id) & GENMASK(3, 0))
+#define PCIE_ATR_TYPE_MEM  PCIE_ATR_ID(0)
+#define PCIE_ATR_TYPE_IO   PCIE_ATR_ID(1)
+#define PCIE_ATR_TLP_TYPE(type)(((type) << 16) & GENMASK(18, 
16))
+#define PCIE_ATR_TLP_TYPE_MEM  PCIE_ATR_TLP_TYPE(0)
+#define PCIE_ATR_TLP_TYPE_IO   PCIE_ATR_TLP_TYPE(2)
+
+/**
+ * struct mtk_pcie_port - PCIe port information
+ * @dev: pointer to PCIe device
+ * @base: IO mapped register base
+ * @reg_base: Physical register base
+ * @mac_reset: mac reset control
+ * @phy_reset: phy reset control
+ * @phy: PHY controller block
+ * @clks: PCIe clocks
+ * @num_clks: PCIe clocks count for this port
+ */
+struct mtk_pcie_port {
+   struct device *dev;
+   void __iomem *base;
+

[v8,5/7] PCI: mediatek-gen3: Add MSI support

2021-02-23 Thread Jianjun Wang

Add MSI support for MediaTek Gen3 PCIe controller.

This PCIe controller supports up to 256 MSI vectors, the MSI hardware
block diagram is as follows:

  +-+
  | GIC |
  +-+
 ^
 |
 port->irq
 |
 +-+-+-+-+-+-+-+-+
 |0|1|2|3|4|5|6|7| (PCIe intc)
 +-+-+-+-+-+-+-+-+
  ^ ^   ^
  | |...|
  +---+ +--++---+
  |||
+-+-+---+--+--+  +-+-+---+--+--+  +-+-+---+--+--+
|0|1|...|30|31|  |0|1|...|30|31|  |0|1|...|30|31| (MSI sets)
+-+-+---+--+--+  +-+-+---+--+--+  +-+-+---+--+--+
 ^ ^  ^  ^^ ^  ^  ^^ ^  ^  ^
 | |  |  || |  |  || |  |  |  (MSI vectors)
 | |  |  || |  |  || |  |  |

  (MSI SET0)   (MSI SET1)  ...   (MSI SET7)

With 256 MSI vectors supported, the MSI vectors are composed of 8 sets,
each set has its own address for MSI message, and supports 32 MSI vectors
to generate interrupt.

Signed-off-by: Jianjun Wang 
Acked-by: Ryder Lee 
---
 drivers/pci/controller/pcie-mediatek-gen3.c | 277 
 1 file changed, 277 insertions(+)

diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c 
b/drivers/pci/controller/pcie-mediatek-gen3.c
index 8b3b5f838b69..3cbec22ece0c 100644
--- a/drivers/pci/controller/pcie-mediatek-gen3.c
+++ b/drivers/pci/controller/pcie-mediatek-gen3.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -48,12 +49,29 @@
 #define PCIE_LINK_STATUS_REG   0x154
 #define PCIE_PORT_LINKUP   BIT(8)
 
+#define PCIE_MSI_SET_NUM   8
+#define PCIE_MSI_IRQS_PER_SET  32
+#define PCIE_MSI_IRQS_NUM \
+   (PCIE_MSI_IRQS_PER_SET * PCIE_MSI_SET_NUM)
+
 #define PCIE_INT_ENABLE_REG0x180
+#define PCIE_MSI_ENABLEGENMASK(PCIE_MSI_SET_NUM + 8 - 
1, 8)
+#define PCIE_MSI_SHIFT 8
 #define PCIE_INTX_SHIFT24
 #define PCIE_INTX_ENABLE \
GENMASK(PCIE_INTX_SHIFT + PCI_NUM_INTX - 1, PCIE_INTX_SHIFT)
 
 #define PCIE_INT_STATUS_REG0x184
+#define PCIE_MSI_SET_ENABLE_REG0x190
+#define PCIE_MSI_SET_ENABLEGENMASK(PCIE_MSI_SET_NUM - 1, 0)
+
+#define PCIE_MSI_SET_BASE_REG  0xc00
+#define PCIE_MSI_SET_OFFSET0x10
+#define PCIE_MSI_SET_STATUS_OFFSET 0x04
+#define PCIE_MSI_SET_ENABLE_OFFSET 0x08
+
+#define PCIE_MSI_SET_ADDR_HI_BASE  0xc80
+#define PCIE_MSI_SET_ADDR_HI_OFFSET0x04
 
 #define PCIE_TRANS_TABLE_BASE_REG  0x800
 #define PCIE_ATR_SRC_ADDR_MSB_OFFSET   0x4
@@ -73,6 +91,16 @@
 #define PCIE_ATR_TLP_TYPE_MEM  PCIE_ATR_TLP_TYPE(0)
 #define PCIE_ATR_TLP_TYPE_IO   PCIE_ATR_TLP_TYPE(2)
 
+/**
+ * struct mtk_pcie_msi - MSI information for each set
+ * @base: IO mapped register base
+ * @msg_addr: MSI message address
+ */
+struct mtk_msi_set {
+   void __iomem *base;
+   phys_addr_t msg_addr;
+};
+
 /**
  * struct mtk_pcie_port - PCIe port information
  * @dev: pointer to PCIe device
@@ -86,6 +114,11 @@
  * @irq: PCIe controller interrupt number
  * @irq_lock: lock protecting IRQ register access
  * @intx_domain: legacy INTx IRQ domain
+ * @msi_domain: MSI IRQ domain
+ * @msi_bottom_domain: MSI IRQ bottom domain
+ * @msi_sets: MSI sets information
+ * @lock: lock protecting IRQ bit map
+ * @msi_irq_in_use: bit map for assigned MSI IRQ
  */
 struct mtk_pcie_port {
struct device *dev;
@@ -100,6 +133,11 @@ struct mtk_pcie_port {
int irq;
raw_spinlock_t irq_lock;
struct irq_domain *intx_domain;
+   struct irq_domain *msi_domain;
+   struct irq_domain *msi_bottom_domain;
+   struct mtk_msi_set msi_sets[PCIE_MSI_SET_NUM];
+   struct mutex lock;
+   DECLARE_BITMAP(msi_irq_in_use, PCIE_MSI_IRQS_NUM);
 };
 
 /**
@@ -197,6 +235,35 @@ static int mtk_pcie_set_trans_table(struct mtk_pcie_port 
*port,
return 0;
 }
 
+static void mtk_pcie_enable_msi(struct mtk_pcie_port *port)
+{
+   int i;
+   u32 val;
+
+   val = readl_relaxed(port->base + PCIE_MSI_SET_ENABLE_REG);
+   val |= PCIE_MSI_SET_ENABLE;
+   writel_relaxed(val, port->base + PCIE_MSI_SET_ENABLE_REG);
+
+   val = readl_relaxed(port->base + PCIE_INT_ENABLE_REG);
+   val |= PCIE_MSI_ENABLE;
+   writel_relaxed(val, port->base + PCIE_INT_ENABLE_REG);
+
+   for (i = 0; i < PCIE_MSI_SET_NUM; i++) {
+   struct mtk_msi_set *msi_set = >msi_sets[i];
+
+   msi_set->base = port->base + PCIE_MSI_SET_BASE_REG +
+   i * PCIE_MSI_SET_OFFSET;
+   msi_set->msg_addr = port->reg_base + PCIE_MSI_SET_BASE_REG +
+   i * PCIE_MSI_SET_OFFSET;
+
+   /* Configure the MSI capture address */
+

[v8,4/7] PCI: mediatek-gen3: Add INTx support

2021-02-23 Thread Jianjun Wang

Add INTx support for MediaTek Gen3 PCIe controller.

Signed-off-by: Jianjun Wang 
Acked-by: Ryder Lee 
---
 drivers/pci/controller/pcie-mediatek-gen3.c | 176 
 1 file changed, 176 insertions(+)

diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c 
b/drivers/pci/controller/pcie-mediatek-gen3.c
index c602beb9afec..8b3b5f838b69 100644
--- a/drivers/pci/controller/pcie-mediatek-gen3.c
+++ b/drivers/pci/controller/pcie-mediatek-gen3.c
@@ -9,6 +9,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -45,6 +48,13 @@
 #define PCIE_LINK_STATUS_REG   0x154
 #define PCIE_PORT_LINKUP   BIT(8)
 
+#define PCIE_INT_ENABLE_REG0x180
+#define PCIE_INTX_SHIFT24
+#define PCIE_INTX_ENABLE \
+   GENMASK(PCIE_INTX_SHIFT + PCI_NUM_INTX - 1, PCIE_INTX_SHIFT)
+
+#define PCIE_INT_STATUS_REG0x184
+
 #define PCIE_TRANS_TABLE_BASE_REG  0x800
 #define PCIE_ATR_SRC_ADDR_MSB_OFFSET   0x4
 #define PCIE_ATR_TRSL_ADDR_LSB_OFFSET  0x8
@@ -73,6 +83,9 @@
  * @phy: PHY controller block
  * @clks: PCIe clocks
  * @num_clks: PCIe clocks count for this port
+ * @irq: PCIe controller interrupt number
+ * @irq_lock: lock protecting IRQ register access
+ * @intx_domain: legacy INTx IRQ domain
  */
 struct mtk_pcie_port {
struct device *dev;
@@ -83,6 +96,10 @@ struct mtk_pcie_port {
struct phy *phy;
struct clk_bulk_data *clks;
int num_clks;
+
+   int irq;
+   raw_spinlock_t irq_lock;
+   struct irq_domain *intx_domain;
 };
 
 /**
@@ -199,6 +216,11 @@ static int mtk_pcie_startup_port(struct mtk_pcie_port 
*port)
val |= PCI_CLASS(PCI_CLASS_BRIDGE_PCI << 8);
writel_relaxed(val, port->base + PCIE_PCI_IDS_1);
 
+   /* Mask all INTx interrupts */
+   val = readl_relaxed(port->base + PCIE_INT_ENABLE_REG);
+   val &= ~PCIE_INTX_ENABLE;
+   writel_relaxed(val, port->base + PCIE_INT_ENABLE_REG);
+
/* Assert all reset signals */
val = readl_relaxed(port->base + PCIE_RST_CTRL_REG);
val |= PCIE_MAC_RSTB | PCIE_PHY_RSTB | PCIE_BRG_RSTB | PCIE_PE_RSTB;
@@ -262,6 +284,154 @@ static int mtk_pcie_startup_port(struct mtk_pcie_port 
*port)
return 0;
 }
 
+static int mtk_pcie_set_affinity(struct irq_data *data,
+const struct cpumask *mask, bool force)
+{
+   return -EINVAL;
+}
+
+static void mtk_intx_mask(struct irq_data *data)
+{
+   struct mtk_pcie_port *port = irq_data_get_irq_chip_data(data);
+   unsigned long flags;
+   u32 val;
+
+   raw_spin_lock_irqsave(>irq_lock, flags);
+   val = readl_relaxed(port->base + PCIE_INT_ENABLE_REG);
+   val &= ~BIT(data->hwirq + PCIE_INTX_SHIFT);
+   writel_relaxed(val, port->base + PCIE_INT_ENABLE_REG);
+   raw_spin_unlock_irqrestore(>irq_lock, flags);
+}
+
+static void mtk_intx_unmask(struct irq_data *data)
+{
+   struct mtk_pcie_port *port = irq_data_get_irq_chip_data(data);
+   unsigned long flags;
+   u32 val;
+
+   raw_spin_lock_irqsave(>irq_lock, flags);
+   val = readl_relaxed(port->base + PCIE_INT_ENABLE_REG);
+   val |= BIT(data->hwirq + PCIE_INTX_SHIFT);
+   writel_relaxed(val, port->base + PCIE_INT_ENABLE_REG);
+   raw_spin_unlock_irqrestore(>irq_lock, flags);
+}
+
+/**
+ * mtk_intx_eoi
+ * @data: pointer to chip specific data
+ *
+ * As an emulated level IRQ, its interrupt status will remain
+ * until the corresponding de-assert message is received; hence that
+ * the status can only be cleared when the interrupt has been serviced.
+ */
+static void mtk_intx_eoi(struct irq_data *data)
+{
+   struct mtk_pcie_port *port = irq_data_get_irq_chip_data(data);
+   unsigned long hwirq;
+
+   hwirq = data->hwirq + PCIE_INTX_SHIFT;
+   writel_relaxed(BIT(hwirq), port->base + PCIE_INT_STATUS_REG);
+}
+
+static struct irq_chip mtk_intx_irq_chip = {
+   .irq_enable = mtk_intx_unmask,
+   .irq_disable= mtk_intx_mask,
+   .irq_mask   = mtk_intx_mask,
+   .irq_unmask = mtk_intx_unmask,
+   .irq_eoi= mtk_intx_eoi,
+   .irq_set_affinity   = mtk_pcie_set_affinity,
+   .name   = "INTx",
+};
+
+static int mtk_pcie_intx_map(struct irq_domain *domain, unsigned int irq,
+irq_hw_number_t hwirq)
+{
+   irq_set_chip_data(irq, domain->host_data);
+   irq_set_chip_and_handler_name(irq, _intx_irq_chip,
+ handle_fasteoi_irq, "INTx");
+   return 0;
+}
+
+static const struct irq_domain_ops intx_domain_ops = {
+   .map = mtk_pcie_intx_map,
+};
+
+static int mtk_pcie_init_irq_domains(struct mtk_pcie_port *port)
+{
+   struct device *dev = port->dev;
+   struct device_node *intc_node, *node = dev->of_node;
+
+   raw_spin_lock_init(>irq_lock);
+
+   /* Setup INTx */
+

[v8,1/7] dt-bindings: PCI: mediatek-gen3: Add YAML schema

2021-02-23 Thread Jianjun Wang

Add YAML schemas documentation for Gen3 PCIe controller on
MediaTek SoCs.

Signed-off-by: Jianjun Wang 
Acked-by: Ryder Lee 
---
 .../bindings/pci/mediatek-pcie-gen3.yaml  | 181 ++
 1 file changed, 181 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml

diff --git a/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml 
b/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml
new file mode 100644
index ..e7b1f9892da4
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml
@@ -0,0 +1,181 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/pci/mediatek-pcie-gen3.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Gen3 PCIe controller on MediaTek SoCs
+
+maintainers:
+  - Jianjun Wang 
+
+description: |+
+  PCIe Gen3 MAC controller for MediaTek SoCs, it supports Gen3 speed
+  and compatible with Gen2, Gen1 speed.
+
+  This PCIe controller supports up to 256 MSI vectors, the MSI hardware
+  block diagram is as follows:
+
++-+
+| GIC |
++-+
+   ^
+   |
+   port->irq
+   |
+   +-+-+-+-+-+-+-+-+
+   |0|1|2|3|4|5|6|7| (PCIe intc)
+   +-+-+-+-+-+-+-+-+
+^ ^   ^
+| |...|
++---+ +--++---+
+|||
+  +-+-+---+--+--+  +-+-+---+--+--+  +-+-+---+--+--+
+  |0|1|...|30|31|  |0|1|...|30|31|  |0|1|...|30|31| (MSI sets)
+  +-+-+---+--+--+  +-+-+---+--+--+  +-+-+---+--+--+
+   ^ ^  ^  ^^ ^  ^  ^^ ^  ^  ^
+   | |  |  || |  |  || |  |  |  (MSI vectors)
+   | |  |  || |  |  || |  |  |
+
+(MSI SET0)   (MSI SET1)  ...   (MSI SET7)
+
+  With 256 MSI vectors supported, the MSI vectors are composed of 8 sets,
+  each set has its own address for MSI message, and supports 32 MSI vectors
+  to generate interrupt.
+
+allOf:
+  - $ref: /schemas/pci/pci-bus.yaml#
+
+properties:
+  compatible:
+const: mediatek,mt8192-pcie
+
+  reg:
+maxItems: 1
+
+  reg-names:
+items:
+  - const: pcie-mac
+
+  interrupts:
+maxItems: 1
+
+  ranges:
+minItems: 1
+maxItems: 8
+
+  resets:
+minItems: 1
+maxItems: 2
+
+  reset-names:
+minItems: 1
+maxItems: 2
+items:
+  - const: phy
+  - const: mac
+
+  clocks:
+maxItems: 6
+
+  clock-names:
+items:
+  - const: pl_250m
+  - const: tl_26m
+  - const: tl_96m
+  - const: tl_32k
+  - const: peri_26m
+  - const: top_133m
+
+  assigned-clocks:
+maxItems: 1
+
+  assigned-clock-parents:
+maxItems: 1
+
+  phys:
+maxItems: 1
+
+  '#interrupt-cells':
+const: 1
+
+  interrupt-controller:
+description: Interrupt controller node for handling legacy PCI interrupts.
+type: object
+properties:
+  '#address-cells':
+const: 0
+  '#interrupt-cells':
+const: 1
+  interrupt-controller: true
+
+required:
+  - '#address-cells'
+  - '#interrupt-cells'
+  - interrupt-controller
+
+additionalProperties: false
+
+required:
+  - compatible
+  - reg
+  - reg-names
+  - interrupts
+  - ranges
+  - clocks
+  - '#interrupt-cells'
+  - interrupt-controller
+
+unevaluatedProperties: false
+
+examples:
+  - |
+#include 
+#include 
+
+bus {
+#address-cells = <2>;
+#size-cells = <2>;
+
+pcie: pcie@1123 {
+compatible = "mediatek,mt8192-pcie";
+device_type = "pci";
+#address-cells = <3>;
+#size-cells = <2>;
+reg = <0x00 0x1123 0x00 0x4000>;
+reg-names = "pcie-mac";
+interrupts = ;
+bus-range = <0x00 0xff>;
+ranges = <0x8200 0x00 0x1200 0x00
+  0x1200 0x00 0x100>;
+clocks = < 44>,
+ < 40>,
+ < 43>,
+ < 97>,
+ < 99>,
+ < 111>;
+clock-names = "pl_250m", "tl_26m", "tl_96m",
+  "tl_32k", "peri_26m", "top_133m";
+assigned-clocks = < 50>;
+assigned-clock-parents = < 91>;
+
+phys = <>;
+phy-names = "pcie-phy";
+
+resets = <_rst 2>,
+ <_rst 3>;
+reset-names = "phy", "mac";
+
+#interrupt-cells = <1>;
+interrupt-map-mask = <0 0 0 0x7>;
+interrupt-map = <0 0 0 1 _intc 0>,
+<0 0 0 2 _intc 1>,
+<0 0 0 3 _intc 2>,
+<0 0 0 4 _intc 3>;
+pcie_intc: interrupt-controller {
+  #address-cells =

[v8,2/7] PCI: Export pci_pio_to_address() for module use

2021-02-23 Thread Jianjun Wang

This interface will be used by PCI host drivers for PIO translation,
export it to support compiling those drivers as kernel modules.

Signed-off-by: Jianjun Wang 
---
 drivers/pci/pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b9fecc25d213..7ce72a82bec5 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4047,6 +4047,7 @@ phys_addr_t pci_pio_to_address(unsigned long pio)
 
return address;
 }
+EXPORT_SYMBOL(pci_pio_to_address);
 
 unsigned long __weak pci_address_to_pio(phys_addr_t address)
 {
-- 
2.25.1

[v8,6/7] PCI: mediatek-gen3: Add system PM support

2021-02-23 Thread Jianjun Wang

Add suspend_noirq and resume_noirq callback functions to implement
PM system suspend hooks for MediaTek Gen3 PCIe controller.

When system suspend, trigger the PCIe link to L2 state and pull down
the PERST# pin, gating the clocks of MAC layer and power off the
physical layer for the sake of power saving.

When system resum, the PCIe link should be re-established and the
related control register values should be restored.

Signed-off-by: Jianjun Wang 
Acked-by: Ryder Lee 
---
 drivers/pci/controller/pcie-mediatek-gen3.c | 84 +
 1 file changed, 84 insertions(+)

diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c 
b/drivers/pci/controller/pcie-mediatek-gen3.c
index fde9de591888..fd13540d37fe 100644
--- a/drivers/pci/controller/pcie-mediatek-gen3.c
+++ b/drivers/pci/controller/pcie-mediatek-gen3.c
@@ -45,6 +45,9 @@
 #define PCIE_PE_RSTB   BIT(3)
 
 #define PCIE_LTSSM_STATUS_REG  0x150
+#define PCIE_LTSSM_STATE_MASK  GENMASK(28, 24)
+#define PCIE_LTSSM_STATE(val)  ((val & PCIE_LTSSM_STATE_MASK) >> 24)
+#define PCIE_LTSSM_STATE_L2_IDLE   0x14
 
 #define PCIE_LINK_STATUS_REG   0x154
 #define PCIE_PORT_LINKUP   BIT(8)
@@ -73,6 +76,9 @@
 #define PCIE_MSI_SET_ADDR_HI_BASE  0xc80
 #define PCIE_MSI_SET_ADDR_HI_OFFSET0x04
 
+#define PCIE_ICMD_PM_REG   0x198
+#define PCIE_TURN_OFF_LINK BIT(4)
+
 #define PCIE_TRANS_TABLE_BASE_REG  0x800
 #define PCIE_ATR_SRC_ADDR_MSB_OFFSET   0x4
 #define PCIE_ATR_TRSL_ADDR_LSB_OFFSET  0x8
@@ -892,6 +898,83 @@ static int mtk_pcie_remove(struct platform_device *pdev)
return 0;
 }
 
+static int __maybe_unused mtk_pcie_turn_off_link(struct mtk_pcie_port *port)
+{
+   u32 val;
+
+   val = readl_relaxed(port->base + PCIE_ICMD_PM_REG);
+   val |= PCIE_TURN_OFF_LINK;
+   writel_relaxed(val, port->base + PCIE_ICMD_PM_REG);
+
+   /* Check the link is L2 */
+   return readl_poll_timeout(port->base + PCIE_LTSSM_STATUS_REG, val,
+ (PCIE_LTSSM_STATE(val) ==
+  PCIE_LTSSM_STATE_L2_IDLE), 20,
+  50 * USEC_PER_MSEC);
+}
+
+static int __maybe_unused mtk_pcie_suspend_noirq(struct device *dev)
+{
+   struct mtk_pcie_port *port = dev_get_drvdata(dev);
+   int err;
+   u32 val;
+
+   /* Trigger link to L2 state */
+   err = mtk_pcie_turn_off_link(port);
+   if (err) {
+   dev_err(port->dev, "can not enter L2 state\n");
+   return err;
+   }
+
+   /* Pull down the PERST# pin */
+   val = readl_relaxed(port->base + PCIE_RST_CTRL_REG);
+   val |= PCIE_PE_RSTB;
+   writel_relaxed(val, port->base + PCIE_RST_CTRL_REG);
+
+   dev_dbg(port->dev, "enter L2 state success");
+
+   clk_bulk_disable_unprepare(port->num_clks, port->clks);
+
+   reset_control_assert(port->mac_reset);
+
+   phy_power_off(port->phy);
+   reset_control_assert(port->phy_reset);
+
+   return 0;
+}
+
+static int __maybe_unused mtk_pcie_resume_noirq(struct device *dev)
+{
+   struct mtk_pcie_port *port = dev_get_drvdata(dev);
+   int err;
+
+   reset_control_deassert(port->phy_reset);
+   phy_power_on(port->phy);
+
+   reset_control_deassert(port->mac_reset);
+
+   err = clk_bulk_prepare_enable(port->num_clks, port->clks);
+   if (err) {
+   dev_dbg(dev, "failed to enable PCIe clocks\n");
+   return err;
+   }
+
+   err = mtk_pcie_startup_port(port);
+   if (err) {
+   dev_err(port->dev, "resume failed\n");
+   return err;
+   }
+
+   dev_dbg(port->dev, "resume done\n");
+
+   return 0;
+}
+
+static const struct dev_pm_ops mtk_pcie_pm_ops = {
+   SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(mtk_pcie_suspend_noirq,
+ mtk_pcie_resume_noirq)
+};
+
 static const struct of_device_id mtk_pcie_of_match[] = {
{ .compatible = "mediatek,mt8192-pcie" },
{},
@@ -903,6 +986,7 @@ static struct platform_driver mtk_pcie_driver = {
.driver = {
.name = "mtk-pcie",
.of_match_table = mtk_pcie_of_match,
+   .pm = _pcie_pm_ops,
},
 };
 
-- 
2.25.1

[v8,0/7] PCI: mediatek: Add new generation controller support

2021-02-23 Thread Jianjun Wang

These series patches add pcie-mediatek-gen3.c and dt-bindings file to
support new generation PCIe controller.

Changes in v8:
1. Add irq_clock to protect IRQ register access;
2. Mask all INTx interrupt when startup port;
3. Remove activate/deactivate callbacks from bottom_domain_ops;
4. Add unmask/mask callbacks in mtk_msi_bottom_irq_chip;
5. Add property information for reg-names.

Changes in v7:
1. Split the driver patch to core PCIe, INTx, MSI and PM patches;
2. Reshape MSI init and handle flow, use msi_bottom_domain to cover all sets;
3. Replace readl/writel with their relaxed version;
4. Add MSI description in binding document;
5. Add pl_250m clock in binding document.

Changes in v6:
1. Export pci_pio_to_address() to support compiling as kernel module;
2. Replace usleep_range(100 * 1000, 120 * 1000) with msleep(100);
3. Replace dev_notice with dev_err;
4. Fix MSI get hwirq flow;
5. Fix warning for possible recursive locking in mtk_pcie_set_affinity.

Changes in v5:
1. Remove unused macros
2. Modify the config read/write callbacks, set the config byte field
   in TLP header and use pci_generic_config_read32/write32
   to access the config space
3. Fix the settings of translation window, both MEM and IO regions
   works properly
4. Fix typos

Changes in v4:
1. Fix PCIe power up/down flow
2. Use "mac" and "phy" for reset names
3. Add clock names
4. Fix the variables type

Changes in v3:
1. Remove standard property in binding document
2. Return error number when get_optional* API throws an error
3. Use the bulk clk APIs

Changes in v2:
1. Fix the typo of dt-bindings patch
2. Remove the unnecessary properties in binding document
3. dispos the irq mappings of msi top domain when irq teardown

Jianjun Wang (7):
  dt-bindings: PCI: mediatek-gen3: Add YAML schema
  PCI: Export pci_pio_to_address() for module use
  PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192
  PCI: mediatek-gen3: Add INTx support
  PCI: mediatek-gen3: Add MSI support
  PCI: mediatek-gen3: Add system PM support
  MAINTAINERS: Add Jianjun Wang as MediaTek PCI co-maintainer

 .../bindings/pci/mediatek-pcie-gen3.yaml  | 181 
 MAINTAINERS   |   1 +
 drivers/pci/controller/Kconfig|  13 +
 drivers/pci/controller/Makefile   |   1 +
 drivers/pci/controller/pcie-mediatek-gen3.c   | 994 ++
 drivers/pci/pci.c |   1 +
 6 files changed, 1191 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml
 create mode 100644 drivers/pci/controller/pcie-mediatek-gen3.c

-- 
2.25.1

[RFC net-next] net: dsa: rtl8366rb: support bridge offloading

2021-02-23 Thread DENG Qingfang

Use port isolation registers to configure bridge offloading.
Remove the VLAN init, as we have proper CPU tag and bridge offloading
support now.

Signed-off-by: DENG Qingfang 
---
This is not tested, as I don't have a RTL8366RB board. And I think there
is potential race condition in port_bridge_{join,leave}.

 drivers/net/dsa/rtl8366rb.c | 73 ++---
 1 file changed, 67 insertions(+), 6 deletions(-)

diff --git a/drivers/net/dsa/rtl8366rb.c b/drivers/net/dsa/rtl8366rb.c
index a89093bc6c6a..9f6e2b361216 100644
--- a/drivers/net/dsa/rtl8366rb.c
+++ b/drivers/net/dsa/rtl8366rb.c
@@ -300,6 +300,12 @@
 #define RTL8366RB_INTERRUPT_STATUS_REG 0x0442
 #define RTL8366RB_NUM_INTERRUPT14 /* 0..13 */
 
+/* Port isolation registers */
+#define RTL8366RB_PORT_ISO_BASE0x0F08
+#define RTL8366RB_PORT_ISO(pnum)   (RTL8366RB_PORT_ISO_BASE + (pnum))
+#define RTL8366RB_PORT_ISO_EN  BIT(0)
+#define RTL8366RB_PORT_ISO_PORTS_MASK  GENMASK(7, 1)
+
 /* bits 0..5 enable force when cleared */
 #define RTL8366RB_MAC_FORCE_CTRL_REG   0x0F11
 
@@ -835,6 +841,15 @@ static int rtl8366rb_setup(struct dsa_switch *ds)
if (ret)
return ret;
 
+   /* Isolate user ports */
+   for (i = 0; i < RTL8366RB_PORT_NUM_CPU; i++) {
+   ret = regmap_write(smi->map, RTL8366RB_PORT_ISO(i),
+  RTL8366RB_PORT_ISO_EN |
+  BIT(RTL8366RB_PORT_NUM_CPU + 1));
+   if (ret)
+   return ret;
+   }
+
/* Set up the "green ethernet" feature */
ret = rtl8366rb_jam_table(rtl8366rb_green_jam,
  ARRAY_SIZE(rtl8366rb_green_jam), smi, false);
@@ -963,10 +978,6 @@ static int rtl8366rb_setup(struct dsa_switch *ds)
return ret;
}
 
-   ret = rtl8366_init_vlan(smi);
-   if (ret)
-   return ret;
-
ret = rtl8366rb_setup_cascaded_irq(smi);
if (ret)
dev_info(smi->dev, "no interrupt support\n");
@@ -977,8 +988,6 @@ static int rtl8366rb_setup(struct dsa_switch *ds)
return -ENODEV;
}
 
-   ds->configure_vlan_while_not_filtering = false;
-
return 0;
 }
 
@@ -1127,6 +1136,56 @@ rtl8366rb_port_disable(struct dsa_switch *ds, int port)
rb8366rb_set_port_led(smi, port, false);
 }
 
+static int
+rtl8366rb_port_bridge_join(struct dsa_switch *ds, int port,
+  struct net_device *bridge)
+{
+   struct realtek_smi *smi = ds->priv;
+   unsigned int port_bitmap = 0;
+   int ret, i;
+
+   for (i = 0; i < RTL8366RB_PORT_NUM_CPU; i++) {
+   if (i == port)
+   continue;
+   if (dsa_to_port(ds, i)->bridge_dev != bridge)
+   continue;
+   ret = regmap_update_bits(smi->map, RTL8366RB_PORT_ISO(i),
+0, BIT(port + 1));
+   if (ret)
+   return ret;
+
+   port_bitmap |= BIT(i);
+   }
+
+   return regmap_update_bits(smi->map, RTL8366RB_PORT_ISO(port),
+ 0, port_bitmap << 1);
+}
+
+static int
+rtl8366rb_port_bridge_leave(struct dsa_switch *ds, int port,
+   struct net_device *bridge)
+{
+   struct realtek_smi *smi = ds->priv;
+   unsigned int port_bitmap = 0;
+   int ret, i;
+
+   for (i = 0; i < RTL8366RB_PORT_NUM_CPU; i++) {
+   if (i == port)
+   continue;
+   if (dsa_to_port(ds, i)->bridge_dev != bridge)
+   continue;
+   ret = regmap_update_bits(smi->map, RTL8366RB_PORT_ISO(i),
+BIT(port + 1), 0);
+   if (ret)
+   return ret;
+
+   port_bitmap |= BIT(i);
+   }
+
+   return regmap_update_bits(smi->map, RTL8366RB_PORT_ISO(port),
+ port_bitmap << 1, 0);
+}
+
 static int rtl8366rb_change_mtu(struct dsa_switch *ds, int port, int new_mtu)
 {
struct realtek_smi *smi = ds->priv;
@@ -1510,6 +1569,8 @@ static const struct dsa_switch_ops rtl8366rb_switch_ops = 
{
.get_strings = rtl8366_get_strings,
.get_ethtool_stats = rtl8366_get_ethtool_stats,
.get_sset_count = rtl8366_get_sset_count,
+   .port_bridge_join = rtl8366rb_port_bridge_join,
+   .port_bridge_leave = rtl8366rb_port_bridge_leave,
.port_vlan_filtering = rtl8366_vlan_filtering,
.port_vlan_add = rtl8366_vlan_add,
.port_vlan_del = rtl8366_vlan_del,
-- 
2.25.1

Re: [PATCH 2/2] units: Use the HZ_PER_KHZ macro

2021-02-23 Thread Chanwoo Choi

On 2/24/21 5:30 AM, Daniel Lezcano wrote:
> The HZ_PER_KHZ macro definition is duplicated in different subsystems.
> 
> The macro now exists in include/linux/units.h, make use of it and
> remove all the duplicated ones.
> 
> Signed-off-by: Daniel Lezcano 
> ---
>  drivers/devfreq/devfreq.c | 2 +-
>  drivers/iio/light/as73211.c   | 3 +--
>  drivers/thermal/devfreq_cooling.c | 2 +-
>  3 files changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> index 6aa10de792b3..4c636c336ace 100644
> --- a/drivers/devfreq/devfreq.c
> +++ b/drivers/devfreq/devfreq.c
> @@ -26,6 +26,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "governor.h"
>  
>  #define CREATE_TRACE_POINTS
> @@ -33,7 +34,6 @@
>  
>  #define IS_SUPPORTED_FLAG(f, name) ((f & DEVFREQ_GOV_FLAG_##name) ? true : 
> false)
>  #define IS_SUPPORTED_ATTR(f, name) ((f & DEVFREQ_GOV_ATTR_##name) ? true : 
> false)
> -#define HZ_PER_KHZ   1000
>  
>  static struct class *devfreq_class;
>  static struct dentry *devfreq_debugfs;
> diff --git a/drivers/iio/light/as73211.c b/drivers/iio/light/as73211.c
> index 7b32dfaee9b3..3ba2378df3dd 100644
> --- a/drivers/iio/light/as73211.c
> +++ b/drivers/iio/light/as73211.c
> @@ -24,8 +24,7 @@
>  #include 
>  #include 
>  #include 
> -
> -#define HZ_PER_KHZ 1000
> +#include 
>  
>  #define AS73211_DRV_NAME "as73211"
>  
> diff --git a/drivers/thermal/devfreq_cooling.c 
> b/drivers/thermal/devfreq_cooling.c
> index fed3121ff2a1..fa5b8b0c7604 100644
> --- a/drivers/thermal/devfreq_cooling.c
> +++ b/drivers/thermal/devfreq_cooling.c
> @@ -19,10 +19,10 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> -#define HZ_PER_KHZ   1000
>  #define SCALE_ERROR_MITIGATION   100
>  
>  static DEFINE_IDA(devfreq_ida);
> 

For devfreq part,
Acked-by: Chanwoo Choi 

-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

Re: [PATCH] cpufreq: schedutil: Call sugov_update_next_freq() before check to fast_switch_enabled

2021-02-23 Thread Yue Hu

On Wed, 24 Feb 2021 11:32:36 +0530
Viresh Kumar  wrote:

> On 24-02-21, 13:42, Yue Hu wrote:
> > From: Yue Hu 
> > 
> > Note that sugov_update_next_freq() may return false, that means the
> > caller sugov_fast_switch() will do nothing except fast switch check.
> > 
> > Similarly, sugov_deferred_update() also has unnecessary operations
> > of raw_spin_{lock,unlock} in sugov_update_single_freq() for that case.
> > 
> > So, let's call sugov_update_next_freq() before the fast switch check
> > to avoid unnecessary behaviors above. Update the related interface
> > definitions accordingly.
> > 
> > Signed-off-by: Yue Hu 
> > ---
> >  kernel/sched/cpufreq_schedutil.c | 28 ++--
> >  1 file changed, 14 insertions(+), 14 deletions(-)
> > 
> > diff --git a/kernel/sched/cpufreq_schedutil.c 
> > b/kernel/sched/cpufreq_schedutil.c
> > index 41e498b..d23e5be 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -114,19 +114,13 @@ static bool sugov_update_next_freq(struct 
> > sugov_policy *sg_policy, u64 time,
> > return true;
> >  }
> >  
> > -static void sugov_fast_switch(struct sugov_policy *sg_policy, u64 time,
> > - unsigned int next_freq)
> > +static void sugov_fast_switch(struct sugov_policy *sg_policy, unsigned int 
> > next_freq)
> >  {
> > -   if (sugov_update_next_freq(sg_policy, time, next_freq))
> > -   cpufreq_driver_fast_switch(sg_policy->policy, next_freq);
> > +   cpufreq_driver_fast_switch(sg_policy->policy, next_freq);  
> 
> I will call this directly instead, no need of the wrapper anymore.

To fix it in next version.

Thank you.

> 
> >  }
> >  
> > -static void sugov_deferred_update(struct sugov_policy *sg_policy, u64 time,
> > - unsigned int next_freq)
> > +static void sugov_deferred_update(struct sugov_policy *sg_policy)
> >  {
> > -   if (!sugov_update_next_freq(sg_policy, time, next_freq))
> > -   return;
> > -
> > if (!sg_policy->work_in_progress) {
> > sg_policy->work_in_progress = true;
> > irq_work_queue(_policy->irq_work);
> > @@ -368,16 +362,19 @@ static void sugov_update_single_freq(struct 
> > update_util_data *hook, u64 time,
> > sg_policy->cached_raw_freq = cached_freq;
> > }
> >  
> > +   if (!sugov_update_next_freq(sg_policy, time, next_f))
> > +   return;
> > +
> > /*
> >  * This code runs under rq->lock for the target CPU, so it won't run
> >  * concurrently on two different CPUs for the same target and it is not
> >  * necessary to acquire the lock in the fast switch case.
> >  */
> > if (sg_policy->policy->fast_switch_enabled) {
> > -   sugov_fast_switch(sg_policy, time, next_f);
> > +   sugov_fast_switch(sg_policy, next_f);
> > } else {
> > raw_spin_lock(_policy->update_lock);
> > -   sugov_deferred_update(sg_policy, time, next_f);
> > +   sugov_deferred_update(sg_policy);
> > raw_spin_unlock(_policy->update_lock);
> > }
> >  }
> > @@ -456,12 +453,15 @@ static unsigned int sugov_next_freq_shared(struct 
> > sugov_cpu *sg_cpu, u64 time)
> > if (sugov_should_update_freq(sg_policy, time)) {
> > next_f = sugov_next_freq_shared(sg_cpu, time);
> >  
> > +   if (!sugov_update_next_freq(sg_policy, time, next_f))
> > +   goto unlock;
> > +
> > if (sg_policy->policy->fast_switch_enabled)
> > -   sugov_fast_switch(sg_policy, time, next_f);
> > +   sugov_fast_switch(sg_policy, next_f);
> > else
> > -   sugov_deferred_update(sg_policy, time, next_f);
> > +   sugov_deferred_update(sg_policy);
> > }
> > -
> > +unlock:
> > raw_spin_unlock(_policy->update_lock);
> >  }  
>

Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero

2021-02-23 Thread Jason Wang




On 2021/2/24 1:04 下午, Michael S. Tsirkin wrote:

On Tue, Feb 23, 2021 at 11:35:57AM -0800, Si-Wei Liu wrote:


On 2/23/2021 5:26 AM, Michael S. Tsirkin wrote:

On Tue, Feb 23, 2021 at 10:03:57AM +0800, Jason Wang wrote:

On 2021/2/23 9:12 上午, Si-Wei Liu wrote:

On 2/21/2021 11:34 PM, Michael S. Tsirkin wrote:

On Mon, Feb 22, 2021 at 12:14:17PM +0800, Jason Wang wrote:

On 2021/2/19 7:54 下午, Si-Wei Liu wrote:

Commit 452639a64ad8 ("vdpa: make sure set_features is invoked
for legacy") made an exception for legacy guests to reset
features to 0, when config space is accessed before features
are set. We should relieve the verify_min_features() check
and allow features reset to 0 for this case.

It's worth noting that not just legacy guests could access
config space before features are set. For instance, when
feature VIRTIO_NET_F_MTU is advertised some modern driver
will try to access and validate the MTU present in the config
space before virtio features are set.

This looks like a spec violation:

"

The following driver-read-only field, mtu only exists if
VIRTIO_NET_F_MTU is
set.
This field specifies the maximum MTU for the driver to use.
"

Do we really want to workaround this?

Thanks

And also:

The driver MUST follow this sequence to initialize a device:
1. Reset the device.
2. Set the ACKNOWLEDGE status bit: the guest OS has noticed the device.
3. Set the DRIVER status bit: the guest OS knows how to drive the
device.
4. Read device feature bits, and write the subset of feature bits
understood by the OS and driver to the
device. During this step the driver MAY read (but MUST NOT write)
the device-specific configuration
fields to check that it can support the device before accepting it.
5. Set the FEATURES_OK status bit. The driver MUST NOT accept new
feature bits after this step.
6. Re-read device status to ensure the FEATURES_OK bit is still set:
otherwise, the device does not
support our subset of features and the device is unusable.
7. Perform device-specific setup, including discovery of virtqueues
for the device, optional per-bus setup,
reading and possibly writing the device’s virtio configuration
space, and population of virtqueues.
8. Set the DRIVER_OK status bit. At this point the device is “live”.


so accessing config space before FEATURES_OK is a spec violation, right?

It is, but it's not relevant to what this commit tries to address. I
thought the legacy guest still needs to be supported.

Having said, a separate patch has to be posted to fix the guest driver
issue where this discrepancy is introduced to virtnet_validate() (since
commit fe36cbe067). But it's not technically related to this patch.

-Siwei

I think it's a bug to read config space in validate, we should move it to
virtnet_probe().

Thanks

I take it back, reading but not writing seems to be explicitly allowed by spec.
So our way to detect a legacy guest is bogus, need to think what is
the best way to handle this.

Then maybe revert commit fe36cbe067 and friends, and have QEMU detect legacy
guest? Supposedly only config space write access needs to be guarded before
setting FEATURES_OK.

-Siwie

Detecting it isn't enough though, we will need a new ioctl to notify
the kernel that it's a legacy guest. Ugh :(



I'm not sure I get this, how can we know if there's a legacy driver 
before set_features()?


And I wonder what will hapeen if we just revert the set_features(0)?

Thanks






Rejecting reset to 0
prematurely causes correct MTU and link status unable to load
for the very first config space access, rendering issues like
guest showing inaccurate MTU value, or failure to reject
out-of-range MTU.

Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for
supported mlx5 devices")
Signed-off-by: Si-Wei Liu 
---
     drivers/vdpa/mlx5/net/mlx5_vnet.c | 15 +--
     1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c
b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 7c1f789..540dd67 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1490,14 +1490,6 @@ static u64
mlx5_vdpa_get_features(struct vdpa_device *vdev)
     return mvdev->mlx_features;
     }
-static int verify_min_features(struct mlx5_vdpa_dev *mvdev,
u64 features)
-{
-    if (!(features & BIT_ULL(VIRTIO_F_ACCESS_PLATFORM)))
-    return -EOPNOTSUPP;
-
-    return 0;
-}
-
     static int setup_virtqueues(struct mlx5_vdpa_net *ndev)
     {
     int err;
@@ -1558,18 +1550,13 @@ static int
mlx5_vdpa_set_features(struct vdpa_device *vdev, u64
features)
     {
     struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
     struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
-    int err;
     print_features(mvdev, features, true);
-    err = verify_min_features(mvdev, features);
-    if (err)
-    return err;
-
     ndev->mvdev.actual_features = features &
ndev->mvdev.mlx_features;
     ndev->config.mtu = cpu_to_mlx5vdpa16(mvdev, ndev->mtu);

Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero

2021-02-23 Thread Jason Wang




On 2021/2/24 1:17 下午, Michael S. Tsirkin wrote:

On Wed, Feb 24, 2021 at 11:20:01AM +0800, Jason Wang wrote:

On 2021/2/24 3:35 上午, Si-Wei Liu wrote:


On 2/23/2021 5:26 AM, Michael S. Tsirkin wrote:

On Tue, Feb 23, 2021 at 10:03:57AM +0800, Jason Wang wrote:

On 2021/2/23 9:12 上午, Si-Wei Liu wrote:

On 2/21/2021 11:34 PM, Michael S. Tsirkin wrote:

On Mon, Feb 22, 2021 at 12:14:17PM +0800, Jason Wang wrote:

On 2021/2/19 7:54 下午, Si-Wei Liu wrote:

Commit 452639a64ad8 ("vdpa: make sure set_features is invoked
for legacy") made an exception for legacy guests to reset
features to 0, when config space is accessed before features
are set. We should relieve the verify_min_features() check
and allow features reset to 0 for this case.

It's worth noting that not just legacy guests could access
config space before features are set. For instance, when
feature VIRTIO_NET_F_MTU is advertised some modern driver
will try to access and validate the MTU present in the config
space before virtio features are set.

This looks like a spec violation:

"

The following driver-read-only field, mtu only exists if
VIRTIO_NET_F_MTU is
set.
This field specifies the maximum MTU for the driver to use.
"

Do we really want to workaround this?

Thanks

And also:

The driver MUST follow this sequence to initialize a device:
1. Reset the device.
2. Set the ACKNOWLEDGE status bit: the guest OS has
noticed the device.
3. Set the DRIVER status bit: the guest OS knows how to drive the
device.
4. Read device feature bits, and write the subset of feature bits
understood by the OS and driver to the
device. During this step the driver MAY read (but MUST NOT write)
the device-specific configuration
fields to check that it can support the device before accepting it.
5. Set the FEATURES_OK status bit. The driver MUST NOT accept new
feature bits after this step.
6. Re-read device status to ensure the FEATURES_OK bit is still set:
otherwise, the device does not
support our subset of features and the device is unusable.
7. Perform device-specific setup, including discovery of virtqueues
for the device, optional per-bus setup,
reading and possibly writing the device’s virtio configuration
space, and population of virtqueues.
8. Set the DRIVER_OK status bit. At this point the device is “live”.


so accessing config space before FEATURES_OK is a spec
violation, right?

It is, but it's not relevant to what this commit tries to address. I
thought the legacy guest still needs to be supported.

Having said, a separate patch has to be posted to fix the guest driver
issue where this discrepancy is introduced to
virtnet_validate() (since
commit fe36cbe067). But it's not technically related to this patch.

-Siwei

I think it's a bug to read config space in validate, we should
move it to
virtnet_probe().

Thanks

I take it back, reading but not writing seems to be explicitly
allowed by spec.
So our way to detect a legacy guest is bogus, need to think what is
the best way to handle this.

Then maybe revert commit fe36cbe067 and friends, and have QEMU detect
legacy guest? Supposedly only config space write access needs to be
guarded before setting FEATURES_OK.


I agree. My understanding is that all vDPA must be modern device (since
VIRITO_F_ACCESS_PLATFORM is mandated) instead of transitional device.

Thanks

Well mlx5 has some code to handle legacy guests ...



My understanding is that, even if mlx5 is modern device it can still 
suppot legacy guests since the device saw by guest is emulated by Qemu. 
Qemu can just present a transitional device to guest, but negotiate 
VIRTIO_F_ACCESS_PLATFORM. (Actually this is what has been done now).


Thanks



Eli, could you comment? Is that support unused right now?



-Siwie


Rejecting reset to 0
prematurely causes correct MTU and link status unable to load
for the very first config space access, rendering issues like
guest showing inaccurate MTU value, or failure to reject
out-of-range MTU.

Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for
supported mlx5 devices")
Signed-off-by: Si-Wei Liu 
---
     drivers/vdpa/mlx5/net/mlx5_vnet.c | 15 +--
     1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c
b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 7c1f789..540dd67 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1490,14 +1490,6 @@ static u64
mlx5_vdpa_get_features(struct vdpa_device *vdev)
     return mvdev->mlx_features;
     }
-static int verify_min_features(struct mlx5_vdpa_dev *mvdev,
u64 features)
-{
-    if (!(features & BIT_ULL(VIRTIO_F_ACCESS_PLATFORM)))
-    return -EOPNOTSUPP;
-
-    return 0;
-}
-
     static int setup_virtqueues(struct mlx5_vdpa_net *ndev)
     {
     int err;
@@ -1558,18 +1550,13 @@ static int
mlx5_vdpa_set_features(struct vdpa_device *vdev, u64
features)
     {
     struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
     struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
-

Re: [PATCH] cpufreq: schedutil: Call sugov_update_next_freq() before check to fast_switch_enabled

2021-02-23 Thread Viresh Kumar

On 24-02-21, 13:42, Yue Hu wrote:
> From: Yue Hu 
> 
> Note that sugov_update_next_freq() may return false, that means the
> caller sugov_fast_switch() will do nothing except fast switch check.
> 
> Similarly, sugov_deferred_update() also has unnecessary operations
> of raw_spin_{lock,unlock} in sugov_update_single_freq() for that case.
> 
> So, let's call sugov_update_next_freq() before the fast switch check
> to avoid unnecessary behaviors above. Update the related interface
> definitions accordingly.
> 
> Signed-off-by: Yue Hu 
> ---
>  kernel/sched/cpufreq_schedutil.c | 28 ++--
>  1 file changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/kernel/sched/cpufreq_schedutil.c 
> b/kernel/sched/cpufreq_schedutil.c
> index 41e498b..d23e5be 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -114,19 +114,13 @@ static bool sugov_update_next_freq(struct sugov_policy 
> *sg_policy, u64 time,
>   return true;
>  }
>  
> -static void sugov_fast_switch(struct sugov_policy *sg_policy, u64 time,
> -   unsigned int next_freq)
> +static void sugov_fast_switch(struct sugov_policy *sg_policy, unsigned int 
> next_freq)
>  {
> - if (sugov_update_next_freq(sg_policy, time, next_freq))
> - cpufreq_driver_fast_switch(sg_policy->policy, next_freq);
> + cpufreq_driver_fast_switch(sg_policy->policy, next_freq);

I will call this directly instead, no need of the wrapper anymore.

>  }
>  
> -static void sugov_deferred_update(struct sugov_policy *sg_policy, u64 time,
> -   unsigned int next_freq)
> +static void sugov_deferred_update(struct sugov_policy *sg_policy)
>  {
> - if (!sugov_update_next_freq(sg_policy, time, next_freq))
> - return;
> -
>   if (!sg_policy->work_in_progress) {
>   sg_policy->work_in_progress = true;
>   irq_work_queue(_policy->irq_work);
> @@ -368,16 +362,19 @@ static void sugov_update_single_freq(struct 
> update_util_data *hook, u64 time,
>   sg_policy->cached_raw_freq = cached_freq;
>   }
>  
> + if (!sugov_update_next_freq(sg_policy, time, next_f))
> + return;
> +
>   /*
>* This code runs under rq->lock for the target CPU, so it won't run
>* concurrently on two different CPUs for the same target and it is not
>* necessary to acquire the lock in the fast switch case.
>*/
>   if (sg_policy->policy->fast_switch_enabled) {
> - sugov_fast_switch(sg_policy, time, next_f);
> + sugov_fast_switch(sg_policy, next_f);
>   } else {
>   raw_spin_lock(_policy->update_lock);
> - sugov_deferred_update(sg_policy, time, next_f);
> + sugov_deferred_update(sg_policy);
>   raw_spin_unlock(_policy->update_lock);
>   }
>  }
> @@ -456,12 +453,15 @@ static unsigned int sugov_next_freq_shared(struct 
> sugov_cpu *sg_cpu, u64 time)
>   if (sugov_should_update_freq(sg_policy, time)) {
>   next_f = sugov_next_freq_shared(sg_cpu, time);
>  
> + if (!sugov_update_next_freq(sg_policy, time, next_f))
> + goto unlock;
> +
>   if (sg_policy->policy->fast_switch_enabled)
> - sugov_fast_switch(sg_policy, time, next_f);
> + sugov_fast_switch(sg_policy, next_f);
>   else
> - sugov_deferred_update(sg_policy, time, next_f);
> + sugov_deferred_update(sg_policy);
>   }
> -
> +unlock:
>   raw_spin_unlock(_policy->update_lock);
>  }

-- 
viresh

[PATCH] cpufreq: schedutil: Call sugov_update_next_freq() before check to fast_switch_enabled

2021-02-23 Thread Yue Hu

From: Yue Hu 

Note that sugov_update_next_freq() may return false, that means the
caller sugov_fast_switch() will do nothing except fast switch check.

Similarly, sugov_deferred_update() also has unnecessary operations
of raw_spin_{lock,unlock} in sugov_update_single_freq() for that case.

So, let's call sugov_update_next_freq() before the fast switch check
to avoid unnecessary behaviors above. Update the related interface
definitions accordingly.

Signed-off-by: Yue Hu 
---
 kernel/sched/cpufreq_schedutil.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 41e498b..d23e5be 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -114,19 +114,13 @@ static bool sugov_update_next_freq(struct sugov_policy 
*sg_policy, u64 time,
return true;
 }
 
-static void sugov_fast_switch(struct sugov_policy *sg_policy, u64 time,
- unsigned int next_freq)
+static void sugov_fast_switch(struct sugov_policy *sg_policy, unsigned int 
next_freq)
 {
-   if (sugov_update_next_freq(sg_policy, time, next_freq))
-   cpufreq_driver_fast_switch(sg_policy->policy, next_freq);
+   cpufreq_driver_fast_switch(sg_policy->policy, next_freq);
 }
 
-static void sugov_deferred_update(struct sugov_policy *sg_policy, u64 time,
- unsigned int next_freq)
+static void sugov_deferred_update(struct sugov_policy *sg_policy)
 {
-   if (!sugov_update_next_freq(sg_policy, time, next_freq))
-   return;
-
if (!sg_policy->work_in_progress) {
sg_policy->work_in_progress = true;
irq_work_queue(_policy->irq_work);
@@ -368,16 +362,19 @@ static void sugov_update_single_freq(struct 
update_util_data *hook, u64 time,
sg_policy->cached_raw_freq = cached_freq;
}
 
+   if (!sugov_update_next_freq(sg_policy, time, next_f))
+   return;
+
/*
 * This code runs under rq->lock for the target CPU, so it won't run
 * concurrently on two different CPUs for the same target and it is not
 * necessary to acquire the lock in the fast switch case.
 */
if (sg_policy->policy->fast_switch_enabled) {
-   sugov_fast_switch(sg_policy, time, next_f);
+   sugov_fast_switch(sg_policy, next_f);
} else {
raw_spin_lock(_policy->update_lock);
-   sugov_deferred_update(sg_policy, time, next_f);
+   sugov_deferred_update(sg_policy);
raw_spin_unlock(_policy->update_lock);
}
 }
@@ -456,12 +453,15 @@ static unsigned int sugov_next_freq_shared(struct 
sugov_cpu *sg_cpu, u64 time)
if (sugov_should_update_freq(sg_policy, time)) {
next_f = sugov_next_freq_shared(sg_cpu, time);
 
+   if (!sugov_update_next_freq(sg_policy, time, next_f))
+   goto unlock;
+
if (sg_policy->policy->fast_switch_enabled)
-   sugov_fast_switch(sg_policy, time, next_f);
+   sugov_fast_switch(sg_policy, next_f);
else
-   sugov_deferred_update(sg_policy, time, next_f);
+   sugov_deferred_update(sg_policy);
}
-
+unlock:
raw_spin_unlock(_policy->update_lock);
 }
 
-- 
1.9.1

Re: linux-next: build warning after merge of the block tree

2021-02-23 Thread Chaitanya Kulkarni

On 2/23/21 21:33, Stephen Rothwell wrote:
>> I've failed to understand this warning as rwbs is present in the doc header
>> and in the function parameter :-
> I presume it is the missing ':' after @rwbs in the comment.
Thanks, I was looking at the wrong places all this time, will send a fix.

I'll setup doc generation using sphinx on my machine, is there
a particular command line that you have used for these warnings ?

this is mail test since the mail also rejected

2021-02-23 Thread campion test

[PATCH v2 2/3] scsi: ufs-qcom: Disable interrupt in reset path

2021-02-23 Thread Can Guo

From: Nitin Rawat 

Disable interrupt in reset path to flush pending IRQ handler in order to
avoid possible NoC issues.

Signed-off-by: Nitin Rawat 
Signed-off-by: Can Guo 
---
 drivers/scsi/ufs/ufs-qcom.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c
index f97d7b0..a9dc8d7 100644
--- a/drivers/scsi/ufs/ufs-qcom.c
+++ b/drivers/scsi/ufs/ufs-qcom.c
@@ -253,12 +253,17 @@ static int ufs_qcom_host_reset(struct ufs_hba *hba)
 {
int ret = 0;
struct ufs_qcom_host *host = ufshcd_get_variant(hba);
+   bool reenable_intr = false;
 
if (!host->core_reset) {
dev_warn(hba->dev, "%s: reset control not set\n", __func__);
goto out;
}
 
+   reenable_intr = hba->is_irq_enabled;
+   disable_irq(hba->irq);
+   hba->is_irq_enabled = false;
+
ret = reset_control_assert(host->core_reset);
if (ret) {
dev_err(hba->dev, "%s: core_reset assert failed, err = %d\n",
@@ -280,6 +285,11 @@ static int ufs_qcom_host_reset(struct ufs_hba *hba)
 
usleep_range(1000, 1100);
 
+   if (reenable_intr) {
+   enable_irq(hba->irq);
+   hba->is_irq_enabled = true;
+   }
+
 out:
return ret;
 }
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.

[PATCH v2 1/3] scsi: ufs: Minor adjustments to error handling

2021-02-23 Thread Can Guo

In error handling prepare stage, after SCSI requests are blocked, do a
down/up_write(clk_scaling_lock) to clean up the queuecommand() path.
Meanwhile, stop eeh_work in case it disturbs error recovery. Moreover,
reset ufshcd_state at the entrance of ufshcd_probe_hba(), since it may be
called multiple times during error recovery.

Signed-off-by: Can Guo 
---
 drivers/scsi/ufs/ufshcd.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 80620c8..013eb73 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -4987,6 +4987,7 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct 
ufshcd_lrb *lrbp)
 * UFS device needs urgent BKOPs.
 */
if (!hba->pm_op_in_progress &&
+   !ufshcd_eh_in_progress(hba) &&
ufshcd_is_exception_event(lrbp->ucd_rsp_ptr) &&
schedule_work(>eeh_work)) {
/*
@@ -5784,13 +5785,20 @@ static void ufshcd_err_handling_prepare(struct ufs_hba 
*hba)
ufshcd_suspend_clkscaling(hba);
ufshcd_clk_scaling_allow(hba, false);
}
+   ufshcd_scsi_block_requests(hba);
+   /* Drain ufshcd_queuecommand() */
+   down_write(>clk_scaling_lock);
+   up_write(>clk_scaling_lock);
+   cancel_work_sync(>eeh_work);
 }
 
 static void ufshcd_err_handling_unprepare(struct ufs_hba *hba)
 {
+   ufshcd_scsi_unblock_requests(hba);
ufshcd_release(hba);
if (ufshcd_is_clkscaling_supported(hba))
ufshcd_clk_scaling_suspend(hba, false);
+   ufshcd_clear_ua_wluns(hba);
pm_runtime_put(hba->dev);
 }
 
@@ -5882,8 +5890,8 @@ static void ufshcd_err_handler(struct work_struct *work)
spin_unlock_irqrestore(hba->host->host_lock, flags);
ufshcd_err_handling_prepare(hba);
spin_lock_irqsave(hba->host->host_lock, flags);
-   ufshcd_scsi_block_requests(hba);
-   hba->ufshcd_state = UFSHCD_STATE_RESET;
+   if (hba->ufshcd_state != UFSHCD_STATE_ERROR)
+   hba->ufshcd_state = UFSHCD_STATE_RESET;
 
/* Complete requests that have door-bell cleared by h/w */
ufshcd_complete_requests(hba);
@@ -6042,12 +6050,8 @@ static void ufshcd_err_handler(struct work_struct *work)
}
ufshcd_clear_eh_in_progress(hba);
spin_unlock_irqrestore(hba->host->host_lock, flags);
-   ufshcd_scsi_unblock_requests(hba);
ufshcd_err_handling_unprepare(hba);
up(>host_sem);
-
-   if (!err && needs_reset)
-   ufshcd_clear_ua_wluns(hba);
 }
 
 /**
@@ -7858,6 +7862,8 @@ static int ufshcd_probe_hba(struct ufs_hba *hba, bool 
async)
unsigned long flags;
ktime_t start = ktime_get();
 
+   hba->ufshcd_state = UFSHCD_STATE_RESET;
+
ret = ufshcd_link_startup(hba);
if (ret)
goto out;
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.

[PATCH v2 3/3] scsi: ufs: Remove redundant checks of !hba in suspend/resume callbacks

2021-02-23 Thread Can Guo

Runtime and system suspend/resume can only come after hba probe invokes
platform_set_drvdata(pdev, hba), meaning hba cannot be NULL in these PM
callbacks, so remove the checks of !hba.

Signed-off-by: Can Guo 
---
 drivers/scsi/ufs/ufshcd.c | 21 -
 1 file changed, 21 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 013eb73..2517ef1 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -95,8 +95,6 @@
   16, 4, buf, __len, false);\
 } while (0)
 
-static bool early_suspend;
-
 int ufshcd_dump_regs(struct ufs_hba *hba, size_t offset, size_t len,
 const char *prefix)
 {
@@ -8978,11 +8976,6 @@ int ufshcd_system_suspend(struct ufs_hba *hba)
int ret = 0;
ktime_t start = ktime_get();
 
-   if (!hba) {
-   early_suspend = true;
-   return 0;
-   }
-
down(>host_sem);
 
if (!hba->is_powered)
@@ -9034,14 +9027,6 @@ int ufshcd_system_resume(struct ufs_hba *hba)
int ret = 0;
ktime_t start = ktime_get();
 
-   if (!hba)
-   return -EINVAL;
-
-   if (unlikely(early_suspend)) {
-   early_suspend = false;
-   down(>host_sem);
-   }
-
if (!hba->is_powered || pm_runtime_suspended(hba->dev))
/*
 * Let the runtime resume take care of resuming
@@ -9074,9 +9059,6 @@ int ufshcd_runtime_suspend(struct ufs_hba *hba)
int ret = 0;
ktime_t start = ktime_get();
 
-   if (!hba)
-   return -EINVAL;
-
if (!hba->is_powered)
goto out;
else
@@ -9115,9 +9097,6 @@ int ufshcd_runtime_resume(struct ufs_hba *hba)
int ret = 0;
ktime_t start = ktime_get();
 
-   if (!hba)
-   return -EINVAL;
-
if (!hba->is_powered)
goto out;
else
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.

Re: [PATCH v8 17/22] counter: Add character device interface

2021-02-23 Thread William Breathitt Gray

On Sun, Feb 14, 2021 at 06:06:12PM +, Jonathan Cameron wrote:
> On Fri, 12 Feb 2021 21:13:41 +0900
> William Breathitt Gray  wrote:
> 
> > This patch introduces a character device interface for the Counter
> > subsystem. Device data is exposed through standard character device read
> > operations. Device data is gathered when a Counter event is pushed by
> > the respective Counter device driver. Configuration is handled via ioctl
> > operations on the respective Counter character device node.
> > 
> > Cc: David Lechner 
> > Cc: Gwendal Grignou 
> > Cc: Dan Carpenter 
> > Cc: Oleksij Rempel 
> > Signed-off-by: William Breathitt Gray 
> 
> Hi William,
> 
> A few minor comments.  Mostly seems to have come together well and
> makes sense to me.
> 
> Jonathan
> 
> > ---
> >  drivers/counter/Makefile |   2 +-
> >  drivers/counter/counter-chrdev.c | 496 +++
> >  drivers/counter/counter-chrdev.h |  16 +
> >  drivers/counter/counter-core.c   |  37 ++-
> >  include/linux/counter.h  |  45 +++
> >  include/uapi/linux/counter.h |  70 +
> >  6 files changed, 661 insertions(+), 5 deletions(-)
> >  create mode 100644 drivers/counter/counter-chrdev.c
> >  create mode 100644 drivers/counter/counter-chrdev.h
> > 
> 
> ...
> 
> > diff --git a/drivers/counter/counter-core.c b/drivers/counter/counter-core.c
> > index bcf672e1fc0d..c137fcb97d9c 100644
> > --- a/drivers/counter/counter-core.c
> > +++ b/drivers/counter/counter-core.c
> > @@ -5,12 +5,16 @@
> >   */
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> > +#include "counter-chrdev.h"
> >  #include "counter-sysfs.h"
> >  
> >  /* Provides a unique ID for each counter device */
> > @@ -33,6 +37,8 @@ static struct bus_type counter_bus_type = {
> > .name = "counter"
> >  };
> >  
> > +static dev_t counter_devt;
> > +
> >  /**
> >   * counter_register - register Counter to the system
> >   * @counter:   pointer to Counter to register
> > @@ -54,7 +60,6 @@ int counter_register(struct counter_device *const counter)
> > if (counter->id < 0)
> > return counter->id;
> >  
> > -   /* Configure device structure for Counter */
> 
> Not sure why this comment gets removed here.

This comment wasn't suppose to be removed. I'll revert this.

> > dev->type = _device_type;
> > dev->bus = _bus_type;
> > if (counter->parent) {
> > @@ -65,18 +70,25 @@ int counter_register(struct counter_device *const 
> > counter)
> > device_initialize(dev);
> > dev_set_drvdata(dev, counter);
> >  
> > +   /* Add Counter character device */
> > +   err = counter_chrdev_add(counter, counter_devt);
> > +   if (err < 0)
> > +   goto err_free_id;
> > +
> > /* Add Counter sysfs attributes */
> > err = counter_sysfs_add(counter);
> > if (err < 0)
> > -   goto err_free_id;
> > +   goto err_remove_chrdev;
> >  
> > /* Add device to system */
> > err = device_add(dev);
> > if (err < 0)
> > -   goto err_free_id;
> > +   goto err_remove_chrdev;
> 
> It might be worth thinking about using cdev_device_add()
> though will require a slightly different order of adding.

I think using cdev_device_add() should be possible. I'll adjust
counter_chrdev_add() accordingly to account for this.
 
> >  
> > return 0;
> >  
> > +err_remove_chrdev:
> > +   counter_chrdev_remove(counter);
> >  err_free_id:
> > put_device(dev);
> > return err;
> > @@ -138,13 +150,30 @@ int devm_counter_register(struct device *dev,
> >  }
> >  EXPORT_SYMBOL_GPL(devm_counter_register);
> >  
> > +#define COUNTER_DEV_MAX 256
> > +
> >  static int __init counter_init(void)
> >  {
> > -   return bus_register(_bus_type);
> > +   int err;
> > +
> > +   err = bus_register(_bus_type);
> > +   if (err < 0)
> > +   return err;
> > +
> > +   err = alloc_chrdev_region(_devt, 0, COUNTER_DEV_MAX, "counter");
> > +   if (err < 0)
> > +   goto err_unregister_bus;
> > +
> > +   return 0;
> > +
> > +err_unregister_bus:
> > +   bus_unregister(_bus_type);
> > +   return err;
> >  }
> >  
> >  static void __exit counter_exit(void)
> >  {
> > +   unregister_chrdev_region(counter_devt, COUNTER_DEV_MAX);
> > bus_unregister(_bus_type);
> >  }
> >  
> 
> ...
> 
> > diff --git a/include/uapi/linux/counter.h b/include/uapi/linux/counter.h
> > index 6113938a6044..3d647a5383b8 100644
> > --- a/include/uapi/linux/counter.h
> > +++ b/include/uapi/linux/counter.h
> > @@ -6,6 +6,19 @@
> >  #ifndef _UAPI_COUNTER_H_
> >  #define _UAPI_COUNTER_H_
> >  
> > +#include 
> > +#include 
> > +
> > +/* Component type definitions */
> > +enum counter_component_type {
> > +   COUNTER_COMPONENT_NONE,
> > +   COUNTER_COMPONENT_SIGNAL,
> > +   COUNTER_COMPONENT_COUNT,
> > +   COUNTER_COMPONENT_FUNCTION,
> > +   COUNTER_COMPONENT_SYNAPSE_ACTION,
> > +   COUNTER_COMPONENT_EXTENSION,
> > +};
> > +
> >  /*

Re: linux-next: build warning after merge of the block tree

2021-02-23 Thread Stephen Rothwell

Hi Chaitanya,

On Wed, 24 Feb 2021 05:25:49 + Chaitanya Kulkarni 
 wrote:
>
> On 2/23/21 18:31, Stephen Rothwell wrote:
> > Hi all,
> >
> > After merging the block tree, today's linux-next build (htmldocs)
> > produced this warning:
> >
> > kernel/trace/blktrace.c:1878: warning: Function parameter or member 'rwbs' 
> > not described in 'blk_fill_rwbs'`
> >
> > Introduced by commit
> >
> >   1f83bb4b4914 ("blktrace: add blk_fill_rwbs documentation comment")
> >
> > -- Cheers, Stephen Rothwell  
> I've failed to understand this warning as rwbs is present in the doc header
> and in the function parameter :-

I presume it is the missing ':' after @rwbs in the comment.

-- 
Cheers,
Stephen Rothwell


pgpNPUQ7r6uOr.pgp
Description: OpenPGP digital signature

Re: [PATCH] kdb: Remove redundant function definitions/prototypes

2021-02-23 Thread Sumit Garg

On Tue, 23 Feb 2021 at 21:39, Doug Anderson  wrote:
>
> Hi,
>
> On Tue, Feb 23, 2021 at 4:01 AM Sumit Garg  wrote:
> >
> > @@ -103,7 +103,6 @@ extern int kdb_getword(unsigned long *, unsigned long, 
> > size_t);
> >  extern int kdb_putword(unsigned long, unsigned long, size_t);
> >
> >  extern int kdbgetularg(const char *, unsigned long *);
> > -extern int kdbgetu64arg(const char *, u64 *);
>
> IMO you should leave kdbgetu64arg() the way it was.  It is symmetric
> to all of the other similar functions and even if there are no
> external users of kdbgetu64arg() now it seems like it makes sense to
> keep it matching.
>

Okay, will keep kdbgetu64arg() the way it was.

-Sumit

>
> > @@ -209,9 +208,7 @@ extern unsigned long kdb_task_state(const struct 
> > task_struct *p,
> > unsigned long mask);
> >  extern void kdb_ps_suppressed(void);
> >  extern void kdb_ps1(const struct task_struct *p);
> > -extern void kdb_print_nameval(const char *name, unsigned long val);
> >  extern void kdb_send_sig(struct task_struct *p, int sig);
> > -extern void kdb_meminfo_proc_show(void);
>
> Getting rid of kdb_print_nameval() / kdb_meminfo_proc_show() makes sense to 
> me.
>
>
> >  extern char kdb_getchar(void);
> >  extern char *kdb_getstr(char *, size_t, const char *);
> >  extern void kdb_gdb_state_pass(char *buf);
> > diff --git a/kernel/debug/kdb/kdb_support.c b/kernel/debug/kdb/kdb_support.c
> > index 6226502ce049..b59aad1f0b55 100644
> > --- a/kernel/debug/kdb/kdb_support.c
> > +++ b/kernel/debug/kdb/kdb_support.c
> > @@ -665,24 +665,6 @@ unsigned long kdb_task_state(const struct task_struct 
> > *p, unsigned long mask)
> > return (mask & kdb_task_state_string(state)) != 0;
> >  }
> >
> > -/*
> > - * kdb_print_nameval - Print a name and its value, converting the
> > - * value to a symbol lookup if possible.
> > - * Inputs:
> > - * namefield name to print
> > - * val value of field
> > - */
> > -void kdb_print_nameval(const char *name, unsigned long val)
> > -{
> > -   kdb_symtab_t symtab;
> > -   kdb_printf("  %-11.11s ", name);
> > -   if (kdbnearsym(val, ))
> > -   kdb_symbol_print(val, ,
> > -
> > KDB_SP_VALUE|KDB_SP_SYMSIZE|KDB_SP_NEWLINE);
> > -   else
> > -   kdb_printf("0x%lx\n", val);
> > -}
> > -
>
> Getting rid of kdb_print_nameval() makes sense to me.
>
> -Doug

Re: [PATCH v2 03/10] mm: don't pass "enum lru_list" to lru list addition functions

2021-02-23 Thread Yu Zhao

On Tue, Feb 23, 2021 at 02:50:11PM -0800, Andrew Morton wrote:
> On Tue, 26 Jan 2021 15:14:38 -0700 Yu Zhao  wrote:
> 
> > On Tue, Jan 26, 2021 at 10:01:11PM +, Matthew Wilcox wrote:
> > > On Fri, Jan 22, 2021 at 03:05:53PM -0700, Yu Zhao wrote:
> > > > +++ b/mm/swap.c
> > > > @@ -231,7 +231,7 @@ static void pagevec_move_tail_fn(struct page *page, 
> > > > struct lruvec *lruvec)
> > > > if (!PageUnevictable(page)) {
> > > > del_page_from_lru_list(page, lruvec, page_lru(page));
> > > > ClearPageActive(page);
> > > > -   add_page_to_lru_list_tail(page, lruvec, page_lru(page));
> > > > +   add_page_to_lru_list_tail(page, lruvec);
> > > > __count_vm_events(PGROTATED, thp_nr_pages(page));
> > > > }
> > > 
> > > Is it profitable to do ...
> > > 
> > > - del_page_from_lru_list(page, lruvec, page_lru(page));
> > > + enum lru_list lru = page_lru(page);
> > > + del_page_from_lru_list(page, lruvec, lru);
> > >   ClearPageActive(page);
> > > - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
> > > + lru &= ~LRU_ACTIVE;
> > > + add_page_to_lru_list_tail(page, lruvec, lru);
> > 
> > Ok, now we want to trade readability for size. Sure, I'll see how
> > much we could squeeze.
> 
> As nothing has happened here and the code bloat issue remains, I'll
> hold this series out of 5.12-rc1.

Sorry for the slow response. I was trying to ascertain why
page_lru(), a tiny helper, could bloat vmlinux by O(KB). It turned out
compound_head() included in Page{Active,Unevictable} is a nuisance in
our case. Testing PG_{active,unevictable} against
compound_head(page)->flags is really unnecessary because all lru
operations are eventually done on page->lru not
compound_head(page)->lru. With the following change, which sacrifices
the readability a bit, we gain 998 bytes with Clang but lose 227 bytes
with GCC, which IMO is a win. (We use Clang by default.)


diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 355ea1ee32bd..ec0878a3cdfe 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -46,14 +46,12 @@ static __always_inline void __clear_page_lru_flags(struct 
page *page)
 {
VM_BUG_ON_PAGE(!PageLRU(page), page);
 
-   __ClearPageLRU(page);
-
/* this shouldn't happen, so leave the flags to bad_page() */
-   if (PageActive(page) && PageUnevictable(page))
+   if ((page->flags & (BIT(PG_active) | BIT(PG_unevictable))) ==
+   (BIT(PG_active) | BIT(PG_unevictable)))
return;
 
-   __ClearPageActive(page);
-   __ClearPageUnevictable(page);
+   page->flags &= ~(BIT(PG_lru) | BIT(PG_active) | BIT(PG_unevictable));
 }
 
 /**
@@ -65,18 +63,12 @@ static __always_inline void __clear_page_lru_flags(struct 
page *page)
  */
 static __always_inline enum lru_list page_lru(struct page *page)
 {
-   enum lru_list lru;
+   unsigned long flags = READ_ONCE(page->flags);
 
VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);
 
-   if (PageUnevictable(page))
-   return LRU_UNEVICTABLE;
-
-   lru = page_is_file_lru(page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
-   if (PageActive(page))
-   lru += LRU_ACTIVE;
-
-   return lru;
+   return (flags & BIT(PG_unevictable)) ? LRU_UNEVICTABLE :
+  (LRU_FILE * !(flags & BIT(PG_swapbacked)) + !!(flags & 
BIT(PG_active)));
 }
 
 static __always_inline void add_page_to_lru_list(struct page *page,


I'll post this as a separate patch. Below the bloat-o-meter collected
on top of c03c21ba6f4e.

$ ./scripts/bloat-o-meter ../vmlinux.clang.orig ../vmlinux.clang
add/remove: 0/1 grow/shrink: 7/10 up/down: 191/-1189 (-998)
Function old new   delta
lru_lazyfree_fn  848 893 +45
lru_deactivate_file_fn  10371075 +38
perf_trace_mm_lru_insertion  515 548 +33
check_move_unevictable_pages 9831006 +23
__activate_page  706 729 +23
trace_event_raw_event_mm_lru_insertion   476 497 +21
lru_deactivate_fn691 699  +8
__bpf_trace_mm_lru_insertion  13  11  -2
__traceiter_mm_lru_insertion  67  62  -5
move_pages_to_lru964 881 -83
__pagevec_lru_add_fn 665 581 -84
isolate_lru_page 524 419-105
__munlock_pagevec   16091481-128
isolate_migratepages_block  33703237-133
__page_cache_release 556 413-143
lruvec_lru_size  151   --151
release_pages   1025 866-159
pagevec_move_tail_fn

Re: [PATCH] kernel: debug: Handle breakpoints in kernel .init.text section

2021-02-23 Thread Sumit Garg

On Tue, 23 Feb 2021 at 18:24, Daniel Thompson
 wrote:
>
> On Tue, Feb 23, 2021 at 02:33:50PM +0530, Sumit Garg wrote:
> > Thanks Doug for your comments.
> >
> > On Tue, 23 Feb 2021 at 05:28, Doug Anderson  wrote:
> > > > To be clear there is still a very small window between call to
> > > > free_initmem() and system_state = SYSTEM_RUNNING which can lead to
> > > > removal of freed .init.text section breakpoints but I think we can live
> > > > with that.
> > >
> > > I know kdb / kgdb tries to keep out of the way of the rest of the
> > > system and so there's a bias to just try to infer the state of the
> > > rest of the system, but this feels like a halfway solution when really
> > > a cleaner solution really wouldn't intrude much on the main kernel.
> > > It seems like it's at least worth asking if we can just add a call
> > > like kgdb_drop_init_breakpoints() into main.c.  Then we don't have to
> > > try to guess the state...
>
> Just for the record, +1. This would be a better approach.
>
>
> > Sounds reasonable, will post RFC for this. I think we should call such
> > function as kgdb_free_init_mem() in similar way as:
> > - kprobe_free_init_mem()
> > - ftrace_free_init_mem()
>
> As is matching the names...
>
>
> > @@ -378,8 +382,13 @@ int dbg_deactivate_sw_breakpoints(void)
> > int i;
> >
> > for (i = 0; i < KGDB_MAX_BREAKPOINTS; i++) {
> > -   if (kgdb_break[i].state != BP_ACTIVE)
> > +   if (kgdb_break[i].state < BP_ACTIVE_INIT)
> > +   continue;
> > +   if (system_state >= SYSTEM_RUNNING &&
> > +   kgdb_break[i].state == BP_ACTIVE_INIT) {
> > +   kgdb_break[i].state = BP_UNDEFINED;
> > continue;
> > +   }
> > error = kgdb_arch_remove_breakpoint(_break[i]);
> > if (error) {
> > pr_info("BP remove failed: %lx\n",
> >
> > >
> > > > +   kgdb_break[i].state = BP_ACTIVE;
> > > > +   else
> > > > +   kgdb_break[i].state = BP_ACTIVE_INIT;
> > >
> > > I don't really see what the "BP_ACTIVE_INIT" state gets you.  Why not
> > > just leave it as "BP_ACTIVE" and put all the logic fully in
> > > dbg_deactivate_sw_breakpoints()?
> >
> > Please see my response above.
> >
> > [which was]
> > > "BP_ACTIVE_INIT" state is added specifically to handle this scenario
> > > as to keep track of breakpoints that actually belong to the .init.text
> > > section. And we should be able to again set breakpoints after free
> > > since below change in this patch would mark them as "BP_UNDEFINED":
>
> This answer does not say whether the BP_ACTIVE_INIT state needs to be
> per-breakpoint state or whether we can infer it from the global state.
>
> Changing the state of breakpoints in .init is a one-shot activity
> whether it is triggered explicitly (e.g. kgdb_free_init_mem) or implicitly
> (run the first time dbg_deactivate_sw_breakpoints is called with the system
> state >= running).
>
> As Doug has suggested it is quite possible to unify all the logic to
> handle .init within a single function by running that function when the
> state changes globally.
>

Ah, I see. Thanks for further clarification. Will get rid of
BP_ACTIVE_INIT state.

-Sumit

>
> Daniel.

Re: [PATCH] arm64: enable GENERIC_FIND_FIRST_BIT

2021-02-23 Thread Yury Norov

On Tue, Dec 08, 2020 at 10:35:50AM +, Will Deacon wrote:
> On Mon, Dec 07, 2020 at 05:59:16PM -0800, Yury Norov wrote:
> > (CC: Alexey Klimov)
> > 
> > On Mon, Dec 7, 2020 at 3:25 AM Will Deacon  wrote:
> > >
> > > On Sat, Dec 05, 2020 at 08:54:06AM -0800, Yury Norov wrote:
> > > > ARM64 doesn't implement find_first_{zero}_bit in arch code and doesn't
> > > > enable it in config. It leads to using find_next_bit() which is less
> > > > efficient:
> > >
> > > [...]
> > >
> > > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > > > index 1515f6f153a0..2b90ef1f548e 100644
> > > > --- a/arch/arm64/Kconfig
> > > > +++ b/arch/arm64/Kconfig
> > > > @@ -106,6 +106,7 @@ config ARM64
> > > >   select GENERIC_CPU_AUTOPROBE
> > > >   select GENERIC_CPU_VULNERABILITIES
> > > >   select GENERIC_EARLY_IOREMAP
> > > > + select GENERIC_FIND_FIRST_BIT
> > >
> > > Does this actually make any measurable difference? The disassembly with
> > > or without this is _very_ similar for me (clang 11).
> > >
> > > Will
> > 
> > On A-53 find_first_bit() is almost twice faster than find_next_bit(),
> > according to
> > lib/find_bit_benchmark. (Thanks to Alexey for testing.)
> 
> I guess it's more compiler dependent than anything else, and it's a pity
> that find_next_bit() isn't implemented in terms of the generic
> find_first_bit() tbh, but if the numbers are as you suggest then I don't
> have a problem selecting this on arm64.

Ping?

Re: linux-next: build warning after merge of the block tree

2021-02-23 Thread Chaitanya Kulkarni

Stephen,

On 2/23/21 18:31, Stephen Rothwell wrote:
> Hi all,
>
> After merging the block tree, today's linux-next build (htmldocs)
> produced this warning:
>
> kernel/trace/blktrace.c:1878: warning: Function parameter or member 'rwbs' 
> not described in 'blk_fill_rwbs'`
>
> Introduced by commit
>
>   1f83bb4b4914 ("blktrace: add blk_fill_rwbs documentation comment")
>
> -- Cheers, Stephen Rothwell
I've failed to understand this warning as rwbs is present in the doc header
and in the function parameter :-

/** 
  

 * blk_fill_rwbs - Fill the buffer rwbs by mapping op to character string.
 * @rwbsbuffer to be filled
 * @op: REQ_OP_XXX for the tracepoint
 *
 * Description:
 * Maps the REQ_OP_XXX to character and fills the buffer provided by the
 * caller with resulting string.
 *
 **/
void blk_fill_rwbs(char *rwbs, unsigned int op)
{
int i = 0;

if (op & REQ_PREFLUSH)
rwbs[i++] = 'F';

switch (op & REQ_OP_MASK) {
case REQ_OP_WRITE:
case REQ_OP_WRITE_SAME:
rwbs[i++] = 'W';
break;
case REQ_OP_DISCARD:
rwbs[i++] = 'D';
break;
case REQ_OP_SECURE_ERASE:
rwbs[i++] = 'D';
rwbs[i++] = 'E';
break;
case REQ_OP_FLUSH:
rwbs[i++] = 'F';
break;
case REQ_OP_READ:
rwbs[i++] = 'R';
break;
default:
rwbs[i++] = 'N';
}

if (op & REQ_FUA)
rwbs[i++] = 'F';
if (op & REQ_RAHEAD)
rwbs[i++] = 'A';
if (op & REQ_SYNC)
rwbs[i++] = 'S';
if (op & REQ_META)
rwbs[i++] = 'M';

rwbs[i] = '\0';
}
EXPORT_SYMBOL_GPL(blk_fill_rwbs);

RE: [Linuxarm] Re: [PATCH for-next 00/32] spin lock usage optimization for SCSI drivers

2021-02-23 Thread Finn Thain

On Tue, 23 Feb 2021, Song Bao Hua (Barry Song) wrote:

> > 
> > Regarding m68k, your analysis overlooks the timing issue. E.g. patch 
> > 11/32 could be a problem because removing the irqsave would allow PDMA 
> > transfers to be interrupted. Aside from the timing issues, I agree 
> > with your analysis above regarding m68k.
> 
> You mentioned you need realtime so you want an interrupt to be able to 
> preempt another one.

That's not what I said. But for the sake of discussion, yes, I do know 
people who run Linux on ARM hardware (if Android vendor kernels can be 
called "Linux") and who would benefit from realtime support on those 
devices.

> Now you said you want an interrupt not to be preempted as it will make a 
> timing issue.

mac_esp deliberately constrains segment sizes so that it can harmlessly 
disable interrupts for the duration of the transfer.

Maybe the irqsave in this driver is over-cautious. Who knows? The PDMA 
timing problem relates to SCSI bus signalling and the tolerance of real-
world SCSI devices to same. The other problem is that the PDMA logic 
circuit is undocumented hardware. So there may be further timing 
requirements lurking there. Therefore, patch 11/32 is too risky.

> If this PDMA transfer will have some problem when it is preempted, I 
> believe we need some enhanced ways to handle this, otherwise, once we 
> enable preempt_rt or threaded_irq, it will get the timing issue. so here 
> it needs a clear comment and IRQF_NO_THREAD if this is the case.
> 

People who require fast response times cannot expect random drivers or 
platforms to meet such requirements. I fear you may be asking too much 
from Mac Quadra machines.

> > 
> > With regard to other architectures and platforms, in specific cases, 
> > e.g. where there's never more than one IRQ involved, then I could 
> > agree that your assumptions probably hold and an irqsave would be 
> > probably redundant.
> > 
> > When you find a redundant irqsave, to actually patch it would bring a 
> > risk of regression with little or no reward. It's not my place to veto 
> > this entire patch series on that basis but IMO this kind of churn is 
> > misguided.
> 
> Nope.
> 
> I would say the real misguidance is that the code adds one lock while it 
> doesn't need the lock. Easily we can add redundant locks or exaggerate 
> the coverage range of locks, but the smarter way is that people add 
> locks only when they really need the lock by considering concurrency and 
> realtime performance.
> 

You appear to be debating a strawman. No-one is advocating excessive 
locking in new code.

> Thanks
> Barry
>

Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero

2021-02-23 Thread Michael S. Tsirkin

On Wed, Feb 24, 2021 at 11:20:01AM +0800, Jason Wang wrote:
> 
> On 2021/2/24 3:35 上午, Si-Wei Liu wrote:
> > 
> > 
> > On 2/23/2021 5:26 AM, Michael S. Tsirkin wrote:
> > > On Tue, Feb 23, 2021 at 10:03:57AM +0800, Jason Wang wrote:
> > > > On 2021/2/23 9:12 上午, Si-Wei Liu wrote:
> > > > > 
> > > > > On 2/21/2021 11:34 PM, Michael S. Tsirkin wrote:
> > > > > > On Mon, Feb 22, 2021 at 12:14:17PM +0800, Jason Wang wrote:
> > > > > > > On 2021/2/19 7:54 下午, Si-Wei Liu wrote:
> > > > > > > > Commit 452639a64ad8 ("vdpa: make sure set_features is invoked
> > > > > > > > for legacy") made an exception for legacy guests to reset
> > > > > > > > features to 0, when config space is accessed before features
> > > > > > > > are set. We should relieve the verify_min_features() check
> > > > > > > > and allow features reset to 0 for this case.
> > > > > > > > 
> > > > > > > > It's worth noting that not just legacy guests could access
> > > > > > > > config space before features are set. For instance, when
> > > > > > > > feature VIRTIO_NET_F_MTU is advertised some modern driver
> > > > > > > > will try to access and validate the MTU present in the config
> > > > > > > > space before virtio features are set.
> > > > > > > This looks like a spec violation:
> > > > > > > 
> > > > > > > "
> > > > > > > 
> > > > > > > The following driver-read-only field, mtu only exists if
> > > > > > > VIRTIO_NET_F_MTU is
> > > > > > > set.
> > > > > > > This field specifies the maximum MTU for the driver to use.
> > > > > > > "
> > > > > > > 
> > > > > > > Do we really want to workaround this?
> > > > > > > 
> > > > > > > Thanks
> > > > > > And also:
> > > > > > 
> > > > > > The driver MUST follow this sequence to initialize a device:
> > > > > > 1. Reset the device.
> > > > > > 2. Set the ACKNOWLEDGE status bit: the guest OS has
> > > > > > noticed the device.
> > > > > > 3. Set the DRIVER status bit: the guest OS knows how to drive the
> > > > > > device.
> > > > > > 4. Read device feature bits, and write the subset of feature bits
> > > > > > understood by the OS and driver to the
> > > > > > device. During this step the driver MAY read (but MUST NOT write)
> > > > > > the device-specific configuration
> > > > > > fields to check that it can support the device before accepting it.
> > > > > > 5. Set the FEATURES_OK status bit. The driver MUST NOT accept new
> > > > > > feature bits after this step.
> > > > > > 6. Re-read device status to ensure the FEATURES_OK bit is still set:
> > > > > > otherwise, the device does not
> > > > > > support our subset of features and the device is unusable.
> > > > > > 7. Perform device-specific setup, including discovery of virtqueues
> > > > > > for the device, optional per-bus setup,
> > > > > > reading and possibly writing the device’s virtio configuration
> > > > > > space, and population of virtqueues.
> > > > > > 8. Set the DRIVER_OK status bit. At this point the device is “live”.
> > > > > > 
> > > > > > 
> > > > > > so accessing config space before FEATURES_OK is a spec
> > > > > > violation, right?
> > > > > It is, but it's not relevant to what this commit tries to address. I
> > > > > thought the legacy guest still needs to be supported.
> > > > > 
> > > > > Having said, a separate patch has to be posted to fix the guest driver
> > > > > issue where this discrepancy is introduced to
> > > > > virtnet_validate() (since
> > > > > commit fe36cbe067). But it's not technically related to this patch.
> > > > > 
> > > > > -Siwei
> > > > 
> > > > I think it's a bug to read config space in validate, we should
> > > > move it to
> > > > virtnet_probe().
> > > > 
> > > > Thanks
> > > I take it back, reading but not writing seems to be explicitly
> > > allowed by spec.
> > > So our way to detect a legacy guest is bogus, need to think what is
> > > the best way to handle this.
> > Then maybe revert commit fe36cbe067 and friends, and have QEMU detect
> > legacy guest? Supposedly only config space write access needs to be
> > guarded before setting FEATURES_OK.
> 
> 
> I agree. My understanding is that all vDPA must be modern device (since
> VIRITO_F_ACCESS_PLATFORM is mandated) instead of transitional device.
> 
> Thanks

Well mlx5 has some code to handle legacy guests ...
Eli, could you comment? Is that support unused right now?


> 
> > 
> > -Siwie
> > 
> > > > > > 
> > > > > > > > Rejecting reset to 0
> > > > > > > > prematurely causes correct MTU and link status unable to load
> > > > > > > > for the very first config space access, rendering issues like
> > > > > > > > guest showing inaccurate MTU value, or failure to reject
> > > > > > > > out-of-range MTU.
> > > > > > > > 
> > > > > > > > Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for
> > > > > > > > supported mlx5 devices")
> > > > > > > > Signed-off-by: Si-Wei Liu 
> > > > > > > > ---
> > > > > > > >     drivers/vdpa/mlx5/net/mlx5_vnet.c | 15 +--
> > > > > > > >     1 file changed, 1 insertion(+), 14 deletions(-)
> > > > > > > >

Re: [PATCH v10 2/5] sched: CGroup tagging interface for core scheduling

2021-02-23 Thread Josh Don

On Tue, Feb 23, 2021 at 11:26 AM Chris Hyser  wrote:
>
> On 2/23/21 4:05 AM, Peter Zijlstra wrote:
> > On Mon, Feb 22, 2021 at 11:00:37PM -0500, Chris Hyser wrote:
> >> On 1/22/21 8:17 PM, Joel Fernandes (Google) wrote:
> >> While trying to test the new prctl() code I'm working on, I ran into a bug 
> >> I
> >> chased back into this v10 code. Under a fair amount of stress, when the
> >> function __sched_core_update_cookie() is ultimately called from
> >> sched_core_fork(), the system deadlocks or otherwise non-visibly crashes.
> >> I've not had much success figuring out why/what. I'm running with LOCKDEP 
> >> on
> >> and seeing no complaints. Duplicating it only requires setting a cookie on 
> >> a
> >> task and forking a bunch of threads ... all of which then want to update
> >> their cookie.
> >
> > Can you share the code and reproducer?
>
> Attached is a tarball with c code (source) and scripts. Just run ./setup_bug 
> which will compile the source and start a
> bash with a cs cookie. Then run ./show_bug which dumps the cookie and then 
> fires off some processes and threads. Note
> the cs_clone command is not doing any core sched prctls for this test (not 
> needed and currently coded for a diff prctl
> interface). It just creates processes and threads. I see this hang almost 
> instantly.
>
> Josh, I did verify that this occurs on Joel's coresched tree both with and 
> w/o the kprot patch and that should exactly
> correspond to these patches.
>
> -chrish
>

I think I've gotten to the root of this. In the fork code, our cases
for inheriting task_cookie are inverted for CLONE_THREAD vs
!CLONE_THREAD. As a result, we are creating a new cookie per-thread,
rather than inheriting from the parent. Now this is actually ok; I'm
not observing a scalability problem with creating this many cookies.
However, it means that overall throughput of your binary is cut in
~half, since none of the threads can share a core. Note that I never
saw an indefinite deadlock, just ~2x runtime for your binary vs the
control. I've verified that both a) manually hardcoding all threads to
be able to share regardless of cookie, and b) using a machine with 6
cores instead of 2, both allow your binary to complete in the same
amount of time as without the new API.

[PATCH v5 2/2] ufs: sysfs: Resume the proper scsi device

2021-02-23 Thread Asutosh Das

Resumes the actual scsi device the unit descriptor of which
is being accessed instead of the hba alone.

Signed-off-by: Asutosh Das 
---
 drivers/scsi/ufs/ufs-sysfs.c | 26 +++---
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/ufs/ufs-sysfs.c b/drivers/scsi/ufs/ufs-sysfs.c
index acc54f5..34481e3 100644
--- a/drivers/scsi/ufs/ufs-sysfs.c
+++ b/drivers/scsi/ufs/ufs-sysfs.c
@@ -297,10 +297,10 @@ static ssize_t ufs_sysfs_read_desc_param(struct ufs_hba 
*hba,
goto out;
}
 
-   pm_runtime_get_sync(hba->dev);
+   scsi_autopm_get_device(hba->sdev_ufs_device);
ret = ufshcd_read_desc_param(hba, desc_id, desc_index,
param_offset, desc_buf, param_size);
-   pm_runtime_put_sync(hba->dev);
+   scsi_autopm_put_device(hba->sdev_ufs_device);
if (ret) {
ret = -EINVAL;
goto out;
@@ -678,7 +678,7 @@ static ssize_t _name##_show(struct device *dev, 
\
up(>host_sem); \
return -ENOMEM; \
}   \
-   pm_runtime_get_sync(hba->dev);  \
+   scsi_autopm_get_device(hba->sdev_ufs_device);   \
ret = ufshcd_query_descriptor_retry(hba,\
UPIU_QUERY_OPCODE_READ_DESC, QUERY_DESC_IDN_DEVICE, \
0, 0, desc_buf, _len); \
@@ -695,7 +695,7 @@ static ssize_t _name##_show(struct device *dev, 
\
goto out;   \
ret = sysfs_emit(buf, "%s\n", desc_buf);\
 out:   \
-   pm_runtime_put_sync(hba->dev);  \
+   scsi_autopm_put_device(hba->sdev_ufs_device);   \
kfree(desc_buf);\
up(>host_sem); \
return ret; \
@@ -744,10 +744,10 @@ static ssize_t _name##_show(struct device *dev,   
\
}   \
if (ufshcd_is_wb_flags(QUERY_FLAG_IDN##_uname)) \
index = ufshcd_wb_get_query_index(hba); \
-   pm_runtime_get_sync(hba->dev);  \
+   scsi_autopm_get_device(hba->sdev_ufs_device);   \
ret = ufshcd_query_flag(hba, UPIU_QUERY_OPCODE_READ_FLAG,   \
QUERY_FLAG_IDN##_uname, index, );  \
-   pm_runtime_put_sync(hba->dev);  \
+   scsi_autopm_put_device(hba->sdev_ufs_device);   \
if (ret) {  \
ret = -EINVAL;  \
goto out;   \
@@ -813,10 +813,10 @@ static ssize_t _name##_show(struct device *dev,   
\
}   \
if (ufshcd_is_wb_attrs(QUERY_ATTR_IDN##_uname)) \
index = ufshcd_wb_get_query_index(hba); \
-   pm_runtime_get_sync(hba->dev);  \
+   scsi_autopm_get_device(hba->sdev_ufs_device);   \
ret = ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_READ_ATTR,   \
QUERY_ATTR_IDN##_uname, index, 0, );  \
-   pm_runtime_put_sync(hba->dev);  \
+   scsi_autopm_put_device(hba->sdev_ufs_device);   \
if (ret) {  \
ret = -EINVAL;  \
goto out;   \
@@ -899,11 +899,15 @@ static ssize_t _pname##_show(struct device *dev,  
\
struct scsi_device *sdev = to_scsi_device(dev); \
struct ufs_hba *hba = shost_priv(sdev->host);   \
u8 lun = ufshcd_scsi_to_upiu_lun(sdev->lun);\
+   int ret;\
if (!ufs_is_valid_unit_desc_lun(>dev_info, lun,\
_duname##_DESC_PARAM##_puname)) \
return -EINVAL; \
-   return ufs_sysfs_read_desc_param(hba, QUERY_DESC_IDN_##_duname, \
+   scsi_autopm_get_device(sdev);

[PATCH v5 1/2] scsi: ufs: Enable power management for wlun

2021-02-23 Thread Asutosh Das

During runtime-suspend of ufs host, the scsi devices are
already suspended and so are the queues associated with them.
But the ufs host sends SSU to wlun during its runtime-suspend.
During the process blk_queue_enter checks if the queue is not in
suspended state. If so, it waits for the queue to resume, and never
comes out of it.
The commit
(d55d15a33: scsi: block: Do not accept any requests while suspended)
adds the check if the queue is in suspended state in blk_queue_enter().

Call trace:
 __switch_to+0x174/0x2c4
 __schedule+0x478/0x764
 schedule+0x9c/0xe0
 blk_queue_enter+0x158/0x228
 blk_mq_alloc_request+0x40/0xa4
 blk_get_request+0x2c/0x70
 __scsi_execute+0x60/0x1c4
 ufshcd_set_dev_pwr_mode+0x124/0x1e4
 ufshcd_suspend+0x208/0x83c
 ufshcd_runtime_suspend+0x40/0x154
 ufshcd_pltfrm_runtime_suspend+0x14/0x20
 pm_generic_runtime_suspend+0x28/0x3c
 __rpm_callback+0x80/0x2a4
 rpm_suspend+0x308/0x614
 rpm_idle+0x158/0x228
 pm_runtime_work+0x84/0xac
 process_one_work+0x1f0/0x470
 worker_thread+0x26c/0x4c8
 kthread+0x13c/0x320
 ret_from_fork+0x10/0x18

Fix this by registering ufs device wlun as a scsi driver and
registering it for block runtime-pm. Also make this as a
supplier for all other luns. That way, this device wlun
suspends after all the consumers and resumes after
hba resumes.

Co-developed-by: Can Guo 
Signed-off-by: Can Guo 
Signed-off-by: Asutosh Das 
---
 drivers/scsi/ufs/ufs-qcom.c  |   2 +
 drivers/scsi/ufs/ufshcd-pci.c|  24 --
 drivers/scsi/ufs/ufshcd-pltfrm.c |  29 +++
 drivers/scsi/ufs/ufshcd-pltfrm.h |   4 +
 drivers/scsi/ufs/ufshcd.c| 491 +++
 drivers/scsi/ufs/ufshcd.h|   5 +
 include/trace/events/ufs.h   |  20 ++
 7 files changed, 454 insertions(+), 121 deletions(-)

diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c
index f97d7b0..8cd8cfd 100644
--- a/drivers/scsi/ufs/ufs-qcom.c
+++ b/drivers/scsi/ufs/ufs-qcom.c
@@ -1546,6 +1546,8 @@ static const struct dev_pm_ops ufs_qcom_pm_ops = {
.runtime_suspend = ufshcd_pltfrm_runtime_suspend,
.runtime_resume  = ufshcd_pltfrm_runtime_resume,
.runtime_idle= ufshcd_pltfrm_runtime_idle,
+   .prepare= ufshcd_pltfrm_prepare,
+   .resume_early   = ufshcd_pltfrm_resume_early,
 };
 
 static struct platform_driver ufs_qcom_pltform = {
diff --git a/drivers/scsi/ufs/ufshcd-pci.c b/drivers/scsi/ufs/ufshcd-pci.c
index fadd566..ab84d56 100644
--- a/drivers/scsi/ufs/ufshcd-pci.c
+++ b/drivers/scsi/ufs/ufshcd-pci.c
@@ -247,29 +247,6 @@ static int ufshcd_pci_resume(struct device *dev)
return ufshcd_system_resume(dev_get_drvdata(dev));
 }
 
-/**
- * ufshcd_pci_poweroff - suspend-to-disk poweroff function
- * @dev: pointer to PCI device handle
- *
- * Returns 0 if successful
- * Returns non-zero otherwise
- */
-static int ufshcd_pci_poweroff(struct device *dev)
-{
-   struct ufs_hba *hba = dev_get_drvdata(dev);
-   int spm_lvl = hba->spm_lvl;
-   int ret;
-
-   /*
-* For poweroff we need to set the UFS device to PowerDown mode.
-* Force spm_lvl to ensure that.
-*/
-   hba->spm_lvl = 5;
-   ret = ufshcd_system_suspend(hba);
-   hba->spm_lvl = spm_lvl;
-   return ret;
-}
-
 #endif /* !CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM
@@ -370,7 +347,6 @@ static const struct dev_pm_ops ufshcd_pci_pm_ops = {
.resume = ufshcd_pci_resume,
.freeze = ufshcd_pci_suspend,
.thaw   = ufshcd_pci_resume,
-   .poweroff   = ufshcd_pci_poweroff,
.restore= ufshcd_pci_resume,
 #endif
SET_RUNTIME_PM_OPS(ufshcd_pci_runtime_suspend,
diff --git a/drivers/scsi/ufs/ufshcd-pltfrm.c b/drivers/scsi/ufs/ufshcd-pltfrm.c
index 1a69949..84550dc 100644
--- a/drivers/scsi/ufs/ufshcd-pltfrm.c
+++ b/drivers/scsi/ufs/ufshcd-pltfrm.c
@@ -217,6 +217,35 @@ int ufshcd_pltfrm_runtime_idle(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(ufshcd_pltfrm_runtime_idle);
 
+int ufshcd_pltfrm_prepare(struct device *dev)
+{
+   struct ufs_hba *hba = dev_get_drvdata(dev);
+
+   /*
+* SCSI assumes that runtime-pm and system-pm for scsi drivers
+* are same. And it doesn't wake up the device for system-suspend
+* if it's runtime suspended. But ufs doesn't follow that.
+* The rpm-lvl and spm-lvl can be different in ufs.
+* Force it to honor system-suspend.
+*/
+   scsi_autopm_get_device(hba->sdev_ufs_device);
+   /* Refer ufshcd_pltfrm_resume_early() */
+   pm_runtime_get_noresume(>sdev_ufs_device->sdev_gendev);
+   scsi_autopm_put_device(hba->sdev_ufs_device);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(ufshcd_pltfrm_prepare);
+
+int ufshcd_pltfrm_resume_early(struct device *dev)
+{
+   struct ufs_hba *hba = dev_get_drvdata(dev);
+
+   pm_runtime_put_noidle(>sdev_ufs_device->sdev_gendev);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(ufshcd_pltfrm_resume_early);
 #endif /* CONFIG_PM */
 
 void

Re: [RFC PATCH v5 11/19] virtio/vsock: dequeue callback for SOCK_SEQPACKET

2021-02-23 Thread Arseny Krasnov



On 23.02.2021 17:17, Michael S. Tsirkin wrote:
> On Thu, Feb 18, 2021 at 08:39:37AM +0300, Arseny Krasnov wrote:
>> This adds transport callback and it's logic for SEQPACKET dequeue.
>> Callback fetches RW packets from rx queue of socket until whole record
>> is copied(if user's buffer is full, user is not woken up). This is done
>> to not stall sender, because if we wake up user and it leaves syscall,
>> nobody will send credit update for rest of record, and sender will wait
>> for next enter of read syscall at receiver's side. So if user buffer is
>> full, we just send credit update and drop data. If during copy SEQ_BEGIN
>> was found(and not all data was copied), copying is restarted by reset
>> user's iov iterator(previous unfinished data is dropped).
>>
>> Signed-off-by: Arseny Krasnov 
>> ---
>>  include/linux/virtio_vsock.h|  10 +++
>>  include/uapi/linux/virtio_vsock.h   |  16 
>>  net/vmw_vsock/virtio_transport_common.c | 114 
>>  3 files changed, 140 insertions(+)
>>
>> diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>> index dc636b727179..003d06ae4a85 100644
>> --- a/include/linux/virtio_vsock.h
>> +++ b/include/linux/virtio_vsock.h
>> @@ -36,6 +36,11 @@ struct virtio_vsock_sock {
>>  u32 rx_bytes;
>>  u32 buf_alloc;
>>  struct list_head rx_queue;
>> +
>> +/* For SOCK_SEQPACKET */
>> +u32 user_read_seq_len;
>> +u32 user_read_copied;
>> +u32 curr_rx_msg_cnt;
>
> wrap these in a struct to make it's clearer they
> are related?
Ack
>
>>  };
>>  
>>  struct virtio_vsock_pkt {
>> @@ -80,6 +85,11 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
>> struct msghdr *msg,
>> size_t len, int flags);
>>  
>> +int
>> +virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
>> +   struct msghdr *msg,
>> +   int flags,
>> +   bool *msg_ready);
>>  s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
>>  s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
>>  
>> diff --git a/include/uapi/linux/virtio_vsock.h 
>> b/include/uapi/linux/virtio_vsock.h
>> index 1d57ed3d84d2..cf9c165e5cca 100644
>> --- a/include/uapi/linux/virtio_vsock.h
>> +++ b/include/uapi/linux/virtio_vsock.h
>> @@ -63,8 +63,14 @@ struct virtio_vsock_hdr {
>>  __le32  fwd_cnt;
>>  } __attribute__((packed));
>>  
>> +struct virtio_vsock_seq_hdr {
>> +__le32  msg_cnt;
>> +__le32  msg_len;
>> +} __attribute__((packed));
>> +
>>  enum virtio_vsock_type {
>>  VIRTIO_VSOCK_TYPE_STREAM = 1,
>> +VIRTIO_VSOCK_TYPE_SEQPACKET = 2,
>>  };
>>  
>>  enum virtio_vsock_op {
>> @@ -83,6 +89,11 @@ enum virtio_vsock_op {
>>  VIRTIO_VSOCK_OP_CREDIT_UPDATE = 6,
>>  /* Request the peer to send the credit info to us */
>>  VIRTIO_VSOCK_OP_CREDIT_REQUEST = 7,
>> +
>> +/* Record begin for SOCK_SEQPACKET */
>> +VIRTIO_VSOCK_OP_SEQ_BEGIN = 8,
>> +/* Record end for SOCK_SEQPACKET */
>> +VIRTIO_VSOCK_OP_SEQ_END = 9,
>>  };
>>  
>>  /* VIRTIO_VSOCK_OP_SHUTDOWN flags values */
>> @@ -91,4 +102,9 @@ enum virtio_vsock_shutdown {
>>  VIRTIO_VSOCK_SHUTDOWN_SEND = 2,
>>  };
>>  
>> +/* VIRTIO_VSOCK_OP_RW flags values */
>> +enum virtio_vsock_rw {
>> +VIRTIO_VSOCK_RW_EOR = 1,
>> +};
>> +
>>  #endif /* _UAPI_LINUX_VIRTIO_VSOCK_H */
> Probably a good idea to also have a feature bit gating
> this functionality.

IIUC this also requires some qemu patch, because in current

implementation of vsock device in qemu, there is no 'set_features'

callback for such device. This callback will handle guest's write

to feature register, by calling vhost kernel backend, where this

bit will be processed by host.


IMHO I'm not sure that SEQPACKET support needs feature

bit - it is just two new ops for virtio vsock protocol, and from point

of view of virtio device it is same as STREAM. May be it is needed

for cases when client tries to connect to server which doesn't support

SEQPACKET, so without bit result will be "Connection reset by peer",

and with such bit client will know that server doesn't support it and

'socket(SOCK_SEQPACKET)' will return error?

>

Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero

2021-02-23 Thread Michael S. Tsirkin

On Tue, Feb 23, 2021 at 11:35:57AM -0800, Si-Wei Liu wrote:
> 
> 
> On 2/23/2021 5:26 AM, Michael S. Tsirkin wrote:
> > On Tue, Feb 23, 2021 at 10:03:57AM +0800, Jason Wang wrote:
> > > On 2021/2/23 9:12 上午, Si-Wei Liu wrote:
> > > > 
> > > > On 2/21/2021 11:34 PM, Michael S. Tsirkin wrote:
> > > > > On Mon, Feb 22, 2021 at 12:14:17PM +0800, Jason Wang wrote:
> > > > > > On 2021/2/19 7:54 下午, Si-Wei Liu wrote:
> > > > > > > Commit 452639a64ad8 ("vdpa: make sure set_features is invoked
> > > > > > > for legacy") made an exception for legacy guests to reset
> > > > > > > features to 0, when config space is accessed before features
> > > > > > > are set. We should relieve the verify_min_features() check
> > > > > > > and allow features reset to 0 for this case.
> > > > > > > 
> > > > > > > It's worth noting that not just legacy guests could access
> > > > > > > config space before features are set. For instance, when
> > > > > > > feature VIRTIO_NET_F_MTU is advertised some modern driver
> > > > > > > will try to access and validate the MTU present in the config
> > > > > > > space before virtio features are set.
> > > > > > This looks like a spec violation:
> > > > > > 
> > > > > > "
> > > > > > 
> > > > > > The following driver-read-only field, mtu only exists if
> > > > > > VIRTIO_NET_F_MTU is
> > > > > > set.
> > > > > > This field specifies the maximum MTU for the driver to use.
> > > > > > "
> > > > > > 
> > > > > > Do we really want to workaround this?
> > > > > > 
> > > > > > Thanks
> > > > > And also:
> > > > > 
> > > > > The driver MUST follow this sequence to initialize a device:
> > > > > 1. Reset the device.
> > > > > 2. Set the ACKNOWLEDGE status bit: the guest OS has noticed the 
> > > > > device.
> > > > > 3. Set the DRIVER status bit: the guest OS knows how to drive the
> > > > > device.
> > > > > 4. Read device feature bits, and write the subset of feature bits
> > > > > understood by the OS and driver to the
> > > > > device. During this step the driver MAY read (but MUST NOT write)
> > > > > the device-specific configuration
> > > > > fields to check that it can support the device before accepting it.
> > > > > 5. Set the FEATURES_OK status bit. The driver MUST NOT accept new
> > > > > feature bits after this step.
> > > > > 6. Re-read device status to ensure the FEATURES_OK bit is still set:
> > > > > otherwise, the device does not
> > > > > support our subset of features and the device is unusable.
> > > > > 7. Perform device-specific setup, including discovery of virtqueues
> > > > > for the device, optional per-bus setup,
> > > > > reading and possibly writing the device’s virtio configuration
> > > > > space, and population of virtqueues.
> > > > > 8. Set the DRIVER_OK status bit. At this point the device is “live”.
> > > > > 
> > > > > 
> > > > > so accessing config space before FEATURES_OK is a spec violation, 
> > > > > right?
> > > > It is, but it's not relevant to what this commit tries to address. I
> > > > thought the legacy guest still needs to be supported.
> > > > 
> > > > Having said, a separate patch has to be posted to fix the guest driver
> > > > issue where this discrepancy is introduced to virtnet_validate() (since
> > > > commit fe36cbe067). But it's not technically related to this patch.
> > > > 
> > > > -Siwei
> > > 
> > > I think it's a bug to read config space in validate, we should move it to
> > > virtnet_probe().
> > > 
> > > Thanks
> > I take it back, reading but not writing seems to be explicitly allowed by 
> > spec.
> > So our way to detect a legacy guest is bogus, need to think what is
> > the best way to handle this.
> Then maybe revert commit fe36cbe067 and friends, and have QEMU detect legacy
> guest? Supposedly only config space write access needs to be guarded before
> setting FEATURES_OK.
> 
> -Siwie

Detecting it isn't enough though, we will need a new ioctl to notify
the kernel that it's a legacy guest. Ugh :(


> > > > > 
> > > > > > > Rejecting reset to 0
> > > > > > > prematurely causes correct MTU and link status unable to load
> > > > > > > for the very first config space access, rendering issues like
> > > > > > > guest showing inaccurate MTU value, or failure to reject
> > > > > > > out-of-range MTU.
> > > > > > > 
> > > > > > > Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for
> > > > > > > supported mlx5 devices")
> > > > > > > Signed-off-by: Si-Wei Liu 
> > > > > > > ---
> > > > > > >     drivers/vdpa/mlx5/net/mlx5_vnet.c | 15 +--
> > > > > > >     1 file changed, 1 insertion(+), 14 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > index 7c1f789..540dd67 100644
> > > > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > @@ -1490,14 +1490,6 @@ static u64
> > > > > > > mlx5_vdpa_get_features(struct vdpa_device *vdev)
> > > > > > >     return

Re: [PATCH 00/33] Network fs helper library & fscache kiocb API [ver #3]

2021-02-23 Thread Steve French

On Tue, Feb 23, 2021 at 2:28 PM Matthew Wilcox  wrote:
>
> On Mon, Feb 15, 2021 at 11:22:20PM -0600, Steve French wrote:
> > On Mon, Feb 15, 2021 at 8:10 PM Matthew Wilcox  wrote:
> > > The switch from readpages to readahead does help in a couple of corner
> > > cases.  For example, if you have two processes reading the same file at
> > > the same time, one will now block on the other (due to the page lock)
> > > rather than submitting a mess of overlapping and partial reads.
> >
> > Do you have a simple repro example of this we could try (fio, dbench, iozone
> > etc) to get some objective perf data?
>
> I don't.  The problem was noted by the f2fs people, so maybe they have a
> reproducer.
>
> > My biggest worry is making sure that the switch to netfs doesn't degrade
> > performance (which might be a low bar now since current network file copy
> > perf seems to signifcantly lag at least Windows), and in some easy to 
> > understand
> > scenarios want to make sure it actually helps perf.
>
> I had a question about that ... you've mentioned having 4x4MB reads
> outstanding as being the way to get optimum performance.  Is there a
> significant performance difference between 4x4MB, 16x1MB and 64x256kB?
> I'm concerned about having "too large" an I/O on the wire at a given time.
> For example, with a 1Gbps link, you get 250MB/s.  That's a minimum
> latency of 16us for a 4kB page, but 16ms for a 4MB page.
>
> "For very simple tasks, people can perceive latencies down to 2 ms or less"
> (https://danluu.com/input-lag/)
> so going all the way to 4MB I/Os takes us into the perceptible latency
> range, whereas a 256kB I/O is only 1ms.
>
> So could you do some experiments with fio doing direct I/O to see if
> it takes significantly longer to do, say, 1TB of I/O in 4MB chunks vs
> 256kB chunks?  Obviously use threads to keep lots of I/Os outstanding.

That is a good question and it has been months since I have done experiments
with something similar.   Obviously this will vary depending on RDMA or not and
multichannel or not - but assuming the 'normal' low end network configuration -
ie a 1Gbps link and no RDMA or multichannel I could do some more recent
experiments.

In the past what I had noticed was that server performance for simple
workloads like cp or grep increased with network I/O size to a point:
smaller than 256K packet size was bad. Performance improved
significantly from 256K to 512K to 1MB, but only very
slightly from 1MB to 2MB to 4MB and sometimes degraded at 8MB
(IIRC 8MB is the max commonly supported by SMB3 servers),
but this is with only one adapter (no multichannel) and 1Gb adapters.

But in those examples there wasn't a lot of concurrency on the wire.

I did some experiments with increasing the read ahead size
(which causes more than one async read to be issued by cifs.ko
but presumably does still result in some 'dead time')
which seemed to help perf of some sequential read examples
(e.g. grep or cp) to some servers but I didn't try enough variety
of server targets to feel confident about that change especially
if netfs is coming

e.g. a change I experimented with was:
 sb->s_bdi->ra_pages = cifs_sb->ctx->rsize / PAGE_SIZE
to
 sb->s_bdi->ra_pages = 2 * cifs_sb->ctx->rsize / PAGE_SIZE

and it did seem to help a little.

I would expect that 8x1MB (ie trying to keep eight 1MB reads in process should
keep the network mostly busy and not lead to too much dead time on
server, client
or network) and is 'good enough' in many read ahead use cases (at
least for non-RDMA, and non-multichannel on a slower network) to keep the pipe
file, and I would expect the performance to be similar to the equivalent using
2MB read (e.g. 4x2MB) and perhaps better than 2x4MB.  Below 1MB i/o size
on the wire I would expect to see degradation due to packet processing
and task switching
overhead.  Would definitely be worth doing more experimentation here.
-- 
Thanks,

Steve

[PATCH v24 3/4] scsi: ufs: Prepare HPB read for cached sub-region

2021-02-23 Thread Daejun Park

This patch changes the read I/O to the HPB read I/O.

If the logical address of the read I/O belongs to active sub-region, the
HPB driver modifies the read I/O command to HPB read. It modifies the UPIU
command of UFS instead of modifying the existing SCSI command.

In the HPB version 1.0, the maximum read I/O size that can be converted to
HPB read is 4KB.

The dirty map of the active sub-region prevents an incorrect HPB read that
has stale physical page number which is updated by previous write I/O.

Reviewed-by: Can Guo 
Reviewed-by: Bart Van Assche 
Acked-by: Avri Altman 
Tested-by: Bean Huo 
Signed-off-by: Daejun Park 
---
 drivers/scsi/ufs/ufshcd.c |   2 +
 drivers/scsi/ufs/ufshpb.c | 253 +-
 drivers/scsi/ufs/ufshpb.h |   2 +
 3 files changed, 254 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 5852ff44c3cc..851c01a26207 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -2656,6 +2656,8 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, 
struct scsi_cmnd *cmd)
 
lrbp->req_abort_skip = false;
 
+   ufshpb_prep(hba, lrbp);
+
ufshcd_comp_scsi_upiu(hba, lrbp);
 
err = ufshcd_map_sg(hba, lrbp);
diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c
index 5d7b60c273cc..1a4a9c548ef3 100644
--- a/drivers/scsi/ufs/ufshpb.c
+++ b/drivers/scsi/ufs/ufshpb.c
@@ -46,6 +46,29 @@ static void ufshpb_set_state(struct ufshpb_lu *hpb, int 
state)
atomic_set(>hpb_state, state);
 }
 
+static int ufshpb_is_valid_srgn(struct ufshpb_region *rgn,
+   struct ufshpb_subregion *srgn)
+{
+   return rgn->rgn_state != HPB_RGN_INACTIVE &&
+   srgn->srgn_state == HPB_SRGN_VALID;
+}
+
+static bool ufshpb_is_read_cmd(struct scsi_cmnd *cmd)
+{
+   return req_op(cmd->request) == REQ_OP_READ;
+}
+
+static bool ufshpb_is_write_or_discard_cmd(struct scsi_cmnd *cmd)
+{
+   return op_is_write(req_op(cmd->request)) ||
+  op_is_discard(req_op(cmd->request));
+}
+
+static bool ufshpb_is_support_chunk(int transfer_len)
+{
+   return transfer_len <= HPB_MULTI_CHUNK_HIGH;
+}
+
 static bool ufshpb_is_general_lun(int lun)
 {
return lun < UFS_UPIU_MAX_UNIT_NUM_ID;
@@ -80,8 +103,8 @@ static void ufshpb_kick_map_work(struct ufshpb_lu *hpb)
 }
 
 static bool ufshpb_is_hpb_rsp_valid(struct ufs_hba *hba,
-struct ufshcd_lrb *lrbp,
-struct utp_hpb_rsp *rsp_field)
+   struct ufshcd_lrb *lrbp,
+   struct utp_hpb_rsp *rsp_field)
 {
/* Check HPB_UPDATE_ALERT */
if (!(lrbp->ucd_rsp_ptr->header.dword_2 &
@@ -107,6 +130,230 @@ static bool ufshpb_is_hpb_rsp_valid(struct ufs_hba *hba,
return true;
 }
 
+static void ufshpb_set_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx,
+int srgn_idx, int srgn_offset, int cnt)
+{
+   struct ufshpb_region *rgn;
+   struct ufshpb_subregion *srgn;
+   int set_bit_len;
+   int bitmap_len;
+
+next_srgn:
+   rgn = hpb->rgn_tbl + rgn_idx;
+   srgn = rgn->srgn_tbl + srgn_idx;
+
+   if (likely(!srgn->is_last))
+   bitmap_len = hpb->entries_per_srgn;
+   else
+   bitmap_len = hpb->last_srgn_entries;
+
+   if ((srgn_offset + cnt) > bitmap_len)
+   set_bit_len = bitmap_len - srgn_offset;
+   else
+   set_bit_len = cnt;
+
+   if (rgn->rgn_state != HPB_RGN_INACTIVE &&
+   srgn->srgn_state == HPB_SRGN_VALID)
+   bitmap_set(srgn->mctx->ppn_dirty, srgn_offset, set_bit_len);
+
+   srgn_offset = 0;
+   if (++srgn_idx == hpb->srgns_per_rgn) {
+   srgn_idx = 0;
+   rgn_idx++;
+   }
+
+   cnt -= set_bit_len;
+   if (cnt > 0)
+   goto next_srgn;
+}
+
+static bool ufshpb_test_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx,
+ int srgn_idx, int srgn_offset, int cnt)
+{
+   struct ufshpb_region *rgn;
+   struct ufshpb_subregion *srgn;
+   int bitmap_len;
+   int bit_len;
+
+next_srgn:
+   rgn = hpb->rgn_tbl + rgn_idx;
+   srgn = rgn->srgn_tbl + srgn_idx;
+
+   if (likely(!srgn->is_last))
+   bitmap_len = hpb->entries_per_srgn;
+   else
+   bitmap_len = hpb->last_srgn_entries;
+
+   if (!ufshpb_is_valid_srgn(rgn, srgn))
+   return true;
+
+   /*
+* If the region state is active, mctx must be allocated.
+* In this case, check whether the region is evicted or
+* mctx allcation fail.
+*/
+   if (unlikely(!srgn->mctx)) {
+   dev_err(>sdev_ufs_lu->sdev_dev,
+   "no mctx in region %d subregion %d.\n",
+   srgn->rgn_idx, srgn->srgn_idx);
+   return true;
+   }

[PATCH v24 4/4] scsi: ufs: Add HPB 2.0 support

2021-02-23 Thread Daejun Park

This patch supports the HPB 2.0.

The HPB 2.0 supports read of varying sizes from 4KB to 512KB.
In the case of Read (<= 32KB) is supported as single HPB read.
In the case of Read (36KB ~ 512KB) is supported by as a combination of
write buffer command and HPB read command to deliver more PPN.
The write buffer commands may not be issued immediately due to busy tags.
To use HPB read more aggressively, the driver can requeue the write buffer
command. The requeue threshold is implemented as timeout and can be
modified with requeue_timeout_ms entry in sysfs.

Signed-off-by: Daejun Park 
---
 Documentation/ABI/testing/sysfs-driver-ufs |  35 +-
 drivers/scsi/ufs/ufs-sysfs.c   |   2 +
 drivers/scsi/ufs/ufs.h |   3 +-
 drivers/scsi/ufs/ufshcd.c  |  18 +-
 drivers/scsi/ufs/ufshcd.h  |   7 +
 drivers/scsi/ufs/ufshpb.c  | 616 +++--
 drivers/scsi/ufs/ufshpb.h  |  67 ++-
 7 files changed, 669 insertions(+), 79 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-ufs 
b/Documentation/ABI/testing/sysfs-driver-ufs
index bf5cb8846de1..0017eaf89cbe 100644
--- a/Documentation/ABI/testing/sysfs-driver-ufs
+++ b/Documentation/ABI/testing/sysfs-driver-ufs
@@ -1253,14 +1253,14 @@ Description:This entry shows the number of HPB 
pinned regions assigned to
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/hit_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/hit_cnt
 Date:  February 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of reads that changed to HPB read.
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/miss_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/miss_cnt
 Date:  February 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of reads that cannot be changed to
@@ -1268,7 +1268,7 @@ Description:  This entry shows the number of reads 
that cannot be changed to
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/rb_noti_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_noti_cnt
 Date:  February 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of response UPIUs that has
@@ -1276,7 +1276,7 @@ Description:  This entry shows the number of response 
UPIUs that has
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/rb_active_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_active_cnt
 Date:  February 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of active sub-regions recommended by
@@ -1284,7 +1284,7 @@ Description:  This entry shows the number of active 
sub-regions recommended by
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/rb_inactive_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_inactive_cnt
 Date:  February 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of inactive regions recommended by
@@ -1292,10 +1292,33 @@ Description:This entry shows the number of inactive 
regions recommended by
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/map_req_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/map_req_cnt
 Date:  February 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of read buffer commands for
activating sub-regions recommended by response UPIUs.
 
The file is read only.
+
+What:  
/sys/class/scsi_device/*/device/hpb_param_sysfs/requeue_timeout_ms
+Date:  February 2021
+Contact:   Daejun Park 
+Description:   This entry shows the requeue timeout threshold for write buffer
+   command in ms. This value can be changed by writing proper 
integer to
+   this entry.
+
+What:  
/sys/bus/platform/drivers/ufshcd/*/attributes/max_data_size_hpb_single_cmd
+Date:  February 2021
+Contact:   Daejun Park 
+Description:   This entry shows the maximum HPB data size for using single HPB
+   command.
+
+   ===  
+   00h  4KB
+   01h  8KB
+   02h  12KB
+   ...
+   FFh  1024KB
+   ===  
+
+   The file is read only.
diff --git a/drivers/scsi/ufs/ufs-sysfs.c b/drivers/scsi/ufs/ufs-sysfs.c
index 2546e7a1ac4f..00fb519406cf 100644
--- a/drivers/scsi/ufs/ufs-sysfs.c
+++ b/drivers/scsi/ufs/ufs-sysfs.c
@@ -841,6 +841,7 @@ out:
\
 static DEVICE_ATTR_RO(_name)

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1024 matches

Mail list logo