Re: [PATCH] drm/fb-helper: only unmap if buffer not null

2021-03-01 Thread Tong Zhang
Hi Tomas,

I think the issue could be possibly caused by the following,
Please correct me if I'm wrong.

drm_fb_helper_single_fb_probe() can fail with
"Cannot find any crtc or sizes"
which will cause fb_helper->funcs->fb_probe not being called,
thus fb_helper->buffer remains NULL --
Since there could be the case that the fb_probe is never called,
a subsequent modprobe -r will cause the following
drm_client_buffer_vunmap(NULL) in drm_fbdev_cleanup()

Best,
- Tong

On Mon, Mar 1, 2021 at 3:26 AM Thomas Zimmermann  wrote:
>
> Hi
>
> Am 28.02.21 um 05:46 schrieb Tong Zhang:
> > drm_fbdev_cleanup() can be called when fb_helper->buffer is null, hence
> > fb_helper->buffer should be checked before calling
> > drm_client_buffer_vunmap(). This buffer is also checked in
> > drm_client_framebuffer_delete(), so we should also do the same thing for
> > drm_client_buffer_vunmap().
>
> I think a lot of drivers are affected by this problem; probably most of
> the ones that use the generic fbdev code. How did you produce the error?
>
> What I'm more concerned about is why the buffer is NULL. Was ther eno
> hotplug event? Do you have a display attached?
>
> Best regards
> Thomas
>
>
> >
> > [  199.128742] RIP: 0010:drm_client_buffer_vunmap+0xd/0x20
> > [  199.129031] Code: 43 18 48 8b 53 20 49 89 45 00 49 89 55 08 5b 44 89 e0 
> > 41 5c 41 5d 41 5e 5d
> > c3 0f 1f 00 53 48 89 fb 48 8d 7f 10 e8 73 7d a1 ff <48> 8b 7b 10 48 8d 73 
> > 18 5b e9 75 53 fc ff 0
> > f 1f 44 00 00 48 b8 00
> > [  199.130041] RSP: 0018:888103f3fc88 EFLAGS: 00010282
> > [  199.130329] RAX: 0001 RBX:  RCX: 
> > 8214d46d
> > [  199.130733] RDX: 1079c6b9 RSI: 0246 RDI: 
> > 83ce35c8
> > [  199.131119] RBP: 888103d25458 R08: 0001 R09: 
> > fbfff0791761
> > [  199.131505] R10: 83c8bb07 R11: fbfff0791760 R12: 
> > 
> > [  199.131891] R13: 888103d25468 R14: 888103d25418 R15: 
> > 888103f18120
> > [  199.132277] FS:  7f36fdcbb6a0() GS:88815b40() 
> > knlGS:
> > [  199.132721] CS:  0010 DS:  ES:  CR0: 80050033
> > [  199.133033] CR2: 0010 CR3: 000103d26000 CR4: 
> > 06f0
> > [  199.133420] DR0:  DR1:  DR2: 
> > 
> > [  199.133807] DR3:  DR6: fffe0ff0 DR7: 
> > 0400
> > [  199.134195] Call Trace:
> > [  199.134333]  drm_fbdev_cleanup+0x179/0x1a0
> > [  199.134562]  drm_fbdev_client_unregister+0x2b/0x40
> > [  199.134828]  drm_client_dev_unregister+0xa8/0x180
> > [  199.135088]  drm_dev_unregister+0x61/0x110
> > [  199.135315]  mgag200_pci_remove+0x38/0x52 [mgag200]
> > [  199.135586]  pci_device_remove+0x62/0xe0
> > [  199.135806]  device_release_driver_internal+0x148/0x270
> > [  199.136094]  driver_detach+0x76/0xe0
> > [  199.136294]  bus_remove_driver+0x7e/0x100
> > [  199.136521]  pci_unregister_driver+0x28/0xf0
> > [  199.136759]  __x64_sys_delete_module+0x268/0x300
> > [  199.137016]  ? __ia32_sys_delete_module+0x300/0x300
> > [  199.137285]  ? call_rcu+0x3e4/0x580
> > [  199.137481]  ? fpregs_assert_state_consistent+0x4d/0x60
> > [  199.137767]  ? exit_to_user_mode_prepare+0x2f/0x130
> > [  199.138037]  do_syscall_64+0x33/0x40
> > [  199.138237]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [  199.138517] RIP: 0033:0x7f36fdc3dcf7
> >
> > Signed-off-by: Tong Zhang 
> > ---
> >   drivers/gpu/drm/drm_fb_helper.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/drm_fb_helper.c 
> > b/drivers/gpu/drm/drm_fb_helper.c
> > index b9a616737c0e..f6baa2046124 100644
> > --- a/drivers/gpu/drm/drm_fb_helper.c
> > +++ b/drivers/gpu/drm/drm_fb_helper.c
> > @@ -2048,7 +2048,7 @@ static void drm_fbdev_cleanup(struct drm_fb_helper 
> > *fb_helper)
> >
> >   if (shadow)
> >   vfree(shadow);
> > - else
> > + else if (fb_helper->buffer)
> >   drm_client_buffer_vunmap(fb_helper->buffer);
> >
> >   drm_client_framebuffer_delete(fb_helper->buffer);
> >
>
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
>


Re: [PATCH] PATCH Documentation translations:translate sound/hd-audio/controls to chinese

2021-03-01 Thread huangjianghui
On Mon, Mar 01, 2021 at 02:16:36PM -0700, Jonathan Corbet wrote:
> 
> So you have sent me two versions of this in the last 24 hours with no
> indication of what has changed or why I should prefer one over the
> other.  Always include that information (under the "---" line) when you
> send updated versions.
> 
> It looks like you got a Reviewed-by tag from Alex on the other version,
> but that doesn't appear here; why?
> 
> [...]
> 
Thank you very much for your advices, I sent two emails ,and As a
result,I lost the reviewed information of Mr. Alex,I'm sorry for causing of
the result.

[...]
> So you list a bunch of files here, but most of them are not added in
> your patch.  That will, of course, break the docs build.
> 
> 
> What are all of these files?
> 
> Please fix these issues and make sure that the docs build runs correctly
> before resubmitting.
> 
> Thanks,
> 
> jon
> 
> 
In the next patch ,I deleted the index of the untranstated files,and i
used checkpatch.pl to detect doc errors and tried to built the htmldocs
on my pc.

Thanks,

Huang Jianghui
>From 47c05e4fe540e938fc7a9fa13a2f8698579a4744 Mon Sep 17 00:00:00 2001
From: hjh 
Date: Tue, 2 Mar 2021 10:52:18 +0800
Subject: [PATCH] PATCH Documentation translations:translate
 sound/hd-audio/controls to chinese

Signed-off-by: hjh 
---
 Documentation/translations/zh_CN/index.rst|   1 +
 .../zh_CN/sound/hd-audio/controls.rst | 102 ++
 .../zh_CN/sound/hd-audio/index.rst|  14 +++
 .../translations/zh_CN/sound/index.rst|  22 
 4 files changed, 139 insertions(+)
 create mode 100644 Documentation/translations/zh_CN/sound/hd-audio/controls.rst
 create mode 100644 Documentation/translations/zh_CN/sound/hd-audio/index.rst
 create mode 100644 Documentation/translations/zh_CN/sound/index.rst

diff --git a/Documentation/translations/zh_CN/index.rst b/Documentation/translations/zh_CN/index.rst
index be6f11176200..2767dacfe86d 100644
--- a/Documentation/translations/zh_CN/index.rst
+++ b/Documentation/translations/zh_CN/index.rst
@@ -20,6 +20,7 @@
process/index
filesystems/index
arm64/index
+   sound/index
 
 目录和表格
 --
diff --git a/Documentation/translations/zh_CN/sound/hd-audio/controls.rst b/Documentation/translations/zh_CN/sound/hd-audio/controls.rst
new file mode 100644
index ..54c028ab9a40
--- /dev/null
+++ b/Documentation/translations/zh_CN/sound/hd-audio/controls.rst
@@ -0,0 +1,102 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Chinese translator: Huang Jianghui 
+-
+.. include:: ../../disclaimer-zh_CN.rst
+以下为正文
+-
+==
+高清音频编解码器特定混音器控件
+==
+
+
+此文件解释特定于编解码器的混音器控件.
+
+瑞昱编解码器
+
+
+声道模式
+  这是一个用于更改环绕声道设置的枚举控件,仅在环绕声道打开时显示出现。
+  它给出要使用的通道数:"2ch","4ch","6ch",和"8ch"。根据配置,这还控
+  制多I/O插孔的插孔重分配。
+
+自动静音模式
+  这是一个枚举控件,用于更改耳机和线路输出插孔的自动静音行为。如果内
+  置扬声器、耳机和/或线路输出插孔在机器上可用,则显示该控件。当只有
+  耳机或者线路输出的时候,它给出”禁用“和”启用“状态。当启用后,插孔插
+  入后扬声器会自动静音。
+
+  当耳机和线路输出插孔都存在时,它给出”禁用“、”仅扬声器“和”线路输出+扬
+  声器“。当”仅扬声器“被选择,插入耳机或者线路输出插孔可使扬声器静音,
+  但不会使线路输出静音。当线路输出+扬声器被选择,插入耳机插孔会同时使扬
+  声器和线路输出静音。
+
+
+矽玛特编解码器
+--
+
+模拟环回
+   此控件启用/禁用模拟环回电路。只有在编解码器提示中将”lookback“设置为真
+   时才会出现(见HD-Audio.txt)。请注意,在某些编解码器上,模拟环回和正常
+   PCM播放是独占的,即当此选项打开时,您将听不到任何PCM流。
+
+交换中置/低频
+   交换中置和低频通道顺序,通常情况下,左侧对应中置,右侧对应低频,启动此
+   项后,左边低频,右边中置。
+
+耳机作为线路输出
+   当此控制开启时,将耳机视为线路输出插孔。也就是说,耳机不会自动静音其他
+   线路输出,没有耳机放大器被设置到引脚上。
+
+麦克风插口模式、线路插孔模式等
+   这些枚举控制输入插孔引脚的方向和偏置。根据插孔类型,它可以设置为”麦克风
+   输入“和”线路输入“以确定输入偏置,或者当引脚是环绕声道的多I/O插孔时,它
+   可以设置为”线路输出“。
+
+
+威盛编解码器
+
+
+智能5.1
+   一个枚举控件,用于为环绕输出重新分配多个I/O插孔的任务。当它打开时,相应
+   的输入插孔(通常是线路输入和麦克风输入)被切换为环绕和中央低频输出插孔。
+
+独立耳机
+   启用此枚举控制时,耳机输出从单个流(第三个PCM,如hw:0,2)而不是主流路由。
+   如果耳机DAC与侧边或中央低频通道DAC共享,则DAC将自动切换到耳机。
+
+环回混合
+   一个用于确定是否启动了模拟环回路由的枚举控件。当它启用后,模拟环回路由到
+   前置通道。同样,耳机与扬声器输出也采用相同的路径。作为一个副作用,当设置
+   此模式后,单个音量控制将不再适用于耳机和扬声器,因为只有一个DAC连接到混
+   音器小部件。
+
+动态电源控制
+   此控件决定是否启动每个插孔的动态电源控制检测。启用时,根据插孔的插入情况
+   动态更改组件的电源状态(D0/D3)以节省电量消耗。但是,如果您的系统没有提
+   供正确的插孔检测,这将无法工作;在这种情况下,请关闭此控件。
+
+插孔检测
+   此控件仅为VT1708编解码器提供,它不会为每个插孔插拔提供适当的未请求事件。
+   当此控件打开,驱动将轮询插孔检测,以便耳机自动静音可以工作,而关闭此控
+   件将降低功耗。
+
+
+科胜讯编解码器
+--
+
+自动静音模式
+   见瑞昱解码器
+
+
+
+模拟编解码器
+
+
+通道模式
+   这是一个用于更改环绕声道设置的枚举控件,仅在环绕声道可用时显示。它提供了能
+   被使用的通道数:”2ch“、”4ch“和”6ch“。根据配置,这还控制多I/O插孔的插孔重
+   分配。
+
+独立耳机
+   启动此枚举控制后,耳机输出从单个流(第三个PCM,如hw:0,2)而不是主流路由。
diff --git a/Documentation/translations/zh_CN/sound/hd-audio/index.rst b/Documentation/translations/zh_CN/sound/hd-audio/index.rst
new file mode 100644
index ..d9885d53b069
--- /dev/null
+++ b/Documentation/translations/zh_CN/sound/hd-audio/index.rst
@@ -0,0 +1,14 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: ../../disclaimer-zh_CN.rst
+
+:Original: :doc:`../../../../sound/hd-audio/index`
+:Translator: Huang Jianghui 
+
+
+高清音频
+
+
+.. toctree::
+   

Re: Question about the "EXPERIMENTAL" tag for dax in XFS

2021-03-01 Thread Darrick J. Wong
On Mon, Mar 01, 2021 at 12:55:53PM -0800, Dan Williams wrote:
> On Sun, Feb 28, 2021 at 2:39 PM Dave Chinner  wrote:
> >
> > On Sat, Feb 27, 2021 at 03:40:24PM -0800, Dan Williams wrote:
> > > On Sat, Feb 27, 2021 at 2:36 PM Dave Chinner  wrote:
> > > > On Fri, Feb 26, 2021 at 02:41:34PM -0800, Dan Williams wrote:
> > > > > On Fri, Feb 26, 2021 at 1:28 PM Dave Chinner  
> > > > > wrote:
> > > > > > On Fri, Feb 26, 2021 at 12:59:53PM -0800, Dan Williams wrote:
> > > > it points to, check if it points to the PMEM that is being removed,
> > > > grab the page it points to, map that to the relevant struct page,
> > > > run collect_procs() on that page, then kill the user processes that
> > > > map that page.
> > > >
> > > > So why can't we walk the ptescheck the physical pages that they
> > > > map to and if they map to a pmem page we go poison that
> > > > page and that kills any user process that maps it.
> > > >
> > > > i.e. I can't see how unexpected pmem device unplug is any different
> > > > to an MCE delivering a hwpoison event to a DAX mapped page.
> > >
> > > I guess the tradeoff is walking a long list of inodes vs walking a
> > > large array of pages.
> >
> > Not really. You're assuming all a filesystem has to do is invalidate
> > everything if a device goes away, and that's not true. Finding if an
> > inode has a mapping that spans a specific device in a multi-device
> > filesystem can be a lot more complex than that. Just walking inodes
> > is easy - determining whihc inodes need invalidation is the hard
> > part.
> 
> That inode-to-device level of specificity is not needed for the same
> reason that drop_caches does not need to be specific. If the wrong
> page is unmapped a re-fault will bring it back, and re-fault will fail
> for the pages that are successfully removed.
> 
> > That's where ->corrupt_range() comes in - the filesystem is already
> > set up to do reverse mapping from physical range to inode(s)
> > offsets...
> 
> Sure, but what is the need to get to that level of specificity with
> the filesystem for something that should rarely happen in the course
> of normal operation outside of a mistake?

I can't tell if we're conflating the "a bunch of your pmem went bad"
case with the "all your dimms fell out of the machine" case.

If, say, a single cacheline's worth of pmem goes bad on a node with 2TB
of pmem, I certainly want that level of specificity.  Just notify the
users of the dead piece, don't flush the whole machine down the drain.

> > > There's likely always more pages than inodes, but perhaps it's more
> > > efficient to walk the 'struct page' array than sb->s_inodes?
> >
> > I really don't see you seem to be telling us that invalidation is an
> > either/or choice. There's more ways to convert physical block
> > address -> inode file offset and mapping index than brute force
> > inode cache walks
> 
> Yes, but I was trying to map it to an existing mechanism and the
> internals of drop_pagecache_sb() are, in coarse terms, close to what
> needs to happen here.

Yes.  XFS (with rmap enabled) can do all the iteration and walking in
that function except for the invalidate_mapping_* call itself.  The goal
of this series is first to wire up a callback within both the block and
pmem subsystems so that they can take notifications and reverse-map them
through the storage stack until they reach an fs superblock.

Once the information has reached XFS, it can use its own reverse
mappings to figure out which pages of which inodes are now targetted.
The future of DAX hw error handling can be that you throw the spitwad at
us, and it's our problem to distill that into mm invalidation calls.
XFS' reverse mapping data is indexed by storage location and isn't
sharded by address_space, so (except for the DIMMs falling out), we
don't need to walk the entire inode list or scan the entire mapping.

Between XFS and DAX and mm, the mm already has the invalidation calls,
xfs already has the distiller, and so all we need is that first bit.
The current mm code doesn't fully solve the problem, nor does it need
to, since it handles DRAM errors acceptably* already.

* Actually, the hwpoison code should _also_ be calling ->corrupted_range
when DRAM goes bad so that we can detect metadata failures and either
reload the buffer or (if it was dirty) shut down.

> >
> > .
> >
> > > > IOWs, what needs to happen at this point is very filesystem
> > > > specific. Assuming that "device unplug == filesystem dead" is not
> > > > correct, nor is specifying a generic action that assumes the
> > > > filesystem is dead because a device it is using went away.
> > >
> > > Ok, I think I set this discussion in the wrong direction implying any
> > > mapping of this action to a "filesystem dead" event. It's just a "zap
> > > all ptes" event and upper layers recover from there.
> >
> > Yes, that's exactly what ->corrupt_range() is intended for. It
> > allows the filesystem to lock out access to the bad range
> > and then 

Re: [x86, build] 6dafca9780: WARNING:at_arch/x86/kernel/ftrace.c:#ftrace_verify_code

2021-03-01 Thread Sami Tolvanen
On Mon, Mar 1, 2021 at 6:47 PM Steven Rostedt  wrote:
>
> On Mon, 1 Mar 2021 18:40:32 -0800
> Sami Tolvanen  wrote:
>
> > On Mon, Mar 01, 2021 at 08:15:26PM -0500, Steven Rostedt wrote:
> > > On Mon, 1 Mar 2021 16:03:51 -0800
> > > Sami Tolvanen  wrote:
> > > >
> > > > > ret = ftrace_verify_code(rec->ip, old);
> > > > > +
> > > > > +   if (__is_defined(CC_USING_NOP_MCOUNT) && ret && 
> > > > > old_nop) {
> > > > > +   /* Compiler could have put in P6_NOP5 */
> > > > > +   old = P6_NOP5;
> > > > > +   ret = ftrace_verify_code(rec->ip, old);
> > > > > +   }
> > > > > +
> > > >
> > > > Wouldn't that still hit WARN(1) in the initial ftrace_verify_code()
> > > > call if ideal_nops doesn't match?
> > >
> > > That was too quickly written ;-)
> > >
> > > Take 2:
> > >
> > > [ with fixes for setting p6_nop ]
> >
> > Thanks, I tested this with the config from the build bot, and I can
> > confirm that it fixes the issue for me.
> >
> > I also tested a quick patch to disable the __fentry__ conversion in
> > objtool, and it seems to work too, but it's probably a good idea to
> > fix the issue with CC_USING_NOP_MCOUNT in any case.
>
> Thanks for testing, I'll make this into a proper patch and start
> testing it internally. I'm assuming you want this to go into the -rc
> release and possibly stable?

Sounds good, thank you. Yes, getting the fix to -rc would be great.
I'm not sure if it's necessary in -stable though, the objtool patch is
only in -rc1.

Sami


[PATCH v9 2/2] ufs: sysfs: Resume the proper scsi device

2021-03-01 Thread Asutosh Das
Resumes the actual scsi device the unit descriptor of which
is being accessed instead of the hba alone.

Signed-off-by: Asutosh Das 
---
 drivers/scsi/ufs/ufs-sysfs.c | 26 +++---
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/ufs/ufs-sysfs.c b/drivers/scsi/ufs/ufs-sysfs.c
index acc54f5..34481e3 100644
--- a/drivers/scsi/ufs/ufs-sysfs.c
+++ b/drivers/scsi/ufs/ufs-sysfs.c
@@ -297,10 +297,10 @@ static ssize_t ufs_sysfs_read_desc_param(struct ufs_hba 
*hba,
goto out;
}
 
-   pm_runtime_get_sync(hba->dev);
+   scsi_autopm_get_device(hba->sdev_ufs_device);
ret = ufshcd_read_desc_param(hba, desc_id, desc_index,
param_offset, desc_buf, param_size);
-   pm_runtime_put_sync(hba->dev);
+   scsi_autopm_put_device(hba->sdev_ufs_device);
if (ret) {
ret = -EINVAL;
goto out;
@@ -678,7 +678,7 @@ static ssize_t _name##_show(struct device *dev, 
\
up(>host_sem); \
return -ENOMEM; \
}   \
-   pm_runtime_get_sync(hba->dev);  \
+   scsi_autopm_get_device(hba->sdev_ufs_device);   \
ret = ufshcd_query_descriptor_retry(hba,\
UPIU_QUERY_OPCODE_READ_DESC, QUERY_DESC_IDN_DEVICE, \
0, 0, desc_buf, _len); \
@@ -695,7 +695,7 @@ static ssize_t _name##_show(struct device *dev, 
\
goto out;   \
ret = sysfs_emit(buf, "%s\n", desc_buf);\
 out:   \
-   pm_runtime_put_sync(hba->dev);  \
+   scsi_autopm_put_device(hba->sdev_ufs_device);   \
kfree(desc_buf);\
up(>host_sem); \
return ret; \
@@ -744,10 +744,10 @@ static ssize_t _name##_show(struct device *dev,   
\
}   \
if (ufshcd_is_wb_flags(QUERY_FLAG_IDN##_uname)) \
index = ufshcd_wb_get_query_index(hba); \
-   pm_runtime_get_sync(hba->dev);  \
+   scsi_autopm_get_device(hba->sdev_ufs_device);   \
ret = ufshcd_query_flag(hba, UPIU_QUERY_OPCODE_READ_FLAG,   \
QUERY_FLAG_IDN##_uname, index, );  \
-   pm_runtime_put_sync(hba->dev);  \
+   scsi_autopm_put_device(hba->sdev_ufs_device);   \
if (ret) {  \
ret = -EINVAL;  \
goto out;   \
@@ -813,10 +813,10 @@ static ssize_t _name##_show(struct device *dev,   
\
}   \
if (ufshcd_is_wb_attrs(QUERY_ATTR_IDN##_uname)) \
index = ufshcd_wb_get_query_index(hba); \
-   pm_runtime_get_sync(hba->dev);  \
+   scsi_autopm_get_device(hba->sdev_ufs_device);   \
ret = ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_READ_ATTR,   \
QUERY_ATTR_IDN##_uname, index, 0, );  \
-   pm_runtime_put_sync(hba->dev);  \
+   scsi_autopm_put_device(hba->sdev_ufs_device);   \
if (ret) {  \
ret = -EINVAL;  \
goto out;   \
@@ -899,11 +899,15 @@ static ssize_t _pname##_show(struct device *dev,  
\
struct scsi_device *sdev = to_scsi_device(dev); \
struct ufs_hba *hba = shost_priv(sdev->host);   \
u8 lun = ufshcd_scsi_to_upiu_lun(sdev->lun);\
+   int ret;\
if (!ufs_is_valid_unit_desc_lun(>dev_info, lun,\
_duname##_DESC_PARAM##_puname)) \
return -EINVAL; \
-   return ufs_sysfs_read_desc_param(hba, QUERY_DESC_IDN_##_duname, \
+   scsi_autopm_get_device(sdev); 

[PATCH v9 1/2] scsi: ufs: Enable power management for wlun

2021-03-01 Thread Asutosh Das
During runtime-suspend of ufs host, the scsi devices are
already suspended and so are the queues associated with them.
But the ufs host sends SSU to wlun during its runtime-suspend.
During the process blk_queue_enter checks if the queue is not in
suspended state. If so, it waits for the queue to resume, and never
comes out of it.
The commit
(d55d15a33: scsi: block: Do not accept any requests while suspended)
adds the check if the queue is in suspended state in blk_queue_enter().

Call trace:
 __switch_to+0x174/0x2c4
 __schedule+0x478/0x764
 schedule+0x9c/0xe0
 blk_queue_enter+0x158/0x228
 blk_mq_alloc_request+0x40/0xa4
 blk_get_request+0x2c/0x70
 __scsi_execute+0x60/0x1c4
 ufshcd_set_dev_pwr_mode+0x124/0x1e4
 ufshcd_suspend+0x208/0x83c
 ufshcd_runtime_suspend+0x40/0x154
 ufshcd_pltfrm_runtime_suspend+0x14/0x20
 pm_generic_runtime_suspend+0x28/0x3c
 __rpm_callback+0x80/0x2a4
 rpm_suspend+0x308/0x614
 rpm_idle+0x158/0x228
 pm_runtime_work+0x84/0xac
 process_one_work+0x1f0/0x470
 worker_thread+0x26c/0x4c8
 kthread+0x13c/0x320
 ret_from_fork+0x10/0x18

Fix this by registering ufs device wlun as a scsi driver and
registering it for block runtime-pm. Also make this as a
supplier for all other luns. That way, this device wlun
suspends after all the consumers and resumes after
hba resumes.

Co-developed-by: Can Guo 
Signed-off-by: Can Guo 
Signed-off-by: Asutosh Das 
---
 drivers/scsi/ufs/cdns-pltfrm.c |   2 +
 drivers/scsi/ufs/tc-dwc-g210-pci.c |   2 +
 drivers/scsi/ufs/ufs-exynos.c  |   2 +
 drivers/scsi/ufs/ufs-hisi.c|   2 +
 drivers/scsi/ufs/ufs-mediatek.c|   2 +
 drivers/scsi/ufs/ufs-qcom.c|   2 +
 drivers/scsi/ufs/ufshcd-pci.c  |  32 +-
 drivers/scsi/ufs/ufshcd.c  | 609 +++--
 drivers/scsi/ufs/ufshcd.h  |   7 +
 include/trace/events/ufs.h |  20 ++
 10 files changed, 493 insertions(+), 187 deletions(-)

diff --git a/drivers/scsi/ufs/cdns-pltfrm.c b/drivers/scsi/ufs/cdns-pltfrm.c
index 149391f..3e70c23 100644
--- a/drivers/scsi/ufs/cdns-pltfrm.c
+++ b/drivers/scsi/ufs/cdns-pltfrm.c
@@ -319,6 +319,8 @@ static const struct dev_pm_ops cdns_ufs_dev_pm_ops = {
.runtime_suspend = ufshcd_pltfrm_runtime_suspend,
.runtime_resume  = ufshcd_pltfrm_runtime_resume,
.runtime_idle= ufshcd_pltfrm_runtime_idle,
+   .prepare = ufshcd_suspend_prepare,
+   .complete   = ufshcd_resume_complete,
 };
 
 static struct platform_driver cdns_ufs_pltfrm_driver = {
diff --git a/drivers/scsi/ufs/tc-dwc-g210-pci.c 
b/drivers/scsi/ufs/tc-dwc-g210-pci.c
index 67a6a61..b01db12 100644
--- a/drivers/scsi/ufs/tc-dwc-g210-pci.c
+++ b/drivers/scsi/ufs/tc-dwc-g210-pci.c
@@ -148,6 +148,8 @@ static const struct dev_pm_ops tc_dwc_g210_pci_pm_ops = {
.runtime_suspend = tc_dwc_g210_pci_runtime_suspend,
.runtime_resume  = tc_dwc_g210_pci_runtime_resume,
.runtime_idle= tc_dwc_g210_pci_runtime_idle,
+   .prepare = ufshcd_suspend_prepare,
+   .complete   = ufshcd_resume_complete,
 };
 
 static const struct pci_device_id tc_dwc_g210_pci_tbl[] = {
diff --git a/drivers/scsi/ufs/ufs-exynos.c b/drivers/scsi/ufs/ufs-exynos.c
index 267943a1..45c0b02 100644
--- a/drivers/scsi/ufs/ufs-exynos.c
+++ b/drivers/scsi/ufs/ufs-exynos.c
@@ -1268,6 +1268,8 @@ static const struct dev_pm_ops exynos_ufs_pm_ops = {
.runtime_suspend = ufshcd_pltfrm_runtime_suspend,
.runtime_resume  = ufshcd_pltfrm_runtime_resume,
.runtime_idle= ufshcd_pltfrm_runtime_idle,
+   .prepare = ufshcd_suspend_prepare,
+   .complete   = ufshcd_resume_complete,
 };
 
 static struct platform_driver exynos_ufs_pltform = {
diff --git a/drivers/scsi/ufs/ufs-hisi.c b/drivers/scsi/ufs/ufs-hisi.c
index 0aa5813..d463b44 100644
--- a/drivers/scsi/ufs/ufs-hisi.c
+++ b/drivers/scsi/ufs/ufs-hisi.c
@@ -574,6 +574,8 @@ static const struct dev_pm_ops ufs_hisi_pm_ops = {
.runtime_suspend = ufshcd_pltfrm_runtime_suspend,
.runtime_resume  = ufshcd_pltfrm_runtime_resume,
.runtime_idle= ufshcd_pltfrm_runtime_idle,
+   .prepare = ufshcd_suspend_prepare,
+   .complete   = ufshcd_resume_complete,
 };
 
 static struct platform_driver ufs_hisi_pltform = {
diff --git a/drivers/scsi/ufs/ufs-mediatek.c b/drivers/scsi/ufs/ufs-mediatek.c
index c55202b..df1eabb 100644
--- a/drivers/scsi/ufs/ufs-mediatek.c
+++ b/drivers/scsi/ufs/ufs-mediatek.c
@@ -1097,6 +1097,8 @@ static const struct dev_pm_ops ufs_mtk_pm_ops = {
.runtime_suspend = ufshcd_pltfrm_runtime_suspend,
.runtime_resume  = ufshcd_pltfrm_runtime_resume,
.runtime_idle= ufshcd_pltfrm_runtime_idle,
+   .prepare = ufshcd_suspend_prepare,
+   .complete   = ufshcd_resume_complete,
 };
 
 static struct platform_driver ufs_mtk_pltform = {
diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c
index f97d7b0..9aa098a 100644
--- 

Re: [External] Re: [PATCH 2/5] mm: memcontrol: make page_memcg{_rcu} only applicable for non-kmem page

2021-03-01 Thread Muchun Song
On Tue, Mar 2, 2021 at 2:11 AM Shakeel Butt  wrote:
>
> On Sun, Feb 28, 2021 at 10:25 PM Muchun Song  wrote:
> >
> > We want to reuse the obj_cgroup APIs to reparent the kmem pages when
> > the memcg offlined. If we do this, we should store an object cgroup
> > pointer to page->memcg_data for the kmem pages.
> >
> > Finally, page->memcg_data can have 3 different meanings.
> >
> >   1) For the slab pages, page->memcg_data points to an object cgroups
> >  vector.
> >
> >   2) For the kmem pages (exclude the slab pages), page->memcg_data
> >  points to an object cgroup.
> >
> >   3) For the user pages (e.g. the LRU pages), page->memcg_data points
> >  to a memory cgroup.
> >
> > Currently we always get the memcg associated with a page via page_memcg
> > or page_memcg_rcu. page_memcg_check is special, it has to be used in
> > cases when it's not known if a page has an associated memory cgroup
> > pointer or an object cgroups vector. Because the page->memcg_data of
> > the kmem page is not pointing to a memory cgroup in the later patch,
> > the page_memcg and page_memcg_rcu cannot be applicable for the kmem
> > pages. In this patch, we introduce page_memcg_kmem to get the memcg
> > associated with the kmem pages. And make page_memcg and page_memcg_rcu
> > no longer apply to the kmem pages.
> >
> > In the end, there are 4 helpers to get the memcg associated with a
> > page. The usage is as follows.
> >
> >   1) Get the memory cgroup associated with a non-kmem page (e.g. the LRU
> >  pages).
> >
> >  - page_memcg()
> >  - page_memcg_rcu()
>
> Can you rename these to page_memcg_lru[_rcu] to make them explicitly
> for LRU pages?

Yes. Will do. Thanks.

>
> >
> >   2) Get the memory cgroup associated with a kmem page (exclude the slab
> >  pages).
> >
> >  - page_memcg_kmem()
> >
> >   3) Get the memory cgroup associated with a page. It has to be used in
> >  cases when it's not known if a page has an associated memory cgroup
> >  pointer or an object cgroups vector. Returns NULL for slab pages or
> >  uncharged pages, otherwise, returns memory cgroup for charged pages
> >  (e.g. kmem pages, LRU pages).
> >
> >  - page_memcg_check()
> >
> > In some place, we use page_memcg to check whether the page is charged.
> > Now we introduce page_memcg_charged helper to do this.
> >
> > This is a preparation for reparenting the kmem pages. To support reparent
> > kmem pages, we just need to adjust page_memcg_kmem and page_memcg_check in
> > the later patch.
> >
> > Signed-off-by: Muchun Song 
> > ---
> [snip]
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -855,10 +855,11 @@ void __mod_lruvec_page_state(struct page *page, enum 
> > node_stat_item idx,
> >  int val)
> >  {
> > struct page *head = compound_head(page); /* rmap on tail pages */
> > -   struct mem_cgroup *memcg = page_memcg(head);
> > +   struct mem_cgroup *memcg;
> > pg_data_t *pgdat = page_pgdat(page);
> > struct lruvec *lruvec;
> >
> > +   memcg = PageMemcgKmem(head) ? page_memcg_kmem(head) : 
> > page_memcg(head);
>
> Should page_memcg_check() be used here?

Yeah. page_memcg_check() can be used here.
But on the inside of the page_memcg_check(),
there is a READ_ONCE(). Actually, we do not
need READ_ONCE() here. So I use page_memcg
or page_memcg_kmem directly. Thanks.

>
> > /* Untracked pages have no memcg, no lruvec. Update only the node */
> > if (!memcg) {
> > __mod_node_page_state(pgdat, idx, val);
> > @@ -3170,12 +3171,13 @@ int __memcg_kmem_charge_page(struct page *page, 
> > gfp_t gfp, int order)
> >   */
> >  void __memcg_kmem_uncharge_page(struct page *page, int order)
> >  {
> > -   struct mem_cgroup *memcg = page_memcg(page);
> > +   struct mem_cgroup *memcg;
> > unsigned int nr_pages = 1 << order;
> >
> > -   if (!memcg)
> > +   if (!page_memcg_charged(page))
> > return;
> >
> > +   memcg = page_memcg_kmem(page);
> > VM_BUG_ON_PAGE(mem_cgroup_is_root(memcg), page);
> > __memcg_kmem_uncharge(memcg, nr_pages);
> > page->memcg_data = 0;
> > @@ -6831,24 +6833,25 @@ static void uncharge_batch(const struct 
> > uncharge_gather *ug)
> >  static void uncharge_page(struct page *page, struct uncharge_gather *ug)
> >  {
> > unsigned long nr_pages;
> > +   struct mem_cgroup *memcg;
> >
> > VM_BUG_ON_PAGE(PageLRU(page), page);
> >
> > -   if (!page_memcg(page))
> > +   if (!page_memcg_charged(page))
> > return;
> >
> > /*
> >  * Nobody should be changing or seriously looking at
> > -* page_memcg(page) at this point, we have fully
> > -* exclusive access to the page.
> > +* page memcg at this point, we have fully exclusive
> > +* access to the page.
> >  */
> > -
> > -   if (ug->memcg != page_memcg(page)) {
> > +   memcg = 

Re: [PATCH V8 3/4] of: unittest: Create overlay_common.dtsi and testcases_common.dtsi

2021-03-01 Thread Frank Rowand
Hi Viresh,

I commented on this patch in v7 after you had created v8.  You acknowledged
that comment and said that it will be corrected.

-Frank

On 2/12/21 5:18 AM, Viresh Kumar wrote:
> In order to build-test the same unit-test files using fdtoverlay tool,
> move the device nodes from the existing overlay_base.dts and
> testcases_common.dts files to .dtsi counterparts. The .dts files now
> include the new .dtsi files, resulting in exactly the same behavior as
> earlier.
> 
> The .dtsi files can now be reused for compile time tests using
> fdtoverlay (will be done by a later commit).
> 
> This is required because the base files passed to fdtoverlay tool
> shouldn't be overlays themselves (i.e. shouldn't have the /plugin/;
> tag).
> 
> Note that this commit also moves "testcase-device2" node to
> testcases.dts from tests-interrupts.dtsi, as this node has a deliberate
> error in it and is only relevant for runtime testing done with
> unittest.c.
> 
> Signed-off-by: Viresh Kumar 
> ---
>  drivers/of/unittest-data/overlay_base.dts | 90 +-
>  drivers/of/unittest-data/overlay_common.dtsi  | 91 +++
>  drivers/of/unittest-data/testcases.dts| 18 ++--
>  .../of/unittest-data/testcases_common.dtsi| 19 
>  .../of/unittest-data/tests-interrupts.dtsi|  7 --
>  5 files changed, 118 insertions(+), 107 deletions(-)
>  create mode 100644 drivers/of/unittest-data/overlay_common.dtsi
>  create mode 100644 drivers/of/unittest-data/testcases_common.dtsi
> 
> diff --git a/drivers/of/unittest-data/overlay_base.dts 
> b/drivers/of/unittest-data/overlay_base.dts
> index 99ab9d12d00b..ab9014589c5d 100644
> --- a/drivers/of/unittest-data/overlay_base.dts
> +++ b/drivers/of/unittest-data/overlay_base.dts
> @@ -2,92 +2,4 @@
>  /dts-v1/;
>  /plugin/;
>  
> -/*
> - * Base device tree that overlays will be applied against.
> - *
> - * Do not add any properties in node "/".
> - * Do not add any nodes other than "/testcase-data-2" in node "/".
> - * Do not add anything that would result in dtc creating node "/__fixups__".
> - * dtc will create nodes "/__symbols__" and "/__local_fixups__".
> - */
> -
> -/ {
> - testcase-data-2 {
> - #address-cells = <1>;
> - #size-cells = <1>;
> -
> - electric_1: substation@100 {
> - compatible = "ot,big-volts-control";
> - reg = < 0x0100 0x100 >;
> - status = "disabled";
> -
> - hvac_1: hvac-medium-1 {
> - compatible = "ot,hvac-medium";
> - heat-range = < 50 75 >;
> - cool-range = < 60 80 >;
> - };
> -
> - spin_ctrl_1: motor-1 {
> - compatible = "ot,ferris-wheel-motor";
> - spin = "clockwise";
> - rpm_avail = < 50 >;
> - };
> -
> - spin_ctrl_2: motor-8 {
> - compatible = "ot,roller-coaster-motor";
> - };
> - };
> -
> - rides_1: fairway-1 {
> - #address-cells = <1>;
> - #size-cells = <1>;
> - compatible = "ot,rides";
> - status = "disabled";
> - orientation = < 127 >;
> -
> - ride@100 {
> - #address-cells = <1>;
> - #size-cells = <1>;
> - compatible = "ot,roller-coaster";
> - reg = < 0x0100 0x100 >;
> - hvac-provider = < _1 >;
> - hvac-thermostat = < 29 > ;
> - hvac-zones = < 14 >;
> - hvac-zone-names = "operator";
> - spin-controller = < _ctrl_2 5 _ctrl_2 
> 7 >;
> - spin-controller-names = "track_1", "track_2";
> - queues = < 2 >;
> -
> - track@30 {
> - reg = < 0x0030 0x10 >;
> - };
> -
> - track@40 {
> - reg = < 0x0040 0x10 >;
> - };
> -
> - };
> - };
> -
> - lights_1: lights@3 {
> - compatible = "ot,work-lights";
> - reg = < 0x0003 0x1000 >;
> - status = "disabled";
> - };
> -
> - lights_2: lights@4 {
> - compatible = "ot,show-lights";
> - reg = < 0x0004 0x1000 >;
> - status = "disabled";
> - rate = < 13 138 >;
> - };
> -
> - retail_1: vending@5 {
> -   

[RESEND][PATCH v2 1/2] dma-buf: dma-heap: Provide accessor to get heap name

2021-03-01 Thread John Stultz
It can be useful to access the name for the heap,
so provide an accessor to do so.

Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-de...@lists.freedesktop.org
Signed-off-by: John Stultz 
---
v2:
* Make sure to use "const char *" as Reported-by: kernel test robot 

---
 drivers/dma-buf/dma-heap.c | 12 
 include/linux/dma-heap.h   |  9 +
 2 files changed, 21 insertions(+)

diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c
index 6b5db954569f..56bf5ad01ad5 100644
--- a/drivers/dma-buf/dma-heap.c
+++ b/drivers/dma-buf/dma-heap.c
@@ -202,6 +202,18 @@ void *dma_heap_get_drvdata(struct dma_heap *heap)
return heap->priv;
 }
 
+/**
+ * dma_heap_get_name() - get heap name
+ * @heap: DMA-Heap to retrieve private data for
+ *
+ * Returns:
+ * The char* for the heap name.
+ */
+const char *dma_heap_get_name(struct dma_heap *heap)
+{
+   return heap->name;
+}
+
 struct dma_heap *dma_heap_add(const struct dma_heap_export_info *exp_info)
 {
struct dma_heap *heap, *h, *err_ret;
diff --git a/include/linux/dma-heap.h b/include/linux/dma-heap.h
index 5bc5c946af58..0c05561cad6e 100644
--- a/include/linux/dma-heap.h
+++ b/include/linux/dma-heap.h
@@ -50,6 +50,15 @@ struct dma_heap_export_info {
  */
 void *dma_heap_get_drvdata(struct dma_heap *heap);
 
+/**
+ * dma_heap_get_name() - get heap name
+ * @heap: DMA-Heap to retrieve private data for
+ *
+ * Returns:
+ * The char* for the heap name.
+ */
+const char *dma_heap_get_name(struct dma_heap *heap);
+
 /**
  * dma_heap_add - adds a heap to dmabuf heaps
  * @exp_info:  information needed to register this heap
-- 
2.25.1



[RESEND][PATCH v2 2/2] dma-buf: heaps: Fix the name used when exporting dmabufs to be the actual heap name

2021-03-01 Thread John Stultz
By default dma_buf_export() sets the exporter name to be
KBUILD_MODNAME. Unfortunately this may not be identical to the
string used as the heap name (ie: "system" vs "system_heap").

This can cause some minor confusion with tooling, and there is
the future potential where multiple heap types may be exported
by the same module (but would all have the same name).

So to avoid all this, set the exporter exp_name to the heap name.

Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-de...@lists.freedesktop.org
Signed-off-by: John Stultz 
---
 drivers/dma-buf/heaps/cma_heap.c| 1 +
 drivers/dma-buf/heaps/system_heap.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
index 5d64eccd21d6..0c05b79870f9 100644
--- a/drivers/dma-buf/heaps/cma_heap.c
+++ b/drivers/dma-buf/heaps/cma_heap.c
@@ -339,6 +339,7 @@ static struct dma_buf *cma_heap_allocate(struct dma_heap 
*heap,
buffer->pagecount = pagecount;
 
/* create the dmabuf */
+   exp_info.exp_name = dma_heap_get_name(heap);
exp_info.ops = _heap_buf_ops;
exp_info.size = buffer->len;
exp_info.flags = fd_flags;
diff --git a/drivers/dma-buf/heaps/system_heap.c 
b/drivers/dma-buf/heaps/system_heap.c
index 29e49ac17251..23a7e74ef966 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -390,6 +390,7 @@ static struct dma_buf *system_heap_allocate(struct dma_heap 
*heap,
}
 
/* create the dmabuf */
+   exp_info.exp_name = dma_heap_get_name(heap);
exp_info.ops = _heap_buf_ops;
exp_info.size = buffer->len;
exp_info.flags = fd_flags;
-- 
2.25.1



Re: [PATCH V8 0/4] dt: Add fdtoverlay rule and statically build unittest

2021-03-01 Thread Frank Rowand
Hi Viresh,

On 3/1/21 12:56 AM, Viresh Kumar wrote:
> On 12-02-21, 16:48, Viresh Kumar wrote:
>> Hi,
>>
>> This patchset adds a generic rule for applying overlays using fdtoverlay
>> tool and then updates unittests to get built statically using the same.
>>
>> V7->V8:
>> - Patch 1 is new.
>> - Platforms need to use dtb-y += foo.dtb instead of overlay-y +=
>>   foo.dtb.
>> - Use multi_depend instead of .SECONDEXPANSION.
>> - Use dtb-y for unittest instead of overlay-y.
>> - Rename the commented dtb filess in unittest Makefile as .dtbo.
>> - Improved Makefile code (I am learning a lot every day :)
> 
> Ping!
> 

Please respin on 5.12-rc1, and pull in the change you said
you would make in response to my post v8 comment about the
v7 patches.

-Frank


Re: Why do kprobes and uprobes singlestep?

2021-03-01 Thread Andy Lutomirski
On Mon, Mar 1, 2021 at 6:22 PM Masami Hiramatsu  wrote:
>
> Hi Oleg and Andy,
>
> On Mon, 1 Mar 2021 17:51:31 +0100
> Oleg Nesterov  wrote:
>
> > Hi Andy,
> >
> > sorry for delay.
> >
> > On 02/23, Andy Lutomirski wrote:
> > >
> > > A while back, I let myself be convinced that kprobes genuinely need to
> > > single-step the kernel on occasion, and I decided that this sucked but
> > > I could live with it.  it would, however, be Really Really Nice (tm)
> > > if we could have a rule that anyone running x86 Linux who single-steps
> > > the kernel (e.g. kgdb and nothing else) gets to keep all the pieces
> > > when the system falls apart around them.  Specifically, if we don't
> > > allow kernel single-stepping and if we suitably limit kernel
> > > instruction breakpoints (the latter isn't actually a major problem),
> > > then we don't really really need to use IRET to return to the kernel,
> > > and that means we can avoid some massive NMI nastiness.
> >
> > Not sure I understand you correctly, I know almost nothing about low-level
> > x86  magic.
>
> x86 has normal interrupt and NMI. When an NMI occurs the CPU masks NMI
> (the mask itself is hidden status) and IRET releases the mask. The problem
> is that if an INT3 is hit in the NMI handler and does a single-stepping,
> it has to use IRET for atomically setting TF and return.
>
> >
> > But I guess this has nothing to do with uprobes, they do not single-step
> > in kernel mode, right?
>
> Agreed, if the problematic case is IRET from NMI handler, uprobes doesn't
> hit because it only invoked from user-space.
> Andy, what would you think?

Indeed, this isn't a problem for uprobes.  The problem for uprobes is
that all the notifiers from #DB are kind of messy, and I would like to
get rid of them if possible.


[PATCH v2] can: c_can: move runtime PM enable/disable to c_can_platform

2021-03-01 Thread Tong Zhang
Currently doing modprobe c_can_pci will make kernel complain
"Unbalanced pm_runtime_enable!", this is caused by pm_runtime_enable()
called before pm is initialized.
This fix is similar to 227619c3ff7c, move those pm_enable/disable code to
c_can_platform.

Signed-off-by: Tong Zhang 
---
 drivers/net/can/c_can/c_can.c  | 24 +---
 drivers/net/can/c_can/c_can_platform.c |  6 +-
 2 files changed, 6 insertions(+), 24 deletions(-)

diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c
index ef474bae47a1..6958830cb983 100644
--- a/drivers/net/can/c_can/c_can.c
+++ b/drivers/net/can/c_can/c_can.c
@@ -212,18 +212,6 @@ static const struct can_bittiming_const 
c_can_bittiming_const = {
.brp_inc = 1,
 };
 
-static inline void c_can_pm_runtime_enable(const struct c_can_priv *priv)
-{
-   if (priv->device)
-   pm_runtime_enable(priv->device);
-}
-
-static inline void c_can_pm_runtime_disable(const struct c_can_priv *priv)
-{
-   if (priv->device)
-   pm_runtime_disable(priv->device);
-}
-
 static inline void c_can_pm_runtime_get_sync(const struct c_can_priv *priv)
 {
if (priv->device)
@@ -1335,7 +1323,6 @@ static const struct net_device_ops c_can_netdev_ops = {
 
 int register_c_can_dev(struct net_device *dev)
 {
-   struct c_can_priv *priv = netdev_priv(dev);
int err;
 
/* Deactivate pins to prevent DRA7 DCAN IP from being
@@ -1345,28 +1332,19 @@ int register_c_can_dev(struct net_device *dev)
 */
pinctrl_pm_select_sleep_state(dev->dev.parent);
 
-   c_can_pm_runtime_enable(priv);
-
dev->flags |= IFF_ECHO; /* we support local echo */
dev->netdev_ops = _can_netdev_ops;
 
err = register_candev(dev);
-   if (err)
-   c_can_pm_runtime_disable(priv);
-   else
+   if (!err)
devm_can_led_init(dev);
-
return err;
 }
 EXPORT_SYMBOL_GPL(register_c_can_dev);
 
 void unregister_c_can_dev(struct net_device *dev)
 {
-   struct c_can_priv *priv = netdev_priv(dev);
-
unregister_candev(dev);
-
-   c_can_pm_runtime_disable(priv);
 }
 EXPORT_SYMBOL_GPL(unregister_c_can_dev);
 
diff --git a/drivers/net/can/c_can/c_can_platform.c 
b/drivers/net/can/c_can/c_can_platform.c
index 05f425ceb53a..47b251b1607c 100644
--- a/drivers/net/can/c_can/c_can_platform.c
+++ b/drivers/net/can/c_can/c_can_platform.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -386,6 +387,7 @@ static int c_can_plat_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, dev);
SET_NETDEV_DEV(dev, >dev);
 
+   pm_runtime_enable(priv->device);
ret = register_c_can_dev(dev);
if (ret) {
dev_err(>dev, "registering %s failed (err=%d)\n",
@@ -398,6 +400,7 @@ static int c_can_plat_probe(struct platform_device *pdev)
return 0;
 
 exit_free_device:
+   pm_runtime_disable(priv->device);
free_c_can_dev(dev);
 exit:
dev_err(>dev, "probe failed\n");
@@ -408,9 +411,10 @@ static int c_can_plat_probe(struct platform_device *pdev)
 static int c_can_plat_remove(struct platform_device *pdev)
 {
struct net_device *dev = platform_get_drvdata(pdev);
+   struct c_can_priv *priv = netdev_priv(dev);
 
unregister_c_can_dev(dev);
-
+   pm_runtime_disable(priv->device);
free_c_can_dev(dev);
 
return 0;
-- 
2.25.1



Re: [PATCH] can: c_can: move runtime PM enable/disable to c_can_platform

2021-03-01 Thread Tong Zhang
On Mon, Mar 1, 2021 at 2:49 PM Marc Kleine-Budde  wrote:
>
> On 28.02.2021 23:15:48, Tong Zhang wrote:
> > Currently doing modprobe c_can_pci will make kernel complain
> > "Unbalanced pm_runtime_enable!", this is caused by pm_runtime_enable()
> > called before pm is initialized in register_candev() and doing so will
>
> I don't see where register_candev() is doing any pm related
> initialization.
>
> > also cause it to enable twice.
>
> > This fix is similar to 227619c3ff7c, move those pm_enable/disable code to
> > c_can_platform.
>
> As I understand 227619c3ff7c ("can: m_can: move runtime PM
> enable/disable to m_can_platform"), PCI devices automatically enable PM,
> when the "PCI device is added".

Hi Marc,
Thanks for the comments. I thinks you are right -- I was mislead by the trace --
I have corrected the commit log along with the indent fix in v2 patch.
Thanks again for your help,
- Tong

>
> Please clarify the above point, otherwise the code looks OK, small
> nitpick inline:


[PATCH] clocksource: don't run watchdog forever

2021-03-01 Thread Feng Tang
clocksource watchdog runs every 500ms, which creates some OS noise.
As the clocksource wreckage (especially for those that has per-cpu
reading hook) usually happens shortly after CPU is brought up or
after system resumes from sleep state, so add a time limit for
clocksource watchdog to only run for a period of time, and make
sure it run at least twice for each CPU.

Regarding performance data, there is no improvement data with the
micro-benchmarks we have like hackbench/netperf/fio/will-it-scale
etc. But it obviously reduces periodic timer interrupts, and may
help in following cases:
* When some CPUs are isolated to only run scientific or high
  performance computing tasks on a NOHZ_FULL kernel, where there
  is almost no interrupts, this could make it more quiet
* On a cluster which runs a lot of systems in parallel with
  barriers there are always enough systems which run the watchdog
  and make everyone else wait

Signed-off-by: Feng Tang 
Reviewed-by: Andi Kleen 
---
 include/linux/clocksource.h |  7 +++
 kernel/cpu.c|  3 +++
 kernel/time/clocksource.c   | 31 +--
 3 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 86d143d..cf428a2 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -252,6 +252,13 @@ static inline void __clocksource_update_freq_khz(struct 
clocksource *cs, u32 khz
__clocksource_update_freq_scale(cs, 1000, khz);
 }
 
+#ifdef CONFIG_CLOCKSOURCE_WATCHDOG
+extern void clocksource_kick_watchdog(void);
+#else
+static inline void clocksource_kick_watchdog(void) { }
+#endif
+
+
 #ifdef CONFIG_ARCH_CLOCKSOURCE_INIT
 extern void clocksource_arch_init(struct clocksource *cs);
 #else
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 1b6302e..fdf3c69 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #define CREATE_TRACE_POINTS
@@ -1286,6 +1287,8 @@ static int cpu_up(unsigned int cpu, enum cpuhp_state 
target)
}
 
err = _cpu_up(cpu, 0, target);
+
+   clocksource_kick_watchdog();
 out:
cpu_maps_update_done();
return err;
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index cce484a..aba985a 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -104,6 +104,7 @@ static DECLARE_WORK(watchdog_work, 
clocksource_watchdog_work);
 static DEFINE_SPINLOCK(watchdog_lock);
 static int watchdog_running;
 static atomic_t watchdog_reset_pending;
+static unsigned long watchdog_stop_time;   /* in jiffies */
 
 static inline void clocksource_watchdog_lock(unsigned long *flags)
 {
@@ -295,10 +296,16 @@ static void clocksource_watchdog(struct timer_list 
*unused)
next_cpu = cpumask_first(cpu_online_mask);
 
/*
-* Arm timer if not already pending: could race with concurrent
-* pair clocksource_stop_watchdog() clocksource_start_watchdog().
+* Arm timer if not already pending or pass the check time window:
+* could race with concurrent pair clocksource_stop_watchdog()
+* clocksource_start_watchdog().
 */
if (!timer_pending(_timer)) {
+   if (time_after(jiffies, watchdog_stop_time)) {
+   atomic_inc(_reset_pending);
+   watchdog_running = 0;
+   goto out;
+   }
watchdog_timer.expires += WATCHDOG_INTERVAL;
add_timer_on(_timer, next_cpu);
}
@@ -308,6 +315,16 @@ static void clocksource_watchdog(struct timer_list *unused)
 
 static inline void clocksource_start_watchdog(void)
 {
+   unsigned long check_ticks;
+
+   /*
+* As all CPUs will be looped to run the timer, make sure each
+* CPU can run the timer twice, and the check run for at least
+* 10 minutes.
+*/
+   check_ticks = max_t(unsigned long, num_possible_cpus(), 600) * HZ;
+   watchdog_stop_time = jiffies + check_ticks;
+
if (watchdog_running || !watchdog || list_empty(_list))
return;
timer_setup(_timer, clocksource_watchdog, 0);
@@ -324,6 +341,15 @@ static inline void clocksource_stop_watchdog(void)
watchdog_running = 0;
 }
 
+void clocksource_kick_watchdog(void)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(_lock, flags);
+   clocksource_start_watchdog();
+   spin_unlock_irqrestore(_lock, flags);
+}
+
 static inline void clocksource_reset_watchdog(void)
 {
struct clocksource *cs;
@@ -618,6 +644,7 @@ void clocksource_resume(void)
cs->resume(cs);
 
clocksource_resume_watchdog();
+   clocksource_kick_watchdog();
 }
 
 /**
-- 
2.7.4



Re: [PATCH v2 1/4] KVM: vmx/pmu: Add MSR_ARCH_LBR_DEPTH emulation for Arch LBR

2021-03-01 Thread Like Xu

On 2021/3/2 6:34, Sean Christopherson wrote:

On Wed, Feb 03, 2021, Like Xu wrote:

@@ -348,10 +352,26 @@ static bool intel_pmu_handle_lbr_msrs_access(struct 
kvm_vcpu *vcpu,
return true;
  }
  
+/*

+ * Check if the requested depth values is supported
+ * based on the bits [0:7] of the guest cpuid.1c.eax.
+ */
+static bool arch_lbr_depth_is_valid(struct kvm_vcpu *vcpu, u64 depth)
+{
+   struct kvm_cpuid_entry2 *best;
+
+   best = kvm_find_cpuid_entry(vcpu, 0x1c, 0);
+   if (depth && best)



+   return (best->eax & 0xff) & (1ULL << (depth / 8 - 1));


I believe this will genereate undefined behavior if depth > 64.  Or if depth < 
8.
And I believe this check also needs to enforce that depth is a multiple of 8.

For each bit n set in this field, the IA32_LBR_DEPTH.DEPTH value 8*(n+1) is
supported.

Thus it's impossible for 0-7, 9-15, etc... to be legal depths.


Thank you! How about:

best = kvm_find_cpuid_entry(vcpu, 0x1c, 0);
if (best && depth && !(depth % 8))
return (best->eax & 0xff) & (1ULL << (depth / 8 - 1));

return false;





+
+   return false;
+}
+






Re: [PATCH] dma-buf: heaps: Set VM_PFNMAP in mmap for system and cma heaps

2021-03-01 Thread John Stultz
On Sat, Feb 27, 2021 at 1:44 AM Christoph Hellwig  wrote:
>
> On Fri, Feb 26, 2021 at 08:36:55AM +0100, Daniel Vetter wrote:
> > Also given that both deal with struct page there's a ton of divergence
> > between these two that doesn't make much sense. Maybe could even share
> > the code fully, aside from how you allocate the struct pages.
>
> I've been saying that since the code was first submitted.  Once pages
> are allocated from CMA they should be treated not different from normal
> pages.
>
> Please take a look at how the DMA contigous allocator manages to share
> all code for handling CMA vs alloc_pages pages.

I'll take a look at that! Thanks for the pointer!
-john


[PATCH] x86/tsc: mark tsc reliable for qualified platforms

2021-03-01 Thread Feng Tang
There are cases that tsc clocksource are wrongly judged as unstable by
clocksource watchdogs like hpet, acpi_pm or 'refined-jiffies'. While
there is hardly a general reliable way to check the validity of a
watchdog, and to protect the innocent tsc, Thomas Gleixner proposed [1]:

"I'm inclined to lift that requirement when the CPU has:

1) X86_FEATURE_CONSTANT_TSC
2) X86_FEATURE_NONSTOP_TSC
3) X86_FEATURE_NONSTOP_TSC_S3
4) X86_FEATURE_TSC_ADJUST
5) At max. 4 sockets

 After two decades of horrors we're finally at a point where TSC seems
 to be halfway reliable and less abused by BIOS tinkerers. TSC_ADJUST
 was really key as we can now detect even small modifications reliably
 and the important point is that we can cure them as well (not pretty
 but better than all other options)."

So implement it with slight change as discussed in the thread, and be
more defensive to use maxim of 2 sockets.

The check is done inside tsc_init() before registering 'tsc-early' and
'tsc' clocksources, as there were cases that both of them have been
wrongly judged as unreliable.

[1]. https://lore.kernel.org/lkml/87eekfk8bd@nanos.tec.linutronix.de/
Signed-off-by: Feng Tang 
Reviewed-by: Andi Kleen 
---
 arch/x86/kernel/tsc.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index f70dffc..a7e3980 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1193,6 +1193,17 @@ static void __init check_system_tsc_reliable(void)
 #endif
if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE))
tsc_clocksource_reliable = 1;
+
+   /*
+* Ideally the socket number should be checked, but this is called
+* by tsc_init() which is in early boot phase and the socket numbers
+* may not be available. Use 'nr_online_nodes' as a fallback solution
+*/
+   if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC)
+   && boot_cpu_has(X86_FEATURE_NONSTOP_TSC)
+   && boot_cpu_has(X86_FEATURE_TSC_ADJUST)
+   && nr_online_nodes <= 2)
+   tsc_clocksource_reliable = 1;
 }
 
 /*
-- 
2.7.4



Re: [External] Re: [PATCH 0/5] Use obj_cgroup APIs to change kmem pages

2021-03-01 Thread Muchun Song
On Tue, Mar 2, 2021 at 9:12 AM Roman Gushchin  wrote:
>
> Hi Muchun!
>
> On Mon, Mar 01, 2021 at 02:22:22PM +0800, Muchun Song wrote:
> > Since Roman series "The new cgroup slab memory controller" applied. All
> > slab objects are changed via the new APIs of obj_cgroup. This new APIs
> > introduce a struct obj_cgroup instead of using struct mem_cgroup directly
> > to charge slab objects. It prevents long-living objects from pinning the
> > original memory cgroup in the memory. But there are still some corner
> > objects (e.g. allocations larger than order-1 page on SLUB) which are
> > not charged via the API of obj_cgroup. Those objects (include the pages
> > which are allocated from buddy allocator directly) are charged as kmem
> > pages which still hold a reference to the memory cgroup.
>
> Yes, this is a good idea, large kmallocs should be treated the same
> way as small ones.
>
> >
> > E.g. We know that the kernel stack is charged as kmem pages because the
> > size of the kernel stack can be greater than 2 pages (e.g. 16KB on x86_64
> > or arm64). If we create a thread (suppose the thread stack is charged to
> > memory cgroup A) and then move it from memory cgroup A to memory cgroup
> > B. Because the kernel stack of the thread hold a reference to the memory
> > cgroup A. The thread can pin the memory cgroup A in the memory even if
> > we remove the cgroup A. If we want to see this scenario by using the
> > following script. We can see that the system has added 500 dying cgroups.
> >
> >   #!/bin/bash
> >
> >   cat /proc/cgroups | grep memory
> >
> >   cd /sys/fs/cgroup/memory
> >   echo 1 > memory.move_charge_at_immigrate
> >
> >   for i in range{1..500}
> >   do
> >   mkdir kmem_test
> >   echo $$ > kmem_test/cgroup.procs
> >   sleep 3600 &
> >   echo $$ > cgroup.procs
> >   echo `cat kmem_test/cgroup.procs` > cgroup.procs
> >   rmdir kmem_test
> >   done
> >
> >   cat /proc/cgroups | grep memory
>
> Well, moving processes between cgroups always created a lot of issues
> and corner cases and this one is definitely not the worst. So this problem
> looks a bit artificial, unless I'm missing something. But if it doesn't
> introduce any new performance costs and doesn't make the code more complex,
> I have nothing against.

OK. I just want to show that large kmallocs are charged as kmem pages.
So I constructed this test case.

>
> Btw, can you, please, run the spell-checker on commit logs? There are many
> typos (starting from the title of the series, I guess), which make the 
> patchset
> look less appealing.

Sorry for my poor English. I will do that. Thanks for your suggestions.


>
> Thank you!
>
> >
> > This patchset aims to make those kmem pages drop the reference to memory
> > cgroup by using the APIs of obj_cgroup. Finally, we can see that the number
> > of the dying cgroups will not increase if we run the above test script.
> >
> > Patch 1-3 are using obj_cgroup APIs to charge kmem pages. The remote
> > memory cgroup charing APIs is a mechanism to charge kernel memory to a
> > given memory cgroup. So I also make it use the APIs of obj_cgroup.
> > Patch 4-5 are doing this.
> >
> > Muchun Song (5):
> >   mm: memcontrol: introduce obj_cgroup_{un}charge_page
> >   mm: memcontrol: make page_memcg{_rcu} only applicable for non-kmem
> > page
> >   mm: memcontrol: reparent the kmem pages on cgroup removal
> >   mm: memcontrol: move remote memcg charging APIs to CONFIG_MEMCG_KMEM
> >   mm: memcontrol: use object cgroup for remote memory cgroup charging
> >
> >  fs/buffer.c  |  10 +-
> >  fs/notify/fanotify/fanotify.c|   6 +-
> >  fs/notify/fanotify/fanotify_user.c   |   2 +-
> >  fs/notify/group.c|   3 +-
> >  fs/notify/inotify/inotify_fsnotify.c |   8 +-
> >  fs/notify/inotify/inotify_user.c |   2 +-
> >  include/linux/bpf.h  |   2 +-
> >  include/linux/fsnotify_backend.h |   2 +-
> >  include/linux/memcontrol.h   | 109 +++---
> >  include/linux/sched.h|   6 +-
> >  include/linux/sched/mm.h |  30 ++--
> >  kernel/bpf/syscall.c |  35 ++---
> >  kernel/fork.c|   4 +-
> >  mm/memcontrol.c  | 276 
> > ++-
> >  mm/page_alloc.c  |   4 +-
> >  15 files changed, 324 insertions(+), 175 deletions(-)
> >
> > --
> > 2.11.0
> >


Re: [PATCH 5/7] printk: Make %pS and friends print module build ID

2021-03-01 Thread Steven Rostedt
On Mon,  1 Mar 2021 09:47:47 -0800
Stephen Boyd  wrote:

> The %pS printk format (among some others) is used to print kernel
> addresses symbolically. When the kernel prints an address inside of a
> module, the kernel prints the addresses' symbol name along with the
> module's name that contains the address. Let's make kernel stacktraces
> easier to identify on KALLSYMS builds by including the build ID of a
> module when we print the address.

Please no!

This kills the output of tracing with offset, and can possibly break
scripts. I don't want to look at traces like this!

  -0   [004] ..s1   353.842577: ipv4_conntrack_in+0x0/0x10 
[nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) 
<-nf_hook_slow+0x40/0xb0
  -0   [004] ..s1   353.842577: nf_conntrack_in+0x0/0x5c0 
[nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) 
<-nf_hook_slow+0x40/0xb0
  -0   [004] ..s1   353.842577: get_l4proto+0x0/0x190 
[nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) 
<-nf_conntrack_in+0x92/0x5c0 [nf_conntrack] 
(3b39eb771b2566331887f671c741f90bfba0b051)
  -0   [004] ..s1   353.842577: nf_ct_get_tuple+0x0/0x240 
[nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) 
<-nf_conntrack_in+0xec/0x5c0 [nf_conntrack] 
(3b39eb771b2566331887f671c741f90bfba0b051)
  -0   [004] ..s1   353.842577: hash_conntrack_raw+0x0/0x170 
[nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) 
<-nf_conntrack_in+0x28c/0x5c0 [nf_conntrack] 
(3b39eb771b2566331887f671c741f90bfba0b051)
  -0   [004] ..s1   353.842578: 
__nf_conntrack_find_get.isra.0+0x0/0x2f0 [nf_conntrack] 
(3b39eb771b2566331887f671c741f90bfba0b051) <-nf_conntrack_in+0x29d/0x5c0 
[nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051)
  -0   [004] ..s1   353.842578: 
nf_conntrack_tcp_packet+0x0/0x1760 [nf_conntrack] 
(3b39eb771b2566331887f671c741f90bfba0b051) <-nf_conntrack_in+0x3c8/0x5c0 
[nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051)
  -0   [004] ..s2   353.842578: nf_ct_seq_offset+0x0/0x40 
[nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) 
<-nf_conntrack_tcp_packet+0x26d/0x1760 [nf_conntrack] 
(3b39eb771b2566331887f671c741f90bfba0b051)
  -0   [004] ..s1   353.842578: __nf_ct_refresh_acct+0x0/0x50 
[nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) 
<-nf_conntrack_tcp_packet+0x558/0x1760 [nf_conntrack] 
(3b39eb771b2566331887f671c741f90bfba0b051)

 NACK!

-- Steve


Re: [x86, build] 6dafca9780: WARNING:at_arch/x86/kernel/ftrace.c:#ftrace_verify_code

2021-03-01 Thread Steven Rostedt
On Mon, 1 Mar 2021 18:40:32 -0800
Sami Tolvanen  wrote:

> On Mon, Mar 01, 2021 at 08:15:26PM -0500, Steven Rostedt wrote:
> > On Mon, 1 Mar 2021 16:03:51 -0800
> > Sami Tolvanen  wrote:  
> > >   
> > > > ret = ftrace_verify_code(rec->ip, old);
> > > > +
> > > > +   if (__is_defined(CC_USING_NOP_MCOUNT) && ret && 
> > > > old_nop) {
> > > > +   /* Compiler could have put in P6_NOP5 */
> > > > +   old = P6_NOP5;
> > > > +   ret = ftrace_verify_code(rec->ip, old);
> > > > +   }
> > > > +
> > > 
> > > Wouldn't that still hit WARN(1) in the initial ftrace_verify_code()
> > > call if ideal_nops doesn't match?  
> > 
> > That was too quickly written ;-)
> > 
> > Take 2:
> > 
> > [ with fixes for setting p6_nop ]  
> 
> Thanks, I tested this with the config from the build bot, and I can
> confirm that it fixes the issue for me.
> 
> I also tested a quick patch to disable the __fentry__ conversion in
> objtool, and it seems to work too, but it's probably a good idea to
> fix the issue with CC_USING_NOP_MCOUNT in any case.

Thanks for testing, I'll make this into a proper patch and start
testing it internally. I'm assuming you want this to go into the -rc
release and possibly stable?

-- Steve


RE: [PATCH] exfat: fix erroneous discard when clear cluster bit

2021-03-01 Thread Sungjong Seo
> Subject: [PATCH] exfat: fix erroneous discard when clear cluster bit
> 
> If mounted with discard option, exFAT issues discard command when clear
> cluster bit to remove file. But the input parameter of cluster-to-sector
> calculation is abnormally adds reserved cluster size which is 2, leading
> to discard unrelated sectors included in target+2 cluster.
> 
> Fixes: 1e49a94cf707 ("exfat: add bitmap operations")
> Signed-off-by: Hyeongseok Kim 
> ---
>  fs/exfat/balloc.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)

Looks good.

Acked-by: Sungjong Seo 

Thanks for your work!
Could you remove the wrong comments above set/clear/find bitmap functions
together?



Re: Question about the "EXPERIMENTAL" tag for dax in XFS

2021-03-01 Thread Dave Chinner
On Mon, Mar 01, 2021 at 04:32:36PM -0800, Dan Williams wrote:
> On Mon, Mar 1, 2021 at 2:47 PM Dave Chinner  wrote:
> > Now we have the filesytem people providing a mechanism for the pmem
> > devices to tell the filesystems about physical device failures so
> > they can handle such failures correctly themselves. Having the
> > device go away unexpectedly from underneath a mounted and active
> > filesystem is a *device failure*, not an "unplug event".
> 
> It's the same difference to the physical page, all mappings to that
> page need to be torn down. I'm happy to call an fs callback and let
> each filesystem do what it wants with a "every pfn in this dax device
> needs to be unmapped".

You keep talking like this is something specific to a DAX device.
It isn't - the filesystem needs to take specific actions if any type
of block device reports that it has a corrupted range, not just DAX.
A DAX device simply adds "and invalidate direct mappings" to the
list of stuff that needs to be done.

And as far as a filesystem is concerned, there is no difference
between "this 4kB range is bad" and "the range of this entire device
is bad". We have to do the same things in both situations.

> I'm looking at the ->corrupted_range() patches trying to map it to
> this use case and I don't see how, for example a realtime-xfs over DM
> over multiple PMEM gets the notification to the right place.
> bd_corrupted_range() uses get_super() which get the wrong answer for
> both realtime-xfs and DM.

I'm not sure I follow your logic. What is generating the wrong
answer?

We already have infrastructure for the block device to look up the
superblock mounted on top of it, an DM already uses that for things
like "dmsetup suspend" to freeze the filesystem before it does
something.  This "superblock lookup" only occurs for the top level
DM device, not for the component pmem devices that make up the DM
device.


IOWs, if there's a DM device that maps multiple pmem devices, then
it should be stacking the bd_corrupted_range() callbacks to map the
physical device range to the range in the higher level DM device
that belongs to. This mapping of ranges is what DM exists to do -
the filesystem has no clue about what devices make up a DM device,
so the DM device *must* translate ranges for component devices into
the ranges that it maps that device into the LBA range it exposes to
the filesystem.

> I'd flip that arrangement around and have the FS tell the block device
> "if something happens to you, here is the super_block to notify".

We already have a mechanism for this that the block device calls:
get_active_super(bdev). There can be only one superblock per block
device - the superblock has exclusive ownership of the block device
while the filesystem is mounted.

get_active_super() returns the superblock that sits on top of the
bdev with an active reference, allowing the caller to safely access
and operate on the sueprblock without having to worry about the
superblock going away in the middle of whatever operation the block
device needs to perform.

If this isn't working, then existing storage stack functionality
doesn't work as it should and this needs fixing independently of
the PMEM/DAX stuff we are talking about here.

> So
> to me this looks like a fs_dax_register_super() helper that plumbs the
> superblock through an arbitrary stack of block devices to the leaf
> block-device that might want to send a notification up when a global
> unmap operation needs to be performed.

No, this is just wrong. The filesystem has no clue what block device
is at the leaf level of a block device stack, nor what LBA block
range represents that device within the address space the stacked
block devices present to the filesystem.

> > Please listen when we say "that is
> > not sufficient" because we don't want to be backed into a corner
> > that we have to fix ourselves again before we can enable some basic
> > filesystem functionality that we should have been able to support on
> > DAX from the start...
> 
> That's some revisionist interpretation of how the discovery of the
> reflink+dax+memory-error-handling collision went down.
> 
> The whole point of this discussion is to determine if
> ->corrupted_range() is suitable for this notification, and looking at
> the code as is, it isn't. Perhaps you have a different implementation
> of ->corrupted_range() in mind that allows this to be plumbed
> correctly?

So rather than try to make the generic ->corrupted_range
infrastructure be able to report "DAX range is invalid" (which is
the very definition of a corrupted block device range!), you want
to introduce a DAX specific notification to tell us that a range in
the block device contains invalid/corrupt data?

We're talking about a patchset that is in development. The proposed
notification path is supposed to be generic and *not PMEM specific*,
and is intended to handle exactly the use case that you raised.
The implementation may not be perfect yet, so rather 

Re: [x86, build] 6dafca9780: WARNING:at_arch/x86/kernel/ftrace.c:#ftrace_verify_code

2021-03-01 Thread Sami Tolvanen
On Mon, Mar 01, 2021 at 08:15:26PM -0500, Steven Rostedt wrote:
> On Mon, 1 Mar 2021 16:03:51 -0800
> Sami Tolvanen  wrote:
> > 
> > > ret = ftrace_verify_code(rec->ip, old);
> > > +
> > > +   if (__is_defined(CC_USING_NOP_MCOUNT) && ret && old_nop) {
> > > +   /* Compiler could have put in P6_NOP5 */
> > > +   old = P6_NOP5;
> > > +   ret = ftrace_verify_code(rec->ip, old);
> > > +   }
> > > +  
> > 
> > Wouldn't that still hit WARN(1) in the initial ftrace_verify_code()
> > call if ideal_nops doesn't match?
> 
> That was too quickly written ;-)
> 
> Take 2:
> 
> [ with fixes for setting p6_nop ]

Thanks, I tested this with the config from the build bot, and I can
confirm that it fixes the issue for me.

I also tested a quick patch to disable the __fentry__ conversion in
objtool, and it seems to work too, but it's probably a good idea to
fix the issue with CC_USING_NOP_MCOUNT in any case.

diff --git a/Makefile b/Makefile
index f9b54da2fca0..536fea073d5b 100644
--- a/Makefile
+++ b/Makefile
@@ -863,9 +863,6 @@ ifdef CONFIG_FTRACE_MCOUNT_USE_CC
 endif
   endif
 endif
-ifdef CONFIG_FTRACE_MCOUNT_USE_OBJTOOL
-  CC_FLAGS_USING   += -DCC_USING_NOP_MCOUNT
-endif
 ifdef CONFIG_FTRACE_MCOUNT_USE_RECORDMCOUNT
   ifdef CONFIG_HAVE_C_RECORDMCOUNT
 BUILD_C_RECORDMCOUNT := y
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 068cdb41f76f..497e00c1cb69 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1047,21 +1047,9 @@ static int add_call_destinations(struct objtool_file 
*file)
insn->type = INSN_NOP;
}
 
-   if (mcount && !strcmp(insn->call_dest->name, "__fentry__")) {
-   if (reloc) {
-   reloc->type = R_NONE;
-   elf_write_reloc(file->elf, reloc);
-   }
-
-   elf_write_insn(file->elf, insn->sec,
-  insn->offset, insn->len,
-  arch_nop_insn(insn->len));
-
-   insn->type = INSN_NOP;
-
+   if (mcount && !strcmp(insn->call_dest->name, "__fentry__"))
list_add_tail(>mcount_loc_node,
  >mcount_loc_list);
-   }
 
/*
 * Whatever stack impact regular CALLs have, should be undone

Sami


Re: [PATCH 1/2] MIPS: Remove KVM_GUEST support

2021-03-01 Thread Jiaxun Yang




在 2021/3/1 下午11:29, Thomas Bogendoerfer 写道:

KVM_GUEST is broken and unmaintained, so let's remove it.

Signed-off-by: Thomas Bogendoerfer 


Reviewed-by: Jiaxun Yang 

I'll prepare a patch for KVM side removal.

Thanks.

- Jiaxun


[PATCH] x86/sgx: Fix a resource leak in sgx_init()

2021-03-01 Thread Jarkko Sakkinen
If sgx_page_cache_init() fails in the middle, a trivial return statement
causes unused memory and virtual address space reserved for the EPC
section, not freed. Fix this by using the same rollback, as when
sgx_page_reclaimer_init() fails.

Cc: sta...@vger.kernel.org # 5.11
Fixes: e7e0545299d8 ("x86/sgx: Initialize metadata for Enclave Page Cache (EPC) 
sections")
Signed-off-by: Jarkko Sakkinen 
---
 arch/x86/kernel/cpu/sgx/main.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 8df81a3ed945..52d070fb4c9a 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -708,8 +708,10 @@ static int __init sgx_init(void)
if (!cpu_feature_enabled(X86_FEATURE_SGX))
return -ENODEV;
 
-   if (!sgx_page_cache_init())
-   return -ENOMEM;
+   if (!sgx_page_cache_init()) {
+   ret = -ENOMEM;
+   goto err_page_cache;
+   }
 
if (!sgx_page_reclaimer_init()) {
ret = -ENOMEM;
-- 
2.30.1



[tip:master] BUILD SUCCESS 87246c319da5db38d74d239c3cffce49942dd5a8

2021-03-01 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master
branch HEAD: 87246c319da5db38d74d239c3cffce49942dd5a8  Merge branch 
'locking/core'

elapsed time: 723m

configs tested: 103
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
sh   se7724_defconfig
mips   xway_defconfig
armrealview_defconfig
mipsvocore2_defconfig
s390   zfcpdump_defconfig
xtensaxip_kc705_defconfig
powerpc  katmai_defconfig
sparc   sparc32_defconfig
armzeus_defconfig
arc haps_hs_defconfig
arm   cns3420vb_defconfig
powerpc tqm8555_defconfig
mips  fuloong2e_defconfig
shedosk7760_defconfig
riscvnommu_virt_defconfig
mips rt305x_defconfig
arm axm55xx_defconfig
sparcalldefconfig
armhisi_defconfig
m68kstmark2_defconfig
sh   se7750_defconfig
ia64defconfig
sh   sh7770_generic_defconfig
arm  alldefconfig
sparc   defconfig
sparc64 defconfig
shapsh4ad0a_defconfig
powerpc canyonlands_defconfig
sh sh7710voipgw_defconfig
mips decstation_r4k_defconfig
ia64 allmodconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
i386   tinyconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a005-20210301
i386 randconfig-a003-20210301
i386 randconfig-a002-20210301
i386 randconfig-a004-20210301
i386 randconfig-a006-20210301
i386 randconfig-a001-20210301
x86_64   randconfig-a013-20210301
x86_64   randconfig-a016-20210301
x86_64   randconfig-a015-20210301
x86_64   randconfig-a014-20210301
x86_64   randconfig-a012-20210301
x86_64   randconfig-a011-20210301
i386 randconfig-a016-20210301
i386 randconfig-a012-20210301
i386 randconfig-a014-20210301
i386 randconfig-a013-20210301
i386 randconfig-a011-20210301
i386 randconfig-a015-20210301
riscvnommu_k210_defconfig
riscvallyesconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
riscvallmodconfig
x86_64   allyesconfig
x86_64rhel-7.6-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-a006-20210301
x86_64   randconfig-a001-20210301
x86_64   randconfig-a004-20210301
x86_64

Re: [PATCH 4.19 000/247] 4.19.178-rc1 review

2021-03-01 Thread Hanjun Guo

On 2021/3/2 0:10, Greg Kroah-Hartman wrote:

This is the start of the stable review cycle for the 4.19.178 release.
There are 247 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed, 03 Mar 2021 16:09:49 +.
Anything received after that time might be too late.


Tested both on x86 [1] and ARM64 [2] server,

Tested-by: Hulk Robot 

Thanks
Hanjun

[1]:
Arch: x86 (confirmed that no kernel failures)

Testcase Result Summary:
total_num: 4681
succeed_num: 4677
failed_num: 4
timeout_num: 1


[2]:
Arch: arm64

Testcase Result Summary:
total_num: 4675
succeed_num: 4675
failed_num: 0
timeout_num: 0





[PATCH] perf diff: Don't crash on freeing errno-session

2021-03-01 Thread Dmitry Safonov
__cmd_diff() sets result of perf_session__new() to d->session.
In case of failure, it's errno and perf-diff may crash with:
failed to open perf.data: Permission denied
Failed to open perf.data
Segmentation fault (core dumped)

>From the coredump:
0  0x5569a62b5955 in auxtrace__free (session=0x)
at util/auxtrace.c:2681
1  0x5569a626b37d in perf_session__delete (session=0x)
at util/session.c:295
2  perf_session__delete (session=0x) at util/session.c:291
3  0x5569a618008a in __cmd_diff () at builtin-diff.c:1239
4  cmd_diff (argc=, argv=) at builtin-diff.c:2011
[..]

Funny enough, it won't always crash. For me it crashes only if failed
file is second in cmd-line: the reason is that cmd_diff() check files for
branch-stacks [in check_file_brstack()] and if the first file doesn't
have brstacks, it doesn't proceed to try open other files from cmd-line.

Check d->session before calling perf_session__delete().

Another solution would be assigning to temporary variable, checking it,
but I find it easier to follow with IS_ERR() check in the same function.
After some time it's still obvious why the check is needed, and with
temp variable it's possible to make the same mistake.

Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Ingo Molnar 
Cc: Jiri Olsa 
Cc: Mark Rutland 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Signed-off-by: Dmitry Safonov 
---
 tools/perf/builtin-diff.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index cefc71506409..b0c57e55052d 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -1236,7 +1236,8 @@ static int __cmd_diff(void)
 
  out_delete:
data__for_each_file(i, d) {
-   perf_session__delete(d->session);
+   if (!IS_ERR(d->session))
+   perf_session__delete(d->session);
data__free(d);
}
 
-- 
2.30.1



[tip:locking/core] BUILD SUCCESS 6bf3195fdbab92b57f3167101a0b651b93dbeae7

2021-03-01 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
locking/core
branch HEAD: 6bf3195fdbab92b57f3167101a0b651b93dbeae7  locking/csd_lock: Add 
more data to CSD lock debugging

elapsed time: 723m

configs tested: 96
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
sh   se7724_defconfig
mips   xway_defconfig
armrealview_defconfig
mipsvocore2_defconfig
s390   zfcpdump_defconfig
xtensaxip_kc705_defconfig
powerpc  katmai_defconfig
sparc   sparc32_defconfig
mips rt305x_defconfig
arm axm55xx_defconfig
sparcalldefconfig
armhisi_defconfig
m68kstmark2_defconfig
sh   se7750_defconfig
ia64defconfig
sh   sh7770_generic_defconfig
arm  alldefconfig
nios2alldefconfig
powerpc   ebony_defconfig
powerpc mpc8313_rdb_defconfig
riscvnommu_virt_defconfig
powerpc mpc834x_mds_defconfig
ia64 allmodconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386   tinyconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a005-20210301
i386 randconfig-a003-20210301
i386 randconfig-a002-20210301
i386 randconfig-a004-20210301
i386 randconfig-a006-20210301
i386 randconfig-a001-20210301
x86_64   randconfig-a013-20210301
x86_64   randconfig-a016-20210301
x86_64   randconfig-a015-20210301
x86_64   randconfig-a014-20210301
x86_64   randconfig-a012-20210301
x86_64   randconfig-a011-20210301
i386 randconfig-a016-20210301
i386 randconfig-a012-20210301
i386 randconfig-a014-20210301
i386 randconfig-a013-20210301
i386 randconfig-a011-20210301
i386 randconfig-a015-20210301
riscvnommu_k210_defconfig
riscvallyesconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
riscvallmodconfig
x86_64   allyesconfig
x86_64rhel-7.6-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-a006-20210301
x86_64   randconfig-a001-20210301
x86_64   randconfig-a004-20210301
x86_64   randconfig-a002-20210301
x86_64   randconfig-a005-20210301
x86_64   randconfig-a003-20210301

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


Invitation to Bid

2021-03-01 Thread pfizer
Good Day Sir/Ms,

We are please to invite you or your company to quote the following item listed
below:

Product/Model No: AU829L Altitude and Level Control Valves
MPodel No: AU829L
Qty. 30 units

Compulsory, Kindly send your quotation to: E-mail:bour...@aol.com  for 
immediate approval.

Kind Regards,
Albert Bourla
PFIZER B.V Supply Chain Manager
Tel: +31(0)208080 880
E-mail: bour...@aol.com
ADDRESS: Rivium Westlaan 142, 2909 LD
Capelle aan den IJssel, Netherlands


RE: [PATCH net v4 1/1] can: can_skb_set_owner(): fix ref counting if socket was closed before setting skb ownership

2021-03-01 Thread Joakim Zhang

> -Original Message-
> From: Joakim Zhang 
> Sent: 2021年3月1日 18:57
> To: Oleksij Rempel ; m...@pengutronix.de; David S.
> Miller ; Jakub Kicinski ; Oliver
> Hartkopp ; Robin van der Gracht
> 
> Cc: Andre Naujoks ; Eric Dumazet
> ; ker...@pengutronix.de; linux-...@vger.kernel.org;
> net...@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: RE: [PATCH net v4 1/1] can: can_skb_set_owner(): fix ref counting if
> socket was closed before setting skb ownership
> 
> 
> > -Original Message-
> > From: Oleksij Rempel 
> > Sent: 2021年2月26日 17:25
> > To: m...@pengutronix.de; David S. Miller ; Jakub
> > Kicinski ; Oliver Hartkopp ;
> > Robin van der Gracht 
> > Cc: Oleksij Rempel ; Andre Naujoks
> > ; Eric Dumazet ;
> > ker...@pengutronix.de; linux-...@vger.kernel.org;
> > net...@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: [PATCH net v4 1/1] can: can_skb_set_owner(): fix ref counting
> > if socket was closed before setting skb ownership
> >
> > There are two ref count variables controlling the free()ing of a socket:
> > - struct sock::sk_refcnt - which is changed by sock_hold()/sock_put()
> > - struct sock::sk_wmem_alloc - which accounts the memory allocated by
> >   the skbs in the send path.
> >
> > In case there are still TX skbs on the fly and the socket() is closed,
> > the struct sock::sk_refcnt reaches 0. In the TX-path the CAN stack
> > clones an "echo" skb, calls sock_hold() on the original socket and
> > references it. This produces the following back trace:
> >
> > | WARNING: CPU: 0 PID: 280 at lib/refcount.c:25
> > | refcount_warn_saturate+0x114/0x134
> > | refcount_t: addition on 0; use-after-free.
> > | Modules linked in: coda_vpu(E) v4l2_jpeg(E) videobuf2_vmalloc(E)
> > imx_vdoa(E)
> > | CPU: 0 PID: 280 Comm: test_can.sh Tainted: GE
> > 5.11.0-04577-gf8ff6603c617 #203
> > | Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> > | Backtrace:
> > | [<80bafea4>] (dump_backtrace) from [<80bb0280>]
> > | (show_stack+0x20/0x24)
> > | r7: r6:600f0113 r5: r4:81441220 [<80bb0260>]
> > | (show_stack) from [<80bb593c>] (dump_stack+0xa0/0xc8) [<80bb589c>]
> > | (dump_stack) from [<8012b268>] (__warn+0xd4/0x114) r9:0019
> > | r8:80f4a8c2 r7:83e4150c r6: r5:0009 r4:80528f90
> > | [<8012b194>] (__warn) from [<80bb09c4>]
> > | (warn_slowpath_fmt+0x88/0xc8)
> > | r9:83f26400 r8:80f4a8d1 r7:0009 r6:80528f90 r5:0019
> > | r4:80f4a8c2 [<80bb0940>] (warn_slowpath_fmt) from [<80528f90>]
> > | (refcount_warn_saturate+0x114/0x134) r8: r7:
> > | r6:82b44000 r5:834e5600 r4:83f4d540 [<80528e7c>]
> > | (refcount_warn_saturate) from [<8079a4c8>]
> > | (__refcount_add.constprop.0+0x4c/0x50)
> > | [<8079a47c>] (__refcount_add.constprop.0) from [<8079a57c>]
> > | (can_put_echo_skb+0xb0/0x13c) [<8079a4cc>] (can_put_echo_skb) from
> > | [<8079ba98>] (flexcan_start_xmit+0x1c4/0x230) r9:0010
> > | r8:83f48610
> > | r7:0fdc r6:0c08 r5:82b44000 r4:834e5600 [<8079b8d4>]
> > | (flexcan_start_xmit) from [<80969078>] (netdev_start_xmit+0x44/0x70)
> > | r9:814c0ba0 r8:80c8790c r7: r6:834e5600 r5:82b44000
> > | r4:82ab1f00 [<80969034>] (netdev_start_xmit) from [<809725a4>]
> > | (dev_hard_start_xmit+0x19c/0x318) r9:814c0ba0 r8:
> > | r7:82ab1f00
> > | r6:82b44000 r5: r4:834e5600 [<80972408>]
> > | (dev_hard_start_xmit) from [<809c6584>] (sch_direct_xmit+0xcc/0x264)
> > | r10:834e5600
> > | r9: r8: r7:82b44000 r6:82ab1f00 r5:834e5600
> > | r4:83f27400 [<809c64b8>] (sch_direct_xmit) from [<809c6c0c>]
> > | (__qdisc_run+0x4f0/0x534)
> >
> > To fix this problem, only set skb ownership to sockets which have
> > still a ref count > 0.
> >
> > Cc: Oliver Hartkopp 
> > Cc: Andre Naujoks 
> > Suggested-by: Eric Dumazet 
> > Fixes: 0ae89beb283a ("can: add destructor for self generated skbs")
> > Signed-off-by: Oleksij Rempel 
> 
> I will give out a test result tomorrow when the board is available. 

I also met this issue in the past and this patch indeed fix it. Thanks Oleksij 
Rempe.

Tested-by: Joakim Zhang 

Best Regards,
Joakim Zhang
> Best Regards,
> Joakim Zhang
> > ---
> >  include/linux/can/skb.h | 8 ++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/can/skb.h b/include/linux/can/skb.h index
> > 685f34cfba20..d82018cc0d0b 100644
> > --- a/include/linux/can/skb.h
> > +++ b/include/linux/can/skb.h
> > @@ -65,8 +65,12 @@ static inline void can_skb_reserve(struct sk_buff
> > *skb)
> >
> >  static inline void can_skb_set_owner(struct sk_buff *skb, struct sock *sk)
> {
> > -   if (sk) {
> > -   sock_hold(sk);
> > +   /*
> > +* If the socket has already been closed by user space, the refcount may
> > +* already be 0 (and the socket will be freed after the last TX skb has
> > +* been freed). So only increase socket refcount if the refcount is > 0.
> > +*/
> > +   if (sk && refcount_inc_not_zero(>sk_refcnt)) {
> >

Re: Why do kprobes and uprobes singlestep?

2021-03-01 Thread Masami Hiramatsu
Hi Oleg and Andy,

On Mon, 1 Mar 2021 17:51:31 +0100
Oleg Nesterov  wrote:

> Hi Andy,
> 
> sorry for delay.
> 
> On 02/23, Andy Lutomirski wrote:
> >
> > A while back, I let myself be convinced that kprobes genuinely need to
> > single-step the kernel on occasion, and I decided that this sucked but
> > I could live with it.  it would, however, be Really Really Nice (tm)
> > if we could have a rule that anyone running x86 Linux who single-steps
> > the kernel (e.g. kgdb and nothing else) gets to keep all the pieces
> > when the system falls apart around them.  Specifically, if we don't
> > allow kernel single-stepping and if we suitably limit kernel
> > instruction breakpoints (the latter isn't actually a major problem),
> > then we don't really really need to use IRET to return to the kernel,
> > and that means we can avoid some massive NMI nastiness.
> 
> Not sure I understand you correctly, I know almost nothing about low-level
> x86  magic.

x86 has normal interrupt and NMI. When an NMI occurs the CPU masks NMI
(the mask itself is hidden status) and IRET releases the mask. The problem
is that if an INT3 is hit in the NMI handler and does a single-stepping,
it has to use IRET for atomically setting TF and return.

> 
> But I guess this has nothing to do with uprobes, they do not single-step
> in kernel mode, right?

Agreed, if the problematic case is IRET from NMI handler, uprobes doesn't
hit because it only invoked from user-space.
Andy, what would you think?

> > Uprobes seem to single-step user code for no discernable reason.
> > (They want to trap after executing an out of line instruction, AFAICT.
> > Surely INT3 or even CALL after the out-of-line insn would work as well
> > or better.)
> 
> Uprobes use single-step from the very beginning, probably because this
> is the most simple and "standard" way to implement xol.
> 
> And please note that CALL/JMP/etc emulation was added much later to fix the
> problems with non-canonical addresses, and this emulation it still incomplete.

Yeah, I found another implementation of the emulation afterwards. Of cource
since uprobes only treat user-space, it maybe need more care.

Thank you,

-- 
Masami Hiramatsu 


Re: [PATCH v5] i2c: virtio: add a virtio i2c frontend driver

2021-03-01 Thread Jie Deng



On 2021/3/1 19:54, Viresh Kumar wrote:

On 01-03-21, 14:41, Jie Deng wrote:

+/**
+ * struct virtio_i2c_req - the virtio I2C request structure
+ * @out_hdr: the OUT header of the virtio I2C message
+ * @write_buf: contains one I2C segment being written to the device
+ * @read_buf: contains one I2C segment being read from the device
+ * @in_hdr: the IN header of the virtio I2C message
+ */
+struct virtio_i2c_req {
+   struct virtio_i2c_out_hdr out_hdr;
+   u8 *write_buf;
+   u8 *read_buf;
+   struct virtio_i2c_in_hdr in_hdr;
+};

I am not able to appreciate the use of write/read bufs here as we
aren't trying to read/write data in the same transaction. Why do we
have two bufs here as well as in specs ?

What about this on top of your patch ?

---
  drivers/i2c/busses/i2c-virtio.c | 31 +++
  include/uapi/linux/virtio_i2c.h |  3 +--
  2 files changed, 12 insertions(+), 22 deletions(-)

diff --git a/drivers/i2c/busses/i2c-virtio.c b/drivers/i2c/busses/i2c-virtio.c
index 8c8bc9545418..e71ab1d2c83f 100644
--- a/drivers/i2c/busses/i2c-virtio.c
+++ b/drivers/i2c/busses/i2c-virtio.c
@@ -67,14 +67,13 @@ static int virtio_i2c_send_reqs(struct virtqueue *vq,
if (!buf)
break;
  
+		reqs[i].buf = buf;

+   sg_init_one(_buf, reqs[i].buf, msgs[i].len);
+
if (msgs[i].flags & I2C_M_RD) {
-   reqs[i].read_buf = buf;
-   sg_init_one(_buf, reqs[i].read_buf, msgs[i].len);
sgs[outcnt + incnt++] = _buf;
} else {
-   reqs[i].write_buf = buf;
-   memcpy(reqs[i].write_buf, msgs[i].buf, msgs[i].len);
-   sg_init_one(_buf, reqs[i].write_buf, msgs[i].len);
+   memcpy(reqs[i].buf, msgs[i].buf, msgs[i].len);
sgs[outcnt++] = _buf;
}
  
@@ -84,13 +83,8 @@ static int virtio_i2c_send_reqs(struct virtqueue *vq,

err = virtqueue_add_sgs(vq, sgs, outcnt, incnt, [i], 
GFP_KERNEL);
if (err < 0) {
pr_err("failed to add msg[%d] to virtqueue.\n", i);
-   if (msgs[i].flags & I2C_M_RD) {
-   kfree(reqs[i].read_buf);
-   reqs[i].read_buf = NULL;
-   } else {
-   kfree(reqs[i].write_buf);
-   reqs[i].write_buf = NULL;
-   }
+   kfree(reqs[i].buf);
+   reqs[i].buf = NULL;
break;
}
}
@@ -118,14 +112,11 @@ static int virtio_i2c_complete_reqs(struct virtqueue *vq,
break;
}
  
-		if (msgs[i].flags & I2C_M_RD) {

-   memcpy(msgs[i].buf, req->read_buf, msgs[i].len);
-   kfree(req->read_buf);
-   req->read_buf = NULL;
-   } else {
-   kfree(req->write_buf);
-   req->write_buf = NULL;
-   }
+   if (msgs[i].flags & I2C_M_RD)
+   memcpy(msgs[i].buf, req->buf, msgs[i].len);
+
+   kfree(req->buf);
+   req->buf = NULL;
}
  
  	return i;

diff --git a/include/uapi/linux/virtio_i2c.h b/include/uapi/linux/virtio_i2c.h
index 92febf0c527e..61f0086ac75b 100644
--- a/include/uapi/linux/virtio_i2c.h
+++ b/include/uapi/linux/virtio_i2c.h
@@ -48,8 +48,7 @@ struct virtio_i2c_in_hdr {
   */
  struct virtio_i2c_req {
struct virtio_i2c_out_hdr out_hdr;
-   u8 *write_buf;
-   u8 *read_buf;
+   u8 *buf;
struct virtio_i2c_in_hdr in_hdr;
  };
  


That's my original proposal. I used to mirror this interface with 
"struct i2c_msg".


But the design philosophy of virtio TC is that VIRTIO devices are not 
specific to Linux
so the specs design should avoid the limitations of the current Linux 
driver behavior.


We had some discussion about this. You may check these links to learn 
the story.

https://lists.oasis-open.org/archives/virtio-comment/202010/msg00016.html
https://lists.oasis-open.org/archives/virtio-comment/202010/msg00033.html
https://lists.oasis-open.org/archives/virtio-comment/202011/msg00025.html


[PATCH] powercap: Add Hygon Fam18h RAPL support

2021-03-01 Thread Pu Wen
Enable Hygon Fam18h RAPL support for the power capping framework.

Signed-off-by: Pu Wen 
---
 drivers/powercap/intel_rapl_common.c | 1 +
 drivers/powercap/intel_rapl_msr.c| 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/powercap/intel_rapl_common.c 
b/drivers/powercap/intel_rapl_common.c
index fdda2a737186..73cf68af9770 100644
--- a/drivers/powercap/intel_rapl_common.c
+++ b/drivers/powercap/intel_rapl_common.c
@@ -1069,6 +1069,7 @@ static const struct x86_cpu_id rapl_ids[] __initconst = {
 
X86_MATCH_VENDOR_FAM(AMD, 0x17, _defaults_amd),
X86_MATCH_VENDOR_FAM(AMD, 0x19, _defaults_amd),
+   X86_MATCH_VENDOR_FAM(HYGON, 0x18, _defaults_amd),
{}
 };
 MODULE_DEVICE_TABLE(x86cpu, rapl_ids);
diff --git a/drivers/powercap/intel_rapl_msr.c 
b/drivers/powercap/intel_rapl_msr.c
index 78213d4b5b16..cc3b22881bfe 100644
--- a/drivers/powercap/intel_rapl_msr.c
+++ b/drivers/powercap/intel_rapl_msr.c
@@ -150,6 +150,7 @@ static int rapl_msr_probe(struct platform_device *pdev)
case X86_VENDOR_INTEL:
rapl_msr_priv = _msr_priv_intel;
break;
+   case X86_VENDOR_HYGON:
case X86_VENDOR_AMD:
rapl_msr_priv = _msr_priv_amd;
break;
-- 
2.23.0



Re: [RFC PATCH 3/5] ASoC: audio-graph-card: Add bindings for sysclk and pll

2021-03-01 Thread Rob Herring
On Thu, Feb 25, 2021 at 12:06 PM Sameer Pujar  wrote:
>
> ASoC core provides callbacks snd_soc_dai_set_sysclk() and
> snd_soc_dai_set_pll() for system clock (sysclk) and pll configurations
> respectively. Add bindings for flexible sysclk or pll configurations
> which can be driven from CPU/Codec DAI or endpoint subnode from DT.
> This in turn helps to avoid hard codings in driver and makes it more
> generic.
>
> Also add system-clock related bindings, "system-clock-direction-out"
> and "system-clock-frequency", which are already supported.

This all looks like duplication of what the clock binding can provide.
We don't need 2 ways to describe clocks in DT.

Rob


Re: [PATCH v1 5/7] net: ipa: Add support for IPA on MSM8998

2021-03-01 Thread Alex Elder
On 2/11/21 11:50 AM, AngeloGioacchino Del Regno wrote:
> MSM8998 features IPA v3.1 (GSI v1.0): add the required configuration
> data for it.
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 

As of today, I have not looked at this file in detail.  You
probably see that the intent is to have this file define
pretty much everything that varies across platforms.  A lot
of this is found in "ipa_utils.c" in downstream code, and
it is organized differently there.

I have reworked the way resources are represented (also
not yet posted for review, but "soon").  I see you included
the additional ones but I'm not completely sure they'll
get programmed properly (but again, I haven't looked very
closely yet).

Interconnects are done differently upstream than downstream,
and to be honest I'm not completely on top of which platforms
require which interconnects.  I'm gathering information about
them as I can.

Jakub pointed out a compile problem, so you should definitely
avoid ever having those in your patches, but sometimes it
happens.

When I'm ready to post my IPA v3.1 data file for review
I'll take another, closer look at what you have here.

-Alex
> ---
>  drivers/net/ipa/Makefile   |   3 +-
>  drivers/net/ipa/ipa_data-msm8998.c | 407 +
>  drivers/net/ipa/ipa_data.h |   5 +
>  drivers/net/ipa/ipa_main.c |   4 +
>  4 files changed, 418 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/ipa/ipa_data-msm8998.c
> 
> diff --git a/drivers/net/ipa/Makefile b/drivers/net/ipa/Makefile
> index afe5df1e6eee..4a6f4053dce2 100644
> --- a/drivers/net/ipa/Makefile
> +++ b/drivers/net/ipa/Makefile
> @@ -9,4 +9,5 @@ ipa-y :=  ipa_main.o ipa_clock.o 
> ipa_reg.o ipa_mem.o \
>   ipa_endpoint.o ipa_cmd.o ipa_modem.o \
>   ipa_qmi.o ipa_qmi_msg.o
>  
> -ipa-y+=  ipa_data-sdm845.o ipa_data-sc7180.o
> +ipa-y+=  ipa_data-msm8998.o ipa_data-sdm845.o \
> + ipa_data-sc7180.o
> diff --git a/drivers/net/ipa/ipa_data-msm8998.c 
> b/drivers/net/ipa/ipa_data-msm8998.c
> new file mode 100644
> index ..90e724468e40
> --- /dev/null
> +++ b/drivers/net/ipa/ipa_data-msm8998.c
> @@ -0,0 +1,407 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
> + * Copyright (C) 2019-2020 Linaro Ltd.
> + * Coypright (C) 2021, AngeloGioacchino Del Regno
> + * 
> + */
> +
> +#include 
> +
> +#include "gsi.h"
> +#include "ipa_data.h"
> +#include "ipa_endpoint.h"
> +#include "ipa_mem.h"
> +
> +/* Endpoint configuration for the MSM8998 SoC. */
> +static const struct ipa_gsi_endpoint_data ipa_gsi_endpoint_data[] = {
> + [IPA_ENDPOINT_AP_COMMAND_TX] = {
> + .ee_id  = GSI_EE_AP,
> + .channel_id = 6,
> + .endpoint_id= 22,
> + .toward_ipa = true,
> + .channel = {
> + .tre_count  = 256,
> + .event_count= 256,
> + .tlv_count  = 18,
> + },
> + .endpoint = {
> + .seq_type   = IPA_SEQ_DMA_ONLY,
> + .config = {
> + .resource_group = 1,
> + .dma_mode   = true,
> + .dma_endpoint   = IPA_ENDPOINT_AP_LAN_RX,
> + },
> + },
> + },
> + [IPA_ENDPOINT_AP_LAN_RX] = {
> + .ee_id  = GSI_EE_AP,
> + .channel_id = 7,
> + .endpoint_id= 15,
> + .toward_ipa = false,
> + .channel = {
> + .tre_count  = 256,
> + .event_count= 256,
> + .tlv_count  = 8,
> + },
> + .endpoint = {
> + .seq_type   = IPA_SEQ_INVALID,
> + .config = {
> + .resource_group = 1,
> + .aggregation= true,
> + .status_enable  = true,
> + .rx = {
> + .pad_align  = ilog2(sizeof(u32)),
> + },
> + },
> + },
> + },
> + [IPA_ENDPOINT_AP_MODEM_TX] = {
> + .ee_id  = GSI_EE_AP,
> + .channel_id = 5,
> + .endpoint_id= 3,
> + .toward_ipa = true,
> + .channel = {
> + .tre_count  = 512,
> + .event_count= 512,
> + .tlv_count  = 16,
> + },
> + .endpoint = {
> + .filter_support = true,
> + .seq_type

Re: [PATCH v1 6/7] dt-bindings: net: qcom-ipa: Document qcom,sc7180-ipa compatible

2021-03-01 Thread Alex Elder
On 2/11/21 11:50 AM, AngeloGioacchino Del Regno wrote:
> The driver supports SC7180, but the binding was not documented.
> Just add it.

I hadn't noticed that!

I'm trying to get through reviewing your series
today and this will take another hour or so to
go validate to my satisfaction.

Would you be willing to submit just this patch
as a fix, and when you do I will give it a proper
review?

-Alex

> Signed-off-by: AngeloGioacchino Del Regno 
> 
> ---
>  Documentation/devicetree/bindings/net/qcom,ipa.yaml | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.yaml 
> b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
> index 8a2d12644675..b063c6c1077a 100644
> --- a/Documentation/devicetree/bindings/net/qcom,ipa.yaml
> +++ b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
> @@ -43,7 +43,11 @@ description:
>  
>  properties:
>compatible:
> -const: "qcom,sdm845-ipa"
> +oneOf:
> +  - items:
> +  - enum:
> +  - "qcom,sdm845-ipa"
> +  - "qcom,sc7180-ipa"
>  
>reg:
>  items:
> 



Re: [PATCH v1 7/7] dt-bindings: net: qcom-ipa: Document qcom,msm8998-ipa compatible

2021-03-01 Thread Alex Elder
On 2/11/21 11:50 AM, AngeloGioacchino Del Regno wrote:
> MSM8998 support has been added: document the new compatible.
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 

With the previous patch in place, this becomes almost
automatic.

But I don't want to claim support for a platform
until things actually *work*.  I don't just mean
we can compile and load (and load firmware), I
want to be able to say we can actually carry LTE
data over IPA before we advertise the compatible
string here.

Maybe I'm being picky, but that's my preference.
It adds some motivation for getting the user space
tools squared away.

Thank you again very much for your patches.

-Alex
> ---
>  Documentation/devicetree/bindings/net/qcom,ipa.yaml | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.yaml 
> b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
> index b063c6c1077a..9dacd224b606 100644
> --- a/Documentation/devicetree/bindings/net/qcom,ipa.yaml
> +++ b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
> @@ -46,6 +46,7 @@ properties:
>  oneOf:
>- items:
>- enum:
> +  - "qcom,msm8998-ipa"
>- "qcom,sdm845-ipa"
>- "qcom,sc7180-ipa"
>  
> 



Re: [PATCH v1 0/7] Add support for IPA v3.1, GSI v1.0, MSM8998 IPA

2021-03-01 Thread Alex Elder
On 2/11/21 11:50 AM, AngeloGioacchino Del Regno wrote:
> Hey all!
> 
> This time around I thought that it would be nice to get some modem
> action going on. We have it, it's working (ish), so just.. why not.
> 
> This series adds support for IPA v3.1 (featuring GSI v1.0) and also
> takes account for some bits that are shared with other unimplemented
> IPA v3 variants and it is specifically targeting MSM8998, for which
> support is added.

It was more like "next month" rather than "next week," but I
finally took some more time to look at this today.

Again I think it's surprising how little code you had
to implement to get something that seems is at least
modestly functional.

FYI I have undertaken an effort to make the upstream code
suitable for use for any IPA version (3.0-4.11) in the
past few months.  Most of what I've done is in line with
the things you found were necessary for IPA v3.1 support.
Early on I got most of the support for IPA v4.5 upstream,
and have been holding off trying to get other similar
changes out for review for other versions until I've had
more of a chance to test some of what's new in IPA v4.5.

In the coming weeks I will start posting more of this
work for review.  You'll see that I'm modifying many
things you do in your series (such as making version
checks not assume only v3.5.1 and v4.2 are supported).
My priority is on newer versions, but I want the code
to be (at least) correct for IPA v3.0, v3.1, and v3.5
as well.

What might be best is for you to consider using the
patches when I send them out.  I'll gladly give you some
credit when I do if you like (suggested-by, reviewed-by,
tested-by, whatever you feel is appropriate).  Please
let me know if you would like to be on the Cc list for
this sort of change.

> Since the userspace isn't entirely ready (as far as I can see) for
> data connection (3g/lte/whatever) through the modem, it was possible
> to only partially test this series.

Yes we're still figuring out how the upstream tools need
to interact with the kernel for configuration.

> Specifically, loading the IPA firmware and setting up the interface
> went just fine, along with a basic setup of the network interface
> that got exposed by this driver.

This is great to hear.

> With this series, the benefits that I see are:
>  1. The modem doesn't crash anymore when trying to setup a data
> connection, as now the modem firmware seems to be happy with
> having IPA initialized and ready;
>  2. Other random modem crashes while picking up LTE home network
> signal (even just for calling, nothing fancy) seem to be gone.
> 
> These are the reasons why I think that this series is ready for
> upstream action. It's *at least* stabilizing the platform when
> the modem is up.
> 
> This was tested on the F(x)Tec Pro 1 (MSM8998) smartphone.

I unfortunately can't promise you you'll have the full
connection up and running, but we can probably get very
close.

It would be very helpful for you (someone other than me,
that is) to participate in validating the changes I am
now finalizing.  I hope you're willing.

I'll offer a few more specific comments on each of your
patches.

-Alex


> AngeloGioacchino Del Regno (7):
>   net: ipa: Add support for IPA v3.1 with GSI v1.0
>   net: ipa: endpoint: Don't read unexistant register on IPAv3.1
>   net: ipa: gsi: Avoid some writes during irq setup for older IPA
>   net: ipa: gsi: Use right masks for GSI v1.0 channels hw param
>   net: ipa: Add support for IPA on MSM8998
>   dt-bindings: net: qcom-ipa: Document qcom,sc7180-ipa compatible
>   dt-bindings: net: qcom-ipa: Document qcom,msm8998-ipa compatible
> 
>  .../devicetree/bindings/net/qcom,ipa.yaml |   7 +-
>  drivers/net/ipa/Makefile  |   3 +-
>  drivers/net/ipa/gsi.c |  33 +-
>  drivers/net/ipa/gsi_reg.h |   5 +
>  drivers/net/ipa/ipa_data-msm8998.c| 407 ++
>  drivers/net/ipa/ipa_data.h|   5 +
>  drivers/net/ipa/ipa_endpoint.c|  26 +-
>  drivers/net/ipa/ipa_main.c|  12 +-
>  drivers/net/ipa/ipa_reg.h |   3 +
>  drivers/net/ipa/ipa_version.h |   1 +
>  10 files changed, 480 insertions(+), 22 deletions(-)
>  create mode 100644 drivers/net/ipa/ipa_data-msm8998.c
> 



Re: [PATCH v1 3/7] net: ipa: gsi: Avoid some writes during irq setup for older IPA

2021-03-01 Thread Alex Elder
On 2/11/21 11:50 AM, AngeloGioacchino Del Regno wrote:
> On some IPA versions (v3.1 and older), writing to registers
> GSI_INTER_EE_SRC_CH_IRQ_OFFSET and GSI_INTER_EE_SRC_EV_CH_IRQ_OFFSET
> will generate a fault and the SoC will lockup.
> 
> Avoid clearing CH and EV_CH interrupts on GSI probe to fix this bad
> behavior: we are anyway not going to get spurious interrupts.

I think the reason for this might be that these registers
are located at a different offset for IPA v3.1.

I'd rather get it right and actively disable these
interrupts rather than assume they won't fire.

Also...  you included an extra blank line; avoid that.

-Alex

> Signed-off-by: AngeloGioacchino Del Regno 
> 
> ---
>  drivers/net/ipa/gsi.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ipa/gsi.c b/drivers/net/ipa/gsi.c
> index 6315336b3ca8..b5460cbb085c 100644
> --- a/drivers/net/ipa/gsi.c
> +++ b/drivers/net/ipa/gsi.c
> @@ -207,11 +207,14 @@ static void gsi_irq_setup(struct gsi *gsi)
>   iowrite32(0, gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_MSK_OFFSET);
>  
>   /* Reverse the offset adjustment for inter-EE register offsets */
> - adjust = gsi->version < IPA_VERSION_4_5 ? 0 : GSI_EE_REG_ADJUST;
> - iowrite32(0, gsi->virt + adjust + GSI_INTER_EE_SRC_CH_IRQ_OFFSET);
> - iowrite32(0, gsi->virt + adjust + GSI_INTER_EE_SRC_EV_CH_IRQ_OFFSET);
> + if (gsi->version > IPA_VERSION_3_1) {
> + adjust = gsi->version < IPA_VERSION_4_5 ? 0 : GSI_EE_REG_ADJUST;
> + iowrite32(0, gsi->virt + adjust + 
> GSI_INTER_EE_SRC_CH_IRQ_OFFSET);
> + iowrite32(0, gsi->virt + adjust + 
> GSI_INTER_EE_SRC_EV_CH_IRQ_OFFSET);
> + }
>  
>   iowrite32(0, gsi->virt + GSI_CNTXT_GSI_IRQ_EN_OFFSET);
> +
>  }
>  
>  /* Turn off all GSI interrupts when we're all done */
> 



Re: [PATCH v1 4/7] net: ipa: gsi: Use right masks for GSI v1.0 channels hw param

2021-03-01 Thread Alex Elder
On 2/11/21 11:50 AM, AngeloGioacchino Del Regno wrote:
> In GSI v1.0 the register GSI_HW_PARAM_2_OFFSET has different layout
> so the number of channels and events per EE are, of course, laid out
> in 8 bits each (0-7, 8-15 respectively).
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 
> ---
>  drivers/net/ipa/gsi.c | 16 +---
>  drivers/net/ipa/gsi_reg.h |  5 +
>  2 files changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ipa/gsi.c b/drivers/net/ipa/gsi.c
> index b5460cbb085c..3311ffe514c9 100644
> --- a/drivers/net/ipa/gsi.c
> +++ b/drivers/net/ipa/gsi.c
> @@ -1790,7 +1790,7 @@ static void gsi_channel_teardown(struct gsi *gsi)
>  int gsi_setup(struct gsi *gsi)
>  {
>   struct device *dev = gsi->dev;
> - u32 val;
> + u32 val, mask;
>   int ret;
>  
>   /* Here is where we first touch the GSI hardware */
> @@ -1804,7 +1804,12 @@ int gsi_setup(struct gsi *gsi)
>  
>   val = ioread32(gsi->virt + GSI_GSI_HW_PARAM_2_OFFSET);
>  
> - gsi->channel_count = u32_get_bits(val, NUM_CH_PER_EE_FMASK);
> + if (gsi->version == IPA_VERSION_3_1)
> + mask = GSIV1_NUM_CH_PER_EE_FMASK;
> + else
> + mask = NUM_CH_PER_EE_FMASK;
> +
> + gsi->channel_count = u32_get_bits(val, mask);

I have a different way of doing this, at least for
encoding, and I'd rather use a similar convention in
this case.  At some point it might become obvious
that "there's got to be a better way" and I might have
to consider something else, but for now I've been
doing what I describe below.

Anyway, what I'd ask for here is to create a a static
inline function in "ipa_reg.h" (or "gsi_reg.h") to
extract these values.  In this case it might look
like this:

static inline u32 num_ev_per_ee_get(enum ipa_version version,
u32 val)
{
if (version == IPA_VERSION_3_0 || version == IPA_VERSION_3_1)
return u32_get_bits(val, GENMASK(8, 0));

return u32_get_bits(val, GENMASK(7, 3));
}

(I'm not sure if the above is correct for all versions...)

Then the caller would do:
gsi->evt_ring_count = num_ev_per_ee_get(ipa->version, val);

I'd want the same general thing for the channel count.

-Alex

>   if (!gsi->channel_count) {
>   dev_err(dev, "GSI reports zero channels supported\n");
>   return -EINVAL;
> @@ -1816,7 +1821,12 @@ int gsi_setup(struct gsi *gsi)
>   gsi->channel_count = GSI_CHANNEL_COUNT_MAX;
>   }
>  
> - gsi->evt_ring_count = u32_get_bits(val, NUM_EV_PER_EE_FMASK);
> + if (gsi->version == IPA_VERSION_3_1)
> + mask = GSIV1_NUM_EV_PER_EE_FMASK;
> + else
> + mask = NUM_EV_PER_EE_FMASK;
> +
> + gsi->evt_ring_count = u32_get_bits(val, mask);
>   if (!gsi->evt_ring_count) {
>   dev_err(dev, "GSI reports zero event rings supported\n");
>   return -EINVAL;
> diff --git a/drivers/net/ipa/gsi_reg.h b/drivers/net/ipa/gsi_reg.h
> index 0e138bbd8205..4ba579fa21c2 100644
> --- a/drivers/net/ipa/gsi_reg.h
> +++ b/drivers/net/ipa/gsi_reg.h
> @@ -287,6 +287,11 @@ enum gsi_generic_cmd_opcode {
>   GSI_EE_N_GSI_HW_PARAM_2_OFFSET(GSI_EE_AP)
>  #define GSI_EE_N_GSI_HW_PARAM_2_OFFSET(ee) \
>   (0x0001f040 + 0x4000 * (ee))
> +
> +/* Fields below are present for IPA v3.1 with GSI version 1 */
> +#define GSIV1_NUM_EV_PER_EE_FMASKGENMASK(8, 0)
> +#define GSIV1_NUM_CH_PER_EE_FMASKGENMASK(15, 8)
> +/* Fields below are present for IPA v3.5.1 and above */
>  #define IRAM_SIZE_FMASK  GENMASK(2, 0)
>  #define NUM_CH_PER_EE_FMASK  GENMASK(7, 3)
>  #define NUM_EV_PER_EE_FMASK  GENMASK(12, 8)
> 



Re: linux-next: build failure after merge of the powerpc-fixes tree

2021-03-01 Thread Michael Ellerman
Stephen Rothwell  writes:
> Hi all,
>
> After merging the powerpc-fixes tree, today's linux-next build (powerpc
> allyesconfig) failed like this:
>
> drivers/net/ethernet/ibm/ibmvnic.c:5399:13: error: conflicting types for 
> 'ibmvnic_remove'
>  5399 | static void ibmvnic_remove(struct vio_dev *dev)
>   | ^~
> drivers/net/ethernet/ibm/ibmvnic.c:81:12: note: previous declaration of 
> 'ibmvnic_remove' was here
>81 | static int ibmvnic_remove(struct vio_dev *);
>   |^~
>
> Caused by commit
>
>   1bdd1e6f9320 ("vio: make remove callback return void")

Gah, is IBMVNIC in any of our defconfigs?! ... no it's not.

> I have applied the following patch for today:

Thanks, I'll squash it in.

cheers

> From: Stephen Rothwell 
> Date: Tue, 2 Mar 2021 11:06:37 +1100
> Subject: [PATCH] vio: fix for make remove callback return void
>
> Signed-off-by: Stephen Rothwell 
> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
> b/drivers/net/ethernet/ibm/ibmvnic.c
> index eb39318766f6..fe3201ba2034 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -78,7 +78,6 @@ MODULE_LICENSE("GPL");
>  MODULE_VERSION(IBMVNIC_DRIVER_VERSION);
>  
>  static int ibmvnic_version = IBMVNIC_INITIAL_VERSION;
> -static int ibmvnic_remove(struct vio_dev *);
>  static void release_sub_crqs(struct ibmvnic_adapter *, bool);
>  static int ibmvnic_reset_crq(struct ibmvnic_adapter *);
>  static int ibmvnic_send_crq_init(struct ibmvnic_adapter *);
> -- 
> 2.30.0
>
> -- 
> Cheers,
> Stephen Rothwell


Re: [RFC PATCH] powercap: Add Hygon Fam18h RAPL support

2021-03-01 Thread Wen Pu
On 2021/3/1 22:20, Rafael J. Wysocki wrote:
> On Mon, Mar 1, 2021 at 3:18 AM Wen Pu  wrote:
>>
>> On 2021/2/28 23:42, Srinivas Pandruvada wrote:
>>> On Thu, 2021-02-25 at 21:01 +0800, Pu Wen wrote:
 Enable Hygon Fam18h RAPL support for the power capping framework.

>>> If this patch is tested and works on this processor, not sure why this
>>> is RFC?
>>
>> This patch is tested and works on Hygon processor. The 'RFC' is automated
>> generated by my script ;)
> 
> Well, care to resend as non-RFC, then?

OK, already resend. Thanks!

-- 
Regards,
Pu Wen

Re: [PATCH] perf stat: improve readability of shadow stats

2021-03-01 Thread Namhyung Kim
On Tue, Mar 2, 2021 at 4:19 AM Jiri Olsa  wrote:
>
> On Tue, Mar 02, 2021 at 01:24:02AM +0800, Changbin Du wrote:
> > This does follow two changes:
> >   1) Select appropriate unit between K/M/G.
> >   2) Use 'cpu-sec' instead of 'sec' to state this is not the wall-time.
> >
> > $ sudo ./perf stat -a -- sleep 1
> >
> > Before: Unit 'M' is selected even the number is very small.
> >  Performance counter stats for 'system wide':
> >
> >   4,003.06 msec cpu-clock #3.998 CPUs utilized
> > 16,179  context-switches  #0.004 M/sec
> >161  cpu-migrations#0.040 K/sec
> >  4,699  page-faults   #0.001 M/sec
> >  6,135,801,925  cycles#1.533 GHz
> >   (83.21%)
> >  5,783,308,491  stalled-cycles-frontend   #   94.26% frontend 
> > cycles idle (83.21%)
> >  4,543,694,050  stalled-cycles-backend#   74.05% backend cycles 
> > idle  (66.49%)
> >  4,720,130,587  instructions  #0.77  insn per cycle
> >   #1.23  stalled cycles 
> > per insn  (83.28%)
> >753,848,078  branches  #  188.318 M/sec  
> >   (83.61%)
> > 37,457,747  branch-misses #4.97% of all 
> > branches  (83.48%)
> >
> >1.001283725 seconds time elapsed
> >
> > After:
> > $ sudo ./perf stat -a -- sleep 2
> >
> >  Performance counter stats for 'system wide':
> >
> >   8,003.20 msec cpu-clock #3.998 CPUs utilized
> >  9,768  context-switches  #1.221 K/cpu-sec
> >164  cpu-migrations#   20.492  /cpu-sec
>
> should you remove also the leading '/' in ' /cpu-sec' ?

The change looks good.  And I think we should keep '/' otherwise it'd be
more confusing.

>
>
> SNIP
>
> > @@ -1270,18 +1271,14 @@ void perf_stat__print_shadow_stats(struct 
> > perf_stat_config *config,
> >   generic_metric(config, evsel->metric_expr, 
> > evsel->metric_events, NULL,
> >   evsel->name, evsel->metric_name, NULL, 1, 
> > cpu, out, st);
> >   } else if (runtime_stat_n(st, STAT_NSECS, cpu, ) != 0) {
> > - char unit = 'M';
> > + char unit = ' ';
> >   char unit_buf[10];
> >
> >   total = runtime_stat_avg(st, STAT_NSECS, cpu, );
> > -
> >   if (total)
> > - ratio = 1000.0 * avg / total;
> > - if (ratio < 0.001) {
> > - ratio *= 1000;
> > - unit = 'K';
> > - }
> > - snprintf(unit_buf, sizeof(unit_buf), "%c/sec", unit);
> > + ratio = convert_unit_double(10.0 * avg / 
> > total, );
> > +
> > + snprintf(unit_buf, sizeof(unit_buf), "%c/cpu-sec", unit);
> >   print_metric(config, ctxp, NULL, "%8.3f", unit_buf, ratio);
>
> hum this will change -x output that people parse, so I don't think we can do 
> that

Agreed.

>
> >   } else if (perf_stat_evsel__is(evsel, SMI_NUM)) {
> >   print_smi_cost(config, cpu, out, st, );
> > diff --git a/tools/perf/util/units.c b/tools/perf/util/units.c
> > index a46762aec4c9..ac13b5ecde31 100644
> > --- a/tools/perf/util/units.c
> > +++ b/tools/perf/util/units.c
> > @@ -55,6 +55,28 @@ unsigned long convert_unit(unsigned long value, char 
> > *unit)
> >   return value;
> >  }
> >
> > +double convert_unit_double(double value, char *unit)
> > +{
> > + *unit = ' ';
> > +
> > + if (value > 1000.0) {
> > + value /= 1000.0;
> > + *unit = 'K';
> > + }
> > +
> > + if (value > 1000.0) {
> > + value /= 1000.0;
> > + *unit = 'M';
> > + }
> > +
> > + if (value > 1000.0) {
> > + value /= 1000.0;
> > + *unit = 'G';
> > + }
> > +
> > + return value;
> > +}
>
> we have convert_unit function just above doing the same only with
> unsigned long.. let's have one base function with double values and
> another one casting the result to unsigned long

Sounds good.

Thanks,
Namhyung


Re: [PATCH v1 2/7] net: ipa: endpoint: Don't read unexistant register on IPAv3.1

2021-03-01 Thread Alex Elder
On 2/11/21 11:50 AM, AngeloGioacchino Del Regno wrote:
> On IPAv3.1 there is no such FLAVOR_0 register so it is impossible
> to read tx/rx channel masks and we have to rely on the correctness
> on the provided configuration.

This works, and is simple.

I think I would rather populate the available mask here
with a mask containing the actual version-specific available
endpoints.  On the other hand, looking at the downstream code,
it looks like almost any of these endpoints could be used.

So, while I don't know for sure the all-1's value here is
*correct*, it's more of a validation check anyway, so it's
probably fine

-Alex

> Signed-off-by: AngeloGioacchino Del Regno 
> 
> ---
>  drivers/net/ipa/ipa_endpoint.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/net/ipa/ipa_endpoint.c b/drivers/net/ipa/ipa_endpoint.c
> index 06d8aa34276e..10c477e1bb90 100644
> --- a/drivers/net/ipa/ipa_endpoint.c
> +++ b/drivers/net/ipa/ipa_endpoint.c
> @@ -1659,6 +1659,15 @@ int ipa_endpoint_config(struct ipa *ipa)
>   u32 max;
>   u32 val;
>  
> + /* Some IPA versions don't provide a FLAVOR register and we cannot
> +  * check the rx/tx masks hence we have to rely on the correctness
> +  * of the provided configuration.
> +  */
> + if (ipa->version == IPA_VERSION_3_1) {
> + ipa->available = U32_MAX;
> + return 0;
> + }
> +
>   /* Find out about the endpoints supplied by the hardware, and ensure
>* the highest one doesn't exceed the number we support.
>*/
> 



Re: [PATCH v1 1/7] net: ipa: Add support for IPA v3.1 with GSI v1.0

2021-03-01 Thread Alex Elder
On 2/11/21 11:50 AM, AngeloGioacchino Del Regno wrote:
> In preparation for adding support for the MSM8998 SoC's IPA,
> add the necessary bits for IPA version 3.1 featuring GSI 1.0,
> found on at least MSM8998.
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 

Overall, this looks good.  As I mentioned, I've
implemented a very similar set of changes in my
private development tree.  It's part of a much
larger set of changes intended to allow many
IPA versions to be supported.

A few minor comments, below.

-Alex

> ---
>  drivers/net/ipa/gsi.c  |  8 
>  drivers/net/ipa/ipa_endpoint.c | 17 +
>  drivers/net/ipa/ipa_main.c |  8 ++--
>  drivers/net/ipa/ipa_reg.h  |  3 +++
>  drivers/net/ipa/ipa_version.h  |  1 +
>  5 files changed, 23 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/net/ipa/gsi.c b/drivers/net/ipa/gsi.c
> index 14d9a791924b..6315336b3ca8 100644
> --- a/drivers/net/ipa/gsi.c
> +++ b/drivers/net/ipa/gsi.c
> @@ -794,14 +794,14 @@ static void gsi_channel_program(struct gsi_channel 
> *channel, bool doorbell)
>  
>   /* Max prefetch is 1 segment (do not set MAX_PREFETCH_FMASK) */
>  
> - /* We enable the doorbell engine for IPA v3.5.1 */
> - if (gsi->version == IPA_VERSION_3_5_1 && doorbell)
> + /* We enable the doorbell engine for IPA v3.x */
> + if (gsi->version < IPA_VERSION_4_0 && doorbell)

My version:
if (gsi->version < IPA_VERSION_4_0 && doorbell)

So... You're doing the right thing.  Almost all changes I made
like this were identical to yours; others were (I think all)
equivalent.

>   val |= USE_DB_ENG_FMASK;
>  
>   /* v4.0 introduces an escape buffer for prefetch.  We use it
>* on all but the AP command channel.
>*/
> - if (gsi->version != IPA_VERSION_3_5_1 && !channel->command) {
> + if (gsi->version >= IPA_VERSION_4_0 && !channel->command) {
>   /* If not otherwise set, prefetch buffers are used */
>   if (gsi->version < IPA_VERSION_4_5)
>   val |= USE_ESCAPE_BUF_ONLY_FMASK;

. . .

> diff --git a/drivers/net/ipa/ipa_main.c b/drivers/net/ipa/ipa_main.c
> index 84bb8ae92725..be191993fbec 100644
> --- a/drivers/net/ipa/ipa_main.c
> +++ b/drivers/net/ipa/ipa_main.c

. . .

> @@ -276,6 +276,7 @@ static void ipa_hardware_config_qsb(struct ipa *ipa)
>  
>   max1 = 12;
>   switch (version) {
> + case IPA_VERSION_3_1:

I do this a little differently now.  These values will be
found in the "ipa_data" file for the platform.

Also I think you'd need different values for IPA v3.1 than
for IPA v3.5.1.

>   case IPA_VERSION_3_5_1:
>   max0 = 8;
>   break;
> @@ -404,6 +405,9 @@ static void ipa_hardware_config(struct ipa *ipa)
>   /* Enable open global clocks (not needed for IPA v4.5) */
>   val = GLOBAL_FMASK;
>   val |= GLOBAL_2X_CLK_FMASK;
> + if (version == IPA_VERSION_3_1)
> + val |= MISC_FMASK;

I see this being set for a workaround or IPA v3.1 in the
msm-4.4 tree, but the other two flags aren't set in that
case.  So this might not be quite right.

> +
>   iowrite32(val, ipa->reg_virt + IPA_REG_CLKON_CFG_OFFSET);
>  
>   /* Disable PA mask to allow HOLB drop */

. . .


[PATCH] x86/cpu/hygon: Set __max_die_per_package on Hygon

2021-03-01 Thread Pu Wen
Set the maximum DIE per package variable on Hygon using the
nodes_per_socket value in order to do per-DIE manipulations
by driver such as powercap.

Signed-off-by: Pu Wen 
---
 arch/x86/kernel/cpu/hygon.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/hygon.c b/arch/x86/kernel/cpu/hygon.c
index ae59115d18f9..0bd6c74e3ba1 100644
--- a/arch/x86/kernel/cpu/hygon.c
+++ b/arch/x86/kernel/cpu/hygon.c
@@ -215,12 +215,12 @@ static void bsp_init_hygon(struct cpuinfo_x86 *c)
u32 ecx;
 
ecx = cpuid_ecx(0x801e);
-   nodes_per_socket = ((ecx >> 8) & 7) + 1;
+   __max_die_per_package = nodes_per_socket = ((ecx >> 8) & 7) + 1;
} else if (boot_cpu_has(X86_FEATURE_NODEID_MSR)) {
u64 value;
 
rdmsrl(MSR_FAM10H_NODE_ID, value);
-   nodes_per_socket = ((value >> 3) & 7) + 1;
+   __max_die_per_package = nodes_per_socket = ((value >> 3) & 7) + 
1;
}
 
if (!boot_cpu_has(X86_FEATURE_AMD_SSBD) &&
-- 
2.23.0



Re: [PATCH] mm/memcg: set memcg when split pages

2021-03-01 Thread Zi Yan
On 1 Mar 2021, at 20:34, Zhou Guanghui wrote:

> When split page, the memory cgroup info recorded in first page is
> not copied to tail pages. In this case, when the tail pages are
> freed, the uncharge operation is not performed. As a result, the
> usage of this memcg keeps increasing, and the OOM may occur.
>
> So, the copying of first page's memory cgroup info to tail pages
> is needed when split page.
>
> Signed-off-by: Zhou Guanghui 
> ---
>  include/linux/memcontrol.h | 10 ++
>  mm/page_alloc.c|  4 +++-
>  2 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index e6dc793d587d..c7e2b4421dc1 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -867,6 +867,12 @@ void mem_cgroup_print_oom_group(struct mem_cgroup 
> *memcg);
>  extern bool cgroup_memory_noswap;
>  #endif
>
> +static inline void copy_page_memcg(struct page *dst, struct page *src)
> +{
> + if (src->memcg_data)
> + dst->memcg_data = src->memcg_data;
> +}
> +
>  struct mem_cgroup *lock_page_memcg(struct page *page);
>  void __unlock_page_memcg(struct mem_cgroup *memcg);
>  void unlock_page_memcg(struct page *page);
> @@ -1291,6 +1297,10 @@ mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
>  {
>  }
>
> +static inline void copy_page_memcg(struct page *dst, struct page *src)
> +{
> +}
> +
>  static inline struct mem_cgroup *lock_page_memcg(struct page *page)
>  {
>   return NULL;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3e4b29ee2b1e..ee0a63dc1c9b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3307,8 +3307,10 @@ void split_page(struct page *page, unsigned int order)
>   VM_BUG_ON_PAGE(PageCompound(page), page);
>   VM_BUG_ON_PAGE(!page_count(page), page);
>
> - for (i = 1; i < (1 << order); i++)
> + for (i = 1; i < (1 << order); i++) {
>   set_page_refcounted(page + i);
> + copy_page_memcg(page + i, page);
> + }
>   split_page_owner(page, 1 << order);
>  }
>  EXPORT_SYMBOL_GPL(split_page);
> -- 
> 2.25.0

+memcg maintainers

split_page() is used for non-compound higher-order pages. I am not sure
if there is any such pages monitored by memcg. Please let me know
if I miss anything.

—
Best Regards,
Yan Zi


signature.asc
Description: OpenPGP digital signature


Re: [PATCH v3] ath10k: skip the wait for completion to recovery in shutdown path

2021-03-01 Thread Abhishek Kumar
This patch seems to address the comments on v2. Overall this patch LGTM.

Reviewed-by: Abhishek Kumar 

On Tue, Feb 23, 2021 at 6:29 AM Youghandhar Chintala
 wrote:
>
> Currently in the shutdown callback we wait for recovery to complete
> before freeing up the resources. This results in additional two seconds
> delay during the shutdown and thereby increase the shutdown time.
>
> As an attempt to take less time during shutdown, remove the wait for
> recovery completion in the shutdown callback and added an API to freeing
> the reosurces in which they were common for shutdown and removing
> the module.
>
> Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.1-01040-QCAHLSWMTPLZ-1
>
> Signed-off-by: Youghandhar Chintala 
> Change-Id: I65bc27b5adae1fedc7f7b367ef13aafbd01f8c0c
> ---
> Changes from v2:
> -Corrected commit text and added common API for freeing the
>  resources for shutdown and unloading the module
> ---
>  drivers/net/wireless/ath/ath10k/snoc.c | 29 ++
>  1 file changed, 20 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath10k/snoc.c 
> b/drivers/net/wireless/ath/ath10k/snoc.c
> index 84666f72bdfa..70b3f2bd1c81 100644
> --- a/drivers/net/wireless/ath/ath10k/snoc.c
> +++ b/drivers/net/wireless/ath/ath10k/snoc.c
> @@ -1781,17 +1781,11 @@ static int ath10k_snoc_probe(struct platform_device 
> *pdev)
> return ret;
>  }
>
> -static int ath10k_snoc_remove(struct platform_device *pdev)
> +static int ath10k_snoc_free_resources(struct ath10k *ar)
>  {
> -   struct ath10k *ar = platform_get_drvdata(pdev);
> struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar);
>
> -   ath10k_dbg(ar, ATH10K_DBG_SNOC, "snoc remove\n");
> -
> -   reinit_completion(>driver_recovery);
> -
> -   if (test_bit(ATH10K_SNOC_FLAG_RECOVERY, _snoc->flags))
> -   wait_for_completion_timeout(>driver_recovery, 3 * HZ);
> +   ath10k_dbg(ar, ATH10K_DBG_SNOC, "snoc free resources\n");
>
> set_bit(ATH10K_SNOC_FLAG_UNREGISTERING, _snoc->flags);
>
> @@ -1805,12 +1799,29 @@ static int ath10k_snoc_remove(struct platform_device 
> *pdev)
> return 0;
>  }
>
> +static int ath10k_snoc_remove(struct platform_device *pdev)
> +{
> +   struct ath10k *ar = platform_get_drvdata(pdev);
> +   struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar);
> +
> +   ath10k_dbg(ar, ATH10K_DBG_SNOC, "snoc remove\n");
> +
> +   reinit_completion(>driver_recovery);
> +
> +   if (test_bit(ATH10K_SNOC_FLAG_RECOVERY, _snoc->flags))
> +   wait_for_completion_timeout(>driver_recovery, 3 * HZ);
> +
> +   ath10k_snoc_free_resources(ar);
> +
> +   return 0;
> +}
> +
>  static void ath10k_snoc_shutdown(struct platform_device *pdev)
>  {
> struct ath10k *ar = platform_get_drvdata(pdev);
>
> ath10k_dbg(ar, ATH10K_DBG_SNOC, "snoc shutdown\n");
> -   ath10k_snoc_remove(pdev);
> +   ath10k_snoc_free_resources(ar);
>  }
>
>  static struct platform_driver ath10k_snoc_driver = {
> --
> 2.29.0
>


Re: [PATCH 04/11] perf test: Fix cpu and thread map leaks in sw_clock_freq test

2021-03-01 Thread Namhyung Kim
Hi Jiri,

On Tue, Mar 2, 2021 at 2:24 AM Jiri Olsa  wrote:
>
> On Mon, Mar 01, 2021 at 11:04:02PM +0900, Namhyung Kim wrote:
> > The evlist has the maps with its own refcounts so we don't need to set
> > the pointers to NULL.  Otherwise following error was reported by Asan.
> >
> > Also change the goto label since it doesn't need to have two.
> >
> >   # perf test -v 25
> >   25: Software clock events period values:
> >   --- start ---
> >   test child forked, pid 149154
> >   mmap size 528384B
> >   mmap size 528384B
> >
> >   =
> >   ==149154==ERROR: LeakSanitizer: detected memory leaks
> >
> >   Direct leak of 32 byte(s) in 1 object(s) allocated from:
> > #0 0x7fef5cd071f8 in __interceptor_realloc 
> > ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:164
> > #1 0x56260d5e8b8e in perf_thread_map__realloc 
> > /home/namhyung/project/linux/tools/lib/perf/threadmap.c:23
> > #2 0x56260d3df7a9 in thread_map__new_by_tid util/thread_map.c:63
> > #3 0x56260d2ac6b2 in __test__sw_clock_freq tests/sw-clock.c:65
> > #4 0x56260d26d8fb in run_test tests/builtin-test.c:428
> > #5 0x56260d26d8fb in test_and_print tests/builtin-test.c:458
> > #6 0x56260d26fa53 in __cmd_test tests/builtin-test.c:679
> > #7 0x56260d26fa53 in cmd_test tests/builtin-test.c:825
> > #8 0x56260d2dbb64 in run_builtin 
> > /home/namhyung/project/linux/tools/perf/perf.c:313
> > #9 0x56260d165a88 in handle_internal_command 
> > /home/namhyung/project/linux/tools/perf/perf.c:365
> > #10 0x56260d165a88 in run_argv 
> > /home/namhyung/project/linux/tools/perf/perf.c:409
> > #11 0x56260d165a88 in main 
> > /home/namhyung/project/linux/tools/perf/perf.c:539
> > #12 0x7fef5c83cd09 in __libc_start_main ../csu/libc-start.c:308
> >
> > ...
> >   test child finished with 1
> >    end 
> >   Software clock events period values  : FAILED!
> >
> > Signed-off-by: Namhyung Kim 
> > ---
> >  tools/perf/tests/sw-clock.c | 12 
> >  1 file changed, 4 insertions(+), 8 deletions(-)
> >
> > diff --git a/tools/perf/tests/sw-clock.c b/tools/perf/tests/sw-clock.c
> > index a49c9e23053b..74988846be1d 100644
> > --- a/tools/perf/tests/sw-clock.c
> > +++ b/tools/perf/tests/sw-clock.c
> > @@ -42,8 +42,8 @@ static int __test__sw_clock_freq(enum perf_sw_ids 
> > clock_id)
> >   .disabled = 1,
> >   .freq = 1,
> >   };
> > - struct perf_cpu_map *cpus;
> > - struct perf_thread_map *threads;
> > + struct perf_cpu_map *cpus = NULL;
> > + struct perf_thread_map *threads = NULL;
> >   struct mmap *md;
> >
> >   attr.sample_freq = 500;
> > @@ -66,14 +66,11 @@ static int __test__sw_clock_freq(enum perf_sw_ids 
> > clock_id)
> >   if (!cpus || !threads) {
> >   err = -ENOMEM;
> >   pr_debug("Not enough memory to create thread/cpu maps\n");
> > - goto out_free_maps;
> > + goto out_delete_evlist;
> >   }
> >
> >   perf_evlist__set_maps(>core, cpus, threads);
> >
> > - cpus= NULL;
> > - threads = NULL;
>
> hum, so IIUC we added these and the other you remove in your patches long 
> time ago,
> because there was no refcounting at that time, right?

It seems my original patch just set the maps directly.

  bc96b361cbf9 perf tests: Add a test case for checking sw clock event frequency

And after that Adrian changed it to use the set_maps() helper.

  c5e6bd2ed3e8 perf tests: Fix software clock events test setting maps

It seems we already had the refcounting at the moment.  And then the libperf
renaming happened later.

Thanks,
Namhyung


drivers/soc/samsung/s3c-pm-debug.c:30:2: warning: function 's3c_pm_dbg' might be a candidate for 'gnu_printf' format attribute

2021-03-01 Thread kernel test robot
Hi Arnd,

First bad commit (maybe != root cause):

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   7a7fd0de4a9804299793e564a555a49c1fc924cb
commit: 17132da70eb766785b9b4677bacce18cc11ea442 ARM: samsung: move pm check 
code to drivers/soc
date:   6 months ago
config: arm-randconfig-r023-20210302 (attached as .config)
compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=17132da70eb766785b9b4677bacce18cc11ea442
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout 17132da70eb766785b9b4677bacce18cc11ea442
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   drivers/soc/samsung/s3c-pm-debug.c: In function 's3c_pm_dbg':
>> drivers/soc/samsung/s3c-pm-debug.c:30:2: warning: function 's3c_pm_dbg' 
>> might be a candidate for 'gnu_printf' format attribute 
>> [-Wsuggest-attribute=format]
  30 |  vsnprintf(buff, sizeof(buff), fmt, va);
 |  ^


vim +30 drivers/soc/samsung/s3c-pm-debug.c

72551f6cf13e2f3a arch/arm/plat-samsung/pm-debug.c Tomasz Figa 2014-03-18  23  
72551f6cf13e2f3a arch/arm/plat-samsung/pm-debug.c Tomasz Figa 2014-03-18  24  
void s3c_pm_dbg(const char *fmt, ...)
72551f6cf13e2f3a arch/arm/plat-samsung/pm-debug.c Tomasz Figa 2014-03-18  25  {
72551f6cf13e2f3a arch/arm/plat-samsung/pm-debug.c Tomasz Figa 2014-03-18  26
va_list va;
72551f6cf13e2f3a arch/arm/plat-samsung/pm-debug.c Tomasz Figa 2014-03-18  27
char buff[256];
72551f6cf13e2f3a arch/arm/plat-samsung/pm-debug.c Tomasz Figa 2014-03-18  28  
72551f6cf13e2f3a arch/arm/plat-samsung/pm-debug.c Tomasz Figa 2014-03-18  29
va_start(va, fmt);
72551f6cf13e2f3a arch/arm/plat-samsung/pm-debug.c Tomasz Figa 2014-03-18 @30
vsnprintf(buff, sizeof(buff), fmt, va);
72551f6cf13e2f3a arch/arm/plat-samsung/pm-debug.c Tomasz Figa 2014-03-18  31
va_end(va);
72551f6cf13e2f3a arch/arm/plat-samsung/pm-debug.c Tomasz Figa 2014-03-18  32  
72551f6cf13e2f3a arch/arm/plat-samsung/pm-debug.c Tomasz Figa 2014-03-18  33
printascii(buff);
72551f6cf13e2f3a arch/arm/plat-samsung/pm-debug.c Tomasz Figa 2014-03-18  34  }
72551f6cf13e2f3a arch/arm/plat-samsung/pm-debug.c Tomasz Figa 2014-03-18  35  

:: The code at line 30 was first introduced by commit
:: 72551f6cf13e2f3a1d273b7007b5d7d7fd69c554 ARM: SAMSUNG: Move Samsung PM 
debug code into separate file

:: TO: Tomasz Figa 
:: CC: Kukjin Kim 

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH 01/13] rcu/nocb: Fix potential missed nocb_timer rearm

2021-03-01 Thread Paul E. McKenney
On Wed, Feb 24, 2021 at 11:06:06PM +0100, Frederic Weisbecker wrote:
> On Wed, Feb 24, 2021 at 10:37:09AM -0800, Paul E. McKenney wrote:
> > On Tue, Feb 23, 2021 at 01:09:59AM +0100, Frederic Weisbecker wrote:
> > > Two situations can cause a missed nocb timer rearm:
> > > 
> > > 1) rdp(CPU A) queues its nocb timer. The grace period elapses before
> > >the timer get a chance to fire. The nocb_gp kthread is awaken by
> > >rdp(CPU B). The nocb_cb kthread for rdp(CPU A) is awaken and
> > >process the callbacks, again before the nocb_timer for CPU A get a
> > >chance to fire. rdp(CPU A) queues a callback and wakes up nocb_gp
> > >kthread, cancelling the pending nocb_timer without resetting the
> > >corresponding nocb_defer_wakeup.
> > 
> > As discussed offlist, expanding the above scenario results in this
> > sequence of steps:

I renumbered the CPUs, since the ->nocb_gp_kthread would normally be
associated with CPU 0.  If the first CPU to enqueue a callback was also
CPU 0, nocb_gp_wait() might clear that CPU's ->nocb_defer_wakeup, which
would prevent this scenario from playing out.  (But admittedly only if
some other CPU handled by this same ->nocb_gp_kthread used its bypass.)

> > 1.  There are no callbacks queued for any CPU covered by CPU 0-2's
> > ->nocb_gp_kthread.

And ->nocb_gp_kthread is associated with CPU 0.

> > 2.  CPU 1 enqueues its first callback with interrupts disabled, and
> > thus must defer awakening its ->nocb_gp_kthread.  It therefore
> > queues its rcu_data structure's ->nocb_timer.

At this point, CPU 1's rdp->nocb_defer_wakeup is RCU_NOCB_WAKE.

> > 3.  CPU 2, which shares the same ->nocb_gp_kthread, also enqueues a
> > callback, but with interrupts enabled, allowing it to directly
> > awaken the ->nocb_gp_kthread.
> > 
> > 4.  The newly awakened ->nocb_gp_kthread associates both CPU 1's
> > and CPU 2's callbacks with a future grace period and arranges
> > for that grace period to be started.
> > 
> > 5.  This ->nocb_gp_kthread goes to sleep waiting for the end of this
> > future grace period.
> > 
> > 6.  This grace period elapses before the CPU 1's timer fires.
> > This is normally improbably given that the timer is set for only
> > one jiffy, but timers can be delayed.  Besides, it is possible
> > that kernel was built with CONFIG_RCU_STRICT_GRACE_PERIOD=y.
> > 
> > 7.  The grace period ends, so rcu_gp_kthread awakens the
> > ->nocb_gp_kthread, which in turn awakens both CPU 1's and
> > CPU 2's ->nocb_cb_kthread.

And then ->nocb_cb_kthread sleeps waiting for more callbacks.

> > 8.  CPU 1's ->nocb_cb_kthread invokes its callback.
> > 
> > 9.  Note that neither kthread updated any ->nocb_timer state,
> > so CPU 1's ->nocb_defer_wakeup is still set to RCU_NOCB_WAKE.
> > 
> > 10. CPU 1 enqueues its second callback, again with interrupts
> > disabled, and thus must again defer awakening its
> > ->nocb_gp_kthread.  However, ->nocb_defer_wakeup prevents
> > CPU 1 from queueing the timer.
> 
> I managed to recollect some pieces of my brain. So keep the above but
> let's change the point 10:
> 
> 10.   CPU 1 enqueues its second callback, this time with interrupts
>   enabled so it can wake directly ->nocb_gp_kthread.
>   It does so with calling __wake_nocb_gp() which also cancels the

wake_nocb_gp() in current -rcu, correct?

>   pending timer that got queued in step 2. But that doesn't reset
>   CPU 1's ->nocb_defer_wakeup which is still set to RCU_NOCB_WAKE.
>   So CPU 1's ->nocb_defer_wakeup and CPU 1's ->nocb_timer are now
>   desynchronized.

Agreed, and agreed that this is a bug.  Thank you for persisting on
this one!

> 11.   ->nocb_gp_kthread associates the callback queued in 10 with a new
>   grace period, arrange for it to start and sleeps on it.
> 
> 12.   The grace period ends, ->nocb_gp_kthread awakens and wakes up
>   CPU 1's ->nocb_cb_kthread which invokes the callback queued in 10.
> 
> 13.   CPU 1 enqueues its third callback, this time with interrupts
>   disabled so it tries to queue a deferred wakeup. However
>   ->nocb_defer_wakeup has a stalled RCU_NOCB_WAKE value which prevents
>   the CPU 1's ->nocb_timer, that got cancelled in 10, from being armed.
> 
> 14.   CPU 1 has its pending callback and it may go unnoticed until
>   some other CPU ever wakes up ->nocb_gp_kthread or CPU 1 ever calls
>   an explicit deferred wake up caller like idle entry.
> 
> I hope I'm not missing something this time...

If you are missing something, then so am I!  ;-)

> > So far so good, but why isn't the timer still queued from back in step 2?
> > What am I missing here?  Either way, could you please update the commit
> > logs to tell the full story?  At some later time, you might be very
> > happy that you did.  ;-)
> > 
> > > 2) The "nocb_bypass_timer" ends up calling wake_nocb_gp() which deletes
> > >the pending "nocb_timer" (note they are 

RE: [PATCH v2 10/10] clocksource/drivers/hyper-v: Move handling of STIMER0 interrupts

2021-03-01 Thread Michael Kelley
From: Daniel Lezcano  Sent: Monday, March 1, 2021 
10:45 AM
> 
> On 01/03/2021 02:15, Michael Kelley wrote:
> > STIMER0 interrupts are most naturally modeled as per-cpu IRQs. But
> > because x86/x64 doesn't have per-cpu IRQs, the core STIMER0 interrupt
> > handling machinery is done in code under arch/x86 and Linux IRQs are
> > not used. Adding support for ARM64 means adding equivalent code
> > using per-cpu IRQs under arch/arm64.
> >
> > A better model is to treat per-cpu IRQs as the normal path (which it is
> > for modern architectures), and the x86/x64 path as the exception. Do this
> > by incorporating standard Linux per-cpu IRQ allocation into the main
> > SITMER0 driver code, and bypass it in the x86/x64 exception case. For
> > x86/x64, special case code is retained under arch/x86, but no STIMER0
> > interrupt handling code is needed under arch/arm64.
> >
> > No functional change.
> >
> > Signed-off-by: Michael Kelley 
> > ---
> >  arch/x86/hyperv/hv_init.c  |   2 +-
> >  arch/x86/include/asm/mshyperv.h|   4 -
> >  arch/x86/kernel/cpu/mshyperv.c |  10 +--
> >  drivers/clocksource/hyperv_timer.c | 180 
> > ++---
> >  include/asm-generic/mshyperv.h |   5 --
> >  include/clocksource/hyperv_timer.h |   3 +-
> >  6 files changed, 132 insertions(+), 72 deletions(-)
> >
> > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> > index 9af4f8a..9d10025 100644
> > --- a/arch/x86/hyperv/hv_init.c
> > +++ b/arch/x86/hyperv/hv_init.c
> > @@ -327,7 +327,7 @@ static void __init hv_stimer_setup_percpu_clockev(void)
> >  * Ignore any errors in setting up stimer clockevents
> >  * as we can run with the LAPIC timer as a fallback.
> >  */
> > -   (void)hv_stimer_alloc();
> > +   (void)hv_stimer_alloc(false);
> >
> > /*
> >  * Still register the LAPIC timer, because the direct-mode STIMER is
> > diff --git a/arch/x86/include/asm/mshyperv.h 
> > b/arch/x86/include/asm/mshyperv.h
> > index 5433312..6d4891b 100644
> > --- a/arch/x86/include/asm/mshyperv.h
> > +++ b/arch/x86/include/asm/mshyperv.h
> > @@ -31,10 +31,6 @@ static inline u64 hv_get_register(unsigned int reg)
> >
> >  void hyperv_vector_handler(struct pt_regs *regs);
> >
> > -static inline void hv_enable_stimer0_percpu_irq(int irq) {}
> > -static inline void hv_disable_stimer0_percpu_irq(int irq) {}
> > -
> > -
> >  #if IS_ENABLED(CONFIG_HYPERV)
> >  extern int hyperv_init_cpuhp;
> >
> > diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> > index 41fd84a..cebed53 100644
> > --- a/arch/x86/kernel/cpu/mshyperv.c
> > +++ b/arch/x86/kernel/cpu/mshyperv.c
> > @@ -90,21 +90,17 @@ void hv_remove_vmbus_handler(void)
> > set_irq_regs(old_regs);
> >  }
> >
> > -int hv_setup_stimer0_irq(int *irq, int *vector, void (*handler)(void))
> > +/* For x86/x64, override weak placeholders in hyperv_timer.c */
> > +void hv_setup_stimer0_handler(void (*handler)(void))
> >  {
> > -   *vector = HYPERV_STIMER0_VECTOR;
> > -   *irq = -1;   /* Unused on x86/x64 */
> > hv_stimer0_handler = handler;
> > -   return 0;
> >  }
> > -EXPORT_SYMBOL_GPL(hv_setup_stimer0_irq);
> >
> > -void hv_remove_stimer0_irq(int irq)
> > +void hv_remove_stimer0_handler(void)
> >  {
> > /* We have no way to deallocate the interrupt gate */
> > hv_stimer0_handler = NULL;
> >  }
> > -EXPORT_SYMBOL_GPL(hv_remove_stimer0_irq);
> >
> >  void hv_setup_kexec_handler(void (*handler)(void))
> >  {
> > diff --git a/drivers/clocksource/hyperv_timer.c 
> > b/drivers/clocksource/hyperv_timer.c
> > index cdb8e0c..b2bf5e5 100644
> > --- a/drivers/clocksource/hyperv_timer.c
> > +++ b/drivers/clocksource/hyperv_timer.c
> > @@ -18,6 +18,9 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> > +#include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -43,14 +46,13 @@
> >   */
> >  static bool direct_mode_enabled;
> >
> > -static int stimer0_irq;
> > -static int stimer0_vector;
> > +static int stimer0_irq = -1;
> > +static long __percpu *stimer0_evt;
> 
> Why not
> 
> static DEFINE_PER_CPU(long, stimer0_evt);
> 
> no need of allocation /free ?
> 

Indeed!  I'll make that simplification in v3 of the patch set.

Michael


RE: [PATCH v2 08/10] clocksource/drivers/hyper-v: Handle sched_clock differences inline

2021-03-01 Thread Michael Kelley
From: Daniel Lezcano  Sent: Monday, March 1, 2021 
6:25 AM
> 
> On 01/03/2021 02:15, Michael Kelley wrote:
> > While the Hyper-V Reference TSC code is architecture neutral, the
> > pv_ops.time.sched_clock() function is implemented for x86/x64, but not
> > for ARM64. Current code calls a utility function under arch/x86 (and
> > coming, under arch/arm64) to handle the difference.
> >
> > Change this approach to handle the difference inline based on whether
> > GENERIC_SCHED_CLOCK is present.  The new approach removes code under
> > arch/* since the difference is tied more to the specifics of the Linux
> > implementation than to the architecture.
> >
> > No functional change.
> >
> > Signed-off-by: Michael Kelley 
> > Reviewed-by: Boqun Feng 
> > ---
> 
> [ ... ]
> 
> > +/*
> > + * Reference to pv_ops must be inline so objtool
> > + * detection of noinstr violations can work correctly.
> > + */
> > +static __always_inline void hv_setup_sched_clock(void *sched_clock)
> > +{
> > +#ifdef CONFIG_GENERIC_SCHED_CLOCK
> > +   /*
> > +* We're on an architecture with generic sched clock (not x86/x64).
> > +* The Hyper-V sched clock read function returns nanoseconds, not
> > +* the normal 100ns units of the Hyper-V synthetic clock.
> > +*/
> > +   sched_clock_register(sched_clock, 64, NSEC_PER_SEC);
> > +#else
> > +#ifdef CONFIG_PARAVIRT
> > +   /* We're on x86/x64 *and* using PV ops */
> > +   pv_ops.time.sched_clock = sched_clock;
> > +#endif
> > +#endif
> > +}
> Please refer to:
> 
> Documentation/process/coding-style.rst
> 
> Section 21)
> 
> [ ... ]
> 
> Prefer to compile out entire functions, rather than portions of
> functions or portions of expressions.
> 
> [ ... ]
> 

OK.  I'll rework the #ifdef in v3 of the patch set.  Is the following
the preferred approach?

#ifdef CONFIG_GENERIC_SCHED_CLOCK
static __always_inline void hv_setup_sched_clock(void *sched_clock)
{
sched_clock_register(sched_clock, 64 NSEC_PER_SEC);
}
#else
#ifdef CONFIG_PARAVIRT
static __always_inline void hv_setup_sched_clock(void *sched_clock)
{
pv_ops.time.sched_clock = sched_clock:
}
#else
static __always_inline void hv_setup_sched_clock(void *sched_clock) {}
#endif
#endif

Michael


Re: [PATCH] e1000e: use proper #include guard name in hw.h

2021-03-01 Thread Nguyen, Anthony L
On Sat, 2021-02-27 at 10:58 +0100, Greg Kroah-Hartman wrote:
> The include guard for the e1000e and e1000 hw.h files are the same,
> so
> add the proper "E" term to the hw.h file for the e1000e driver.

There's a patch in process that addresses this issue [1].

> This resolves some static analyzer warnings, like the one found by
> the
> "lgtm.com" tool.
> 
> Cc: Jesse Brandeburg 
> Cc: Tony Nguyen 
> Cc: "David S. Miller" 
> Cc: Jakub Kicinski 
> Cc: intel-wired-...@lists.osuosl.org
> Signed-off-by: Greg Kroah-Hartman 

[1] https://patchwork.ozlabs.org/project/intel-wired-
lan/patch/20210222040005.20126-1-tseew...@gmail.com/

Thanks,
Tony


Re: [PATCH v4 05/14] drm/bridge: imx: Add i.MX8qm/qxp pixel combiner support

2021-03-01 Thread Liu Ying
On Mon, 2021-03-01 at 11:56 +0100, Robert Foss wrote:
> On Mon, 1 Mar 2021 at 10:07, Liu Ying  wrote:
> > Hi Robert,
> > 
> > On Fri, 2021-02-26 at 14:07 +0100, Robert Foss wrote:
> > > Hey Liu,
> > > 
> > > With the below nit straightened out, feel free to add my r-b.
> > > 
> > > Reviewed-by: Robert Foss 
> > 
> > Thanks for reviewing this patch.
> > 
> > > On Thu, 18 Feb 2021 at 04:58, Liu Ying  wrote:
> > > > This patch adds a drm bridge driver for i.MX8qm/qxp pixel combiner.
> > > > The pixel combiner takes two output streams from a single display
> > > > controller and manipulates the two streams to support a number
> > > > of modes(bypass, pixel combine, YUV444 to YUV422, split_RGB) configured
> > > > as either one screen, two screens, or virtual screens.  The pixel
> > > > combiner is also responsible for generating some of the control signals
> > > > for the pixel link output channel.  For now, the driver only supports
> > > > the bypass mode.
> > > > 
> > > > Signed-off-by: Liu Ying 
> > > > ---
> > > > v3->v4:
> > > > * No change.
> > > > 
> > > > v2->v3:
> > > > * No change.
> > > > 
> > > > v1->v2:
> > > > * No change.
> > > > 
> > > >  drivers/gpu/drm/bridge/Kconfig |   2 +
> > > >  drivers/gpu/drm/bridge/Makefile|   1 +
> > > >  drivers/gpu/drm/bridge/imx/Kconfig |   8 +
> > > >  drivers/gpu/drm/bridge/imx/Makefile|   1 +
> > > >  .../gpu/drm/bridge/imx/imx8qxp-pixel-combiner.c| 452 
> > > > +
> > > >  5 files changed, 464 insertions(+)
> > > >  create mode 100644 drivers/gpu/drm/bridge/imx/Kconfig
> > > >  create mode 100644 drivers/gpu/drm/bridge/imx/Makefile
> > > >  create mode 100644 drivers/gpu/drm/bridge/imx/imx8qxp-pixel-combiner.c
> > > > 
> > > > diff --git a/drivers/gpu/drm/bridge/Kconfig 
> > > > b/drivers/gpu/drm/bridge/Kconfig
> > > > index e4110d6c..84944e0 100644
> > > > --- a/drivers/gpu/drm/bridge/Kconfig
> > > > +++ b/drivers/gpu/drm/bridge/Kconfig
> > > > @@ -256,6 +256,8 @@ source "drivers/gpu/drm/bridge/adv7511/Kconfig"
> > > > 
> > > >  source "drivers/gpu/drm/bridge/cadence/Kconfig"
> > > > 
> > > > +source "drivers/gpu/drm/bridge/imx/Kconfig"
> > > > +
> > > >  source "drivers/gpu/drm/bridge/synopsys/Kconfig"
> > > > 
> > > >  endmenu
> > > > diff --git a/drivers/gpu/drm/bridge/Makefile 
> > > > b/drivers/gpu/drm/bridge/Makefile
> > > > index 86e7acc..bc80cae 100644
> > > > --- a/drivers/gpu/drm/bridge/Makefile
> > > > +++ b/drivers/gpu/drm/bridge/Makefile
> > > > @@ -27,4 +27,5 @@ obj-$(CONFIG_DRM_NWL_MIPI_DSI) += nwl-dsi.o
> > > > 
> > > >  obj-y += analogix/
> > > >  obj-y += cadence/
> > > > +obj-y += imx/
> > > >  obj-y += synopsys/
> > > > diff --git a/drivers/gpu/drm/bridge/imx/Kconfig 
> > > > b/drivers/gpu/drm/bridge/imx/Kconfig
> > > > new file mode 100644
> > > > index ..f1c91b6
> > > > --- /dev/null
> > > > +++ b/drivers/gpu/drm/bridge/imx/Kconfig
> > > > @@ -0,0 +1,8 @@
> > > > +config DRM_IMX8QXP_PIXEL_COMBINER
> > > > +   tristate "Freescale i.MX8QM/QXP pixel combiner"
> > > > +   depends on OF
> > > > +   depends on COMMON_CLK
> > > > +   select DRM_KMS_HELPER
> > > > +   help
> > > > + Choose this to enable pixel combiner found in
> > > > + Freescale i.MX8qm/qxp processors.
> > > > diff --git a/drivers/gpu/drm/bridge/imx/Makefile 
> > > > b/drivers/gpu/drm/bridge/imx/Makefile
> > > > new file mode 100644
> > > > index ..7d7c8d6
> > > > --- /dev/null
> > > > +++ b/drivers/gpu/drm/bridge/imx/Makefile
> > > > @@ -0,0 +1 @@
> > > > +obj-$(CONFIG_DRM_IMX8QXP_PIXEL_COMBINER) += imx8qxp-pixel-combiner.o
> > > > diff --git a/drivers/gpu/drm/bridge/imx/imx8qxp-pixel-combiner.c 
> > > > b/drivers/gpu/drm/bridge/imx/imx8qxp-pixel-combiner.c
> > > > new file mode 100644
> > > > index ..cd5b1be
> > > > --- /dev/null
> > > > +++ b/drivers/gpu/drm/bridge/imx/imx8qxp-pixel-combiner.c
> > > > @@ -0,0 +1,452 @@
> > > > +// SPDX-License-Identifier: GPL-2.0+
> > > > +
> > > > +/*
> > > > + * Copyright 2020 NXP
> > > > + */
> > > > +
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +
> > > > +#define PC_CTRL_REG0x0
> > > > +#define  PC_COMBINE_ENABLE BIT(0)
> > > > +#define  PC_DISP_BYPASS(n) BIT(1 + 21 * (n))
> > > > +#define  PC_DISP_HSYNC_POLARITY(n) BIT(2 + 11 * (n))
> > > > +#define  PC_DISP_HSYNC_POLARITY_POS(n) DISP_HSYNC_POLARITY(n)
> > > > +#define  PC_DISP_VSYNC_POLARITY(n) BIT(3 + 11 * (n))
> > > > +#define  PC_DISP_VSYNC_POLARITY_POS(n) DISP_VSYNC_POLARITY(n)
> > > > +#define  PC_DISP_DVALID_POLARITY(n)BIT(4 + 11 * (n))
> > > > +#define  PC_DISP_DVALID_POLARITY_POS(n)DISP_DVALID_POLARITY(n)
> > > > +#define  PC_VSYNC_MASK_ENABLE  BIT(5)
> > > > +#define  

Re: Why do kprobes and uprobes singlestep?

2021-03-01 Thread Andy Lutomirski
On Mon, Mar 1, 2021 at 8:51 AM Oleg Nesterov  wrote:
>
> Hi Andy,
>
> sorry for delay.
>
> On 02/23, Andy Lutomirski wrote:
> >
> > A while back, I let myself be convinced that kprobes genuinely need to
> > single-step the kernel on occasion, and I decided that this sucked but
> > I could live with it.  it would, however, be Really Really Nice (tm)
> > if we could have a rule that anyone running x86 Linux who single-steps
> > the kernel (e.g. kgdb and nothing else) gets to keep all the pieces
> > when the system falls apart around them.  Specifically, if we don't
> > allow kernel single-stepping and if we suitably limit kernel
> > instruction breakpoints (the latter isn't actually a major problem),
> > then we don't really really need to use IRET to return to the kernel,
> > and that means we can avoid some massive NMI nastiness.
>
> Not sure I understand you correctly, I know almost nothing about low-level
> x86  magic.
>
> But I guess this has nothing to do with uprobes, they do not single-step
> in kernel mode, right?

They single-step user code, though, and the code that makes this work
is quite ugly.  Single-stepping on x86 is a mess.

>
> > Uprobes seem to single-step user code for no discernable reason.
> > (They want to trap after executing an out of line instruction, AFAICT.
> > Surely INT3 or even CALL after the out-of-line insn would work as well
> > or better.)
>
> Uprobes use single-step from the very beginning, probably because this
> is the most simple and "standard" way to implement xol.
>
> And please note that CALL/JMP/etc emulation was added much later to fix the
> problems with non-canonical addresses, and this emulation it still incomplete.

Is there something like a uprobe test suite?  How maintained /
actively used is uprobe?

--Andy


Re: 5.12-rc1 regression: freezing iou-mgr/wrk failed

2021-03-01 Thread Jens Axboe
On 3/1/21 6:25 PM, Jens Axboe wrote:
> On 3/1/21 6:11 PM, Jens Axboe wrote:
>> On 3/1/21 5:57 PM, Alex Xu (Hello71) wrote:
>>> Hi,
>>>
>>> On Linux 5.12-rc1, I am unable to suspend to RAM. The system freezes for 
>>> about 40 seconds and then continues operation. The following messages 
>>> are printed to the kernel log:
>>>
>>> [  240.650300] PM: suspend entry (deep)
>>> [  240.650748] Filesystems sync: 0.000 seconds
>>> [  240.725605] Freezing user space processes ...
>>> [  260.739483] Freezing of tasks failed after 20.013 seconds (3 tasks 
>>> refusing to freeze, wq_busy=0):
>>> [  260.739497] task:iou-mgr-446 state:S stack:0 pid:  516 ppid:   
>>> 439 flags:0x4224
>>> [  260.739504] Call Trace:
>>> [  260.739507]  ? sysvec_apic_timer_interrupt+0xb/0x81
>>> [  260.739515]  ? pick_next_task_fair+0x197/0x1cde
>>> [  260.739519]  ? sysvec_reschedule_ipi+0x2f/0x6a
>>> [  260.739522]  ? asm_sysvec_reschedule_ipi+0x12/0x20
>>> [  260.739525]  ? __schedule+0x57/0x6d6
>>> [  260.739529]  ? del_timer_sync+0xb9/0x115
>>> [  260.739533]  ? schedule+0x63/0xd5
>>> [  260.739536]  ? schedule_timeout+0x219/0x356
>>> [  260.739540]  ? __next_timer_interrupt+0xf1/0xf1
>>> [  260.739544]  ? io_wq_manager+0x73/0xb1
>>> [  260.739549]  ? io_wq_create+0x262/0x262
>>> [  260.739553]  ? ret_from_fork+0x22/0x30
>>> [  260.739557] task:iou-mgr-517 state:S stack:0 pid:  522 ppid:   
>>> 439 flags:0x4224
>>> [  260.739561] Call Trace:
>>> [  260.739563]  ? sysvec_apic_timer_interrupt+0xb/0x81
>>> [  260.739566]  ? pick_next_task_fair+0x16f/0x1cde
>>> [  260.739569]  ? sysvec_apic_timer_interrupt+0xb/0x81
>>> [  260.739571]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
>>> [  260.739574]  ? __schedule+0x5b7/0x6d6
>>> [  260.739578]  ? del_timer_sync+0x70/0x115
>>> [  260.739581]  ? schedule_timeout+0x211/0x356
>>> [  260.739585]  ? __next_timer_interrupt+0xf1/0xf1
>>> [  260.739588]  ? io_wq_check_workers+0x15/0x11f
>>> [  260.739592]  ? io_wq_manager+0x69/0xb1
>>> [  260.739596]  ? io_wq_create+0x262/0x262
>>> [  260.739600]  ? ret_from_fork+0x22/0x30
>>> [  260.739603] task:iou-wrk-517 state:S stack:0 pid:  523 ppid:   
>>> 439 flags:0x4224
>>> [  260.739607] Call Trace:
>>> [  260.739609]  ? __schedule+0x5b7/0x6d6
>>> [  260.739614]  ? schedule+0x63/0xd5
>>> [  260.739617]  ? schedule_timeout+0x219/0x356
>>> [  260.739621]  ? __next_timer_interrupt+0xf1/0xf1
>>> [  260.739624]  ? task_thread.isra.0+0x148/0x3af
>>> [  260.739628]  ? task_thread_unbound+0xa/0xa
>>> [  260.739632]  ? task_thread_bound+0x7/0x7
>>> [  260.739636]  ? ret_from_fork+0x22/0x30
>>> [  260.739647] OOM killer enabled.
>>> [  260.739648] Restarting tasks ... done.
>>> [  260.740077] PM: suspend exit
>>>
>>> and then a set of similar messages except with s2idle instead of deep.
>>>
>>> Reverting 5695e51619 ("Merge tag 'io_uring-worker.v3-2021-02-25' of 
>>> git://git.kernel.dk/linux-block") appears to resolve the issue. I have 
>>> not yet bisected further. Let me know which troubleshooting steps I 
>>> should perform next.
>>
>> Can you try and pull in:
>>
>> git://git.kernel.dk/linux-block io_uring-5.12
>>
>> and see if that resolves it? I usually always run -git on my laptop as
>> well, but something broke it in the merge window so I need to figure
>> out what that is first...
>>
>> What distro are you running?
> 
> You probably want this on top...

And if you've verified that that one works OK, can you try this variant
instead?

diff --git a/fs/io-wq.c b/fs/io-wq.c
index 1fdb2b621b51..fe004cf93c4b 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "../kernel/sched/sched.h"
 #include "io-wq.h"
@@ -480,6 +481,7 @@ static int io_wqe_worker(void *data)
io_worker_start(worker);
 
while (!test_bit(IO_WQ_BIT_EXIT, >state)) {
+   try_to_freeze();
set_current_state(TASK_INTERRUPTIBLE);
 loop:
raw_spin_lock_irq(>lock);
@@ -731,6 +733,7 @@ static int io_wq_manager(void *data)
set_current_state(TASK_INTERRUPTIBLE);
io_wq_check_workers(wq);
schedule_timeout(HZ);
+   try_to_freeze();
if (fatal_signal_pending(current))
set_bit(IO_WQ_BIT_EXIT, >state);
} while (!test_bit(IO_WQ_BIT_EXIT, >state));
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 2757675ab417..03c42f1f9862 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -74,13 +74,11 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
 #include 
-#include 
-#include 
+#include 
 
 #define CREATE_TRACE_POINTS
 #include 
@@ -6744,6 +6748,7 @@ static int io_sq_thread(void *data)
io_ring_set_wakeup_flag(ctx);
 
schedule();
+   try_to_freeze();
list_for_each_entry(ctx, >ctx_list, sqd_list)

[PATCH] mm/memcg: set memcg when split pages

2021-03-01 Thread Zhou Guanghui
When split page, the memory cgroup info recorded in first page is
not copied to tail pages. In this case, when the tail pages are
freed, the uncharge operation is not performed. As a result, the
usage of this memcg keeps increasing, and the OOM may occur.

So, the copying of first page's memory cgroup info to tail pages
is needed when split page.

Signed-off-by: Zhou Guanghui 
---
 include/linux/memcontrol.h | 10 ++
 mm/page_alloc.c|  4 +++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index e6dc793d587d..c7e2b4421dc1 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -867,6 +867,12 @@ void mem_cgroup_print_oom_group(struct mem_cgroup *memcg);
 extern bool cgroup_memory_noswap;
 #endif
 
+static inline void copy_page_memcg(struct page *dst, struct page *src)
+{
+   if (src->memcg_data)
+   dst->memcg_data = src->memcg_data;
+}
+
 struct mem_cgroup *lock_page_memcg(struct page *page);
 void __unlock_page_memcg(struct mem_cgroup *memcg);
 void unlock_page_memcg(struct page *page);
@@ -1291,6 +1297,10 @@ mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
 {
 }
 
+static inline void copy_page_memcg(struct page *dst, struct page *src)
+{
+}
+
 static inline struct mem_cgroup *lock_page_memcg(struct page *page)
 {
return NULL;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3e4b29ee2b1e..ee0a63dc1c9b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3307,8 +3307,10 @@ void split_page(struct page *page, unsigned int order)
VM_BUG_ON_PAGE(PageCompound(page), page);
VM_BUG_ON_PAGE(!page_count(page), page);
 
-   for (i = 1; i < (1 << order); i++)
+   for (i = 1; i < (1 << order); i++) {
set_page_refcounted(page + i);
+   copy_page_memcg(page + i, page);
+   }
split_page_owner(page, 1 << order);
 }
 EXPORT_SYMBOL_GPL(split_page);
-- 
2.25.0



RE: [PATCH v2 07/10] clocksource/drivers/hyper-v: Handle vDSO differences inline

2021-03-01 Thread Michael Kelley
From: Daniel Lezcano  Sent: Monday, March 1, 2021 
4:22 AM
> 
> On 01/03/2021 02:15, Michael Kelley wrote:
> > While the driver for the Hyper-V Reference TSC and STIMERs is architecture
> > neutral, vDSO is implemented for x86/x64, but not for ARM64.  Current code
> > calls into utility functions under arch/x86 (and coming, under arch/arm64)
> > to handle the difference.
> >
> > Change this approach to handle the difference inline based on whether
> > VDSO_CLOCK_MODE_HVCLOCK is present.  The new approach removes code under
> > arch/* since the difference is tied more to the specifics of the Linux
> > implementation than to the architecture.
> >
> > No functional change.
> 
> A suggestion below
> 
> 
> > Signed-off-by: Michael Kelley 
> > Reviewed-by: Boqun Feng 
> > ---
> >  arch/x86/include/asm/mshyperv.h|  4 
> >  drivers/clocksource/hyperv_timer.c | 10 --
> >  2 files changed, 8 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/clocksource/hyperv_timer.c 
> > b/drivers/clocksource/hyperv_timer.c
> > index c73c127..5e5e08aa 100644
> > --- a/drivers/clocksource/hyperv_timer.c
> > +++ b/drivers/clocksource/hyperv_timer.c
> > @@ -372,7 +372,9 @@ static void resume_hv_clock_tsc(struct clocksource *arg)
> >
> >  static int hv_cs_enable(struct clocksource *cs)
> 
> static __maybe_unused int hv_cs_enable(struct clocksource *cs)
> 
> >  {
> > -   hv_enable_vdso_clocksource();
> > +#ifdef VDSO_CLOCKMODE_HVCLOCK
> > +   vclocks_set_used(VDSO_CLOCKMODE_HVCLOCK);
> > +#endif
> > return 0;
> >  }
> >
> > @@ -385,6 +387,11 @@ static int hv_cs_enable(struct clocksource *cs)
> > .suspend= suspend_hv_clock_tsc,
> > .resume = resume_hv_clock_tsc,
> > .enable = hv_cs_enable,
> > +#ifdef VDSO_CLOCKMODE_HVCLOCK
> > +   .vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK,
> > +#else
> > +   .vdso_clock_mode = VDSO_CLOCKMODE_NONE,
> > +#endif
> 
> #ifdef VDSO_CLOCKMODE_HVCLOCK
>   .enable = hv_cs_enable,
>   .vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK,
> #else
>   .vdso_clock_mode = VDSO_CLOCKMODE_NONE,
> #endif
> 

Is there any particular benefit (that I might not be recognizing)
to having the .enable function be NULL vs. a function that
does nothing?  I can see the handful of places where the
.enable function is invoked, and there doesn't seem to be
much difference.

In any case, I have no problem with making the change in
a v3 of the patch set.

Michael




Re: 5.12-rc1 regression: freezing iou-mgr/wrk failed

2021-03-01 Thread Jens Axboe
On 3/1/21 6:11 PM, Jens Axboe wrote:
> On 3/1/21 5:57 PM, Alex Xu (Hello71) wrote:
>> Hi,
>>
>> On Linux 5.12-rc1, I am unable to suspend to RAM. The system freezes for 
>> about 40 seconds and then continues operation. The following messages 
>> are printed to the kernel log:
>>
>> [  240.650300] PM: suspend entry (deep)
>> [  240.650748] Filesystems sync: 0.000 seconds
>> [  240.725605] Freezing user space processes ...
>> [  260.739483] Freezing of tasks failed after 20.013 seconds (3 tasks 
>> refusing to freeze, wq_busy=0):
>> [  260.739497] task:iou-mgr-446 state:S stack:0 pid:  516 ppid:   
>> 439 flags:0x4224
>> [  260.739504] Call Trace:
>> [  260.739507]  ? sysvec_apic_timer_interrupt+0xb/0x81
>> [  260.739515]  ? pick_next_task_fair+0x197/0x1cde
>> [  260.739519]  ? sysvec_reschedule_ipi+0x2f/0x6a
>> [  260.739522]  ? asm_sysvec_reschedule_ipi+0x12/0x20
>> [  260.739525]  ? __schedule+0x57/0x6d6
>> [  260.739529]  ? del_timer_sync+0xb9/0x115
>> [  260.739533]  ? schedule+0x63/0xd5
>> [  260.739536]  ? schedule_timeout+0x219/0x356
>> [  260.739540]  ? __next_timer_interrupt+0xf1/0xf1
>> [  260.739544]  ? io_wq_manager+0x73/0xb1
>> [  260.739549]  ? io_wq_create+0x262/0x262
>> [  260.739553]  ? ret_from_fork+0x22/0x30
>> [  260.739557] task:iou-mgr-517 state:S stack:0 pid:  522 ppid:   
>> 439 flags:0x4224
>> [  260.739561] Call Trace:
>> [  260.739563]  ? sysvec_apic_timer_interrupt+0xb/0x81
>> [  260.739566]  ? pick_next_task_fair+0x16f/0x1cde
>> [  260.739569]  ? sysvec_apic_timer_interrupt+0xb/0x81
>> [  260.739571]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
>> [  260.739574]  ? __schedule+0x5b7/0x6d6
>> [  260.739578]  ? del_timer_sync+0x70/0x115
>> [  260.739581]  ? schedule_timeout+0x211/0x356
>> [  260.739585]  ? __next_timer_interrupt+0xf1/0xf1
>> [  260.739588]  ? io_wq_check_workers+0x15/0x11f
>> [  260.739592]  ? io_wq_manager+0x69/0xb1
>> [  260.739596]  ? io_wq_create+0x262/0x262
>> [  260.739600]  ? ret_from_fork+0x22/0x30
>> [  260.739603] task:iou-wrk-517 state:S stack:0 pid:  523 ppid:   
>> 439 flags:0x4224
>> [  260.739607] Call Trace:
>> [  260.739609]  ? __schedule+0x5b7/0x6d6
>> [  260.739614]  ? schedule+0x63/0xd5
>> [  260.739617]  ? schedule_timeout+0x219/0x356
>> [  260.739621]  ? __next_timer_interrupt+0xf1/0xf1
>> [  260.739624]  ? task_thread.isra.0+0x148/0x3af
>> [  260.739628]  ? task_thread_unbound+0xa/0xa
>> [  260.739632]  ? task_thread_bound+0x7/0x7
>> [  260.739636]  ? ret_from_fork+0x22/0x30
>> [  260.739647] OOM killer enabled.
>> [  260.739648] Restarting tasks ... done.
>> [  260.740077] PM: suspend exit
>>
>> and then a set of similar messages except with s2idle instead of deep.
>>
>> Reverting 5695e51619 ("Merge tag 'io_uring-worker.v3-2021-02-25' of 
>> git://git.kernel.dk/linux-block") appears to resolve the issue. I have 
>> not yet bisected further. Let me know which troubleshooting steps I 
>> should perform next.
> 
> Can you try and pull in:
> 
> git://git.kernel.dk/linux-block io_uring-5.12
> 
> and see if that resolves it? I usually always run -git on my laptop as
> well, but something broke it in the merge window so I need to figure
> out what that is first...
> 
> What distro are you running?

You probably want this on top...


diff --git a/fs/io-wq.c b/fs/io-wq.c
index 1fdb2b621b51..a763e1b09073 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -567,7 +567,7 @@ static int task_thread(void *data, int index)
worker->task = current;
 
set_cpus_allowed_ptr(current, cpumask_of_node(wqe->node));
-   current->flags |= PF_NO_SETAFFINITY;
+   current->flags |= PF_NO_SETAFFINITY | PF_NOFREEZE;
 
raw_spin_lock_irq(>lock);
hlist_nulls_add_head_rcu(>nulls_node, >free_list);
@@ -722,7 +722,7 @@ static int io_wq_manager(void *data)
 
sprintf(buf, "iou-mgr-%d", wq->task_pid);
set_task_comm(current, buf);
-   current->flags |= PF_IO_WORKER;
+   current->flags |= PF_IO_WORKER | PF_NOFREEZE;
wq->manager = get_task_struct(current);
 
complete(>started);
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 2757675ab417..e7aaf56b4dea 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6679,6 +6685,7 @@ static int io_sq_thread(void *data)
set_task_comm(current, buf);
sqd->thread = current;
current->pf_io_worker = NULL;
+   current->flags |= PF_NOFREEZE;
 
if (sqd->sq_cpu != -1)
set_cpus_allowed_ptr(current, cpumask_of(sqd->sq_cpu));

-- 
Jens Axboe



Re: [PATCH 5/5] mm: memcontrol: use object cgroup for remote memory cgroup charging

2021-03-01 Thread Roman Gushchin
On Mon, Mar 01, 2021 at 02:22:27PM +0800, Muchun Song wrote:
> We spent a lot of energy to make slab accounting do not hold a refcount
> to memory cgroup, so the dying cgroup can be freed as soon as possible
> on cgroup offlined.
> 
> But some users of remote memory cgroup charging (e.g. bpf and fsnotify)
> hold a refcount to memory cgroup for charging to it later. Actually,
> the slab core use obj_cgroup APIs for memory cgroup charing, so we can
> hold a refcount to obj_cgroup instead of memory cgroup. In this case,
> the infrastructure of remote meory charging also do not hold a refcount
> to memory cgroup.

-cc all except mm folks

Same here, let's not switch the remote charging infra to objcg to save
an ability to use it for user pages. If we have a real problem with bpf/...,
let's solve it case by case.

Thanks!

> 
> Signed-off-by: Muchun Song 
> ---
>  fs/buffer.c  | 10 --
>  fs/notify/fanotify/fanotify.c|  6 ++--
>  fs/notify/fanotify/fanotify_user.c   |  2 +-
>  fs/notify/group.c|  3 +-
>  fs/notify/inotify/inotify_fsnotify.c |  8 ++---
>  fs/notify/inotify/inotify_user.c |  2 +-
>  include/linux/bpf.h  |  2 +-
>  include/linux/fsnotify_backend.h |  2 +-
>  include/linux/memcontrol.h   | 15 
>  include/linux/sched.h|  4 +--
>  include/linux/sched/mm.h | 28 +++
>  kernel/bpf/syscall.c | 35 +--
>  kernel/fork.c|  2 +-
>  mm/memcontrol.c  | 66 
> 
>  14 files changed, 121 insertions(+), 64 deletions(-)
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 591547779dbd..cc99fcf66368 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -842,14 +842,16 @@ struct buffer_head *alloc_page_buffers(struct page 
> *page, unsigned long size,
>   struct buffer_head *bh, *head;
>   gfp_t gfp = GFP_NOFS | __GFP_ACCOUNT;
>   long offset;
> - struct mem_cgroup *memcg, *old_memcg;
> + struct mem_cgroup *memcg;
> + struct obj_cgroup *objcg, *old_objcg;
>  
>   if (retry)
>   gfp |= __GFP_NOFAIL;
>  
>   /* The page lock pins the memcg */
>   memcg = page_memcg(page);
> - old_memcg = set_active_memcg(memcg);
> + objcg = get_obj_cgroup_from_mem_cgroup(memcg);
> + old_objcg = set_active_obj_cgroup(objcg);
>  
>   head = NULL;
>   offset = PAGE_SIZE;
> @@ -868,7 +870,9 @@ struct buffer_head *alloc_page_buffers(struct page *page, 
> unsigned long size,
>   set_bh_page(bh, page, offset);
>   }
>  out:
> - set_active_memcg(old_memcg);
> + set_active_obj_cgroup(old_objcg);
> + if (objcg)
> + obj_cgroup_put(objcg);
>   return head;
>  /*
>   * In case anything failed, we just free everything we got.
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 1192c9953620..04d24acfffc7 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -530,7 +530,7 @@ static struct fanotify_event *fanotify_alloc_event(struct 
> fsnotify_group *group,
>   struct inode *dirid = fanotify_dfid_inode(mask, data, data_type, dir);
>   const struct path *path = fsnotify_data_path(data, data_type);
>   unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
> - struct mem_cgroup *old_memcg;
> + struct obj_cgroup *old_objcg;
>   struct inode *child = NULL;
>   bool name_event = false;
>  
> @@ -580,7 +580,7 @@ static struct fanotify_event *fanotify_alloc_event(struct 
> fsnotify_group *group,
>   gfp |= __GFP_RETRY_MAYFAIL;
>  
>   /* Whoever is interested in the event, pays for the allocation. */
> - old_memcg = set_active_memcg(group->memcg);
> + old_objcg = set_active_obj_cgroup(group->objcg);
>  
>   if (fanotify_is_perm_event(mask)) {
>   event = fanotify_alloc_perm_event(path, gfp);
> @@ -608,7 +608,7 @@ static struct fanotify_event *fanotify_alloc_event(struct 
> fsnotify_group *group,
>   event->pid = get_pid(task_tgid(current));
>  
>  out:
> - set_active_memcg(old_memcg);
> + set_active_obj_cgroup(old_objcg);
>   return event;
>  }
>  
> diff --git a/fs/notify/fanotify/fanotify_user.c 
> b/fs/notify/fanotify/fanotify_user.c
> index 9e0c1afac8bd..055ca36d4e0e 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -985,7 +985,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, 
> unsigned int, event_f_flags)
>   group->fanotify_data.user = user;
>   group->fanotify_data.flags = flags;
>   atomic_inc(>fanotify_listeners);
> - group->memcg = get_mem_cgroup_from_mm(current->mm);
> + group->objcg = get_obj_cgroup_from_current();
>  
>   group->overflow_event = fanotify_alloc_overflow_event();
>   if (unlikely(!group->overflow_event)) {
> diff --git 

Re: [PATCH 4.4.y] arm: kprobes: Allow to handle reentered kprobe on single-stepping

2021-03-01 Thread Shaobo Huang
On March 1, 2021 at 11:30 AM, Greg KH wrote:
> On Mon, Feb 27, 2021 at 05:17:01PM +0800, huangshaobo wrote:
> > From: Masami Hiramatsu 
> > 
> > commit f3fbd7ec62dec1528fb8044034e2885f2b257941 upstream
> > 
> > This is arm port of commit 6a5022a56ac3 ("kprobes/x86: Allow to handle 
> > reentered kprobe on single-stepping")
> > 
> > Since the FIQ handlers can interrupt in the single stepping (or 
> > preparing the single stepping, do_debug etc.), we should consider a 
> > kprobe is hit in the NMI handler. Even in that case, the kprobe is 
> > allowed to be reentered as same as the kprobes hit in kprobe handlers 
> > (KPROBE_HIT_ACTIVE or KPROBE_HIT_SSDONE).
> > 
> > The real issue will happen when a kprobe hit while another reentered 
> > kprobe is processing (KPROBE_REENTER), because we already consumed a 
> > saved-area for the previous kprobe.
> > 
> > Signed-off-by: Masami Hiramatsu 
> > Signed-off-by: Jon Medhurst 
> > Fixes: 24ba613c9d6c ("ARM kprobes: core code")
> > Cc: sta...@vger.kernel.org #v2.6.25~v4.11
> > Signed-off-by: huangshaobo 
> > ---
> >  arch/arm/probes/kprobes/core.c | 6 ++
> >  1 file changed, 6 insertions(+)
> 
> What about the 4.9.y tree as well?
> 
> thanks,
> 
> greg k-h

Yes, I tested on the 4.4.y tree. From the code analysis, the same problem
exists in the 2.6.25 to 4.11 trees, and of course the 4.9.y tree is also
included.

thanks,
ShaoBo Huang


Re: [PATCH] mm, kasan: don't poison boot memory

2021-03-01 Thread George Kennedy




On 3/1/2021 9:29 AM, George Kennedy wrote:



On 2/28/2021 1:08 PM, Mike Rapoport wrote:

On Fri, Feb 26, 2021 at 11:16:06AM -0500, George Kennedy wrote:

On 2/26/2021 6:17 AM, Mike Rapoport wrote:

Hi George,

On Thu, Feb 25, 2021 at 08:19:18PM -0500, George Kennedy wrote:

Not sure if it's the right thing to do, but added
"acpi_tb_find_table_address()" to return the physical address of a 
table to

use with memblock_reserve().

virt_to_phys(table) does not seem to return the physical address 
for the

iBFT table (it would be nice if struct acpi_table_header also had a
"address" element for the physical address of the table).

virt_to_phys() does not work that early because then it is mapped with
early_memremap()  which uses different virtual to physical scheme.

I'd say that acpi_tb_find_table_address() makes sense if we'd like to
reserve ACPI tables outside of drivers/acpi.

But probably we should simply reserve all the tables during
acpi_table_init() so that any table that firmware put in the normal 
memory

will be surely reserved.

Ran 10 successful boots with the above without failure.

That's good news indeed :)
Wondering if we could do something like this instead (trying to keep 
changes

minimal). Just do the memblock_reserve() for all the standard tables.
I think something like this should work, but I'm not an ACPI expert 
to say

if this the best way to reserve the tables.

Adding ACPI maintainers to the CC list.
diff --git a/drivers/acpi/acpica/tbinstal.c 
b/drivers/acpi/acpica/tbinstal.c

index 0bb15ad..830f82c 100644
--- a/drivers/acpi/acpica/tbinstal.c
+++ b/drivers/acpi/acpica/tbinstal.c
@@ -7,6 +7,7 @@
   *
*/ 



+#include 
  #include 
  #include "accommon.h"
  #include "actables.h"
@@ -14,6 +15,23 @@
  #define _COMPONENT  ACPI_TABLES
  ACPI_MODULE_NAME("tbinstal")

+void
+acpi_tb_reserve_standard_table(acpi_physical_address address,
+               struct acpi_table_header *header)
+{
+    struct acpi_table_header local_header;
+
+    if ((ACPI_COMPARE_NAMESEG(header->signature, ACPI_SIG_FACS)) ||
+        (ACPI_VALIDATE_RSDP_SIG(header->signature))) {
+        return;
+    }
+    /* Standard ACPI table with full common header */
+
+    memcpy(_header, header, sizeof(struct acpi_table_header));
+
+    memblock_reserve(address, PAGE_ALIGN(local_header.length));
+}
+
  /*** 


   *
   * FUNCTION:    acpi_tb_install_table_with_override
@@ -58,6 +76,9 @@
                new_table_desc->flags,
                new_table_desc->pointer);

+ acpi_tb_reserve_standard_table(new_table_desc->address,
+                   new_table_desc->pointer);
+
  acpi_tb_print_table_header(new_table_desc->address,
                 new_table_desc->pointer);

There should be no harm in doing the memblock_reserve() for all the 
standard

tables, right?
It should be ok to memblock_reserve() all the tables very early as 
long as

we don't run out of static entries in memblock.reserved.

We just need to make sure the tables are reserved before memblock
allocations are possible, so we'd still need to move 
acpi_table_init() in

x86::setup_arch() before e820__memblock_setup().
Not sure how early ACPI is initialized on arm64.


Thanks Mike. Will try to move the memblock_reserves() before 
e820__memblock_setup().


Hi Mike,

Moved acpi_table_init() in x86::setup_arch() before 
e820__memblock_setup() as you suggested.


Ran 10 boots with the following without error.

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 740f3bdb..3b1dd24 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1047,6 +1047,7 @@ void __init setup_arch(char **cmdline_p)
 cleanup_highmap();

 memblock_set_current_limit(ISA_END_ADDRESS);
+    acpi_boot_table_init();
 e820__memblock_setup();

 /*
@@ -1140,8 +1141,6 @@ void __init setup_arch(char **cmdline_p)
 /*
  * Parse the ACPI tables for possible boot-time SMP configuration.
  */
-    acpi_boot_table_init();
-
 early_acpi_boot_init();

 initmem_init();
diff --git a/drivers/acpi/acpica/tbinstal.c b/drivers/acpi/acpica/tbinstal.c
index 0bb15ad..7830109 100644
--- a/drivers/acpi/acpica/tbinstal.c
+++ b/drivers/acpi/acpica/tbinstal.c
@@ -7,6 +7,7 @@
  *
*/

+#include 
 #include 
 #include "accommon.h"
 #include "actables.h"
@@ -16,6 +17,33 @@

 
/***
  *
+ * FUNCTION:    acpi_tb_reserve_standard_table
+ *
+ * PARAMETERS:  address - Table physical address
+ *  header  - Table header
+ *
+ * RETURN:  None
+ *
+ * DESCRIPTION: To avoid an acpi table page from being "stolen" by the 
buddy
+ *  allocator run memblock_reserve() on all the standard 

[PATCH] gpio: omap: Honor "aliases" node

2021-03-01 Thread Alexander Sverdlin
Currently the naming of the GPIO chips depends on their order in the DT,
but also on the kernel version (I've noticed the change from v5.10.x to
v5.11). Honor the persistent enumeration in the "aliases" node like other
GPIO drivers do.

Signed-off-by: Alexander Sverdlin 
---
Yes, I noticed checkpatch "WARNING: DT binding docs and includes should be
a separate patch."
However, the parts below are tiny and barely make sense separately.

 Documentation/devicetree/bindings/gpio/gpio-omap.txt | 6 ++
 drivers/gpio/gpio-omap.c | 5 +
 2 files changed, 11 insertions(+)

diff --git a/Documentation/devicetree/bindings/gpio/gpio-omap.txt 
b/Documentation/devicetree/bindings/gpio/gpio-omap.txt
index e57b2cb28f6c..6050db3fd84e 100644
--- a/Documentation/devicetree/bindings/gpio/gpio-omap.txt
+++ b/Documentation/devicetree/bindings/gpio/gpio-omap.txt
@@ -30,9 +30,15 @@ OMAP specific properties:
 - ti,gpio-always-on:   Indicates if a GPIO bank is always powered and
so will never lose its logic state.
 
+Note: GPIO ports can have an alias correctly numbered in "aliases" node for
+persistent enumeration.
 
 Example:
 
+aliases {
+   gpio0 = 
+};
+
 gpio0: gpio@44e07000 {
 compatible = "ti,omap4-gpio";
 reg = <0x44e07000 0x1000>;
diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c
index 41952bb818ad..dd2a8f6d920f 100644
--- a/drivers/gpio/gpio-omap.c
+++ b/drivers/gpio/gpio-omap.c
@@ -1014,6 +1014,11 @@ static int omap_gpio_chip_init(struct gpio_bank *bank, 
struct irq_chip *irqc)
bank->chip.parent = _mpuio_device.dev;
bank->chip.base = OMAP_MPUIO(0);
} else {
+#ifdef CONFIG_OF_GPIO
+   ret = of_alias_get_id(bank->chip.of_node, "gpio");
+   if (ret >= 0)
+   gpio = ret * bank->width;
+#endif
label = devm_kasprintf(bank->chip.parent, GFP_KERNEL, 
"gpio-%d-%d",
   gpio, gpio + bank->width - 1);
if (!label)
-- 
2.30.1



Re: [PATCH v8 2/2] counter: add IRQ or GPIO based counter

2021-03-01 Thread William Breathitt Gray
On Mon, Mar 01, 2021 at 09:04:01AM +0100, Oleksij Rempel wrote:
> Add simple IRQ or GPIO base counter. This device is used to measure
> rotation speed of some agricultural devices, so no high frequency on the
> counter pin is expected.
> 
> The maximal measurement frequency depends on the CPU and system load. On
> the idle iMX6S I was able to measure up to 20kHz without count drops.
> 
> Signed-off-by: Oleksij Rempel 
> Reviewed-by: Ahmad Fatoum 

This version looks acceptable for the Counter subsystem.

Thanks,

Reviewed-by: William Breathitt Gray 

> ---
>  MAINTAINERS |   7 +
>  drivers/counter/Kconfig |  10 ++
>  drivers/counter/Makefile|   1 +
>  drivers/counter/interrupt-cnt.c | 244 
>  4 files changed, 262 insertions(+)
>  create mode 100644 drivers/counter/interrupt-cnt.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index a50a543e3c81..ad0a4455afec 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -9217,6 +9217,13 @@ F: include/dt-bindings/interconnect/
>  F:   include/linux/interconnect-provider.h
>  F:   include/linux/interconnect.h
>  
> +INTERRUPT COUNTER DRIVER
> +M:   Oleksij Rempel 
> +R:   Pengutronix Kernel Team 
> +L:   linux-...@vger.kernel.org
> +F:   Documentation/devicetree/bindings/counter/interrupt-counter.yaml
> +F:   drivers/counter/interrupt-cnt.c
> +
>  INVENSENSE ICM-426xx IMU DRIVER
>  M:   Jean-Baptiste Maneyrol 
>  L:   linux-...@vger.kernel.org
> diff --git a/drivers/counter/Kconfig b/drivers/counter/Kconfig
> index 2de53ab0dd25..dcad13229134 100644
> --- a/drivers/counter/Kconfig
> +++ b/drivers/counter/Kconfig
> @@ -29,6 +29,16 @@ config 104_QUAD_8
> The base port addresses for the devices may be configured via the base
> array module parameter.
>  
> +config INTERRUPT_CNT
> + tristate "Interrupt counter driver"
> + depends on GPIOLIB
> + help
> +   Select this option to enable interrupt counter driver. Any interrupt
> +   source can be used by this driver as the event source.
> +
> +   To compile this driver as a module, choose M here: the
> +   module will be called interrupt-cnt.
> +
>  config STM32_TIMER_CNT
>   tristate "STM32 Timer encoder counter driver"
>   depends on MFD_STM32_TIMERS || COMPILE_TEST
> diff --git a/drivers/counter/Makefile b/drivers/counter/Makefile
> index 0a393f71e481..cb646ed2f039 100644
> --- a/drivers/counter/Makefile
> +++ b/drivers/counter/Makefile
> @@ -6,6 +6,7 @@
>  obj-$(CONFIG_COUNTER) += counter.o
>  
>  obj-$(CONFIG_104_QUAD_8) += 104-quad-8.o
> +obj-$(CONFIG_INTERRUPT_CNT)  += interrupt-cnt.o
>  obj-$(CONFIG_STM32_TIMER_CNT)+= stm32-timer-cnt.o
>  obj-$(CONFIG_STM32_LPTIMER_CNT)  += stm32-lptimer-cnt.o
>  obj-$(CONFIG_TI_EQEP)+= ti-eqep.o
> diff --git a/drivers/counter/interrupt-cnt.c b/drivers/counter/interrupt-cnt.c
> new file mode 100644
> index ..a99ee7996977
> --- /dev/null
> +++ b/drivers/counter/interrupt-cnt.c
> @@ -0,0 +1,244 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2021 Pengutronix, Oleksij Rempel 
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define INTERRUPT_CNT_NAME "interrupt-cnt"
> +
> +struct interrupt_cnt_priv {
> + atomic_t count;
> + struct counter_device counter;
> + struct gpio_desc *gpio;
> + int irq;
> + bool enabled;
> + struct counter_signal signals;
> + struct counter_synapse synapses;
> + struct counter_count cnts;
> +};
> +
> +static irqreturn_t interrupt_cnt_isr(int irq, void *dev_id)
> +{
> + struct interrupt_cnt_priv *priv = dev_id;
> +
> + atomic_inc(>count);
> +
> + return IRQ_HANDLED;
> +}
> +
> +static ssize_t interrupt_cnt_enable_read(struct counter_device *counter,
> +  struct counter_count *count,
> +  void *private, char *buf)
> +{
> + struct interrupt_cnt_priv *priv = counter->priv;
> +
> + return sysfs_emit(buf, "%d\n", priv->enabled);
> +}
> +
> +static ssize_t interrupt_cnt_enable_write(struct counter_device *counter,
> +   struct counter_count *count,
> +   void *private, const char *buf,
> +   size_t len)
> +{
> + struct interrupt_cnt_priv *priv = counter->priv;
> + bool enable;
> + ssize_t ret;
> +
> + ret = kstrtobool(buf, );
> + if (ret)
> + return ret;
> +
> + if (priv->enabled == enable)
> + return len;
> +
> + if (enable) {
> + priv->enabled = true;
> + enable_irq(priv->irq);
> + } else {
> + disable_irq(priv->irq);
> + priv->enabled = false;
> + }
> +
> + return len;
> +}
> +
> +static const struct counter_count_ext interrupt_cnt_ext[] = {
> + {
> + .name = 

[PATCH] iwlwifi: fix ARCH=i386 compilation warnings

2021-03-01 Thread Pierre-Louis Bossart
An unsigned long variable should rely on '%lu' format strings, not '%zd'

Fixes: a1a6a4cf49ece ("iwlwifi: pnvm: implement reading PNVM from UEFI")
Signed-off-by: Pierre-Louis Bossart 
---
warnings found with v5.12-rc1 and next-20210301

 drivers/net/wireless/intel/iwlwifi/fw/pnvm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/fw/pnvm.c 
b/drivers/net/wireless/intel/iwlwifi/fw/pnvm.c
index fd070ca5e517..40f2109a097f 100644
--- a/drivers/net/wireless/intel/iwlwifi/fw/pnvm.c
+++ b/drivers/net/wireless/intel/iwlwifi/fw/pnvm.c
@@ -271,12 +271,12 @@ static int iwl_pnvm_get_from_efi(struct iwl_trans *trans,
err = efivar_entry_get(pnvm_efivar, NULL, _size, package);
if (err) {
IWL_DEBUG_FW(trans,
-"PNVM UEFI variable not found %d (len %zd)\n",
+"PNVM UEFI variable not found %d (len %lu)\n",
 err, package_size);
goto out;
}
 
-   IWL_DEBUG_FW(trans, "Read PNVM fro UEFI with size %zd\n", package_size);
+   IWL_DEBUG_FW(trans, "Read PNVM fro UEFI with size %lu\n", package_size);
 
*data = kmemdup(package->data, *len, GFP_KERNEL);
if (!*data)
-- 
2.25.1



Re: [PATCH 4/5] mm: memcontrol: move remote memcg charging APIs to CONFIG_MEMCG_KMEM

2021-03-01 Thread Roman Gushchin
On Mon, Mar 01, 2021 at 02:22:26PM +0800, Muchun Song wrote:
> The remote memcg charing APIs is a mechanism to charge kernel memory
> to a given memcg. So we can move the infrastructure to the scope of
> the CONFIG_MEMCG_KMEM.

This is not a good idea, because there is nothing kmem-specific
in the idea of remote charging, and we definitely will see cases
when user memory is charged to the process different from the current.

> 
> As a bonus, on !CONFIG_MEMCG_KMEM build some functions and variables
> can be compiled out.
> 
> Signed-off-by: Muchun Song 
> ---
>  include/linux/sched.h| 2 ++
>  include/linux/sched/mm.h | 2 +-
>  kernel/fork.c| 2 +-
>  mm/memcontrol.c  | 4 
>  4 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index ee46f5cab95b..c2d488eddf85 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1314,7 +1314,9 @@ struct task_struct {
>  
>   /* Number of pages to reclaim on returning to userland: */
>   unsigned intmemcg_nr_pages_over_high;
> +#endif
>  
> +#ifdef CONFIG_MEMCG_KMEM
>   /* Used by memcontrol for targeted memcg charge: */
>   struct mem_cgroup   *active_memcg;
>  #endif
> diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
> index 1ae08b8462a4..64a72975270e 100644
> --- a/include/linux/sched/mm.h
> +++ b/include/linux/sched/mm.h
> @@ -294,7 +294,7 @@ static inline void memalloc_nocma_restore(unsigned int 
> flags)
>  }
>  #endif
>  
> -#ifdef CONFIG_MEMCG
> +#ifdef CONFIG_MEMCG_KMEM
>  DECLARE_PER_CPU(struct mem_cgroup *, int_active_memcg);
>  /**
>   * set_active_memcg - Starts the remote memcg charging scope.
> diff --git a/kernel/fork.c b/kernel/fork.c
> index d66cd1014211..d66718bc82d5 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -942,7 +942,7 @@ static struct task_struct *dup_task_struct(struct 
> task_struct *orig, int node)
>   tsk->use_memdelay = 0;
>  #endif
>  
> -#ifdef CONFIG_MEMCG
> +#ifdef CONFIG_MEMCG_KMEM
>   tsk->active_memcg = NULL;
>  #endif
>   return tsk;
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 39cb8c5bf8b2..092dc4588b43 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -76,8 +76,10 @@ EXPORT_SYMBOL(memory_cgrp_subsys);
>  
>  struct mem_cgroup *root_mem_cgroup __read_mostly;
>  
> +#ifdef CONFIG_MEMCG_KMEM
>  /* Active memory cgroup to use from an interrupt context */
>  DEFINE_PER_CPU(struct mem_cgroup *, int_active_memcg);
> +#endif
>  
>  /* Socket memory accounting disabled? */
>  static bool cgroup_memory_nosocket;
> @@ -1054,6 +1056,7 @@ struct mem_cgroup *get_mem_cgroup_from_mm(struct 
> mm_struct *mm)
>  }
>  EXPORT_SYMBOL(get_mem_cgroup_from_mm);
>  
> +#ifdef CONFIG_MEMCG_KMEM
>  static __always_inline struct mem_cgroup *active_memcg(void)
>  {
>   if (in_interrupt())
> @@ -1074,6 +1077,7 @@ static __always_inline bool memcg_kmem_bypass(void)
>  
>   return false;
>  }
> +#endif
>  
>  /**
>   * mem_cgroup_iter - iterate over memory cgroup hierarchy
> -- 
> 2.11.0
> 


Re: [x86, build] 6dafca9780: WARNING:at_arch/x86/kernel/ftrace.c:#ftrace_verify_code

2021-03-01 Thread Steven Rostedt
On Mon, 1 Mar 2021 16:03:51 -0800
Sami Tolvanen  wrote:
> 
> > ret = ftrace_verify_code(rec->ip, old);
> > +
> > +   if (__is_defined(CC_USING_NOP_MCOUNT) && ret && old_nop) {
> > +   /* Compiler could have put in P6_NOP5 */
> > +   old = P6_NOP5;
> > +   ret = ftrace_verify_code(rec->ip, old);
> > +   }
> > +  
> 
> Wouldn't that still hit WARN(1) in the initial ftrace_verify_code()
> call if ideal_nops doesn't match?

That was too quickly written ;-)

Take 2:

[ with fixes for setting p6_nop ]

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 7edbd5ee5ed4..e8afc765e00a 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -36,6 +36,7 @@
 #ifdef CONFIG_DYNAMIC_FTRACE
 
 static int ftrace_poke_late = 0;
+static const char p6_nop[] = { P6_NOP5 };
 
 int ftrace_arch_code_modify_prepare(void)
 __acquires(_mutex)
@@ -74,7 +75,7 @@ static const char *ftrace_call_replace(unsigned long ip, 
unsigned long addr)
return text_gen_insn(CALL_INSN_OPCODE, (void *)ip, (void *)addr);
 }
 
-static int ftrace_verify_code(unsigned long ip, const char *old_code)
+static int ftrace_verify_code(unsigned long ip, const char *old_code, bool 
warn)
 {
char cur_code[MCOUNT_INSN_SIZE];
 
@@ -87,13 +88,13 @@ static int ftrace_verify_code(unsigned long ip, const char 
*old_code)
 */
/* read the text we want to modify */
if (copy_from_kernel_nofault(cur_code, (void *)ip, MCOUNT_INSN_SIZE)) {
-   WARN_ON(1);
+   WARN_ON(warn);
return -EFAULT;
}
 
/* Make sure it is what we expect it to be */
if (memcmp(cur_code, old_code, MCOUNT_INSN_SIZE) != 0) {
-   WARN_ON(1);
+   WARN_ON(warn);
return -EINVAL;
}
 
@@ -107,9 +108,9 @@ static int ftrace_verify_code(unsigned long ip, const char 
*old_code)
  */
 static int __ref
 ftrace_modify_code_direct(unsigned long ip, const char *old_code,
- const char *new_code)
+ const char *new_code, bool verify_warn)
 {
-   int ret = ftrace_verify_code(ip, old_code);
+   int ret = ftrace_verify_code(ip, old_code, verify_warn);
if (ret)
return ret;
 
@@ -138,7 +139,7 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace 
*rec, unsigned long ad
 * just modify the code directly.
 */
if (addr == MCOUNT_ADDR)
-   return ftrace_modify_code_direct(ip, old, new);
+   return ftrace_modify_code_direct(ip, old, new, true);
 
/*
 * x86 overrides ftrace_replace_code -- this function will never be used
@@ -152,12 +153,20 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned 
long addr)
 {
unsigned long ip = rec->ip;
const char *new, *old;
+   bool verify_warn = !__is_defined(CC_USING_NOP_MCOUNT);
+   int ret;
 
old = ftrace_nop_replace();
new = ftrace_call_replace(ip, addr);
 
/* Should only be called when module is loaded */
-   return ftrace_modify_code_direct(rec->ip, old, new);
+   ret = ftrace_modify_code_direct(rec->ip, old, new, verify_warn);
+   if (ret && !verify_warn) {
+   /* Compiler could have put in P6_NOP5 */
+   old = p6_nop;
+   ret = ftrace_modify_code_direct(rec->ip, old, new, true);
+   }
+   return ret;
 }
 
 /*
@@ -199,6 +208,8 @@ void ftrace_replace_code(int enable)
int ret;
 
for_ftrace_rec_iter(iter) {
+   bool verify_warn = true;
+
rec = ftrace_rec_iter_record(iter);
 
switch (ftrace_test_record(rec, enable)) {
@@ -208,6 +219,7 @@ void ftrace_replace_code(int enable)
 
case FTRACE_UPDATE_MAKE_CALL:
old = ftrace_nop_replace();
+   verify_warn = !__is_defined(CC_USING_NOP_MCOUNT);
break;
 
case FTRACE_UPDATE_MODIFY_CALL:
@@ -216,7 +228,14 @@ void ftrace_replace_code(int enable)
break;
}
 
-   ret = ftrace_verify_code(rec->ip, old);
+   ret = ftrace_verify_code(rec->ip, old, verify_warn);
+
+   if (ret && !verify_warn) {
+   /* Compiler could have put in P6_NOP5 */
+   old = p6_nop;
+   ret = ftrace_verify_code(rec->ip, old, true);
+   }
+
if (ret) {
ftrace_bug(ret, rec);
return;


-- Steve


Re: possible deadlock in sk_clone_lock

2021-03-01 Thread Mike Kravetz
On 3/1/21 9:23 AM, Michal Hocko wrote:
> On Mon 01-03-21 08:39:22, Shakeel Butt wrote:
>> On Mon, Mar 1, 2021 at 7:57 AM Michal Hocko  wrote:
> [...]
>>> Then how come this can ever be a problem? in_task() should exclude soft
>>> irq context unless I am mistaken.
>>>
>>
>> If I take the following example of syzbot's deadlock scenario then
>> CPU1 is the one freeing the hugetlb pages. It is in the process
>> context but has disabled softirqs (see __tcp_close()).
>>
>> CPU0CPU1
>> 
>>lock(hugetlb_lock);
>> local_irq_disable();
>> lock(slock-AF_INET);
>> lock(hugetlb_lock);
>>
>>  lock(slock-AF_INET);
>>
>> So, this deadlock scenario is very much possible.
> 
> OK, I see the point now. I was focusing on the IRQ context and hugetlb
> side too much. We do not need to be freeing from there. All it takes is
> to get a dependency chain over a common lock held here. Thanks for
> bearing with me.
> 
> Let's see whether we can make hugetlb_lock irq safe.

I may be confused, but it seems like we have a general problem with
calling free_huge_page (as a result of put_page) with interrupts
disabled.

Consider the current free_huge_page code.  Today, we drop the lock
when processing gigantic pages because we may need to block on a mutex
in cma code.  If our caller has disabled interrupts, then it doesn't
matter if the hugetlb lock is irq safe, when we drop it interrupts will
still be disabled we can not block .  Right?  If correct, then making
hugetlb_lock irq safe would not help.

Again, I may be missing something.

Note that we also are considering doing more with the hugetlb lock
dropped in this path in the 'free vmemmap of hugetlb pages' series.

Since we need to do some work that could block in this path, it seems
like we really need to use a workqueue.  It is too bad that there is not
an interface to identify all the cases where interrupts are disabled.
-- 
Mike Kravetz


Re: [PATCH v8 1/2] dt-bindings: counter: add interrupt-counter binding

2021-03-01 Thread William Breathitt Gray
On Mon, Mar 01, 2021 at 09:04:00AM +0100, Oleksij Rempel wrote:
> Add binding for the interrupt counter node
> 
> Signed-off-by: Oleksij Rempel 
> Reviewed-by: Linus Walleij 

Acked-by: William Breathitt Gray 

> ---
>  .../bindings/counter/interrupt-counter.yaml   | 62 +++
>  1 file changed, 62 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/counter/interrupt-counter.yaml
> 
> diff --git a/Documentation/devicetree/bindings/counter/interrupt-counter.yaml 
> b/Documentation/devicetree/bindings/counter/interrupt-counter.yaml
> new file mode 100644
> index ..fd075d104631
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/counter/interrupt-counter.yaml
> @@ -0,0 +1,62 @@
> +# SPDX-License-Identifier: GPL-2.0
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/counter/interrupt-counter.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Interrupt counter
> +
> +maintainers:
> +  - Oleksij Rempel 
> +
> +description: |
> +  A generic interrupt counter to measure interrupt frequency. It was 
> developed
> +  and used for agricultural devices to measure rotation speed of wheels or
> +  other tools. Since the direction of rotation is not important, only one
> +  signal line is needed.
> +  Interrupts or gpios are required. If both are defined, the interrupt will
> +  take precedence for counting interrupts.
> +
> +properties:
> +  compatible:
> +const: interrupt-counter
> +
> +  interrupts:
> +maxItems: 1
> +
> +  gpios:
> +maxItems: 1
> +
> +required:
> +  - compatible
> +
> +anyOf:
> +  - required: [ interrupts-extended ]
> +  - required: [ interrupts ]
> +  - required: [ gpios ]
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +
> +#include 
> +#include 
> +
> +counter-0 {
> +compatible = "interrupt-counter";
> +interrupts-extended = < 0 IRQ_TYPE_EDGE_RISING>;
> +};
> +
> +counter-1 {
> +compatible = "interrupt-counter";
> +gpios = < 2 GPIO_ACTIVE_HIGH>;
> +};
> +
> +counter-2 {
> +compatible = "interrupt-counter";
> +interrupts-extended = < 2 IRQ_TYPE_EDGE_RISING>;
> +gpios = < 2 GPIO_ACTIVE_HIGH>;
> +};
> +
> +...
> -- 
> 2.29.2
> 


signature.asc
Description: PGP signature


Re: [PATCH 0/5] Use obj_cgroup APIs to change kmem pages

2021-03-01 Thread Roman Gushchin
Hi Muchun!

On Mon, Mar 01, 2021 at 02:22:22PM +0800, Muchun Song wrote:
> Since Roman series "The new cgroup slab memory controller" applied. All
> slab objects are changed via the new APIs of obj_cgroup. This new APIs
> introduce a struct obj_cgroup instead of using struct mem_cgroup directly
> to charge slab objects. It prevents long-living objects from pinning the
> original memory cgroup in the memory. But there are still some corner
> objects (e.g. allocations larger than order-1 page on SLUB) which are
> not charged via the API of obj_cgroup. Those objects (include the pages
> which are allocated from buddy allocator directly) are charged as kmem
> pages which still hold a reference to the memory cgroup.

Yes, this is a good idea, large kmallocs should be treated the same
way as small ones.

> 
> E.g. We know that the kernel stack is charged as kmem pages because the
> size of the kernel stack can be greater than 2 pages (e.g. 16KB on x86_64
> or arm64). If we create a thread (suppose the thread stack is charged to
> memory cgroup A) and then move it from memory cgroup A to memory cgroup
> B. Because the kernel stack of the thread hold a reference to the memory
> cgroup A. The thread can pin the memory cgroup A in the memory even if
> we remove the cgroup A. If we want to see this scenario by using the
> following script. We can see that the system has added 500 dying cgroups.
> 
>   #!/bin/bash
> 
>   cat /proc/cgroups | grep memory
> 
>   cd /sys/fs/cgroup/memory
>   echo 1 > memory.move_charge_at_immigrate
> 
>   for i in range{1..500}
>   do
>   mkdir kmem_test
>   echo $$ > kmem_test/cgroup.procs
>   sleep 3600 &
>   echo $$ > cgroup.procs
>   echo `cat kmem_test/cgroup.procs` > cgroup.procs
>   rmdir kmem_test
>   done
> 
>   cat /proc/cgroups | grep memory

Well, moving processes between cgroups always created a lot of issues
and corner cases and this one is definitely not the worst. So this problem
looks a bit artificial, unless I'm missing something. But if it doesn't
introduce any new performance costs and doesn't make the code more complex,
I have nothing against.

Btw, can you, please, run the spell-checker on commit logs? There are many
typos (starting from the title of the series, I guess), which make the patchset
look less appealing.

Thank you!

> 
> This patchset aims to make those kmem pages drop the reference to memory
> cgroup by using the APIs of obj_cgroup. Finally, we can see that the number
> of the dying cgroups will not increase if we run the above test script.
> 
> Patch 1-3 are using obj_cgroup APIs to charge kmem pages. The remote
> memory cgroup charing APIs is a mechanism to charge kernel memory to a
> given memory cgroup. So I also make it use the APIs of obj_cgroup.
> Patch 4-5 are doing this.
> 
> Muchun Song (5):
>   mm: memcontrol: introduce obj_cgroup_{un}charge_page
>   mm: memcontrol: make page_memcg{_rcu} only applicable for non-kmem
> page
>   mm: memcontrol: reparent the kmem pages on cgroup removal
>   mm: memcontrol: move remote memcg charging APIs to CONFIG_MEMCG_KMEM
>   mm: memcontrol: use object cgroup for remote memory cgroup charging
> 
>  fs/buffer.c  |  10 +-
>  fs/notify/fanotify/fanotify.c|   6 +-
>  fs/notify/fanotify/fanotify_user.c   |   2 +-
>  fs/notify/group.c|   3 +-
>  fs/notify/inotify/inotify_fsnotify.c |   8 +-
>  fs/notify/inotify/inotify_user.c |   2 +-
>  include/linux/bpf.h  |   2 +-
>  include/linux/fsnotify_backend.h |   2 +-
>  include/linux/memcontrol.h   | 109 +++---
>  include/linux/sched.h|   6 +-
>  include/linux/sched/mm.h |  30 ++--
>  kernel/bpf/syscall.c |  35 ++---
>  kernel/fork.c|   4 +-
>  mm/memcontrol.c  | 276 
> ++-
>  mm/page_alloc.c  |   4 +-
>  15 files changed, 324 insertions(+), 175 deletions(-)
> 
> -- 
> 2.11.0
> 


Re: 5.12-rc1 regression: freezing iou-mgr/wrk failed

2021-03-01 Thread Jens Axboe
On 3/1/21 5:57 PM, Alex Xu (Hello71) wrote:
> Hi,
> 
> On Linux 5.12-rc1, I am unable to suspend to RAM. The system freezes for 
> about 40 seconds and then continues operation. The following messages 
> are printed to the kernel log:
> 
> [  240.650300] PM: suspend entry (deep)
> [  240.650748] Filesystems sync: 0.000 seconds
> [  240.725605] Freezing user space processes ...
> [  260.739483] Freezing of tasks failed after 20.013 seconds (3 tasks 
> refusing to freeze, wq_busy=0):
> [  260.739497] task:iou-mgr-446 state:S stack:0 pid:  516 ppid:   439 
> flags:0x4224
> [  260.739504] Call Trace:
> [  260.739507]  ? sysvec_apic_timer_interrupt+0xb/0x81
> [  260.739515]  ? pick_next_task_fair+0x197/0x1cde
> [  260.739519]  ? sysvec_reschedule_ipi+0x2f/0x6a
> [  260.739522]  ? asm_sysvec_reschedule_ipi+0x12/0x20
> [  260.739525]  ? __schedule+0x57/0x6d6
> [  260.739529]  ? del_timer_sync+0xb9/0x115
> [  260.739533]  ? schedule+0x63/0xd5
> [  260.739536]  ? schedule_timeout+0x219/0x356
> [  260.739540]  ? __next_timer_interrupt+0xf1/0xf1
> [  260.739544]  ? io_wq_manager+0x73/0xb1
> [  260.739549]  ? io_wq_create+0x262/0x262
> [  260.739553]  ? ret_from_fork+0x22/0x30
> [  260.739557] task:iou-mgr-517 state:S stack:0 pid:  522 ppid:   439 
> flags:0x4224
> [  260.739561] Call Trace:
> [  260.739563]  ? sysvec_apic_timer_interrupt+0xb/0x81
> [  260.739566]  ? pick_next_task_fair+0x16f/0x1cde
> [  260.739569]  ? sysvec_apic_timer_interrupt+0xb/0x81
> [  260.739571]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
> [  260.739574]  ? __schedule+0x5b7/0x6d6
> [  260.739578]  ? del_timer_sync+0x70/0x115
> [  260.739581]  ? schedule_timeout+0x211/0x356
> [  260.739585]  ? __next_timer_interrupt+0xf1/0xf1
> [  260.739588]  ? io_wq_check_workers+0x15/0x11f
> [  260.739592]  ? io_wq_manager+0x69/0xb1
> [  260.739596]  ? io_wq_create+0x262/0x262
> [  260.739600]  ? ret_from_fork+0x22/0x30
> [  260.739603] task:iou-wrk-517 state:S stack:0 pid:  523 ppid:   439 
> flags:0x4224
> [  260.739607] Call Trace:
> [  260.739609]  ? __schedule+0x5b7/0x6d6
> [  260.739614]  ? schedule+0x63/0xd5
> [  260.739617]  ? schedule_timeout+0x219/0x356
> [  260.739621]  ? __next_timer_interrupt+0xf1/0xf1
> [  260.739624]  ? task_thread.isra.0+0x148/0x3af
> [  260.739628]  ? task_thread_unbound+0xa/0xa
> [  260.739632]  ? task_thread_bound+0x7/0x7
> [  260.739636]  ? ret_from_fork+0x22/0x30
> [  260.739647] OOM killer enabled.
> [  260.739648] Restarting tasks ... done.
> [  260.740077] PM: suspend exit
> 
> and then a set of similar messages except with s2idle instead of deep.
> 
> Reverting 5695e51619 ("Merge tag 'io_uring-worker.v3-2021-02-25' of 
> git://git.kernel.dk/linux-block") appears to resolve the issue. I have 
> not yet bisected further. Let me know which troubleshooting steps I 
> should perform next.

Can you try and pull in:

git://git.kernel.dk/linux-block io_uring-5.12

and see if that resolves it? I usually always run -git on my laptop as
well, but something broke it in the merge window so I need to figure
out what that is first...

What distro are you running?

-- 
Jens Axboe



linux-next: Tree for Mar 2

2021-03-01 Thread Stephen Rothwell
Hi all,

Changes since 20210301:

The powerpc-fixes tree gained a build failure for which I applied a patch.

Non-merge commits (relative to Linus' tree): 745
 769 files changed, 20792 insertions(+), 8201 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig and htmldocs. And finally, a simple boot test
of the powerpc pseries_le_defconfig kernel in qemu (with and without
kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 333 trees (counting Linus' and 87 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (7a7fd0de4a98 Merge branch 'kmap-conversion-for-5.12' of 
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux)
Merging fixes/fixes (e71ba9452f0b Linux 5.11-rc2)
Merging kbuild-current/fixes (207da4c82ade kbuild: Fix  for 
empty SUBLEVEL or PATCHLEVEL again)
Merging arc-current/for-curr (7c53f6b671f4 Linux 5.11-rc3)
Merging arm-current/fixes (4d62e81b60d4 ARM: kexec: fix oops after TLB are 
invalidated)
Merging arm64-fixes/for-next/fixes (fe07bfda2fb9 Linux 5.12-rc1)
Merging arm-soc-fixes/arm/fixes (090e502e4e63 Merge tag 
'socfpga_dts_fix_for_v5.12' of 
git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux into arm/fixes)
Merging drivers-memory-fixes/fixes (5c8fe583cce5 Linux 5.11-rc1)
Merging m68k-current/for-linus (5d2b62832c2e m68k: Fix virt_addr_valid() W=1 
compiler warnings)
Merging powerpc-fixes/fixes (e3d773ddb5a1 powerpc/sstep: Fix VSX instruction 
emulation)
Merging s390-fixes/fixes (f40ddce88593 Linux 5.11)
Merging sparc/master (cf64c2a905e0 Merge branch 'work.sparc32' of 
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs)
Merging fscrypt-current/for-stable (d19d8d345eec fscrypt: fix inline encryption 
not used on new files)
Merging net/master (a2bd45834e83 atm: lanai: dont run lanai_dev_close if not 
open)
Merging bpf/master (447621e373bd Merge branch 'net-hns3-fixes-fot-net')
Merging ipsec/master (8fc0e3b6a866 xfrm: interface: fix ipv4 pmtu check to 
honor ip header df)
Merging netfilter/master (8e24edddad15 netfilter: x_tables: gpf inside 
xt_find_revision())
Merging ipvs/master (44a674d6f798 Merge tag 'mlx5-fixes-2021-01-26' of 
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux)
Merging wireless-drivers/master (c490492f15f6 mt76: mt7915: fix unused 'mode' 
variable)
Merging mac80211/master (826d82170b53 xen-netback: use local var in 
xenvif_tx_check_gop() instead of re-calculating)
Merging rdma-fixes/for-rc (fe07bfda2fb9 Linux 5.12-rc1)
Merging sound-current/for-linus (2c9119001dcb ALSA: usb-audio: Fix Pioneer DJM 
devices URB_CONTROL request direction to set samplerate)
Merging sound-asoc-fixes/for-linus (cf421c5a84ca Merge remote-tracking branch 
'asoc/for-5.12' into asoc-linus)
Merging regmap-fixes/for-linus (19c329f68089 Linux 5.11-rc4)
Merging regulator-fixes/for-linus (4a8a7d251201 Merge remote-tracking branch 
'regulator/for-5.12' into regulator-linus)
Merging spi-fixes/for-linus (21b49223c0f5 Merge remote-tracking branch 
'spi/for-5.12' into spi-linus)
Merging pci-current/for-linus (7e69d07d7c3c Revert "PCI/ASPM: Save/restore L1SS 
Capability for suspend/resume")
Merging driver-core.current/driver-core-linus (fe07bfda2fb9 Linux 5.12-rc1)
Merging tty.current/tty-linus (fe07bfda2fb9 Linux 5.12-rc1)
Merging usb.current/usb-linus (fe07bfda2fb9 Linux 5.12-rc1)
Merging usb-gadget-fixes/fixes (129aa9734559 usb: raw-gadget: fix memory leak 
in gadget_setup)
Merging usb-serial-fixes/usb-linus (9a9

[tip:x86/seves] BUILD SUCCESS bb8dc26937d51b3421b26d9d91cdad3484c34b7e

2021-03-01 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
x86/seves
branch HEAD: bb8dc26937d51b3421b26d9d91cdad3484c34b7e  x86/sev-es: Remove 
subtraction of res variable

elapsed time: 722m

configs tested: 76
configs skipped: 64

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
powerpcge_imp3a_defconfig
mips  bmips_stb_defconfig
mips  malta_defconfig
powerpc  katmai_defconfig
arm  pxa255-idp_defconfig
mipsnlm_xlr_defconfig
powerpc   maple_defconfig
sh   alldefconfig
sh  kfr2r09_defconfig
powerpc   mpc834x_itxgp_defconfig
shtitan_defconfig
m68k  sun3x_defconfig
sparc   sparc64_defconfig
powerpc   currituck_defconfig
powerpc  iss476-smp_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386   tinyconfig
i386defconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a005-20210301
i386 randconfig-a003-20210301
i386 randconfig-a002-20210301
i386 randconfig-a004-20210301
i386 randconfig-a006-20210301
i386 randconfig-a001-20210301
x86_64   randconfig-a013-20210301
x86_64   randconfig-a016-20210301
x86_64   randconfig-a015-20210301
x86_64   randconfig-a014-20210301
x86_64   randconfig-a012-20210301
x86_64   randconfig-a011-20210301
i386 randconfig-a016-20210301
i386 randconfig-a012-20210301
i386 randconfig-a014-20210301
i386 randconfig-a013-20210301
i386 randconfig-a011-20210301
i386 randconfig-a015-20210301
riscvnommu_k210_defconfig
riscvallyesconfig
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
riscvallmodconfig
x86_64   allyesconfig
x86_64rhel-7.6-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-a006-20210301
x86_64   randconfig-a001-20210301
x86_64   randconfig-a004-20210301
x86_64   randconfig-a002-20210301
x86_64   randconfig-a005-20210301
x86_64   randconfig-a003-20210301

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


[PATCH v9 9/9] certs: Add support for using elliptic curve keys for signing modules

2021-03-01 Thread yumeng




在 2021/3/1 21:11, Mimi Zohar 写道:

On Sat, 2021-02-27 at 11:35 +0800, yumeng wrote:

在 2021/2/26 0:08, Stefan Berger 写道:

From: Stefan Berger 




diff --git a/certs/Makefile b/certs/Makefile
index 3fe6b73786fa..c487d7021c54 100644
--- a/certs/Makefile
+++ b/certs/Makefile
@@ -69,6 +69,18 @@ else
   SIGNER = -signkey $(obj)/signing_key.key
   endif # CONFIG_IMA_APPRAISE_MODSIG
   


Is there anything wrong in this patch?
I can't apply it when I use 'git am '.
errors like below:

error: certs/Kconfig: does not match index
error: patch failed: certs/Makefile:69
error: certs/Makefile: patch does not apply

Thanks


Nothing wrong with the patch, just a dependency.  From the Change log:
- This patch builds on top Nayna's series for 'kernel build support
for loading the kernel module signing key'.
- https://lkml.org/lkml/2021/2/18/856

thanks,

Mimi




OK, thank you. Sorry for the noise.


Re: [PATCH] Input: elan_i2c - Reduce the resume time for new dev ices

2021-03-01 Thread jingle.wu
HI Dmitry:

So data->ops->initialize(client) essentially performs reset of the
controller (we may want to rename it even) and as far as I understand
you would want to avoid resetting the controller on newer devices,
right?

-> YES

My question is how behavior of older devices differ from the new ones
(are they stay in "undefined" state at power up) and whether it is
possible to determine if controller is in operating mode. For example,
what would happen on older devices if we call elan_query_product() below
without resetting the controller?

-> But there may be other problems, because ELAN can't test all the older 
devices , 
-> so use quirk to divide this part.

I also think that while I can see us skipping reset in resume paths we
probably want to keep it in probe as we really do not know the state of
the device (was it powered up properly earlier, etc).

-> In this part, at PROBE state will be called data->ops->initialize(client) 
function.
-> Because quirk's setting (data->quirks = elan_i2c_lookup_quirk(data->ic_type, 
data->product_id);)
-> is after data->ops->initialize(client) and elan_query_product()  function.

THANKS
JINGLE
-Original message-
From:Dmitry Torokhov 
To:jingle.wu 
Cc:linux-kernel@vger.kernel.org,linux-in...@vger.kernel.org,phoe...@emc.com.tw,dave.w...@emc.com.tw,josh.c...@emc.com.tw
Date:Mon, 01 Mar 2021 13:31:31
Subject:Re: [PATCH] Input: elan_i2c - Reduce the resume time for new devices

Hi Jingle,

On Fri, Feb 26, 2021 at 03:35:37PM +0800, jingle.wu wrote:
> @@ -273,10 +318,12 @@ static int __elan_initialize(struct elan_tp_data *data)
>   bool woken_up = false;
>   int error;
>  
> - error = data->ops->initialize(client);
> - if (error) {
> - dev_err(>dev, "device initialize failed: %d\n", error);
> - return error;
> + if (!(data->quirks & ETP_QUIRK_SET_QUICK_WAKEUP_DEV)) {
> + error = data->ops->initialize(client);
> + if (error) {
> + dev_err(>dev, "device initialize failed: %d\n", 
> error);
> + return error;
> + }

So data->ops->initialize(client) essentially performs reset of the
controller (we may want to rename it even) and as far as I understand
you would want to avoid resetting the controller on newer devices,
right?

My question is how behavior of older devices differ from the new ones
(are they stay in "undefined" state at power up) and whether it is
possible to determine if controller is in operating mode. For example,
what would happen on older devices if we call elan_query_product() below
without resetting the controller?

I also think that while I can see us skipping reset in resume paths we
probably want to keep it in probe as we really do not know the state of
the device (was it powered up properly earlier, etc).

>   }
>  
>   error = elan_query_product(data);
> @@ -366,6 +413,8 @@ static int elan_query_device_info(struct elan_tp_data 
> *data)
>   if (error)
>   return error;
>  
> + data->quirks = elan_i2c_lookup_quirk(data->ic_type, data->product_id);
> +
>   error = elan_get_fwinfo(data->ic_type, data->iap_version,
>   >fw_validpage_count,
>   >fw_signature_address,
> -- 
> 2.17.1
> 

Thanks.

-- 
Dmitry


Re: [PATCH] KVM: nSVM: Optimize L12 to L2 vmcb.save copies

2021-03-01 Thread Sean Christopherson
On Mon, Mar 01, 2021, Cathy Avery wrote:
>   kvm_set_rflags(>vcpu, vmcb12->save.rflags | X86_EFLAGS_FIXED);
>   svm_set_efer(>vcpu, vmcb12->save.efer);
>   svm_set_cr0(>vcpu, vmcb12->save.cr0);
>   svm_set_cr4(>vcpu, vmcb12->save.cr4);

Why not utilize VMCB_CR?

> - svm->vcpu.arch.cr2 = vmcb12->save.cr2;
> + svm->vmcb->save.cr2 = svm->vcpu.arch.cr2 = vmcb12->save.cr2;

Same question for VMCB_CR2.

Also, isn't writing svm->vmcb->save.cr2 unnecessary since svm_vcpu_run()
unconditionally writes it?

Alternatively, it shouldn't be too much work to add proper dirty tracking for
CR2.  VMX has to write the real CR2 every time because there's no VMCS field,
but I assume can avoid the write and dirty update on the majority of VMRUNs.

> +
>   kvm_rax_write(>vcpu, vmcb12->save.rax);
>   kvm_rsp_write(>vcpu, vmcb12->save.rsp);
>   kvm_rip_write(>vcpu, vmcb12->save.rip);
>  
>   /* In case we don't even reach vcpu_run, the fields are not updated */
> - svm->vmcb->save.cr2 = svm->vcpu.arch.cr2;
>   svm->vmcb->save.rax = vmcb12->save.rax;
>   svm->vmcb->save.rsp = vmcb12->save.rsp;
>   svm->vmcb->save.rip = vmcb12->save.rip;
>  
> - svm->vmcb->save.dr7 = vmcb12->save.dr7 | DR7_FIXED_1;
> - svm->vcpu.arch.dr6  = vmcb12->save.dr6 | DR6_ACTIVE_LOW;
> - vmcb_mark_dirty(svm->vmcb, VMCB_DR);
> + /* These bits will be set properly on the first execution when 
> new_vmc12 is true */
> + if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_DR))) {
> + svm->vmcb->save.dr7 = vmcb12->save.dr7 | DR7_FIXED_1;
> + svm->vcpu.arch.dr6  = vmcb12->save.dr6 | DR6_ACTIVE_LOW;
> + vmcb_mark_dirty(svm->vmcb, VMCB_DR);
> + }
>  }
>  
>  static void nested_vmcb02_prepare_control(struct vcpu_svm *svm)
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 54610270f66a..9761a7ca8100 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -1232,6 +1232,7 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
>   svm->asid = 0;
>  
>   svm->nested.vmcb12_gpa = 0;
> + svm->nested.last_vmcb12_gpa = 0;

We should use INVALID_PAGE, '0' is a legal physical address and could
theoretically get a false negative on the "new_vmcb12" check.

>   vcpu->arch.hflags = 0;
>  
>   if (!kvm_pause_in_guest(vcpu->kvm)) {
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index fbbb26dd0f73..911868d4584c 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -93,6 +93,7 @@ struct svm_nested_state {
>   u64 hsave_msr;
>   u64 vm_cr_msr;
>   u64 vmcb12_gpa;
> + u64 last_vmcb12_gpa;
>  
>   /* These are the merged vectors */
>   u32 *msrpm;
> @@ -247,6 +248,11 @@ static inline void vmcb_mark_dirty(struct vmcb *vmcb, 
> int bit)
>   vmcb->control.clean &= ~(1 << bit);
>  }
>  
> +static inline bool vmcb_is_dirty(struct vmcb *vmcb, int bit)
> +{
> +return !test_bit(bit, (unsigned long *)>control.clean);
> +}
> +
>  static inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu)
>  {
>   return container_of(vcpu, struct vcpu_svm, vcpu);
> -- 
> 2.26.2
> 


Re: [PATCH v5 0/5] mm/hugetlb: Early cow on fork, and a few cleanups

2021-03-01 Thread Zhang, Wei

Yes, such user includes libfabric (https://ofiwg.github.io/libfabric/) . which 
uses hugetlb pages.
 
On 3/1/21, 4:30 PM, "Jason Gunthorpe"  wrote:

CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.



On Mon, Mar 01, 2021 at 04:28:46PM -0800, Andrew Morton wrote:
> On Mon, 1 Mar 2021 09:11:51 -0500 Peter Xu  wrote:
>
> > On Wed, Feb 17, 2021 at 06:35:42PM -0500, Peter Xu wrote:
> > > v5:
> > > - patch 4: change "int cow" into "bool cow"
> > > - collect r-bs for Jason
> >
> > Andrew,
> >
> > I just noticed 5.12-rc1 has released; is this series still possible to 
make it
> > for 5.12, or needs to wait for 5.13?
> >
>
> It has taken a while to settle down.  What is the case for
> fast-tracking it into 5.12?

IIRC hugetlb users and fork and DMA will get the unexpected VA
corruption that triggered all this work.

Jason



5.12-rc1 regression: freezing iou-mgr/wrk failed

2021-03-01 Thread Alex Xu (Hello71)
Hi,

On Linux 5.12-rc1, I am unable to suspend to RAM. The system freezes for 
about 40 seconds and then continues operation. The following messages 
are printed to the kernel log:

[  240.650300] PM: suspend entry (deep)
[  240.650748] Filesystems sync: 0.000 seconds
[  240.725605] Freezing user space processes ...
[  260.739483] Freezing of tasks failed after 20.013 seconds (3 tasks refusing 
to freeze, wq_busy=0):
[  260.739497] task:iou-mgr-446 state:S stack:0 pid:  516 ppid:   439 
flags:0x4224
[  260.739504] Call Trace:
[  260.739507]  ? sysvec_apic_timer_interrupt+0xb/0x81
[  260.739515]  ? pick_next_task_fair+0x197/0x1cde
[  260.739519]  ? sysvec_reschedule_ipi+0x2f/0x6a
[  260.739522]  ? asm_sysvec_reschedule_ipi+0x12/0x20
[  260.739525]  ? __schedule+0x57/0x6d6
[  260.739529]  ? del_timer_sync+0xb9/0x115
[  260.739533]  ? schedule+0x63/0xd5
[  260.739536]  ? schedule_timeout+0x219/0x356
[  260.739540]  ? __next_timer_interrupt+0xf1/0xf1
[  260.739544]  ? io_wq_manager+0x73/0xb1
[  260.739549]  ? io_wq_create+0x262/0x262
[  260.739553]  ? ret_from_fork+0x22/0x30
[  260.739557] task:iou-mgr-517 state:S stack:0 pid:  522 ppid:   439 
flags:0x4224
[  260.739561] Call Trace:
[  260.739563]  ? sysvec_apic_timer_interrupt+0xb/0x81
[  260.739566]  ? pick_next_task_fair+0x16f/0x1cde
[  260.739569]  ? sysvec_apic_timer_interrupt+0xb/0x81
[  260.739571]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
[  260.739574]  ? __schedule+0x5b7/0x6d6
[  260.739578]  ? del_timer_sync+0x70/0x115
[  260.739581]  ? schedule_timeout+0x211/0x356
[  260.739585]  ? __next_timer_interrupt+0xf1/0xf1
[  260.739588]  ? io_wq_check_workers+0x15/0x11f
[  260.739592]  ? io_wq_manager+0x69/0xb1
[  260.739596]  ? io_wq_create+0x262/0x262
[  260.739600]  ? ret_from_fork+0x22/0x30
[  260.739603] task:iou-wrk-517 state:S stack:0 pid:  523 ppid:   439 
flags:0x4224
[  260.739607] Call Trace:
[  260.739609]  ? __schedule+0x5b7/0x6d6
[  260.739614]  ? schedule+0x63/0xd5
[  260.739617]  ? schedule_timeout+0x219/0x356
[  260.739621]  ? __next_timer_interrupt+0xf1/0xf1
[  260.739624]  ? task_thread.isra.0+0x148/0x3af
[  260.739628]  ? task_thread_unbound+0xa/0xa
[  260.739632]  ? task_thread_bound+0x7/0x7
[  260.739636]  ? ret_from_fork+0x22/0x30
[  260.739647] OOM killer enabled.
[  260.739648] Restarting tasks ... done.
[  260.740077] PM: suspend exit

and then a set of similar messages except with s2idle instead of deep.

Reverting 5695e51619 ("Merge tag 'io_uring-worker.v3-2021-02-25' of 
git://git.kernel.dk/linux-block") appears to resolve the issue. I have 
not yet bisected further. Let me know which troubleshooting steps I 
should perform next.

Thanks,
Alex.


Re: [PATCH 21/25] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions

2021-03-01 Thread Kai Huang
On Mon, 1 Mar 2021 09:20:15 -0800 Sean Christopherson wrote:
> On Mon, Mar 01, 2021, Kai Huang wrote:
> > +static int handle_encls_ecreate(struct kvm_vcpu *vcpu)
> > +{
> > +   struct kvm_cpuid_entry2 *sgx_12_0, *sgx_12_1;
> > +   gva_t pageinfo_gva, secs_gva;
> > +   gva_t metadata_gva, contents_gva;
> > +   gpa_t metadata_gpa, contents_gpa, secs_gpa;
> > +   unsigned long metadata_hva, contents_hva, secs_hva;
> > +   struct sgx_pageinfo pageinfo;
> > +   struct sgx_secs *contents;
> > +   u64 attributes, xfrm, size;
> > +   u32 miscselect;
> > +   struct x86_exception ex;
> > +   u8 max_size_log2;
> > +   int trapnr, r;
> > +
> 
> (see below)
> 
> --- cut here --- >8
> 
> > +   sgx_12_0 = kvm_find_cpuid_entry(vcpu, 0x12, 0);
> > +   sgx_12_1 = kvm_find_cpuid_entry(vcpu, 0x12, 1);
> > +   if (!sgx_12_0 || !sgx_12_1) {
> > +   kvm_inject_gp(vcpu, 0);
> 
> This should probably be an emulation failure.  This code is reached iff SGX1 
> is
> enabled in the guest, userspace done messed up if they enabled SGX1 without
> defining CPUID.0x12.1.  That also makes it more obvious that burying this in a
> helper after a bunch of other checks isn't wrong, i.e. KVM has already 
> verified
> that SGX1 is enabled in the guest.

Agreed.

> 
> > +   return 1;
> > +   }
> 
> ---
> 
> > +
> > +   if (sgx_get_encls_gva(vcpu, kvm_rbx_read(vcpu), 32, 32, _gva) 
> > ||
> > +   sgx_get_encls_gva(vcpu, kvm_rcx_read(vcpu), 4096, 4096, _gva))
> > +   return 1;
> > +
> > +   /*
> > +* Copy the PAGEINFO to local memory, its pointers need to be
> > +* translated, i.e. we need to do a deep copy/translate.
> > +*/
> > +   r = kvm_read_guest_virt(vcpu, pageinfo_gva, ,
> > +   sizeof(pageinfo), );
> > +   if (r == X86EMUL_PROPAGATE_FAULT) {
> > +   kvm_inject_emulated_page_fault(vcpu, );
> > +   return 1;
> > +   } else if (r != X86EMUL_CONTINUE) {
> > +   sgx_handle_emulation_failure(vcpu, pageinfo_gva, size);
> > +   return 0;
> > +   }
> > +
> > +   if (sgx_get_encls_gva(vcpu, pageinfo.metadata, 64, 64, _gva) ||
> > +   sgx_get_encls_gva(vcpu, pageinfo.contents, 4096, 4096,
> > + _gva))
> > +   return 1;
> > +
> > +   /*
> > +* Translate the SECINFO, SOURCE and SECS pointers from GVA to GPA.
> > +* Resume the guest on failure to inject a #PF.
> > +*/
> > +   if (sgx_gva_to_gpa(vcpu, metadata_gva, false, _gpa) ||
> > +   sgx_gva_to_gpa(vcpu, contents_gva, false, _gpa) ||
> > +   sgx_gva_to_gpa(vcpu, secs_gva, true, _gpa))
> > +   return 1;
> > +
> > +   /*
> > +* ...and then to HVA.  The order of accesses isn't architectural, i.e.
> > +* KVM doesn't have to fully process one address at a time.  Exit to
> > +* userspace if a GPA is invalid.
> > +*/
> > +   if (sgx_gpa_to_hva(vcpu, metadata_gpa, _hva) ||
> > +   sgx_gpa_to_hva(vcpu, contents_gpa, _hva) ||
> > +   sgx_gpa_to_hva(vcpu, secs_gpa, _hva))
> > +   return 0;
> > +   /*
> > +* Copy contents into kernel memory to prevent TOCTOU attack. E.g. the
> > +* guest could do ECREATE w/ SECS.SGX_ATTR_PROVISIONKEY=0, and
> > +* simultaneously set SGX_ATTR_PROVISIONKEY to bypass the check to
> > +* enforce restriction of access to the PROVISIONKEY.
> > +*/
> > +   contents = (struct sgx_secs *)__get_free_page(GFP_KERNEL);
> > +   if (!contents)
> > +   return -ENOMEM;
> 
> --- cut here --- >8
> 
> > +
> > +   /* Exit to userspace if copying from a host userspace address fails. */
> > +   if (sgx_read_hva(vcpu, contents_hva, (void *)contents, PAGE_SIZE))
> 
> This, and every failure path below, will leak 'contents'.  The easiest thing 
> is
> probably to wrap everything in "cut here" in a separate helper.  The CPUID
> lookups can be , e.g.
> 
>   contents = (struct sgx_secs *)__get_free_page(GFP_KERNEL);
>   if (!contents)
>   return -ENOMEM;
> 
>   r = __handle_encls_ecreate(vcpu, , secs);
> 
>   free_page((unsigned long)contents);
>   return r;
> 
> And then the helper can be everything below, plus the CPUID lookup:
> 
>   sgx_12_0 = kvm_find_cpuid_entry(vcpu, 0x12, 0);
>   sgx_12_1 = kvm_find_cpuid_entry(vcpu, 0x12, 1);
>   if (!sgx_12_0 || !sgx_12_1) {
>   vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>   vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
>   vcpu->run->internal.ndata = 0;
>   return 0;
>   }
> 

Hmm.. Obviously I wasn't careful enough. Thanks for pointing out.

Will do in your way.

> 
> > +   return 0;
> > +
> > +   miscselect = contents->miscselect;
> > +   attributes = contents->attributes;
> > +   xfrm = contents->xfrm;
> > +   size = contents->size;
> > +
> > +   /* Enforce restriction of access to the PROVISIONKEY. */
> > +   if (!vcpu->kvm->arch.sgx_provisioning_allowed &&
> > +   (attributes & SGX_ATTR_PROVISIONKEY)) {
> > + 

Re: [PATCH] w1: ds2708 and ds2781 use the new API kobj_to_dev()

2021-03-01 Thread tiantao (H)

Hi:

在 2021/3/1 21:09, Greg KH 写道:

On Mon, Mar 01, 2021 at 08:58:55PM +0800, Tian Tao wrote:

fix the below warnning:
/drivers/w1/slaves/w1_ds2780.c:93:60-61: WARNING opportunity for
kobj_to_dev()

What creates that warning?

This is reported by coccicheck.




Signed-off-by: Tian Tao 
---
  drivers/w1/slaves/w1_ds2780.c | 3 ++-
  drivers/w1/slaves/w1_ds2781.c | 2 +-
  2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/w1/slaves/w1_ds2780.c b/drivers/w1/slaves/w1_ds2780.c
index c281fe5..3cde1bb 100644
--- a/drivers/w1/slaves/w1_ds2780.c
+++ b/drivers/w1/slaves/w1_ds2780.c
@@ -90,7 +90,8 @@ static ssize_t w1_slave_read(struct file *filp, struct 
kobject *kobj,
 struct bin_attribute *bin_attr, char *buf,
 loff_t off, size_t count)
  {
-   struct device *dev = container_of(kobj, struct device, kobj);
+   struct device *dev = kobj_to_dev(kobj);
+

Why the extra line here, but not in the other chunk?


This is reported by checkpatch.

tiantao@ubuntu:~/mailline/linux-next$ ./scripts/checkpatch.pl 
drivers/w1/slaves/w1_ds2780.c


WARNING: Missing a blank line after declarations
#94: FILE: drivers/w1/slaves/w1_ds2780.c:94:
+   struct device *dev = kobj_to_dev(kobj);
+   return w1_ds2780_io(dev, buf, off, count, 0);



Consistancy is key :)

Please fix up.

thanks,

greg k-h
.





Re: [PATCH v5 05/14] vfio/mdev: idxd: add basic mdev registration and helper functions

2021-03-01 Thread Jason Gunthorpe
On Mon, Mar 01, 2021 at 05:48:00PM -0700, Dave Jiang wrote:
> 
> On 3/1/2021 5:29 PM, Jason Gunthorpe wrote:
> > On Mon, Mar 01, 2021 at 05:23:47PM -0700, Dave Jiang wrote:
> > > So after looking at the code in vfio_pci_intrs.c, I agree that the 
> > > set_irqs
> > > code between VFIO_PCI and this driver can be made in common. Given that 
> > > Alex
> > > doesn't want a vfio_pci device embedded in the driver,
> > idxd isn't a vfio_pci so it would be improper to do something like
> > that here anyhow.
> > 
> > > I think we'll need some sort of generic VFIO device that can be used
> > > from the vfio_pci side and vfio_mdev side to pass down in order to
> > > have common support library functions.
> > Why do you need more layers?
> > 
> > Just make some helper functions to manage this and build them into
> > their own struct and function family. All this needs is some callback
> > to for the end driver to hook in the raw device programming and some
> > entry points to direct the emulation access to the module.
> > 
> > It should be fully self contained and completely unrelated to vfio_pci
> > 
> Maybe I'm looking at this wrong. I see a some code in vfio_pci_intrs.c that
> we can reuse with some changes here and there. But, I think see where you
> are getting at with just common functions for mdev side. Let me create it
> just for IMS emulation and then we can go from there trying to figure if
> that's the right path to go down or if we need to share code with vfio_pci.

If it really is very common it could all be consolidated in a
vfio_utils.c kind of thing that all the places can use.

There is nothing wrong with splitting pieces of vfio_pci out.

Jason


Re: [PATCH 19/25] KVM: VMX: Add basic handling of VM-Exit from SGX enclave

2021-03-01 Thread Kai Huang
On Mon, 1 Mar 2021 08:52:13 -0800 Sean Christopherson wrote:
> On Mon, Mar 01, 2021, Kai Huang wrote:
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 50810d471462..df8e338267aa 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -1570,12 +1570,18 @@ static int vmx_rtit_ctl_check(struct kvm_vcpu 
> > *vcpu, u64 data)
> >  
> >  static bool vmx_can_emulate_instruction(struct kvm_vcpu *vcpu, void *insn, 
> > int insn_len)
> >  {
> > +   if (to_vmx(vcpu)->exit_reason.enclave_mode) {
> > +   kvm_queue_exception(vcpu, UD_VECTOR);
> 
> Rereading my own code, I think it would be a good idea to add a comment here
> explaining that injecting #UD is technically wrong, but avoids giving guest
> userspace an easy way to DoS the guest.  The EPT misconfig is a good example;
> guest userspace could have executed a simple MOV ,  instruction, in
> which case injecting a #UD is bizarre behavior.  But, the alternative is 
> exiting
> to userspace with KVM_INTERNAL_ERROR_EMULATION, which is all but guaranteed to
> kill the guest.
> 
> If KVM, specifically handle_emulation_failure(), ever gains a more 
> sophisticated
> mechanism for handling userspace emulation errors, this should be updated too.
> 
>   /*
>* Emulation of instructions in SGX enclaves is impossible as RIP does
>* not point  tthe failing instruction, and even if it did, the code
>* stream is inaccessible.  Inject #UD instead of exiting to userspace
>* so that guest userspace can't DoS the guest simply by triggering
>* emulation (enclaves are CPL3 only).
>*/

Agreed. Will add above comment.

> 
> > +   return false;
> > +   }
> > return true;
> >  }
> 
> ...
> 
> > @@ -5384,6 +5415,9 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
> >  {
> > gpa_t gpa;
> >  
> > +   if (!vmx_can_emulate_instruction(vcpu, NULL, 0))
> > +   return 1;
> > +
> > /*
> >  * A nested guest cannot optimize MMIO vmexits, because we have an
> >  * nGPA here instead of the required GPA.
> > -- 
> > 2.29.2
> > 


[PATCH] v4l-compliance: re-introduce NON_COHERENT and cache hints tests

2021-03-01 Thread Sergey Senozhatsky
This returns back non-coherent (previously known as NON_COHERENT)
memory flag and buffer cache management hints testing (for VB2_MEMORY_MMAP
buffers).

Signed-off-by: Sergey Senozhatsky 
---
 utils/common/cv4l-helpers.h |  8 +--
 utils/common/v4l-helpers.h  |  8 ++-
 utils/v4l2-compliance/v4l2-test-buffers.cpp | 65 ++---
 3 files changed, 66 insertions(+), 15 deletions(-)

diff --git a/utils/common/cv4l-helpers.h b/utils/common/cv4l-helpers.h
index 712efde6..3cee372b 100644
--- a/utils/common/cv4l-helpers.h
+++ b/utils/common/cv4l-helpers.h
@@ -754,17 +754,17 @@ public:
int g_fd(unsigned index, unsigned plane) const { return 
v4l_queue_g_fd(this, index, plane); }
void s_fd(unsigned index, unsigned plane, int fd) { 
v4l_queue_s_fd(this, index, plane, fd); }
 
-   int reqbufs(cv4l_fd *fd, unsigned count = 0)
+   int reqbufs(cv4l_fd *fd, unsigned count = 0, unsigned int flags = 0)
{
-   return v4l_queue_reqbufs(fd->g_v4l_fd(), this, count);
+   return v4l_queue_reqbufs(fd->g_v4l_fd(), this, count, flags);
}
bool has_create_bufs(cv4l_fd *fd) const
{
return v4l_queue_has_create_bufs(fd->g_v4l_fd(), this);
}
-   int create_bufs(cv4l_fd *fd, unsigned count, const v4l2_format *fmt = 
NULL)
+   int create_bufs(cv4l_fd *fd, unsigned count, const v4l2_format *fmt = 
NULL, unsigned int flags = 0)
{
-   return v4l_queue_create_bufs(fd->g_v4l_fd(), this, count, fmt);
+   return v4l_queue_create_bufs(fd->g_v4l_fd(), this, count, fmt, 
flags);
}
int mmap_bufs(cv4l_fd *fd, unsigned from = 0)
{
diff --git a/utils/common/v4l-helpers.h b/utils/common/v4l-helpers.h
index f96b3c38..c09cd987 100644
--- a/utils/common/v4l-helpers.h
+++ b/utils/common/v4l-helpers.h
@@ -1515,7 +1515,7 @@ static inline int v4l_queue_querybufs(struct v4l_fd *f, 
struct v4l_queue *q, uns
 }
 
 static inline int v4l_queue_reqbufs(struct v4l_fd *f,
-   struct v4l_queue *q, unsigned count)
+   struct v4l_queue *q, unsigned count, unsigned int flags = 0)
 {
struct v4l2_requestbuffers reqbufs;
int ret;
@@ -1523,6 +1523,7 @@ static inline int v4l_queue_reqbufs(struct v4l_fd *f,
reqbufs.type = q->type;
reqbufs.memory = q->memory;
reqbufs.count = count;
+   reqbufs.flags = flags;
/*
 * Problem: if REQBUFS returns an error, did it free any old
 * buffers or not?
@@ -1547,7 +1548,7 @@ static inline bool v4l_queue_has_create_bufs(struct 
v4l_fd *f, const struct v4l_
 
 static inline int v4l_queue_create_bufs(struct v4l_fd *f,
struct v4l_queue *q, unsigned count,
-   const struct v4l2_format *fmt)
+   const struct v4l2_format *fmt, unsigned int flags = 0)
 {
struct v4l2_create_buffers createbufs;
int ret;
@@ -1555,6 +1556,7 @@ static inline int v4l_queue_create_bufs(struct v4l_fd *f,
createbufs.format.type = q->type;
createbufs.memory = q->memory;
createbufs.count = count;
+   createbufs.flags = flags;
if (fmt) {
createbufs.format = *fmt;
} else {
@@ -1733,7 +1735,7 @@ static inline void v4l_queue_free(struct v4l_fd *f, 
struct v4l_queue *q)
v4l_ioctl(f, VIDIOC_STREAMOFF, >type);
v4l_queue_release_bufs(f, q, 0);
v4l_queue_close_exported_fds(q);
-   v4l_queue_reqbufs(f, q, 0);
+   v4l_queue_reqbufs(f, q, 0, 0);
 }
 
 static inline void v4l_queue_buffer_update(const struct v4l_queue *q,
diff --git a/utils/v4l2-compliance/v4l2-test-buffers.cpp 
b/utils/v4l2-compliance/v4l2-test-buffers.cpp
index e40461bd..6555c0cb 100644
--- a/utils/v4l2-compliance/v4l2-test-buffers.cpp
+++ b/utils/v4l2-compliance/v4l2-test-buffers.cpp
@@ -663,6 +663,10 @@ int testReqBufs(struct node *node)
fail_on_test(q.reqbufs(node, 0));
 
for (m = V4L2_MEMORY_MMAP; m <= V4L2_MEMORY_DMABUF; m++) {
+   bool cache_hints_cap = false;
+   bool consistent;
+
+   cache_hints_cap = q.g_capabilities() & 
V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS;
if (!(node->valid_memorytype & (1 << m)))
continue;
cv4l_queue q2(i, m);
@@ -678,8 +682,17 @@ int testReqBufs(struct node *node)
reqbufs.count = 1;
reqbufs.type = i;
reqbufs.memory = m;
+   reqbufs.flags = V4L2_FLAG_MEMORY_NON_COHERENT;
fail_on_test(doioctl(node, VIDIOC_REQBUFS, ));
-   fail_on_test(check_0(reqbufs.reserved, 
sizeof(reqbufs.reserved)));
+   consistent = reqbufs.flags & 
V4L2_FLAG_MEMORY_NON_COHERENT;
+   if (!cache_hints_cap) {
+   

Re: [PATCH] v4l-compliance: re-introduce NON_COHERENT and cache hints tests

2021-03-01 Thread Sergey Senozhatsky
On (21/03/02 09:49), Sergey Senozhatsky wrote:
> This returns back non-coherent (previously known as NON_COHERENT)
^^^
NON_CONSISTENT...

> memory flag and buffer cache management hints testing (for VB2_MEMORY_MMAP
> buffers).


[PATCH 8/8] videobuf2: handle non-contiguous DMA allocations

2021-03-01 Thread Sergey Senozhatsky
This adds support for new noncontiguous DMA API, which
requires allocators to have two execution branches: one
for the current API, and one for the new one.

Signed-off-by: Sergey Senozhatsky 
[hch: untested conversion to the ne API]
Signed-off-by: Christoph Hellwig 
---
 .../common/videobuf2/videobuf2-dma-contig.c   | 141 +++---
 1 file changed, 117 insertions(+), 24 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c 
b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
index 1e218bc440c6..d6a9f7b682f3 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -42,8 +43,14 @@ struct vb2_dc_buf {
struct dma_buf_attachment   *db_attach;
 
struct vb2_buffer   *vb;
+   unsigned intnon_coherent_mem:1;
 };
 
+static bool vb2_dc_is_coherent(struct vb2_dc_buf *buf)
+{
+   return !buf->non_coherent_mem;
+}
+
 /*/
 /*scatterlist table functions*/
 /*/
@@ -78,12 +85,21 @@ static void *vb2_dc_cookie(struct vb2_buffer *vb, void 
*buf_priv)
 static void *vb2_dc_vaddr(struct vb2_buffer *vb, void *buf_priv)
 {
struct vb2_dc_buf *buf = buf_priv;
-   struct dma_buf_map map;
-   int ret;
 
-   if (!buf->vaddr && buf->db_attach) {
-   ret = dma_buf_vmap(buf->db_attach->dmabuf, );
-   buf->vaddr = ret ? NULL : map.vaddr;
+   if (buf->vaddr)
+   return buf->vaddr;
+
+   if (buf->db_attach) {
+   struct dma_buf_map map;
+
+   if (!dma_buf_vmap(buf->db_attach->dmabuf, ))
+   buf->vaddr = map.vaddr;
+   }
+
+   if (!vb2_dc_is_coherent(buf)) {
+   buf->vaddr = dma_vmap_noncontiguous(buf->dev,
+   buf->size,
+   buf->dma_sgt);
}
 
return buf->vaddr;
@@ -101,13 +117,26 @@ static void vb2_dc_prepare(void *buf_priv)
struct vb2_dc_buf *buf = buf_priv;
struct sg_table *sgt = buf->dma_sgt;
 
+   /* This takes care of DMABUF and user-enforced cache sync hint */
if (buf->vb->skip_cache_sync_on_prepare)
return;
 
+   /*
+* Coherent MMAP buffers do not need to be synced, unlike coherent
+* USERPTR and non-coherent MMAP buffers.
+*/
+   if (buf->vb->memory == V4L2_MEMORY_MMAP && vb2_dc_is_coherent(buf))
+   return;
+
if (!sgt)
return;
 
+   /* For both USERPTR and non-coherent MMAP */
dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
+
+   /* Non-coherrent MMAP only */
+   if (!vb2_dc_is_coherent(buf) && buf->vaddr)
+   flush_kernel_vmap_range(buf->vaddr, buf->size);
 }
 
 static void vb2_dc_finish(void *buf_priv)
@@ -115,19 +144,46 @@ static void vb2_dc_finish(void *buf_priv)
struct vb2_dc_buf *buf = buf_priv;
struct sg_table *sgt = buf->dma_sgt;
 
+   /* This takes care of DMABUF and user-enforced cache sync hint */
if (buf->vb->skip_cache_sync_on_finish)
return;
 
+   /*
+* Coherent MMAP buffers do not need to be synced, unlike coherent
+* USERPTR and non-coherent MMAP buffers.
+*/
+   if (buf->vb->memory == V4L2_MEMORY_MMAP && vb2_dc_is_coherent(buf))
+   return;
+
if (!sgt)
return;
 
+   /* For both USERPTR and non-coherent MMAP */
dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
+
+   /* Non-coherrent MMAP only */
+   if (!vb2_dc_is_coherent(buf) && buf->vaddr)
+   invalidate_kernel_vmap_range(buf->vaddr, buf->size);
 }
 
 /*/
 /*callbacks for MMAP buffers */
 /*/
 
+static void __vb2_dc_put(struct vb2_dc_buf *buf)
+{
+   if (vb2_dc_is_coherent(buf)) {
+   dma_free_attrs(buf->dev, buf->size, buf->cookie,
+  buf->dma_addr, buf->attrs);
+   return;
+   }
+
+   if (buf->vaddr)
+   dma_vunmap_noncontiguous(buf->dev, buf->vaddr);
+   dma_free_noncontiguous(buf->dev, buf->size,
+  buf->dma_sgt, buf->dma_addr);
+}
+
 static void vb2_dc_put(void *buf_priv)
 {
struct vb2_dc_buf *buf = buf_priv;
@@ -139,17 +195,47 @@ static void vb2_dc_put(void *buf_priv)
sg_free_table(buf->sgt_base);
kfree(buf->sgt_base);
}
-   dma_free_attrs(buf->dev, buf->size, buf->cookie, buf->dma_addr,
-  buf->attrs);
+   __vb2_dc_put(buf);
put_device(buf->dev);
kfree(buf);
 }
 

Re: [PATCH v5 05/14] vfio/mdev: idxd: add basic mdev registration and helper functions

2021-03-01 Thread Dave Jiang



On 3/1/2021 5:29 PM, Jason Gunthorpe wrote:

On Mon, Mar 01, 2021 at 05:23:47PM -0700, Dave Jiang wrote:

So after looking at the code in vfio_pci_intrs.c, I agree that the set_irqs
code between VFIO_PCI and this driver can be made in common. Given that Alex
doesn't want a vfio_pci device embedded in the driver,

idxd isn't a vfio_pci so it would be improper to do something like
that here anyhow.


I think we'll need some sort of generic VFIO device that can be used
from the vfio_pci side and vfio_mdev side to pass down in order to
have common support library functions.

Why do you need more layers?

Just make some helper functions to manage this and build them into
their own struct and function family. All this needs is some callback
to for the end driver to hook in the raw device programming and some
entry points to direct the emulation access to the module.

It should be fully self contained and completely unrelated to vfio_pci

Maybe I'm looking at this wrong. I see a some code in vfio_pci_intrs.c 
that we can reuse with some changes here and there. But, I think see 
where you are getting at with just common functions for mdev side. Let 
me create it just for IMS emulation and then we can go from there trying 
to figure if that's the right path to go down or if we need to share 
code with vfio_pci.


[PATCH 7/8] videobuf2: handle V4L2_FLAG_MEMORY_NON_COHERENT flag

2021-03-01 Thread Sergey Senozhatsky
This patch lets user-space to request a non-coherent memory
allocation during CREATE_BUFS and REQBUFS ioctl calls.

= CREATE_BUFS

  struct v4l2_create_buffers has seven 4-byte reserved areas,
  so reserved[0] is renamed to ->flags. The struct, thus, now
  has six reserved 4-byte regions.

= CREATE_BUFS32

  struct v4l2_create_buffers32 has seven 4-byte reserved areas,
  so reserved[0] is renamed to ->flags. The struct, thus, now
  has six reserved 4-byte regions.

= REQBUFS

 We use one bit of a ->reserved[1] member of struct v4l2_requestbuffers,
 which is now renamed to ->flags. Unlike v4l2_create_buffers, struct
 v4l2_requestbuffers does not have enough reserved room. Therefore for
 backward compatibility  ->reserved and ->flags were put into anonymous
 union.

Signed-off-by: Sergey Senozhatsky 
---
 .../media/v4l/vidioc-create-bufs.rst  |  7 ++-
 .../media/v4l/vidioc-reqbufs.rst  | 11 +--
 .../media/common/videobuf2/videobuf2-core.c   |  6 ++
 .../media/common/videobuf2/videobuf2-v4l2.c   | 19 ---
 drivers/media/v4l2-core/v4l2-compat-ioctl32.c |  9 -
 drivers/media/v4l2-core/v4l2-ioctl.c  |  5 +
 include/uapi/linux/videodev2.h| 11 +--
 7 files changed, 55 insertions(+), 13 deletions(-)

diff --git a/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst 
b/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst
index b06e5b528e11..132c8b612a94 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst
@@ -113,7 +113,12 @@ than the number requested.
``V4L2_MEMORY_MMAP`` and ``format.type`` to the buffer type.
 
 * - __u32
-  - ``reserved``\ [7]
+  - ``flags``
+  - Specifies additional buffer management attributes.
+   See :ref:`memory-flags`.
+
+* - __u32
+  - ``reserved``\ [6]
   - A place holder for future extensions. Drivers and applications
must set the array to zero.
 
diff --git a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst 
b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
index 950e7ec1aac5..80ea48acea84 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
@@ -104,10 +104,17 @@ aborting or finishing any DMA in progress, an implicit
``V4L2_MEMORY_MMAP`` and ``type`` set to the buffer type. This will
free any previously allocated buffers, so this is typically something
that will be done at the start of the application.
+* - union {
+  - (anonymous)
+* - __u32
+  - ``flags``
+  - Specifies additional buffer management attributes.
+   See :ref:`memory-flags`.
 * - __u32
   - ``reserved``\ [1]
-  - A place holder for future extensions. Drivers and applications
-   must set the array to zero.
+  - Kept for backwards compatibility. Use ``flags`` instead.
+* - }
+  -
 
 .. tabularcolumns:: |p{6.1cm}|p{2.2cm}|p{8.7cm}|
 
diff --git a/drivers/media/common/videobuf2/videobuf2-core.c 
b/drivers/media/common/videobuf2/videobuf2-core.c
index 7040b7f47133..5906a48e7757 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -768,6 +768,9 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory 
memory,
unsigned int i;
int ret;
 
+   if (flags & V4L2_FLAG_MEMORY_NON_COHERENT)
+   coherent_mem = false;
+
if (q->streaming) {
dprintk(q, 1, "streaming active\n");
return -EBUSY;
@@ -911,6 +914,9 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum 
vb2_memory memory,
bool coherent_mem = true;
int ret;
 
+   if (flags & V4L2_FLAG_MEMORY_NON_COHERENT)
+   coherent_mem = false;
+
if (q->num_buffers == VB2_MAX_FRAME) {
dprintk(q, 1, "maximum number of buffers already allocated\n");
return -ENOBUFS;
diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c 
b/drivers/media/common/videobuf2/videobuf2-v4l2.c
index 1166d5a9291a..f6a8dcc1b5c6 100644
--- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
+++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
@@ -692,12 +692,22 @@ static void fill_buf_caps(struct vb2_queue *q, u32 *caps)
 #endif
 }
 
+static void validate_coherency_flags(struct vb2_queue *q,
+int memory,
+unsigned int *flags)
+{
+   if (!q->allow_cache_hints || memory != V4L2_MEMORY_MMAP)
+   *flags &= ~V4L2_FLAG_MEMORY_NON_COHERENT;
+}
+
 int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req)
 {
int ret = vb2_verify_memory_type(q, req->memory, req->type);
 
fill_buf_caps(q, >capabilities);
-   return ret ? ret : vb2_core_reqbufs(q, req->memory, 0, >count);
+   

[PATCH 6/8] videobuf2: add queue memory coherency parameter

2021-03-01 Thread Sergey Senozhatsky
Preparations for future V4L2_FLAG_MEMORY_NON_COHERENT support.

Extend vb2_core_reqbufs() parameters list to accept requests'
->flags, which will be used for memory coherency configuration.

An attempt to allocate a buffer with coherency requirements
which don't match queue's consistency model will fail.

Signed-off-by: Sergey Senozhatsky 
---
 .../media/common/videobuf2/videobuf2-core.c   | 40 ---
 .../media/common/videobuf2/videobuf2-v4l2.c   |  5 ++-
 drivers/media/dvb-core/dvb_vb2.c  |  2 +-
 include/media/videobuf2-core.h|  8 +++-
 4 files changed, 44 insertions(+), 11 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-core.c 
b/drivers/media/common/videobuf2/videobuf2-core.c
index 55af63d54f23..7040b7f47133 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -738,11 +738,33 @@ int vb2_verify_memory_type(struct vb2_queue *q,
 }
 EXPORT_SYMBOL(vb2_verify_memory_type);
 
+static void set_queue_coherency(struct vb2_queue *q, bool coherent_mem)
+{
+   q->non_coherent_mem = 0;
+
+   if (!vb2_queue_allows_cache_hints(q))
+   return;
+   if (!coherent_mem)
+   q->non_coherent_mem = 1;
+}
+
+static bool verify_coherency_flags(struct vb2_queue *q, bool coherent_mem)
+{
+   bool queue_is_coherent = !q->non_coherent_mem;
+
+   if (coherent_mem != queue_is_coherent) {
+   dprintk(q, 1, "memory coherency model mismatch\n");
+   return false;
+   }
+   return true;
+}
+
 int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
-unsigned int *count)
+unsigned int flags, unsigned int *count)
 {
unsigned int num_buffers, allocated_buffers, num_planes = 0;
unsigned plane_sizes[VB2_MAX_PLANES] = { };
+   bool coherent_mem = true;
unsigned int i;
int ret;
 
@@ -757,7 +779,8 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory 
memory,
}
 
if (*count == 0 || q->num_buffers != 0 ||
-   (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory)) {
+   (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory) ||
+   !verify_coherency_flags(q, coherent_mem)) {
/*
 * We already have buffers allocated, so first check if they
 * are not in use and can be freed.
@@ -794,6 +817,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory 
memory,
num_buffers = min_t(unsigned int, num_buffers, VB2_MAX_FRAME);
memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
q->memory = memory;
+   set_queue_coherency(q, coherent_mem);
 
/*
 * Ask the driver how many buffers and planes per buffer it requires.
@@ -878,12 +902,13 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory 
memory,
 EXPORT_SYMBOL_GPL(vb2_core_reqbufs);
 
 int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
-unsigned int *count,
+unsigned int flags, unsigned int *count,
 unsigned int requested_planes,
 const unsigned int requested_sizes[])
 {
unsigned int num_planes = 0, num_buffers, allocated_buffers;
unsigned plane_sizes[VB2_MAX_PLANES] = { };
+   bool coherent_mem = true;
int ret;
 
if (q->num_buffers == VB2_MAX_FRAME) {
@@ -899,11 +924,14 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum 
vb2_memory memory,
memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
q->memory = memory;
q->waiting_for_buffers = !q->is_output;
+   set_queue_coherency(q, coherent_mem);
} else {
if (q->memory != memory) {
dprintk(q, 1, "memory model mismatch\n");
return -EINVAL;
}
+   if (!verify_coherency_flags(q, coherent_mem))
+   return -EINVAL;
}
 
num_buffers = min(*count, VB2_MAX_FRAME - q->num_buffers);
@@ -2576,7 +2604,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int 
read)
fileio->memory = VB2_MEMORY_MMAP;
fileio->type = q->type;
q->fileio = fileio;
-   ret = vb2_core_reqbufs(q, fileio->memory, >count);
+   ret = vb2_core_reqbufs(q, fileio->memory, 0, >count);
if (ret)
goto err_kfree;
 
@@ -2633,7 +2661,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int 
read)
 
 err_reqbufs:
fileio->count = 0;
-   vb2_core_reqbufs(q, fileio->memory, >count);
+   vb2_core_reqbufs(q, fileio->memory, 0, >count);
 
 err_kfree:
q->fileio = NULL;
@@ -2653,7 +2681,7 @@ static int __vb2_cleanup_fileio(struct vb2_queue *q)
vb2_core_streamoff(q, q->type);
q->fileio = NULL;
fileio->count = 0;
-  

[PATCH 4/8] videobuf2: move cache_hints handling to allocators

2021-03-01 Thread Sergey Senozhatsky
This moves cache hints handling from videobuf2 core down
to allocators level, because allocators do the sync/flush
caches eventually and may take better decisions. Besides,
allocators already decide whether cache sync/flush should
be done or can be skipped. This patch moves the scattered
buffer cache sync logic to one common place.

Signed-off-by: Sergey Senozhatsky 
---
 drivers/media/common/videobuf2/videobuf2-core.c   | 6 --
 drivers/media/common/videobuf2/videobuf2-dma-contig.c | 6 ++
 drivers/media/common/videobuf2/videobuf2-dma-sg.c | 6 ++
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-core.c 
b/drivers/media/common/videobuf2/videobuf2-core.c
index 76210c006958..55af63d54f23 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -328,9 +328,6 @@ static void __vb2_buf_mem_prepare(struct vb2_buffer *vb)
return;
 
vb->synced = 1;
-   if (vb->skip_cache_sync_on_prepare)
-   return;
-
for (plane = 0; plane < vb->num_planes; ++plane)
call_void_memop(vb, prepare, vb->planes[plane].mem_priv);
 }
@@ -347,9 +344,6 @@ static void __vb2_buf_mem_finish(struct vb2_buffer *vb)
return;
 
vb->synced = 0;
-   if (vb->skip_cache_sync_on_finish)
-   return;
-
for (plane = 0; plane < vb->num_planes; ++plane)
call_void_memop(vb, finish, vb->planes[plane].mem_priv);
 }
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c 
b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
index 019c3843dc6d..1e218bc440c6 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
@@ -101,6 +101,9 @@ static void vb2_dc_prepare(void *buf_priv)
struct vb2_dc_buf *buf = buf_priv;
struct sg_table *sgt = buf->dma_sgt;
 
+   if (buf->vb->skip_cache_sync_on_prepare)
+   return;
+
if (!sgt)
return;
 
@@ -112,6 +115,9 @@ static void vb2_dc_finish(void *buf_priv)
struct vb2_dc_buf *buf = buf_priv;
struct sg_table *sgt = buf->dma_sgt;
 
+   if (buf->vb->skip_cache_sync_on_finish)
+   return;
+
if (!sgt)
return;
 
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c 
b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
index 71094cb5c5d7..cb587c5a345b 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
@@ -204,6 +204,9 @@ static void vb2_dma_sg_prepare(void *buf_priv)
struct vb2_dma_sg_buf *buf = buf_priv;
struct sg_table *sgt = buf->dma_sgt;
 
+   if (buf->vb->skip_cache_sync_on_prepare)
+   return;
+
dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
 }
 
@@ -212,6 +215,9 @@ static void vb2_dma_sg_finish(void *buf_priv)
struct vb2_dma_sg_buf *buf = buf_priv;
struct sg_table *sgt = buf->dma_sgt;
 
+   if (buf->vb->skip_cache_sync_on_finish)
+   return;
+
dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
 }
 
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH 5/8] videobuf2: add V4L2_FLAG_MEMORY_NON_COHERENT flag

2021-03-01 Thread Sergey Senozhatsky
By setting or clearing V4L2_FLAG_MEMORY_NON_COHERENT flag
user-space should be able to hint vb2 that either a non-coherent
(if supported) or coherent memory should be used for the buffer
allocation.

The patch set also adds a corresponding capability flag:
fill_buf_caps() reports V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS
when queue supports user-space cache management hints.

Signed-off-by: Sergey Senozhatsky 
---
 .../userspace-api/media/v4l/buffer.rst| 40 ++-
 .../media/v4l/vidioc-reqbufs.rst  |  5 ++-
 include/uapi/linux/videodev2.h|  2 +
 3 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/Documentation/userspace-api/media/v4l/buffer.rst 
b/Documentation/userspace-api/media/v4l/buffer.rst
index 1b0fdc160533..a39852d6174f 100644
--- a/Documentation/userspace-api/media/v4l/buffer.rst
+++ b/Documentation/userspace-api/media/v4l/buffer.rst
@@ -676,8 +676,6 @@ Buffer Flags
 
 \normalsize
 
-.. _memory-flags:
-
 enum v4l2_memory
 
 
@@ -701,6 +699,44 @@ enum v4l2_memory
   - 4
   - The buffer is used for :ref:`DMA shared buffer ` I/O.
 
+.. _memory-flags:
+
+Memory Consistency Flags
+
+
+.. raw:: latex
+
+\small
+
+.. tabularcolumns:: |p{7.0cm}|p{2.1cm}|p{8.4cm}|
+
+.. cssclass:: longtable
+
+.. flat-table::
+:header-rows:  0
+:stub-columns: 0
+:widths:   3 1 4
+
+* .. _`V4L2-FLAG-MEMORY-NON-COHERENT`:
+
+  - ``V4L2_FLAG_MEMORY_NON_COHERENT``
+  - 0x0001
+  - A buffer is allocated either in coherent (it will be automatically
+   coherent between the CPU and the bus) or non-coherent memory. The
+   latter can provide performance gains, for instance the CPU cache
+   sync/flush operations can be avoided if the buffer is accessed by the
+   corresponding device only and the CPU does not read/write to/from that
+   buffer. However, this requires extra care from the driver -- it must
+   guarantee memory consistency by issuing a cache flush/sync when
+   consistency is needed. If this flag is set V4L2 will attempt to
+   allocate the buffer in non-coherent memory. The flag takes effect
+   only if the buffer is used for :ref:`memory mapping ` I/O and the
+   queue reports the :ref:`V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS
+   ` capability.
+
+.. raw:: latex
+
+\normalsize
 
 Timecodes
 =
diff --git a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst 
b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
index c1c88e00b106..950e7ec1aac5 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
@@ -154,8 +154,9 @@ aborting or finishing any DMA in progress, an implicit
   - This capability is set by the driver to indicate that the queue 
supports
 cache and memory management hints. However, it's only valid when the
 queue is used for :ref:`memory mapping ` streaming I/O. See
-:ref:`V4L2_BUF_FLAG_NO_CACHE_INVALIDATE 
` and
-:ref:`V4L2_BUF_FLAG_NO_CACHE_CLEAN `.
+:ref:`V4L2_BUF_FLAG_NO_CACHE_INVALIDATE 
`,
+:ref:`V4L2_BUF_FLAG_NO_CACHE_CLEAN ` and
+:ref:`V4L2_FLAG_MEMORY_NON_COHERENT `.
 
 Return Value
 
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 79dbde3bcf8d..b1d4171fe50b 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -954,6 +954,8 @@ struct v4l2_requestbuffers {
__u32   reserved[1];
 };
 
+#define V4L2_FLAG_MEMORY_NON_COHERENT  (1 << 0)
+
 /* capabilities for struct v4l2_requestbuffers and v4l2_create_buffers */
 #define V4L2_BUF_CAP_SUPPORTS_MMAP (1 << 0)
 #define V4L2_BUF_CAP_SUPPORTS_USERPTR  (1 << 1)
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH 3/8] videobuf2: split buffer cache_hints initialisation

2021-03-01 Thread Sergey Senozhatsky
V4L2 is not the perfect place to manage vb2 buffer cache hints.
It works for V4L2 users, but there are backends that use vb2 core
and don't use V4L2. Factor buffer cache hints init and call it
when we allocate vb2 buffer.

Signed-off-by: Sergey Senozhatsky 
---
 .../media/common/videobuf2/videobuf2-core.c   | 22 +++
 .../media/common/videobuf2/videobuf2-v4l2.c   | 18 ---
 2 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-core.c 
b/drivers/media/common/videobuf2/videobuf2-core.c
index 23e41fec9880..76210c006958 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -382,6 +382,27 @@ static void __setup_offsets(struct vb2_buffer *vb)
}
 }
 
+static void init_buffer_cache_hints(struct vb2_queue *q, struct vb2_buffer *vb)
+{
+   /*
+* DMA exporter should take care of cache syncs, so we can avoid
+* explicit ->prepare()/->finish() syncs. For other ->memory types
+* we always need ->prepare() or/and ->finish() cache sync.
+*/
+   if (q->memory == VB2_MEMORY_DMABUF) {
+   vb->skip_cache_sync_on_finish = 1;
+   vb->skip_cache_sync_on_prepare = 1;
+   return;
+   }
+
+   /*
+* ->finish() cache sync can be avoided when queue direction is
+* TO_DEVICE.
+*/
+   if (q->dma_dir == DMA_TO_DEVICE)
+   vb->skip_cache_sync_on_finish = 1;
+}
+
 /*
  * __vb2_queue_alloc() - allocate videobuf buffer structures and (for MMAP 
type)
  * video buffer memory for all buffers/planes on the queue and initializes the
@@ -415,6 +436,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum 
vb2_memory memory,
vb->index = q->num_buffers + buffer;
vb->type = q->type;
vb->memory = memory;
+   init_buffer_cache_hints(q, vb);
for (plane = 0; plane < num_planes; ++plane) {
vb->planes[plane].length = plane_sizes[plane];
vb->planes[plane].min_length = plane_sizes[plane];
diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c 
b/drivers/media/common/videobuf2/videobuf2-v4l2.c
index db93678860bd..a02f365bbe60 100644
--- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
+++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
@@ -345,17 +345,6 @@ static void set_buffer_cache_hints(struct vb2_queue *q,
   struct vb2_buffer *vb,
   struct v4l2_buffer *b)
 {
-   /*
-* DMA exporter should take care of cache syncs, so we can avoid
-* explicit ->prepare()/->finish() syncs. For other ->memory types
-* we always need ->prepare() or/and ->finish() cache sync.
-*/
-   if (q->memory == VB2_MEMORY_DMABUF) {
-   vb->skip_cache_sync_on_finish = 1;
-   vb->skip_cache_sync_on_prepare = 1;
-   return;
-   }
-
if (!vb2_queue_allows_cache_hints(q)) {
/*
 * Clear buffer cache flags if queue does not support user
@@ -367,13 +356,6 @@ static void set_buffer_cache_hints(struct vb2_queue *q,
return;
}
 
-   /*
-* ->finish() cache sync can be avoided when queue direction is
-* TO_DEVICE.
-*/
-   if (q->dma_dir == DMA_TO_DEVICE)
-   vb->skip_cache_sync_on_finish = 1;
-
if (b->flags & V4L2_BUF_FLAG_NO_CACHE_INVALIDATE)
vb->skip_cache_sync_on_finish = 1;
 
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH 2/8] videobuf2: inverse buffer cache_hints flags

2021-03-01 Thread Sergey Senozhatsky
It would be less error prone if the default cache hints value
(we kzalloc() structs, so it's zeroed out by default) would be
to "always sync/flush" caches. Inverse and rename cache hints
flags.

Signed-off-by: Sergey Senozhatsky 
---
 .../media/common/videobuf2/videobuf2-core.c   | 31 ++-
 .../media/common/videobuf2/videobuf2-v4l2.c   | 17 +++---
 include/media/videobuf2-core.h| 12 +++
 3 files changed, 21 insertions(+), 39 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-core.c 
b/drivers/media/common/videobuf2/videobuf2-core.c
index 9a5cc3e63439..23e41fec9880 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -327,12 +327,12 @@ static void __vb2_buf_mem_prepare(struct vb2_buffer *vb)
if (vb->synced)
return;
 
-   if (vb->need_cache_sync_on_prepare) {
-   for (plane = 0; plane < vb->num_planes; ++plane)
-   call_void_memop(vb, prepare,
-   vb->planes[plane].mem_priv);
-   }
vb->synced = 1;
+   if (vb->skip_cache_sync_on_prepare)
+   return;
+
+   for (plane = 0; plane < vb->num_planes; ++plane)
+   call_void_memop(vb, prepare, vb->planes[plane].mem_priv);
 }
 
 /*
@@ -346,12 +346,12 @@ static void __vb2_buf_mem_finish(struct vb2_buffer *vb)
if (!vb->synced)
return;
 
-   if (vb->need_cache_sync_on_finish) {
-   for (plane = 0; plane < vb->num_planes; ++plane)
-   call_void_memop(vb, finish,
-   vb->planes[plane].mem_priv);
-   }
vb->synced = 0;
+   if (vb->skip_cache_sync_on_finish)
+   return;
+
+   for (plane = 0; plane < vb->num_planes; ++plane)
+   call_void_memop(vb, finish, vb->planes[plane].mem_priv);
 }
 
 /*
@@ -415,17 +415,6 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum 
vb2_memory memory,
vb->index = q->num_buffers + buffer;
vb->type = q->type;
vb->memory = memory;
-   /*
-* We need to set these flags here so that the videobuf2 core
-* will call ->prepare()/->finish() cache sync/flush on vb2
-* buffers when appropriate. However, we can avoid explicit
-* ->prepare() and ->finish() cache sync for DMABUF buffers,
-* because DMA exporter takes care of it.
-*/
-   if (q->memory != VB2_MEMORY_DMABUF) {
-   vb->need_cache_sync_on_prepare = 1;
-   vb->need_cache_sync_on_finish = 1;
-   }
for (plane = 0; plane < num_planes; ++plane) {
vb->planes[plane].length = plane_sizes[plane];
vb->planes[plane].min_length = plane_sizes[plane];
diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c 
b/drivers/media/common/videobuf2/videobuf2-v4l2.c
index 7e96f67c60ba..db93678860bd 100644
--- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
+++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
@@ -351,18 +351,11 @@ static void set_buffer_cache_hints(struct vb2_queue *q,
 * we always need ->prepare() or/and ->finish() cache sync.
 */
if (q->memory == VB2_MEMORY_DMABUF) {
-   vb->need_cache_sync_on_finish = 0;
-   vb->need_cache_sync_on_prepare = 0;
+   vb->skip_cache_sync_on_finish = 1;
+   vb->skip_cache_sync_on_prepare = 1;
return;
}
 
-   /*
-* Cache sync/invalidation flags are set by default in order to
-* preserve existing behaviour for old apps/drivers.
-*/
-   vb->need_cache_sync_on_prepare = 1;
-   vb->need_cache_sync_on_finish = 1;
-
if (!vb2_queue_allows_cache_hints(q)) {
/*
 * Clear buffer cache flags if queue does not support user
@@ -379,13 +372,13 @@ static void set_buffer_cache_hints(struct vb2_queue *q,
 * TO_DEVICE.
 */
if (q->dma_dir == DMA_TO_DEVICE)
-   vb->need_cache_sync_on_finish = 0;
+   vb->skip_cache_sync_on_finish = 1;
 
if (b->flags & V4L2_BUF_FLAG_NO_CACHE_INVALIDATE)
-   vb->need_cache_sync_on_finish = 0;
+   vb->skip_cache_sync_on_finish = 1;
 
if (b->flags & V4L2_BUF_FLAG_NO_CACHE_CLEAN)
-   vb->need_cache_sync_on_prepare = 0;
+   vb->skip_cache_sync_on_prepare = 1;
 }
 
 static int vb2_queue_or_prepare_buf(struct vb2_queue *q, struct media_device 
*mdev,
diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index d0d85be4809b..48f57a54ddb1 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -265,10 +265,10 @@ struct vb2_buffer {
 *

[PATCH 1/8] videobuf2: rework vb2_mem_ops API

2021-03-01 Thread Sergey Senozhatsky
With new DMA API we need an extension of videobuf2 API. Previously,
videobuf2 core would set non-coherent DMA bit in vb2 queue dma_attr
(if user-space would pass a corresponding memory hint); vb2 core
then would pass the vb2 queue dma_attrs to the vb2 allocators.
vb2 allocator would use queue's dma_attr and DMA API would allocate
either coherent or non-coherent memory.

But we cannot do this anymore, since there is no corresponding DMA
attr flag and, hence, there is no way for the allocator to become
aware of what type of allocation user-space has requested. So we
need to pass more context from videobuf2 core to the allocators.

Fix this by changing call_ptr_memop() macro to pass vb2 pointer to
corresponding op callbacks.

Signed-off-by: Sergey Senozhatsky 
---
 .../media/common/videobuf2/videobuf2-core.c   | 42 +++
 .../common/videobuf2/videobuf2-dma-contig.c   | 36 +---
 .../media/common/videobuf2/videobuf2-dma-sg.c | 33 ---
 .../common/videobuf2/videobuf2-vmalloc.c  | 30 ++---
 include/media/videobuf2-core.h| 37 
 5 files changed, 98 insertions(+), 80 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-core.c 
b/drivers/media/common/videobuf2/videobuf2-core.c
index 02281d13505f..9a5cc3e63439 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -68,13 +68,13 @@ module_param(debug, int, 0644);
err;\
 })
 
-#define call_ptr_memop(vb, op, args...)
\
+#define call_ptr_memop(op, vb, args...)
\
 ({ \
struct vb2_queue *_q = (vb)->vb2_queue; \
void *ptr;  \
\
log_memop(vb, op);  \
-   ptr = _q->mem_ops->op ? _q->mem_ops->op(args) : NULL;   \
+   ptr = _q->mem_ops->op ? _q->mem_ops->op(vb, args) : NULL;   \
if (!IS_ERR_OR_NULL(ptr))   \
(vb)->cnt_mem_ ## op++; \
ptr;\
@@ -144,9 +144,9 @@ module_param(debug, int, 0644);
((vb)->vb2_queue->mem_ops->op ? \
(vb)->vb2_queue->mem_ops->op(args) : 0)
 
-#define call_ptr_memop(vb, op, args...)
\
+#define call_ptr_memop(op, vb, args...)
\
((vb)->vb2_queue->mem_ops->op ? \
-   (vb)->vb2_queue->mem_ops->op(args) : NULL)
+   (vb)->vb2_queue->mem_ops->op(vb, args) : NULL)
 
 #define call_void_memop(vb, op, args...)   \
do {\
@@ -230,9 +230,10 @@ static int __vb2_buf_mem_alloc(struct vb2_buffer *vb)
if (size < vb->planes[plane].length)
goto free;
 
-   mem_priv = call_ptr_memop(vb, alloc,
-   q->alloc_devs[plane] ? : q->dev,
-   q->dma_attrs, size, q->dma_dir, q->gfp_flags);
+   mem_priv = call_ptr_memop(alloc,
+ vb,
+ q->alloc_devs[plane] ? : q->dev,
+ size);
if (IS_ERR_OR_NULL(mem_priv)) {
if (mem_priv)
ret = PTR_ERR(mem_priv);
@@ -975,7 +976,7 @@ void *vb2_plane_vaddr(struct vb2_buffer *vb, unsigned int 
plane_no)
if (plane_no >= vb->num_planes || !vb->planes[plane_no].mem_priv)
return NULL;
 
-   return call_ptr_memop(vb, vaddr, vb->planes[plane_no].mem_priv);
+   return call_ptr_memop(vaddr, vb, vb->planes[plane_no].mem_priv);
 
 }
 EXPORT_SYMBOL_GPL(vb2_plane_vaddr);
@@ -985,7 +986,7 @@ void *vb2_plane_cookie(struct vb2_buffer *vb, unsigned int 
plane_no)
if (plane_no >= vb->num_planes || !vb->planes[plane_no].mem_priv)
return NULL;
 
-   return call_ptr_memop(vb, cookie, vb->planes[plane_no].mem_priv);
+   return call_ptr_memop(cookie, vb, vb->planes[plane_no].mem_priv);
 }
 EXPORT_SYMBOL_GPL(vb2_plane_cookie);
 
@@ -1125,10 +1126,11 @@ static int __prepare_userptr(struct vb2_buffer *vb)
vb->planes[plane].data_offset = 0;
 
/* Acquire each plane's memory */
-   mem_priv = call_ptr_memop(vb, get_userptr,
-   q->alloc_devs[plane] ? : q->dev,
-   

[PATCH 0/8] videobuf2: support new noncontiguous DMA API

2021-03-01 Thread Sergey Senozhatsky
Hello,

RFC

The series adds support for new noncontiguous DMA API [0] and
adds V4L2_FLAG_MEMORY_NON_COHERENT UAPI. This is similar to previous
V4L2_FLAG_MEMORY_NON_CONSISTENT (which was renamed), but the patch set
goes a bit further this time and also does some videobuf2 API
refactroings along the way.

A corresponding v4l2-compliance patch will be posted shortly.

[0] https://lore.kernel.org/lkml/20210301085236.947011-2-...@lst.de/

Sergey Senozhatsky (8):
  videobuf2: rework vb2_mem_ops API
  videobuf2: inverse buffer cache_hints flags
  videobuf2: split buffer cache_hints initialisation
  videobuf2: move cache_hints handling to allocators
  videobuf2: add V4L2_FLAG_MEMORY_NON_COHERENT flag
  videobuf2: add queue memory coherency parameter
  videobuf2: handle V4L2_FLAG_MEMORY_NON_COHERENT flag
  videobuf2: handle non-contiguous DMA allocations

 .../userspace-api/media/v4l/buffer.rst|  40 +++-
 .../media/v4l/vidioc-create-bufs.rst  |   7 +-
 .../media/v4l/vidioc-reqbufs.rst  |  16 +-
 .../media/common/videobuf2/videobuf2-core.c   | 135 +-
 .../common/videobuf2/videobuf2-dma-contig.c   | 175 ++
 .../media/common/videobuf2/videobuf2-dma-sg.c |  39 ++--
 .../media/common/videobuf2/videobuf2-v4l2.c   |  47 ++---
 .../common/videobuf2/videobuf2-vmalloc.c  |  30 +--
 drivers/media/dvb-core/dvb_vb2.c  |   2 +-
 drivers/media/v4l2-core/v4l2-compat-ioctl32.c |   9 +-
 drivers/media/v4l2-core/v4l2-ioctl.c  |   5 +-
 include/media/videobuf2-core.h|  57 +++---
 include/uapi/linux/videodev2.h|  13 +-
 13 files changed, 396 insertions(+), 179 deletions(-)

-- 
2.30.1.766.gb4fecdf3b7-goog



Re: [PATCH] c6x: Remove stale symlink 'scripts/dtc/include-prefixes/c6x'

2021-03-01 Thread Rob Herring
On Mon, Mar 1, 2021 at 6:14 PM Victor Erminpour
 wrote:
>
> Remove stale symlink 'scripts/dtc/include-prefixes/c6x'
>
> Signed-off-by: Victor Erminpour 
> ---
>  scripts/dtc/include-prefixes/c6x | 1 -
>  1 file changed, 1 deletion(-)
>  delete mode 12 scripts/dtc/include-prefixes/c6x

You are the 3rd fix. There's already one pending in linux-next I'll be
sending to Linus this week.

Rob


Re: [PATCH 09/25] x86/sgx: Move ENCLS leaf definitions to sgx.h

2021-03-01 Thread Kai Huang
On Mon, 2021-03-01 at 08:25 -0800, Sean Christopherson wrote:
> On Mon, Mar 01, 2021, Kai Huang wrote:
> > And because they're architectural.
> 
> Heh, this snarky sentence can be dropped, it was a lot more clever when these
> were being moved to sgx_arch.h.

Sure. Reasonable to me.

> 
> > Signed-off-by: Sean Christopherson 
> > Acked-by: Dave Hansen 
> > Acked-by: Jarkko Sakkinen 
> > Signed-off-by: Kai Huang 
> > ---
> >  arch/x86/include/asm/sgx.h  | 15 +++
> >  arch/x86/kernel/cpu/sgx/encls.h | 15 ---
> >  2 files changed, 15 insertions(+), 15 deletions(-)




Re: [PATCH 12/25] x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs

2021-03-01 Thread Kai Huang
On Mon, 2021-03-01 at 08:57 -0800, Sean Christopherson wrote:
> On Mon, Mar 01, 2021, Kai Huang wrote:
> > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> > index 8c922e68274d..276220d0e4b5 100644
> > --- a/arch/x86/kernel/cpu/sgx/main.c
> > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > @@ -696,6 +696,21 @@ static bool __init sgx_page_cache_init(void)
> >     return true;
> >  }
> >  
> > 
> > 
> > 
> > +
> > +/*
> > + * Update the SGX_LEPUBKEYHASH MSRs to the values specified by caller.
> > + * Bare-metal driver requires to update them to hash of enclave's signer
> > + * before EINIT. KVM needs to update them to guest's virtual MSR values
> > + * before doing EINIT from guest.
> > + */
> > +void sgx_update_lepubkeyhash(u64 *lepubkeyhash)
> > +{
> > +   int i;
> 
> Probably worth adding:
> 
>   WARN_ON_ONCE(preemptible());

Agreed. Will do.

> 
> > +
> > +   for (i = 0; i < 4; i++)
> > +   wrmsrl(MSR_IA32_SGXLEPUBKEYHASH0 + i, lepubkeyhash[i]);
> > +}
> > +
> >  static int __init sgx_init(void)
> >  {
> >     int ret;




  1   2   3   4   5   6   7   8   9   10   >