from:"Alex Shi"

Re: [PATCH v5] docs/zh_CN: add translations in zh_CN/dev-tools/gcov

2021-04-14 Thread Alex Shi

Reviewed-by: Alex Shi 

On 2021/4/14 下午9:21, Wu XiangCheng wrote:
> From: Bernard Zhao 
> 
> Add new zh translations
> * zh_CN/dev-tools/gcov.rst
> * zh_CN/dev-tools/index.rst
> and link them to zh_CN/index.rst
> 
> Signed-off-by: Bernard Zhao 
> Reviewed-by: Wu XiangCheng 
> Signed-off-by: Wu XiangCheng 
> ---
> base: linux-next
> commit 269dd42f4776 ("docs/zh_CN: add riscv to zh_CN index")
> 
> Changes since V4:
> * modified some words under Alex Shi's advices
> 
> Changes since V3:
> * update to newest linux-next
> * fix ``
> * fix tags
> * fix list indent
> 
> Changes since V2:
> * fix some inaccurate translation
> 
> Changes since V1:
> * add index.rst in dev-tools and link to to zh_CN/index.rst
> * fix some inaccurate translation
> 
>  .../translations/zh_CN/dev-tools/gcov.rst | 265 ++
>  .../translations/zh_CN/dev-tools/index.rst|  35 +++
>  Documentation/translations/zh_CN/index.rst|   1 +
>  3 files changed, 301 insertions(+)
>  create mode 100644 Documentation/translations/zh_CN/dev-tools/gcov.rst
>  create mode 100644 Documentation/translations/zh_CN/dev-tools/index.rst
> 
> diff --git a/Documentation/translations/zh_CN/dev-tools/gcov.rst 
> b/Documentation/translations/zh_CN/dev-tools/gcov.rst
> new file mode 100644
> index ..7515b488bc4e
> --- /dev/null
> +++ b/Documentation/translations/zh_CN/dev-tools/gcov.rst
> @@ -0,0 +1,265 @@
> +.. include:: ../disclaimer-zh_CN.rst
> +
> +:Original: Documentation/dev-tools/gcov.rst
> +:Translator: 赵军奎 Bernard Zhao 
> +
> +在Linux内核里使用gcov做代码覆盖率检查
> +=
> +
> +gcov是linux中已经集成的一个分析模块，该模块在内核中对GCC的代码覆盖率统
> +计提供了支持。
> +linux内核运行时的代码覆盖率数据会以gcov兼容的格式存储在debug-fs中，可
> +以通过gcov的 ``-o`` 选项（如下示例）获得指定文件的代码运行覆盖率统计数据
> +（需要跳转到内核编译路径下并且要有root权限）::
> +
> +# cd /tmp/linux-out
> +# gcov -o /sys/kernel/debug/gcov/tmp/linux-out/kernel spinlock.c
> +
> +这将在当前目录中创建带有执行计数注释的源代码文件。
> +在获得这些统计文件后，可以使用图形化的 gcov_ 前端工具（比如 lcov_ ），来实现
> +自动化处理linux内核的覆盖率运行数据，同时生成易于阅读的HTML格式文件。
> +
> +可能的用途:
> +
> +* 调试（用来判断每一行的代码是否已经运行过）
> +* 测试改进（如何修改测试代码，尽可能地覆盖到没有运行过的代码）
> +* 内核配置优化（对于某一个选项配置，如果关联的代码从来没有运行过，是
> +  否还需要这个配置）
> +
> +.. _gcov: https://gcc.gnu.org/onlinedocs/gcc/Gcov.html
> +.. _lcov: http://ltp.sourceforge.net/coverage/lcov.php
> +
> +
> +准备
> +
> +
> +内核打开如下配置::
> +
> +CONFIG_DEBUG_FS=y
> +CONFIG_GCOV_KERNEL=y
> +
> +获取整个内核的覆盖率数据，还需要打开::
> +
> +CONFIG_GCOV_PROFILE_ALL=y
> +
> +需要注意的是，整个内核开启覆盖率统计会造成内核镜像文件尺寸的增大，
> +同时内核运行的也会变慢一些。
> +另外，并不是所有的架构都支持整个内核开启覆盖率统计。
> +
> +代码运行覆盖率数据只在debugfs挂载完成后才可以访问::
> +
> +mount -t debugfs none /sys/kernel/debug
> +
> +
> +定制化
> +--
> +
> +如果要单独针对某一个路径或者文件进行代码覆盖率统计，可以在内核相应路
> +径的Makefile中增加如下的配置:
> +
> +- 单独统计单个文件（例如main.o）::
> +
> +GCOV_PROFILE_main.o := y
> +
> +- 单独统计某一个路径::
> +
> +GCOV_PROFILE := y
> +
> +如果要在整个内核的覆盖率统计（开启CONFIG_GCOV_PROFILE_ALL）中单独排除
> +某一个文件或者路径，可以使用如下的方法::
> +
> +GCOV_PROFILE_main.o := n
> +
> +和::
> +
> +GCOV_PROFILE := n
> +
> +此机制仅支持链接到内核镜像或编译为内核模块的文件。
> +
> +
> +相关文件
> +
> +
> +gcov功能需要在debugfs中创建如下文件:
> +
> +``/sys/kernel/debug/gcov``
> +gcov相关功能的根路径
> +
> +``/sys/kernel/debug/gcov/reset``
> +全局复位文件:向该文件写入数据后会将所有的gcov统计数据清0
> +
> +``/sys/kernel/debug/gcov/path/to/compile/dir/file.gcda``
> +gcov工具可以识别的覆盖率统计数据文件，向该文件写入数据后
> +   会将本文件的gcov统计数据清0
> +
> +``/sys/kernel/debug/gcov/path/to/compile/dir/file.gcno``
> +gcov工具需要的软连接文件（指向编译时生成的信息统计文件），这个文件是
> +在gcc编译时如果配置了选项 ``-ftest-coverage`` 时生成的。
> +
> +
> +针对模块的统计
> +--
> +
> +内核中的模块会动态的加载和卸载，模块卸载时对应的数据会被清除掉。
> +gcov提供了一种机制，通过保留相关数据的副本来收集这部分卸载模块的覆盖率数据。
> +模块卸载后这些备份数据在debugfs中会继续存在。
> +一旦这个模块重新加载，模块关联的运行统计会被初始化成debugfs中备份的数据。
> +
> +可以通过对内核参数gcov_persist的修改来停用gcov对模块的备份机制::
> +
> +gcov_persist = 0
> +
> +在运行时，用户还可以通过写入模块的数据文件或者写入gcov复位文件来丢弃已卸
> +载模块的数据。
> +
> +
> +编译机和测试机分离
> +--
> +
> +gcov的内核分析架构支持内核的编译和运行是在同一台机器上，也可以编译和运
> +行是在不同的机器上。
> +如果内核编译和运行是不同的机器，那么需要额外的准备工作，这取决于gcov工具
> +是在哪里使用的:
> +
> +.. _gcov-test_zh:
> +
> +a) 若gcov运行在测试机上
> +
> +测试机上面gcov工具的版本必须要跟内核编译机器使用的gcc版本相兼容，
> +同时下面的文件要从编译机拷贝到测试机上:
> +
> +从源代码中:
> +  - 所有的C文件和头文件
> +
> +从编译目录中:
> +  - 所有的C文件和头文件
> +  - 所有的.gcda文件和.gcno文件
> +  - 所有目录的链接
> +
> +特别需要注意，测试机器上面的目录结构跟编译机器上面的目录机构必须
> +完全一致。
> +如果文件是软链接，需要替换成真正的目录文件（这是由make的当前工作
> +目录变量CURDIR引起的）。
> +
> +.. _gcov-build_zh:
> +
> +b) 若gcov运行在编译机上
>

Re: [PATCH v4] docs/zh_CN: add translations in zh_CN/dev-tools/gcov

2021-04-14 Thread Alex Shi




On 2021/4/14 下午7:24, Wu XiangCheng wrote:
> From: Bernard Zhao 
> 
> Add new zh translations
> * zh_CN/dev-tools/gcov.rst
> * zh_CN/dev-tools/index.rst
> and link them to zh_CN/index.rst
> 
> Signed-off-by: Bernard Zhao 
> Reviewed-by: Wu Xiangcheng 
> Signed-off-by: Wu XiangCheng 
> ---
> base: linux-next
> commit 269dd42f4776 ("docs/zh_CN: add riscv to zh_CN index")
> 
> Changes since V3:
> * update to newest linux-next
> * fix ``
> * fix tags
> * fix list indent
> 
> Changes since V2:
> * fix some inaccurate translation
> 
> Changes since V1:
> * add index.rst in dev-tools and link to to zh_CN/index.rst
> * fix some inaccurate translation
> 
>  .../translations/zh_CN/dev-tools/gcov.rst | 265 ++
>  .../translations/zh_CN/dev-tools/index.rst|  35 +++
>  Documentation/translations/zh_CN/index.rst|   1 +
>  3 files changed, 301 insertions(+)
>  create mode 100644 Documentation/translations/zh_CN/dev-tools/gcov.rst
>  create mode 100644 Documentation/translations/zh_CN/dev-tools/index.rst
> 
> diff --git a/Documentation/translations/zh_CN/dev-tools/gcov.rst 
> b/Documentation/translations/zh_CN/dev-tools/gcov.rst
> new file mode 100644
> index ..7515b488bc4e
> --- /dev/null
> +++ b/Documentation/translations/zh_CN/dev-tools/gcov.rst
> @@ -0,0 +1,265 @@
> +.. include:: ../disclaimer-zh_CN.rst
> +
> +:Original: Documentation/dev-tools/gcov.rst
> +:Translator: 赵军奎 Bernard Zhao 
> +
> +在Linux内核里使用gcov做代码覆盖率检查
> +=
> +
> +gcov是linux中已经集成的一个分析模块，该模块在内核中对GCC的代码覆盖率统
> +计提供了支持。
> +linux内核运行时的代码覆盖率数据会以gcov兼容的格式存储在debug-fs中，可
> +以通过gcov的 ``-o`` 选项（如下示例）获得指定文件的代码运行覆盖率统计数据
> +（需要跳转到内核编译路径下并且要有root权限）::
> +
> +# cd /tmp/linux-out
> +# gcov -o /sys/kernel/debug/gcov/tmp/linux-out/kernel spinlock.c
> +
> +这将在当前目录中创建带有执行计数注释的源代码文件。
> +在获得这些统计文件后，可以使用图形化的 gcov_ 前端工具（比如 lcov_ ），来实现
> +自动化处理linux内核的覆盖率运行数据，同时生成易于阅读的HTML格式文件。
> +
> +可能的用途:
> +
> +* 调试（用来判断每一行的代码是否已经运行过）
> +* 测试改进（如何修改测试代码，尽可能地覆盖到没有运行过的代码）
> +* 内核配置优化（对于某一个选项配置，如果关联的代码从来没有运行过，是
> +  否还需要这个配置）
> +
> +.. _gcov: https://gcc.gnu.org/onlinedocs/gcc/Gcov.html
> +.. _lcov: http://ltp.sourceforge.net/coverage/lcov.php
> +
> +
> +准备
> +
> +
> +内核打开如下配置::
> +
> +CONFIG_DEBUG_FS=y
> +CONFIG_GCOV_KERNEL=y
> +
> +获取整个内核的覆盖率数据，还需要打开::
> +
> +CONFIG_GCOV_PROFILE_ALL=y
> +
> +需要注意的是，整个内核开启覆盖率统计会造成内核镜像文件尺寸的增大，
> +同时内核运行的也会变慢一些。
> +另外，并不是所有的架构都支持整个内核开启覆盖率统计。
> +
> +代码运行覆盖率数据只在debugfs挂载完成后才可以访问::
> +
> +mount -t debugfs none /sys/kernel/debug
> +
> +
> +客制化

一般是‘定制化‘

> +--
> +
> +如果要单独针对某一个路径或者文件进行代码覆盖率统计，可以在内核相应路
> +径的Makefile中增加如下的配置:
> +
> +- 单独统计单个文件（例如main.o）::
> +
> +GCOV_PROFILE_main.o := y
> +
> +- 单独统计某一个路径::
> +
> +GCOV_PROFILE := y
> +
> +如果要在整个内核的覆盖率统计（开启CONFIG_GCOV_PROFILE_ALL）中单独排除
> +某一个文件或者路径，可以使用如下的方法::
> +
> +GCOV_PROFILE_main.o := n
> +
> +和::
> +
> +GCOV_PROFILE := n
> +
> +此机制仅支持链接到内核镜像或编译为内核模块的文件。
> +
> +
> +相关文件
> +
> +
> +gcov功能需要在debugfs中创建如下文件:
> +
> +``/sys/kernel/debug/gcov``
> +gcov相关功能的根路径
> +
> +``/sys/kernel/debug/gcov/reset``
> +全局复位文件:向该文件写入数据后会将所有的gcov统计数据清0
> +
> +``/sys/kernel/debug/gcov/path/to/compile/dir/file.gcda``
> +gcov工具可以识别的覆盖率统计数据文件，向该文件写入数据后
> +   会将本文件的gcov统计数据清0
> +
> +``/sys/kernel/debug/gcov/path/to/compile/dir/file.gcno``
> +gcov工具需要的软连接文件（指向编译时生成的信息统计文件），这个文件是
> +在gcc编译时如果配置了选项 ``-ftest-coverage`` 时生成的。
> +
> +
> +针对模块的统计
> +--
> +
> +内核中的模块会动态的加载和卸载，模块卸载时对应的数据会被清除掉。
> +gcov提供了一种机制，通过保留相关数据的副本来收集这部分卸载模块的覆盖率数据。
> +模块卸载后这些备份数据在debugfs中会继续存在。
> +一旦这个模块重新加载，模块关联的运行统计会被初始化成debugfs中备份的数据。
> +
> +可以通过对内核参数gcov_persist的修改来停用gcov对模块的备份机制::
> +
> +gcov_persist = 0
> +
> +在运行时，用户还可以通过写入模块的数据文件或者写入gcov复位文件来丢弃已卸
> +载模块的数据。
> +
> +
> +分离的编译和运行设备

编译和运行机分离 ？

machine means computer here. translated as 计算机 or 机器 may
better than 设备?

others looks fine for me.

Thanks
Alex

> +
> +
> +gcov的内核分析架构支持内核的编译和分析是在同一台设备上，也可以编译和运
> +行是在不同的设备上。
> +如果内核编译和运行是不同的设备，那么需要额外的准备工作，这取决于gcov工具
> +是在哪里使用的:
> +
> +.. _gcov-test_zh:
> +
> +a) 若gcov运行在测试设备上
> +
> +测试设备上面gcov工具的版本必须要跟设备内核编译使用的gcc版本相兼容，
> +同时下面的文件要从编译设备拷贝到测试设备上:
> +
> +从源代码中:
> +  - 所有的C文件和头文件
> +
> +从编译目录中:
> +  - 所有的C文件和头文件
> +  - 所有的.gcda文件和.gcno文件
> +  - 所有目录的链接
> +
> +特别需要注意，测试机器上面的目录结构跟编译机器上面的目录机构必须
> +完全一致。
> +如果文件是软链接，需要替换成真正的目录文件（这是由make的当前工作
> +目录变量CURDIR引起的）。
> +
> +.. _gcov-build_zh:
> +
> +b) 若gcov运行在编译设备上
> +
> +测试用例运行结束后，如下的文件需要从测试设备中拷贝到编译设备上:
> +
> +从sysfs中的gcov目录中:
> +  - 所有的.gcda文件
> +  - 所有的.gcno文件软链接
> +
> +这些文件可以拷贝到编译设备的任意目录下，gcov使用-o选项指定拷贝的
> +目录。
> +
> +比如一个是示例的目录结构如下::
> +
> +  /tmp/linux:内核源码目录
> +  /tmp/out:  内核编译文件路径（make O=指定）
> +  /tmp/coverage: 从测试机器上面拷贝的数据文件路径
> +
> +  [user@build] cd /tmp/out
> +  [user@build] gcov -o /tmp/coverage/tmp/out/init main.c
> +
> +
> +关于编译器的注意事项
> +
> +
> +GCC和LLVM gcov工具不一定兼容。
> +如果编译器是GCC，使用 gcov_

Re: [PATCH v1 1/4] docs: make reporting-issues.rst official and delete reporting-bugs.rst

2021-03-31 Thread Alex Shi




On 2021/3/31 下午4:33, Wu X.C. wrote:
> Cc Alex Shi's new email 
> 
> On Tue, Mar 30, 2021 at 04:13:04PM +0200, Thorsten Leemhuis wrote:
>> Removing Documentation/admin-guide/reporting-bugs.rst will break links
>> in some of the translations. I was unsure if simply changing them to
>> Documentation/admin-guide/reporting-issue.rst was wise, so I didn't

A bit time info late won't hurt sth, people would update them soon if it's
their care.

>> touch anything for now and CCed the maintainers for the Chinese and
>> Italian translation. I couldn't find one for the Japanse translation.
>>
>> Please advice. For completeness, this are the places where things will
>> break afaics:
>>
>> $ grep -ri 'reporting-bugs.rst' Documentation/
>> Documentation/translations/zh_CN/SecurityBugs:是有帮助的信息，那就请重温一下admin-guide/reporting-bugs.rst文件中的概述过程。任
>> Documentation/translations/zh_CN/process/howto.rst:内核源码主目录中的:ref:`admin-guide/reporting-bugs.rst
>>  `
>> Documentation/translations/zh_CN/admin-guide/reporting-issues.rst:   
>> 本文档将取代“Documentation/admin-guide/reporting-bugs.rst”。主要的工作
>> Documentation/translations/zh_CN/admin-guide/reporting-issues.rst:   
>> “Documentation/admin-guide/reporting-bugs.rst”中的旧文字非常相似。它和它
> 
> Yeah, as Greg said, we will solve that after you patches be merged in next
> tree. Since I have translate the zh reporting-issues.rst in the next tree,
> will correct the link when I sync it with your new version. May cause 
> Warning for some days, but don't worry about it.

yes, also thanks for generous commitment!

thanks
Alex

[PATCH 1/2] mailmap: update email address for Alex Shi

2021-03-26 Thread Alex Shi

Add my kernel.org address for old email address.

Signed-off-by: Alex Shi 
Cc: Andrew Morton  
Cc: Jonathan Corbet  
Cc: Kees Cook  
Cc: Leon Romanovsky  
Cc: Thomas Bogendoerfer  
Cc: Alexander Lobakin  
Cc: linux-kernel@vger.kernel.org 
---
 .mailmap | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/.mailmap b/.mailmap
index 85b93cdefc87..608ce1e5bef7 100644
--- a/.mailmap
+++ b/.mailmap
@@ -25,8 +25,9 @@ Alexandre Belloni  
 
 Alexei Starovoitov  
 Alexei Starovoitov  
-Alex Shi  
-Alex Shi  
+Alex Shi  
+Alex Shi  
+Alex Shi  
 Al Viro 
 Al Viro 
 Andi Kleen  
-- 
2.29.GIT

[PATCH 2/2] Docs/zh_CN: update Alex Shi new email address

2021-03-26 Thread Alex Shi

I am leaving Alibaba, udpate the old email address to new one.

Signed-off-by: Alex Shi 
Cc: Harry Wei  
Cc: Alex Shi  
Cc: Jonathan Corbet  
Cc: linux-...@vger.kernel.org 
Cc: linux-kernel@vger.kernel.org 
---
 Documentation/translations/zh_CN/disclaimer-zh_CN.rst | 2 +-
 MAINTAINERS   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/translations/zh_CN/disclaimer-zh_CN.rst 
b/Documentation/translations/zh_CN/disclaimer-zh_CN.rst
index dcf803ede85a..3c6db094a63c 100644
--- a/Documentation/translations/zh_CN/disclaimer-zh_CN.rst
+++ b/Documentation/translations/zh_CN/disclaimer-zh_CN.rst
@@ -6,4 +6,4 @@
 
 .. note::
  如果您发现本文档与原始文件有任何不同或者有翻译问题，请联系该文件的译者，
- 或者请求时奎亮的帮助：。
+ 或者请求时奎亮的帮助：。
diff --git a/MAINTAINERS b/MAINTAINERS
index 728216e3919c..f4b60255f011 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4211,7 +4211,7 @@ F:Documentation/dev-tools/checkpatch.rst
 
 CHINESE DOCUMENTATION
 M: Harry Wei 
-M: Alex Shi 
+M: Alex Shi 
 L: xiyoulinuxkernelgr...@googlegroups.com (subscribers-only)
 S: Maintained
 F: Documentation/translations/zh_CN/
-- 
2.29.GIT

Re: [PATCH v1 14/14] mm: multigenerational lru: documentation

2021-03-19 Thread Alex Shi




在 2021/3/13 下午3:57, Yu Zhao 写道:
> +Recipes
> +---
> +:Android on ARMv8.1+: ``X=4``, ``N=0``
> +
> +:Android on pre-ARMv8.1 CPUs: Not recommended due to the lack of
> + ``ARM64_HW_AFDBM``
> +
> +:Laptops running Chrome on x86_64: ``X=7``, ``N=2``
> +
> +:Working set estimation: Write ``+ memcg_id node_id gen [swappiness]``
> + to ``/sys/kernel/debug/lru_gen`` to account referenced pages to
> + generation ``max_gen`` and create the next generation ``max_gen+1``.
> + ``gen`` must be equal to ``max_gen`` in order to avoid races. A swap
> + file and a non-zero swappiness value are required to scan anon pages.
> + If swapping is not desired, set ``vm.swappiness`` to ``0`` and
> + overwrite it with a non-zero ``swappiness``.
> +
> +:Proactive reclaim: Write ``- memcg_id node_id gen [swappiness]
> + [nr_to_reclaim]`` to ``/sys/kernel/debug/lru_gen`` to evict
> + generations less than or equal to ``gen``. ``gen`` must be less than
> + ``max_gen-1`` as ``max_gen`` and ``max_gen-1`` are active generations
> + and therefore protected from the eviction. ``nr_to_reclaim`` can be
> + used to limit the number of pages to be evicted. Multiple command
> + lines are supported, so does concatenation with delimiters ``,`` and
> + ``;``.
> +


These are difficult options for users, especially for 'races' involving.
Is it possible to simplify them for end users?

Thanks
Alex

Re: [PATCH] PATCH Documentation translations:translate sound/hd-audio/controls to chinese

2021-03-04 Thread Alex Shi

Reviewed-by: Alex Shi 

在 2021/3/4 下午5:45, hjh 写道:
> Signed-off-by: hjh 
> ---
>  Documentation/translations/zh_CN/index.rst|   1 +
>  .../zh_CN/sound/hd-audio/controls.rst | 102 ++
>  .../zh_CN/sound/hd-audio/index.rst|  14 +++
>  .../translations/zh_CN/sound/index.rst|  22 
>  4 files changed, 139 insertions(+)
>  create mode 100644 
> Documentation/translations/zh_CN/sound/hd-audio/controls.rst
>  create mode 100644 Documentation/translations/zh_CN/sound/hd-audio/index.rst
>  create mode 100644 Documentation/translations/zh_CN/sound/index.rst
> 
> diff --git a/Documentation/translations/zh_CN/index.rst 
> b/Documentation/translations/zh_CN/index.rst
> index be6f11176200..2767dacfe86d 100644
> --- a/Documentation/translations/zh_CN/index.rst
> +++ b/Documentation/translations/zh_CN/index.rst
> @@ -20,6 +20,7 @@
> process/index
> filesystems/index
> arm64/index
> +   sound/index
>  
>  目录和表格
>  --
> diff --git a/Documentation/translations/zh_CN/sound/hd-audio/controls.rst 
> b/Documentation/translations/zh_CN/sound/hd-audio/controls.rst
> new file mode 100644
> index ..54c028ab9a40
> --- /dev/null
> +++ b/Documentation/translations/zh_CN/sound/hd-audio/controls.rst
> @@ -0,0 +1,102 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +Chinese translator: Huang Jianghui 
> +-
> +.. include:: ../../disclaimer-zh_CN.rst
> +以下为正文
> +-
> +==
> +高清音频编解码器特定混音器控件
> +==
> +
> +
> +此文件解释特定于编解码器的混音器控件.
> +
> +瑞昱编解码器
> +
> +
> +声道模式
> +  这是一个用于更改环绕声道设置的枚举控件,仅在环绕声道打开时显示出现。
> +  它给出要使用的通道数:"2ch","4ch","6ch"，和"8ch"。根据配置，这还控
> +  制多I/O插孔的插孔重分配。
> +
> +自动静音模式
> +  这是一个枚举控件，用于更改耳机和线路输出插孔的自动静音行为。如果内
> +  置扬声器、耳机和/或线路输出插孔在机器上可用，则显示该控件。当只有
> +  耳机或者线路输出的时候，它给出”禁用“和”启用“状态。当启用后，插孔插
> +  入后扬声器会自动静音。
> +
> +  当耳机和线路输出插孔都存在时，它给出”禁用“、”仅扬声器“和”线路输出+扬
> +  声器“。当”仅扬声器“被选择，插入耳机或者线路输出插孔可使扬声器静音，
> +  但不会使线路输出静音。当线路输出+扬声器被选择，插入耳机插孔会同时使扬
> +  声器和线路输出静音。
> +
> +
> +矽玛特编解码器
> +--
> +
> +模拟环回
> +   此控件启用/禁用模拟环回电路。只有在编解码器提示中将”lookback“设置为真
> +   时才会出现(见HD-Audio.txt)。请注意，在某些编解码器上，模拟环回和正常
> +   PCM播放是独占的,即当此选项打开时，您将听不到任何PCM流。
> +
> +交换中置/低频
> +   交换中置和低频通道顺序，通常情况下，左侧对应中置，右侧对应低频,启动此
> +   项后，左边低频，右边中置。
> +
> +耳机作为线路输出
> +   当此控制开启时，将耳机视为线路输出插孔。也就是说，耳机不会自动静音其他
> +   线路输出，没有耳机放大器被设置到引脚上。
> +
> +麦克风插口模式、线路插孔模式等
> +   这些枚举控制输入插孔引脚的方向和偏置。根据插孔类型，它可以设置为”麦克风
> +   输入“和”线路输入“以确定输入偏置,或者当引脚是环绕声道的多I/O插孔时，它
> +   可以设置为”线路输出“。
> +
> +
> +威盛编解码器
> +
> +
> +智能5.1
> +   一个枚举控件，用于为环绕输出重新分配多个I/O插孔的任务。当它打开时，相应
> +   的输入插孔（通常是线路输入和麦克风输入）被切换为环绕和中央低频输出插孔。
> +
> +独立耳机
> +   启用此枚举控制时，耳机输出从单个流（第三个PCM，如hw:0,2）而不是主流路由。
> +   如果耳机DAC与侧边或中央低频通道DAC共享，则DAC将自动切换到耳机。
> +
> +环回混合
> +   一个用于确定是否启动了模拟环回路由的枚举控件。当它启用后，模拟环回路由到
> +   前置通道。同样，耳机与扬声器输出也采用相同的路径。作为一个副作用，当设置
> +   此模式后，单个音量控制将不再适用于耳机和扬声器，因为只有一个DAC连接到混
> +   音器小部件。
> +
> +动态电源控制
> +   此控件决定是否启动每个插孔的动态电源控制检测。启用时，根据插孔的插入情况
> +   动态更改组件的电源状态（D0/D3）以节省电量消耗。但是，如果您的系统没有提
> +   供正确的插孔检测，这将无法工作;在这种情况下，请关闭此控件。
> +
> +插孔检测
> +   此控件仅为VT1708编解码器提供，它不会为每个插孔插拔提供适当的未请求事件。
> +   当此控件打开，驱动将轮询插孔检测，以便耳机自动静音可以工作，而关闭此控
> +   件将降低功耗。
> +
> +
> +科胜讯编解码器
> +--
> +
> +自动静音模式
> +   见瑞昱解码器
> +
> +
> +
> +模拟编解码器
> +
> +
> +通道模式
> +   这是一个用于更改环绕声道设置的枚举控件,仅在环绕声道可用时显示。它提供了能
> +   被使用的通道数:”2ch“、”4ch“和”6ch“。根据配置，这还控制多I/O插孔的插孔重
> +   分配。
> +
> +独立耳机
> +   启动此枚举控制后，耳机输出从单个流（第三个PCM，如hw:0,2）而不是主流路由。
> diff --git a/Documentation/translations/zh_CN/sound/hd-audio/index.rst 
> b/Documentation/translations/zh_CN/sound/hd-audio/index.rst
> new file mode 100644
> index ..d9885d53b069
> --- /dev/null
> +++ b/Documentation/translations/zh_CN/sound/hd-audio/index.rst
> @@ -0,0 +1,14 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +.. include:: ../../disclaimer-zh_CN.rst
> +
> +:Original: :doc:`../../../../sound/hd-audio/index`
> +:Translator: Huang Jianghui 
> +
> +
> +高清音频
> +
> +
> +.. toctree::
> +   :maxdepth: 2
> +
> +   controls
> diff --git a/Documentation/translations/zh_CN/sound/index.rst 
> b/Documentation/translations/zh_CN/sound/index.rst
> new file mode 100644
> index ..28d5dca34a63
> --- /dev/null
> +++ b/Documentation/translations/zh_CN/sound/index.rst
> @@ -0,0 +1,22 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +.. include:: ../disclaimer-zh_CN.rst
> +
> +:Original: :doc:`../../../sound/index`
> +:Translator: Huang Jianghui 
> +
> +
> +
> +Linux 声音子系统文档
> +
> +
> +.. toctree::
> +   :maxdepth: 2
> +
> +   hd-audio/index
> +
> +.. only::  subproject and html
> +
> +   Indices
> +   ===
> +
> +   * :ref:`genindex`
>

Re: [PATCH] PATCH Documentation translations:translate sound/hd-audio/controls to chinese

2021-03-03 Thread Alex Shi




在 2021/3/2 下午5:13, huangjianghui 写道:
>> we usually include patch into email instead of attach it as attachment.
>> You can try use 'git send-email' to post your patches.
>>
>> Thanks
>> Alex
>>
>>
> I am sorry to do those, my patch is shown below:

Hi Jianghui,

I cann't apply your patch:

$ g am ../Re_\ \[PATCH\]\ PATCH\ Documentation\ translations_translate\ 
sound_hd-audio_controls\ to\ chinese.eml
Applying: PATCH Documentation translations:translate sound/hd-audio/controls to 
chinese
error: patch failed: Documentation/translations/zh_CN/index.rst:20
error: Documentation/translations/zh_CN/index.rst: patch does not apply
Patch failed at 0001 PATCH Documentation translations:translate 
sound/hd-audio/controls to chinese
hint: Use 'git am --show-current-patch=diff' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


You'd better try to use 'git am' apply your email patch from your email client,
before send it to linux-doc.

There are some process doc of how community working:
https://landlock.io/linux-doc/landlock-v27/translations/zh_CN/process/

Thanks
Alex

Re: [PATCH] PATCH Documentation translations:translate sound/hd-audio/controls to chinese

2021-03-02 Thread Alex Shi




在 2021/3/2 上午11:22, huangjianghui 写道:
>>
> In the next patch ,I deleted the index of the untranstated files,and i
> used checkpatch.pl to detect doc errors and tried to built the htmldocs
> on my pc.
> 
> Thanks,
> 
> Huang Jianghui

Hi Jianghui,

we usually include patch into email instead of attach it as attachment.
You can try use 'git send-email' to post your patches.

Thanks
Alex

Re: [PATCH] PATCH Documentation translations:translate sound/hd-audio/controls to chinese

2021-03-01 Thread Alex Shi




在 2021/3/1 上午11:03, hjh 写道:
> Signed-off-by: hjh 
> ---
>  Documentation/translations/zh_CN/index.rst|   1 +
>  .../zh_CN/sound/hd-audio/controls.rst | 109 ++
>  .../zh_CN/sound/hd-audio/index.rst|  17 +++
>  .../translations/zh_CN/sound/index.rst|  26 +
>  4 files changed, 153 insertions(+)
>  create mode 100644 
> Documentation/translations/zh_CN/sound/hd-audio/controls.rst
>  create mode 100644 Documentation/translations/zh_CN/sound/hd-audio/index.rst
>  create mode 100644 Documentation/translations/zh_CN/sound/index.rst
> 
> diff --git a/Documentation/translations/zh_CN/index.rst 
> b/Documentation/translations/zh_CN/index.rst
> index be6f11176200..2767dacfe86d 100644
> --- a/Documentation/translations/zh_CN/index.rst
> +++ b/Documentation/translations/zh_CN/index.rst
> @@ -20,6 +20,7 @@
> process/index
> filesystems/index
> arm64/index
> +   sound/index
>  
>  目录和表格
>  --
> diff --git a/Documentation/translations/zh_CN/sound/hd-audio/controls.rst 
> b/Documentation/translations/zh_CN/sound/hd-audio/controls.rst
> new file mode 100644
> index ..662bacc5a45f
> --- /dev/null
> +++ b/Documentation/translations/zh_CN/sound/hd-audio/controls.rst
> @@ -0,0 +1,109 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +Chinese translated version of Documentation/sound/hd-audio/controls.rst
> +
> +If you have any comment or update to the content, please contact the
> +original document maintainer directly.  However, if you have a problem
> +communicating in English you can also ask the Chinese maintainer for
> +help.  Contact the Chinese maintainer if this translation is outdated
> +or if there is a problem with the translation.

with the disclaimer file, this info could be removed.

> +
> +Chinese maintainer: Huang Jianghui 

We call us translator，译者。

Others looks fine for me.

Reviewed-by: Alex Shi 

Thanks
Alex

> +-
> +.. include:: ../../disclaimer-zh_CN.rst
> +以下为正文
> +-
> +==
> +高清音频编解码器特定混音器控件
> +==
> +
> +
> +此文件解释特定于编解码器的混音器控件.
> +
> +瑞昱编解码器
> +
> +
> +声道模式
> +  这是一个用于更改环绕声道设置的枚举控件,仅在环绕声道打开时显示出现。
> +  它给出要使用的通道数:"2ch","4ch","6ch"，和"8ch"。根据配置，这还控
> +  制多I/O插孔的插孔重分配。
> +
> +自动静音模式
> +  这是一个枚举控件，用于更改耳机和线路输出插孔的自动静音行为。如果内
> +  置扬声器、耳机和/或线路输出插孔在机器上可用，则显示该控件。当只有
> +  耳机或者线路输出的时候，它给出”禁用“和”启用“状态。当启用后，插孔插
> +  入后扬声器会自动静音。
> +
> +  当耳机和线路输出插孔都存在时，它给出”禁用“、”仅扬声器“和”线路输出+扬
> +  声器“。当”仅扬声器“被选择，插入耳机或者线路输出插孔可使扬声器静音，
> +  但不会使线路输出静音。当线路输出+扬声器被选择，插入耳机插孔会同时使扬
> +  声器和线路输出静音。
> +
> +
> +矽玛特编解码器
> +--
> +
> +模拟环回
> +   此控件启用/禁用模拟环回电路。只有在编解码器提示中将”lookback“设置为真
> +   时才会出现(见HD-Audio.txt)。请注意，在某些编解码器上，模拟环回和正常
> +   PCM播放是独占的,即当此选项打开时，您将听不到任何PCM流。
> +
> +交换中置/低频
> +   交换中置和低频通道顺序，通常情况下，左侧对应中置，右侧对应低频,启动此
> +   项后，左边低频，右边中置。
> +
> +耳机作为线路输出
> +   当此控制开启时，将耳机视为线路输出插孔。也就是说，耳机不会自动静音其他
> +   线路输出，没有耳机放大器被设置到引脚上。
> +
> +麦克风插口模式、线路插孔模式等
> +   这些枚举控制输入插孔引脚的方向和偏置。根据插孔类型，它可以设置为”麦克风
> +   输入“和”线路输入“以确定输入偏置,或者当引脚是环绕声道的多I/O插孔时，它
> +   可以设置为”线路输出“。
> +
> +
> +威盛编解码器
> +
> +
> +智能5.1
> +   一个枚举控件，用于为环绕输出重新分配多个I/O插孔的任务。当它打开时，相应
> +   的输入插孔（通常是线路输入和麦克风输入）被切换为环绕和中央低频输出插孔。
> +
> +独立耳机
> +   启用此枚举控制时，耳机输出从单个流（第三个PCM，如hw:0,2）而不是主流路由。
> +   如果耳机DAC与侧边或中央低频通道DAC共享，则DAC将自动切换到耳机。
> +
> +环回混合
> +   一个用于确定是否启动了模拟环回路由的枚举控件。当它启用后，模拟环回路由到
> +   前置通道。同样，耳机与扬声器输出也采用相同的路径。作为一个副作用，当设置
> +   此模式后，单个音量控制将不再适用于耳机和扬声器，因为只有一个DAC连接到混
> +   音器小部件。
> +
> +动态电源控制
> +   此控件决定是否启动每个插孔的动态电源控制检测。启用时，根据插孔的插入情况
> +   动态更改组件的电源状态（D0/D3）以节省电量消耗。但是，如果您的系统没有提
> +   供正确的插孔检测，这将无法工作;在这种情况下，请关闭此控件。
> +
> +插孔检测
> +   此控件仅为VT1708编解码器提供，它不会为每个插孔插拔提供适当的未请求事件。
> +   当此控件打开，驱动将轮询插孔检测，以便耳机自动静音可以工作，而关闭此控
> +   件将降低功耗。
> +
> +
> +科胜讯编解码器
> +--
> +
> +自动静音模式
> +   见瑞昱解码器
> +
> +
> +
> +模拟编解码器
> +
> +
> +通道模式
> +   这是一个用于更改环绕声道设置的枚举控件,仅在环绕声道可用时显示。它提供了能
> +   被使用的通道数:”2ch“、”4ch“和”6ch“。根据配置，这还控制多I/O插孔的插孔重
> +   分配。
> +
> +独立耳机
> +   启动此枚举控制后，耳机输出从单个流（第三个PCM，如hw:0,2）而不是主流路由。
> diff --git a/Documentation/translations/zh_CN/sound/hd-audio/index.rst 
> b/Documentation/translations/zh_CN/sound/hd-audio/index.rst
> new file mode 100644
> index ..c287aad51066
> --- /dev/null
> +++ b/Documentation/translations/zh_CN/sound/hd-audio/index.rst
> @@ -0,0 +1,17 @@
> +.. include:: ../../disclaimer-zh_CN.rst
> +
> +:Original: :doc:`../../../../sound/hd-audio/index`
> +:Tr

Re: [PATCH v2 03/10] mm: don't pass "enum lru_list" to lru list addition functions

2021-02-24 Thread Alex Shi




在 2021/2/24 下午4:37, Yu Zhao 写道:
>>> @@ -65,18 +63,12 @@ static __always_inline void 
>>> __clear_page_lru_flags(struct page *page)
>>>   */
>>>  static __always_inline enum lru_list page_lru(struct page *page)
>>>  {
>>> -   enum lru_list lru;
>>> +   unsigned long flags = READ_ONCE(page->flags);
>>>  
>>> VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);
>>>  
>>> -   if (PageUnevictable(page))
>>> -   return LRU_UNEVICTABLE;
>>> -
>>> -   lru = page_is_file_lru(page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
>>> -   if (PageActive(page))
>>> -   lru += LRU_ACTIVE;
>>> -
>>> -   return lru;
>>> +   return (flags & BIT(PG_unevictable)) ? LRU_UNEVICTABLE :
>>> +  (LRU_FILE * !(flags & BIT(PG_swapbacked)) + !!(flags & 
>>> BIT(PG_active)));
>> Currently each of page flags used different flags policy, does this mean 
>> above flags could be
>> change to PF_ANY policy?
> That's a good question. Semantically, no because
> PG_{active,unevictable} only apply to head pages. But practically,
> I think the answer is yes, and the only place that needs to
> explicitly call compound_head() is gather_stats() in
> fs/proc/task_mmu.c, IIRC.
> 


A quick testing for your testing request:

# ll vmlinux vmlinux.new
-rwxr-xr-x 1 root root 62245304 Feb 24 16:57 vmlinux
-rwxr-xr-x 1 root root 62245280 Feb 24 16:55 vmlinux.new
# gcc --version
gcc (GCC) 8.3.1 20190311 (Red Hat 8.3.1-3)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

# scripts/bloat-o-meter vmlinux vmlinux.new
add/remove: 0/0 grow/shrink: 1/15 up/down: 1/-2008 (-2007)
Function old new   delta
vermagic  37  38  +1
trace_event_raw_event_mm_lru_insertion   471 418 -53
perf_trace_mm_lru_insertion  526 473 -53
__munlock_pagevec   11341069 -65
isolate_migratepages_block  26232547 -76
isolate_lru_page 384 303 -81
__pagevec_lru_add753 652-101
release_pages780 667-113
__page_cache_release 429 276-153
move_pages_to_lru871 702-169
lru_lazyfree_fn  712 539-173
check_move_unevictable_pages 938 763-175
__activate_page  665 488-177
lru_deactivate_fn636 452-184
pagevec_move_tail_fn 597 411-186
lru_deactivate_file_fn  1000 751-249
Total: Before=17029652, After=17027645, chg -0.01%

Re: [PATCH v2 03/10] mm: don't pass "enum lru_list" to lru list addition functions

2021-02-24 Thread Alex Shi




在 2021/2/24 下午1:29, Yu Zhao 写道:
> On Tue, Feb 23, 2021 at 02:50:11PM -0800, Andrew Morton wrote:
>> On Tue, 26 Jan 2021 15:14:38 -0700 Yu Zhao  wrote:
>>
>>> On Tue, Jan 26, 2021 at 10:01:11PM +, Matthew Wilcox wrote:
 On Fri, Jan 22, 2021 at 03:05:53PM -0700, Yu Zhao wrote:
> +++ b/mm/swap.c
> @@ -231,7 +231,7 @@ static void pagevec_move_tail_fn(struct page *page, 
> struct lruvec *lruvec)
>   if (!PageUnevictable(page)) {
>   del_page_from_lru_list(page, lruvec, page_lru(page));
>   ClearPageActive(page);
> - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
> + add_page_to_lru_list_tail(page, lruvec);
>   __count_vm_events(PGROTATED, thp_nr_pages(page));
>   }

 Is it profitable to do ...

 -  del_page_from_lru_list(page, lruvec, page_lru(page));
 +  enum lru_list lru = page_lru(page);
 +  del_page_from_lru_list(page, lruvec, lru);
ClearPageActive(page);
 -  add_page_to_lru_list_tail(page, lruvec, page_lru(page));
 +  lru &= ~LRU_ACTIVE;
 +  add_page_to_lru_list_tail(page, lruvec, lru);
>>>
>>> Ok, now we want to trade readability for size. Sure, I'll see how
>>> much we could squeeze.
>>
>> As nothing has happened here and the code bloat issue remains, I'll
>> hold this series out of 5.12-rc1.
> 
> Sorry for the slow response. I was trying to ascertain why
> page_lru(), a tiny helper, could bloat vmlinux by O(KB). It turned out
> compound_head() included in Page{Active,Unevictable} is a nuisance in
> our case. Testing PG_{active,unevictable} against
> compound_head(page)->flags is really unnecessary because all lru
> operations are eventually done on page->lru not
> compound_head(page)->lru. With the following change, which sacrifices
> the readability a bit, we gain 998 bytes with Clang but lose 227 bytes
> with GCC, which IMO is a win. (We use Clang by default.)
> 
> 
> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> index 355ea1ee32bd..ec0878a3cdfe 100644
> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -46,14 +46,12 @@ static __always_inline void __clear_page_lru_flags(struct 
> page *page)
>  {
>   VM_BUG_ON_PAGE(!PageLRU(page), page);
>  
> - __ClearPageLRU(page);
> -
>   /* this shouldn't happen, so leave the flags to bad_page() */
> - if (PageActive(page) && PageUnevictable(page))
> + if ((page->flags & (BIT(PG_active) | BIT(PG_unevictable))) ==
> + (BIT(PG_active) | BIT(PG_unevictable)))
>   return;
>  
> - __ClearPageActive(page);
> - __ClearPageUnevictable(page);
> + page->flags &= ~(BIT(PG_lru) | BIT(PG_active) | BIT(PG_unevictable));
>  }
>  
>  /**
> @@ -65,18 +63,12 @@ static __always_inline void __clear_page_lru_flags(struct 
> page *page)
>   */
>  static __always_inline enum lru_list page_lru(struct page *page)
>  {
> - enum lru_list lru;
> + unsigned long flags = READ_ONCE(page->flags);
>  
>   VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);
>  
> - if (PageUnevictable(page))
> - return LRU_UNEVICTABLE;
> -
> - lru = page_is_file_lru(page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
> - if (PageActive(page))
> - lru += LRU_ACTIVE;
> -
> - return lru;
> + return (flags & BIT(PG_unevictable)) ? LRU_UNEVICTABLE :
> +(LRU_FILE * !(flags & BIT(PG_swapbacked)) + !!(flags & 
> BIT(PG_active)));

Currently each of page flags used different flags policy, does this mean above 
flags could be
change to PF_ANY policy?

Thanks
Alex

>  }
>  
>  static __always_inline void add_page_to_lru_list(struct page *page,
> 
> 
> I'll post this as a separate patch. Below the bloat-o-meter collected
> on top of c03c21ba6f4e.
> 
> $ ./scripts/bloat-o-meter ../vmlinux.clang.orig ../vmlinux.clang
> add/remove: 0/1 grow/shrink: 7/10 up/down: 191/-1189 (-998)
> Function old new   delta
> lru_lazyfree_fn  848 893 +45
> lru_deactivate_file_fn  10371075 +38
> perf_trace_mm_lru_insertion  515 548 +33
> check_move_unevictable_pages 9831006 +23
> __activate_page  706 729 +23
> trace_event_raw_event_mm_lru_insertion   476 497 +21
> lru_deactivate_fn691 699  +8
> __bpf_trace_mm_lru_insertion  13  11  -2
> __traceiter_mm_lru_insertion  67  62  -5
> move_pages_to_lru964 881 -83
> __pagevec_lru_add_fn 665 581 -84
> isolate_lru_page 524 419-105
> __munlock_pagevec   16091481-128
> isolate_migratepages_block  33703237

Re: [PATCH] doc: use KCFLAGS instead of EXTRA_CFLAGS to pass flags from command line

2021-02-21 Thread Alex Shi

Reviewed-by: Alex Shi 


在 2021/2/21 下午11:25, Masahiro Yamada 写道:
> You should use KCFLAGS to pass additional compiler flags from the
> command line. Using EXTRA_CFLAGS is wrong.
> 
> EXTRA_CFLAGS is supposed to specify flags applied only to the current
> Makefile (and now deprecated in favor of ccflags-y).
> 
> It is still used in arch/mips/kvm/Makefile (and possibly in external
> modules too). Passing EXTRA_CFLAGS from the command line overwrites
> it and breaks the build.
> 
> I also fixed drivers/gpu/drm/tilcdc/Makefile because commit 816175dd1fd7
> ("drivers/gpu/drm/tilcdc: Makefile, only -Werror when no -W* in
> EXTRA_CFLAGS") was based on the same misunderstanding.
> 
> Signed-off-by: Masahiro Yamada 
> ---
> 
>  Documentation/process/4.Coding.rst| 2 +-
>  Documentation/process/submit-checklist.rst| 2 +-
>  Documentation/translations/it_IT/process/4.Coding.rst | 2 +-
>  Documentation/translations/it_IT/process/submit-checklist.rst | 2 +-
>  Documentation/translations/zh_CN/process/4.Coding.rst | 2 +-
>  drivers/gpu/drm/tilcdc/Makefile   | 2 +-
>  6 files changed, 6 insertions(+), 6 deletions(-)
>

Re: [PATCH 1/1] [PATCH] Documentation/translations: Translate sound/hd-audio/controls.rst into Chinese

2021-02-19 Thread Alex Shi




在 2021/2/19 下午10:48, hjh 写道:
> +Documentation/sound/hd-audio/controls.rst 的中文翻译
> +
> +如果想评论或更新本文的内容，请直接联系原文档的维护者。如果你使用英文
> +交流有困难的话，也可以向中文版维护者求助。如果本翻译更新不及时或者翻
> +译存在问题，请联系中文版维护者。
> +
> +中文版维护者： 黄江慧  Huang Jianghui 
> +中文版翻译者： 黄江慧  Huang Jianghui 
> +
> +

better to reuse disclaimer-zh_CN.rst here.

Thanks

[tip: locking/core] locking/rtmutex: Add missing kernel-doc markup

2021-01-28 Thread tip-bot2 for Alex Shi

The following commit has been merged into the locking/core branch of tip:

Commit-ID: bf594bf400016a1ac58c753bcc0393a39c36f669
Gitweb:
https://git.kernel.org/tip/bf594bf400016a1ac58c753bcc0393a39c36f669
Author:Alex Shi 
AuthorDate:Fri, 13 Nov 2020 16:58:11 +08:00
Committer: Thomas Gleixner 
CommitterDate: Thu, 28 Jan 2021 13:20:18 +01:00

locking/rtmutex: Add missing kernel-doc markup

To fix the following issues:
kernel/locking/rtmutex.c:1612: warning: Function parameter or member
'lock' not described in '__rt_mutex_futex_unlock'
kernel/locking/rtmutex.c:1612: warning: Function parameter or member
'wake_q' not described in '__rt_mutex_futex_unlock'
kernel/locking/rtmutex.c:1675: warning: Function parameter or member
'name' not described in '__rt_mutex_init'
kernel/locking/rtmutex.c:1675: warning: Function parameter or member
'key' not described in '__rt_mutex_init'

[ tglx: Change rt lock to rt_mutex for consistency sake ]

Signed-off-by: Alex Shi 
Signed-off-by: Thomas Gleixner 
Acked-by: Will Deacon 
Link: 
https://lore.kernel.org/r/1605257895-5536-2-git-send-email-alex@linux.alibaba.com


---
 kernel/locking/rtmutex.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index cfdd5b9..a201e5e 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1604,8 +1604,11 @@ void __sched rt_mutex_unlock(struct rt_mutex *lock)
 EXPORT_SYMBOL_GPL(rt_mutex_unlock);
 
 /**
- * Futex variant, that since futex variants do not use the fast-path, can be
- * simple and will not need to retry.
+ * __rt_mutex_futex_unlock - Futex variant, that since futex variants
+ * do not use the fast-path, can be simple and will not need to retry.
+ *
+ * @lock:  The rt_mutex to be unlocked
+ * @wake_q:The wake queue head from which to get the next lock waiter
  */
 bool __sched __rt_mutex_futex_unlock(struct rt_mutex *lock,
struct wake_q_head *wake_q)
@@ -1662,13 +1665,15 @@ void rt_mutex_destroy(struct rt_mutex *lock)
 EXPORT_SYMBOL_GPL(rt_mutex_destroy);
 
 /**
- * __rt_mutex_init - initialize the rt lock
+ * __rt_mutex_init - initialize the rt_mutex
  *
- * @lock: the rt lock to be initialized
+ * @lock:  The rt_mutex to be initialized
+ * @name:  The lock name used for debugging
+ * @key:   The lock class key used for debugging
  *
- * Initialize the rt lock to unlocked state.
+ * Initialize the rt_mutex to unlocked state.
  *
- * Initializing of a locked rt lock is not allowed
+ * Initializing of a locked rt_mutex is not allowed
  */
 void __rt_mutex_init(struct rt_mutex *lock, const char *name,
 struct lock_class_key *key)

[tip: locking/core] locking/rtmutex: Add missing kernel-doc markup

2021-01-27 Thread tip-bot2 for Alex Shi

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 59ea5f1508e15cecddd8e2ca828f7962ea37adab
Gitweb:
https://git.kernel.org/tip/59ea5f1508e15cecddd8e2ca828f7962ea37adab
Author:Alex Shi 
AuthorDate:Fri, 13 Nov 2020 16:58:11 +08:00
Committer: Thomas Gleixner 
CommitterDate: Wed, 27 Jan 2021 12:44:52 +01:00

locking/rtmutex: Add missing kernel-doc markup

To fix the following issues:
kernel/locking/rtmutex.c:1612: warning: Function parameter or member
'lock' not described in '__rt_mutex_futex_unlock'
kernel/locking/rtmutex.c:1612: warning: Function parameter or member
'wake_q' not described in '__rt_mutex_futex_unlock'
kernel/locking/rtmutex.c:1675: warning: Function parameter or member
'name' not described in '__rt_mutex_init'
kernel/locking/rtmutex.c:1675: warning: Function parameter or member
'key' not described in '__rt_mutex_init'

[ tglx: Change rt lock to rt_mutex for consistency sake ]

Signed-off-by: Alex Shi 
Signed-off-by: Thomas Gleixner 
Acked-by: Will Deacon 
Link: 
https://lore.kernel.org/r/1605257895-5536-2-git-send-email-alex@linux.alibaba.com

---
 kernel/locking/rtmutex.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index cfdd5b9..a201e5e 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1604,8 +1604,11 @@ void __sched rt_mutex_unlock(struct rt_mutex *lock)
 EXPORT_SYMBOL_GPL(rt_mutex_unlock);
 
 /**
- * Futex variant, that since futex variants do not use the fast-path, can be
- * simple and will not need to retry.
+ * __rt_mutex_futex_unlock - Futex variant, that since futex variants
+ * do not use the fast-path, can be simple and will not need to retry.
+ *
+ * @lock:  The rt_mutex to be unlocked
+ * @wake_q:The wake queue head from which to get the next lock waiter
  */
 bool __sched __rt_mutex_futex_unlock(struct rt_mutex *lock,
struct wake_q_head *wake_q)
@@ -1662,13 +1665,15 @@ void rt_mutex_destroy(struct rt_mutex *lock)
 EXPORT_SYMBOL_GPL(rt_mutex_destroy);
 
 /**
- * __rt_mutex_init - initialize the rt lock
+ * __rt_mutex_init - initialize the rt_mutex
  *
- * @lock: the rt lock to be initialized
+ * @lock:  The rt_mutex to be initialized
+ * @name:  The lock name used for debugging
+ * @key:   The lock class key used for debugging
  *
- * Initialize the rt lock to unlocked state.
+ * Initialize the rt_mutex to unlocked state.
  *
- * Initializing of a locked rt lock is not allowed
+ * Initializing of a locked rt_mutex is not allowed
  */
 void __rt_mutex_init(struct rt_mutex *lock, const char *name,
 struct lock_class_key *key)

Re: [PATCH] mm/filemap: Adding missing mem_cgroup_uncharge() to __add_to_page_cache_locked()

2021-01-24 Thread Alex Shi

Reviewed-by: Alex Shi 

在 2021/1/25 下午12:24, Waiman Long 写道:
> The commit 3fea5a499d57 ("mm: memcontrol: convert page
> cache to a new mem_cgroup_charge() API") introduced a bug in
> __add_to_page_cache_locked() causing the following splat:
> 
>  [ 1570.068330] page dumped because: VM_BUG_ON_PAGE(page_memcg(page))
>  [ 1570.068333] pages's memcg:8889a4116000
>  [ 1570.068343] [ cut here ]
>  [ 1570.068346] kernel BUG at mm/memcontrol.c:2924!
>  [ 1570.068355] invalid opcode:  [#1] SMP KASAN PTI
>  [ 1570.068359] CPU: 35 PID: 12345 Comm: cat Tainted: G S  W I   
> 5.11.0-rc4-debug+ #1
>  [ 1570.068363] Hardware name: HP HP Z8 G4 Workstation/81C7, BIOS P60 v01.25 
> 12/06/2017
>  [ 1570.068365] RIP: 0010:commit_charge+0xf4/0x130
>:
>  [ 1570.068375] RSP: 0018:8881b38d70e8 EFLAGS: 00010286
>  [ 1570.068379] RAX:  RBX: ea00260ddd00 RCX: 
> 0027
>  [ 1570.068382] RDX:  RSI: 0004 RDI: 
> 88907ebe05a8
>  [ 1570.068384] RBP: ea00260ddd00 R08: ed120fd7c0b6 R09: 
> ed120fd7c0b6
>  [ 1570.068386] R10: 88907ebe05ab R11: ed120fd7c0b5 R12: 
> ea00260ddd38
>  [ 1570.068389] R13: 8889a4116000 R14: 8889a4116000 R15: 
> 0001
>  [ 1570.068391] FS:  7ff039638680() GS:88907ea0() 
> knlGS:
>  [ 1570.068394] CS:  0010 DS:  ES:  CR0: 80050033
>  [ 1570.068396] CR2: 7f36f354cc20 CR3: 0008a0126006 CR4: 
> 007706e0
>  [ 1570.068398] DR0:  DR1:  DR2: 
> 
>  [ 1570.068400] DR3:  DR6: fffe0ff0 DR7: 
> 0400
>  [ 1570.068402] PKRU: 5554
>  [ 1570.068404] Call Trace:
>  [ 1570.068407]  mem_cgroup_charge+0x175/0x770
>  [ 1570.068413]  __add_to_page_cache_locked+0x712/0xad0
>  [ 1570.068439]  add_to_page_cache_lru+0xc5/0x1f0
>  [ 1570.068461]  cachefiles_read_or_alloc_pages+0x895/0x2e10 [cachefiles]
>  [ 1570.068524]  __fscache_read_or_alloc_pages+0x6c0/0xa00 [fscache]
>  [ 1570.068540]  __nfs_readpages_from_fscache+0x16d/0x630 [nfs]
>  [ 1570.068585]  nfs_readpages+0x24e/0x540 [nfs]
>  [ 1570.068693]  read_pages+0x5b1/0xc40
>  [ 1570.068711]  page_cache_ra_unbounded+0x460/0x750
>  [ 1570.068729]  generic_file_buffered_read_get_pages+0x290/0x1710
>  [ 1570.068756]  generic_file_buffered_read+0x2a9/0xc30
>  [ 1570.068832]  nfs_file_read+0x13f/0x230 [nfs]
>  [ 1570.068872]  new_sync_read+0x3af/0x610
>  [ 1570.068901]  vfs_read+0x339/0x4b0
>  [ 1570.068909]  ksys_read+0xf1/0x1c0
>  [ 1570.068920]  do_syscall_64+0x33/0x40
>  [ 1570.068926]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>  [ 1570.068930] RIP: 0033:0x7ff039135595
> 
> Before that commit, there was a try_charge() and commit_charge()
> in __add_to_page_cache_locked(). These 2 separated charge functions
> were replaced by a single mem_cgroup_charge(). However, it forgot
> to add a matching mem_cgroup_uncharge() when the xarray insertion
> failed with the page released back to the pool. Fix this by adding a
> mem_cgroup_uncharge() call when insertion error happens.
> 
> Fixes: 3fea5a499d57 ("mm: memcontrol: convert page cache to a new 
> mem_cgroup_charge() API")
> Signed-off-by: Waiman Long 
> ---
>  mm/filemap.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 5c9d564317a5..aa0e0fb04670 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -835,6 +835,7 @@ noinline int __add_to_page_cache_locked(struct page *page,
>   XA_STATE(xas, >i_pages, offset);
>   int huge = PageHuge(page);
>   int error;
> + bool charged = false;
>  
>   VM_BUG_ON_PAGE(!PageLocked(page), page);
>   VM_BUG_ON_PAGE(PageSwapBacked(page), page);
> @@ -848,6 +849,7 @@ noinline int __add_to_page_cache_locked(struct page *page,
>   error = mem_cgroup_charge(page, current->mm, gfp);
>   if (error)
>   goto error;
> + charged = true;
>   }
>  
>   gfp &= GFP_RECLAIM_MASK;
> @@ -896,6 +898,8 @@ noinline int __add_to_page_cache_locked(struct page *page,
>  
>   if (xas_error()) {
>   error = xas_error();
> + if (charged)
> + mem_cgroup_uncharge(page);
>   goto error;
>   }
>  
>

Re: [RESEND v13 00/10] KVM: x86/pmu: Guest Last Branch Recording Enabling

2021-01-15 Thread Alex Shi




在 2021/1/8 上午9:36, Like Xu 写道:
> Because saving/restoring tens of LBR MSRs (e.g. 32 LBR stack entries) in
> VMX transition brings too excessive overhead to frequent vmx transition
> itself, the guest LBR event would help save/restore the LBR stack msrs
> during the context switching with the help of native LBR event callstack
> mechanism, including LBR_SELECT msr.
> 

Sounds the feature is much helpful for VMM guest performance tunning.
Good job!

Thanks
Alex

Re: [PATCH for doc-next] doc/zh_CN: adjust table markup in mips/ingenic-tcu.rst

2021-01-12 Thread Alex Shi

Reviewed-by: Alex Shi 

在 2021/1/13 下午3:00, Lukas Bulwahn 写道:
> Commit 419b1d4ed1cb ("doc/zh_CN: add mips ingenic-tcu.rst translation")
> introduces a warning with make htmldocs:
> 
>   ./Documentation/translations/zh_CN/mips/ingenic-tcu.rst:
> 61: WARNING: Malformed table. Text in column margin in table line 6.
> 
> Adjust the table markup to address this warning.
> 
> Signed-off-by: Lukas Bulwahn 
> ---
> applies cleanly on next-20210113
> 
> Yanteng, please ack.
> 
> Jonathan, please pick this doc warning fixup on your -next tree. 
> 
>  Documentation/translations/zh_CN/mips/ingenic-tcu.rst | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/translations/zh_CN/mips/ingenic-tcu.rst 
> b/Documentation/translations/zh_CN/mips/ingenic-tcu.rst
> index 72b5d409ed89..9324a0a26430 100644
> --- a/Documentation/translations/zh_CN/mips/ingenic-tcu.rst
> +++ b/Documentation/translations/zh_CN/mips/ingenic-tcu.rst
> @@ -53,14 +53,14 @@
>  
>  TCU硬件的功能分布在多个驱动程序：
>  
> -=== =
> +==  ===
>  时钟drivers/clk/ingenic/tcu.c
>  中断drivers/irqchip/irq-ingenic-tcu.c
>  定时器  drivers/clocksource/ingenic-timer.c
>  OST drivers/clocksource/ingenic-ost.c
>  脉冲宽度调制器  drivers/pwm/pwm-jz4740.c
>  看门狗  drivers/watchdog/jz4740_wdt.c
> -=== =
> +==  ===
>  
>  因为可以从相同的寄存器控制属于不同驱动程序和框架的TCU的各种功能，所以
>  所有这些驱动程序都通过相同的控制总线通用接口访问它们的寄存器。
>

Re: [PATCH] mm/mmap: replace if (cond) BUG() with BUG_ON()

2021-01-06 Thread Alex Shi




在 2021/1/6 下午12:28, Hugh Dickins 写道:
> On Sat, 12 Dec 2020, Alex Shi wrote:
>>
>> I'm very sorry, a typo here. the patch should be updated:
>>
>> From ed4fa1c6d5bed5766c5f0c35af0c597855d7be06 Mon Sep 17 00:00:00 2001
>> From: Alex Shi 
>> Date: Fri, 11 Dec 2020 21:26:46 +0800
>> Subject: [PATCH] mm/mmap: replace if (cond) BUG() with BUG_ON()
>>
>> coccinelle reports some warnings:
>> WARNING: Use BUG_ON instead of if condition followed by BUG.
>>
>> It could be fixed by BUG_ON().
>>
>> Reported-by: ab...@linux.alibaba.com
>> Signed-off-by: Alex Shi 
> 
> When diffing mmotm just now, I was sorry to find this: NAK.
> 
> Alex, please consider why the authors of these lines (whom you
> did not Cc) chose to write them without BUG_ON(): it has always
> been preferred practice to use BUG_ON() on predicates, but not on
> functionally effective statements (sorry, I've forgotten the proper
> term: I'd say statements with side-effects, but here they are not
> just side-effects: they are their main purpose).
> 

Right!

The original line may want to be done whenever the BUG() enabled, I
overlocked this points. Sorry! My fault!

Please revert them.

Thanks
Alex



> We prefer not to hide those away inside BUG macros: please fix your
> "abaci" to respect kernel style here - unless it turns out that the
> kernel has moved away from that, and it's me who's behind the times.
> 
> Andrew, if you agree, please drop
> mm-mmap-replace-if-cond-bug-with-bug_on.patch
> from your stack.
> 
> (And did Minchan really Ack it? I see an Ack from Minchan to a
> similar mm/zsmalloc patch: which surprises me, but is Minchan's
> business not mine; but that patch is not in mmotm.)
> 
> On the whole, I think there are far too many patches submitted,
> where Developer B chooses to rewrite a line to their own preference,
> without respecting that Author A chose to write it in another way.
> That's great when it really does improve readability, but often not.
> 
> Thanks,
> Hugh
> 
>> Cc: Andrew Morton 
>> Cc: linux...@kvack.org
>> Cc: linux-kernel@vger.kernel.org
>> ---
>>  mm/mmap.c | 22 --
>>  1 file changed, 8 insertions(+), 14 deletions(-)
>>
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index 8144fc3c5a78..107fa91bb59f 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -712,9 +712,8 @@ static void __insert_vm_struct(struct mm_struct *mm, 
>> struct vm_area_struct *vma)
>>  struct vm_area_struct *prev;
>>  struct rb_node **rb_link, *rb_parent;
>>  
>> -if (find_vma_links(mm, vma->vm_start, vma->vm_end,
>> -   , _link, _parent))
>> -BUG();
>> +BUG_ON(find_vma_links(mm, vma->vm_start, vma->vm_end,
>> +   , _link, _parent));
>>  __vma_link(mm, vma, prev, rb_link, rb_parent);
>>  mm->map_count++;
>>  }
>> @@ -3585,9 +3584,8 @@ static void vm_lock_anon_vma(struct mm_struct *mm, 
>> struct anon_vma *anon_vma)
>>   * can't change from under us thanks to the
>>   * anon_vma->root->rwsem.
>>   */
>> -if (__test_and_set_bit(0, (unsigned long *)
>> -   
>> _vma->root->rb_root.rb_root.rb_node))
>> -BUG();
>> +BUG_ON(__test_and_set_bit(0, (unsigned long *)
>> +_vma->root->rb_root.rb_root.rb_node));
>>  }
>>  }
>>  
>> @@ -3603,8 +3601,7 @@ static void vm_lock_mapping(struct mm_struct *mm, 
>> struct address_space *mapping)
>>   * mm_all_locks_mutex, there may be other cpus
>>   * changing other bitflags in parallel to us.
>>   */
>> -if (test_and_set_bit(AS_MM_ALL_LOCKS, >flags))
>> -BUG();
>> +BUG_ON(test_and_set_bit(AS_MM_ALL_LOCKS, >flags));
>>  down_write_nest_lock(>i_mmap_rwsem, >mmap_lock);
>>  }
>>  }
>> @@ -3701,9 +3698,8 @@ static void vm_unlock_anon_vma(struct anon_vma 
>> *anon_vma)
>>   * can't change from under us until we release the
>>   * anon_vma->root->rwsem.
>>   */
>> -if (!__test_and_clear_bit(0, (unsigned long *)
>> -  
>> _vma->root->rb_root.rb_root.rb_node))
>> -BUG();
>> +BUG_ON(!__test_and_clear_bit(0, (unsigned long *)
>> +_vma->root->rb_root.rb_root.rb_node));
>>  anon_vma_unlock_write(anon_vma);
>>  }
>>  }
>> @@ -3716,9 +3712,7 @@ static void vm_unlock_mapping(struct address_space 
>> *mapping)
>>   * because we hold the mm_all_locks_mutex.
>>   */
>>  i_mmap_unlock_write(mapping);
>> -if (!test_and_clear_bit(AS_MM_ALL_LOCKS,
>> ->flags))
>> -BUG();
>> +BUG_ON(!test_and_clear_bit(AS_MM_ALL_LOCKS, >flags));
>>  }
>>  }
>>  
>> -- 
>> 2.29.GIT

Re: [PATCH] docs/zh_CN: add Chinese booting and index file

2021-01-05 Thread Alex Shi




在 2021/1/5 下午5:19, siyant...@loongson.cn 写道:
> From: Yanteng Si 
> 
> This is the Chinese version of booting and index file
> 
> Signed-off-by: Yanteng Si 
> ---
>  .../translations/zh_CN/mips/booting.rst   | 47 +++
>  .../translations/zh_CN/mips/index.rst | 45 ++
>  2 files changed, 92 insertions(+)
>  create mode 100644 Documentation/translations/zh_CN/mips/booting.rst
>  create mode 100644 Documentation/translations/zh_CN/mips/index.rst
> 
> diff --git a/Documentation/translations/zh_CN/mips/booting.rst 
> b/Documentation/translations/zh_CN/mips/booting.rst
> new file mode 100644
> index ..12e0aa76b485
> --- /dev/null
> +++ b/Documentation/translations/zh_CN/mips/booting.rst
> @@ -0,0 +1,47 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +Chinese translated version of Documentation/mips/booting.rst
> +
> +If you have any comment or update to the content, please contact the
> +original document maintainer directly.  However, if you have a problem
> +communicating in English you can also ask the Chinese maintainer for
> +help.  Contact the Chinese maintainer if this translation is outdated
> +or if there is a problem with the translation.
> +
> +Chinese maintainer: Yanteng Si 
> +-
> +Documentation/mips/booting.rst 的中文翻译
> +
> +如果想评论或更新本文的内容，请直接联系原文档的维护者。如果你使用英文
> +交流有困难的话，也可以向中文版维护者求助。如果本翻译更新不及时或者翻
> +译存在问题，请联系中文版维护者。
> +
> +中文版维护者： 司延腾  Yanteng Si 
> +中文版翻译者： 司延腾  Yanteng Si 
> +中文版校译者： 司延腾  Yanteng Si 

Could you like to reuse disclaimer-zh_CN.rst, and the just keep the translator
is fine if all roles are youself.


> +
> +以下为正文
> +-
> +
> +BMIPS设备树引导
> +
> +
> +  一些bootloaders只支持在内核镜像开始地址处的单一入口点。而其它
> +  bootloaders将跳转到ELF的开始地址处。两种方案都被支持的；因为

How about the following changes?

s/被支持/支持/
> +  CONFIG_BOOT_RAW=y and CONFIG_NO_EXCEPT_FILL=y, 所以第一条指令
> +  会立即跳转到kernel_entry()入口处执行。
> +
> +  与arch/arm情况(b)类似，dt感知的引导加载程序需要设置以下寄存器:
> +
> + a0 : 0
> +
> + a1 : 0x
> +
> + a2 : RAM中指向设备树块的物理指针(在chapterII中定义)。
> +  设备树可以位于前512MB物理地址空间(0x -
> +  0x1fff)的任何位置，以64位边界对齐。
> +
> +  legacy bootloaders不会使用这样的约定，并且它们不传入DT块。

s/legacy/传统/

> +  在这种情况下，Linux将通过选中CONFIG_DT_*查找DTB。
> +
> +  这个约定只在32位系统中定义，因为目前没有任何64位的BMIPS实现。

s/这个/以上/

Thanks
Alex

> diff --git a/Documentation/translations/zh_CN/mips/index.rst 
> b/Documentation/translations/zh_CN/mips/index.rst
> new file mode 100644
> index ..244b16b7ef51
> --- /dev/null
> +++ b/Documentation/translations/zh_CN/mips/index.rst
> @@ -0,0 +1,45 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +Chinese translated version of Documentation/mips/index.rst
> +
> +If you have any comment or update to the content, please contact the
> +original document maintainer directly.  However, if you have a problem
> +communicating in English you can also ask the Chinese maintainer for
> +help.  Contact the Chinese maintainer if this translation is outdated
> +or if there is a problem with the translation.
> +
> +Chinese maintainer: Yanteng Si 
> +-
> +Documentation/mips/index.rst 的中文翻译
> +
> +如果想评论或更新本文的内容，请直接联系原文档的维护者。如果你使用英文
> +交流有困难的话，也可以向中文版维护者求助。如果本翻译更新不及时或者翻
> +译存在问题，请联系中文版维护者。
> +
> +中文版维护者： 司延腾  Yanteng Si 
> +中文版翻译者： 司延腾  Yanteng Si 
> +中文版校译者： 司延腾  Yanteng Si 
> +
> +以下为正文
> +-
> +
> +
> +===
> +MIPS特性文档
> +===
> +
> +.. toctree::
> +   :maxdepth: 2
> +   :numbered:
> +
> +   booting
> +   ingenic-tcu
> +
> +   features
> +
> +.. only::  subproject and html
> +
> +   Indices
> +   ===
> +
> +   * :ref:`genindex`
>

Re: [PATCH] mm/memcontrol: fix warning in mem_cgroup_page_lruvec()

2021-01-04 Thread Alex Shi

Reviewed-by: Alex Shi 

在 2021/1/4 下午1:03, Hugh Dickins 写道:
> Boot a CONFIG_MEMCG=y kernel with "cgroup_disabled=memory" and you are
> met by a series of warnings from the VM_WARN_ON_ONCE_PAGE(!memcg, page)
> recently added to the inline mem_cgroup_page_lruvec().
> 
> An earlier attempt to place that warning, in mem_cgroup_lruvec(), had
> been careful to do so after weeding out the mem_cgroup_disabled() case;
> but was itself invalid because of the mem_cgroup_lruvec(NULL, pgdat) in
> clear_pgdat_congested() and age_active_anon().
> 
> Warning in mem_cgroup_page_lruvec() was once useful in detecting a KSM
> charge bug, so may be worth keeping: but skip if mem_cgroup_disabled().
> 
> Fixes: 9a1ac2288cf1 ("mm/memcontrol:rewrite mem_cgroup_page_lruvec()")
> Signed-off-by: Hugh Dickins 
> ---
> 
>  include/linux/memcontrol.h |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- 5.11-rc2/include/linux/memcontrol.h   2020-12-27 20:39:36.751923135 
> -0800
> +++ linux/include/linux/memcontrol.h  2021-01-03 19:38:24.822978559 -0800
> @@ -665,7 +665,7 @@ static inline struct lruvec *mem_cgroup_
>  {
>   struct mem_cgroup *memcg = page_memcg(page);
>  
> - VM_WARN_ON_ONCE_PAGE(!memcg, page);
> + VM_WARN_ON_ONCE_PAGE(!memcg && !mem_cgroup_disabled(), page);
>   return mem_cgroup_lruvec(memcg, pgdat);
>  }
>  
>

[RFC PATCH 1/4] mm/swap.c: pre-sort pages in pagevec for pagevec_lru_move_fn

2020-12-25 Thread Alex Shi

Pages in pagevec may have different lruvec, so we have to do relock in
function pagevec_lru_move_fn(), but a relock may cause current cpu wait
for long time on the same lock for spinlock fairness reason.

Before per memcg lru_lock, we have to bear the relock since the spinlock
is the only way to serialize page's memcg/lruvec. Now TestClearPageLRU
could be used to isolate pages exculsively, and stable the page's
lruvec/memcg. So it gives us a chance to sort the page's lruvec before
moving action in pagevec_lru_move_fn. Then we don't suffer from the
spinlock's fairness wait.

Signed-off-by: Alex Shi 
Cc: Konstantin Khlebnikov 
Cc: Hugh Dickins 
Cc: Yu Zhao 
Cc: Michal Hocko 
Cc: Matthew Wilcox (Oracle) 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/swap.c | 92 +++
 1 file changed, 79 insertions(+), 13 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index c5363bdebe67..994641331bf7 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -201,29 +201,95 @@ int get_kernel_page(unsigned long start, int write, 
struct page **pages)
 }
 EXPORT_SYMBOL_GPL(get_kernel_page);
 
+/* Pratt's gaps for shell sort, https://en.wikipedia.org/wiki/Shellsort */
+static int gaps[] = { 6, 4, 3, 2, 1, 0};
+
+/* Shell sort pagevec[] on page's lruvec.*/
+static void shell_sort(struct pagevec *pvec, unsigned long *lvaddr)
+{
+   int g, i, j, n = pagevec_count(pvec);
+
+   for (g=0; gaps[g] > 0 && gaps[g] <= n/2; g++) {
+   int gap = gaps[g];
+
+   for (i = gap; i < n; i++) {
+   unsigned long tmp = lvaddr[i];
+   struct page *page = pvec->pages[i];
+
+   for (j = i - gap; j >= 0 && lvaddr[j] > tmp; j -= gap) {
+   lvaddr[j + gap] = lvaddr[j];
+   pvec->pages[j + gap] = pvec->pages[j];
+   }
+   lvaddr[j + gap] = tmp;
+   pvec->pages[j + gap] = page;
+   }
+   }
+}
+
+/* Get lru bit cleared page and their lruvec address, release the others */
+void sort_isopv(struct pagevec *pvec, struct pagevec *isopv,
+   unsigned long *lvaddr)
+{
+   int i, j;
+   struct pagevec busypv;
+
+   pagevec_init();
+
+   for (i = 0, j = 0; i < pagevec_count(pvec); i++) {
+   struct page *page = pvec->pages[i];
+
+   pvec->pages[i] = NULL;
+
+   /* block memcg migration during page moving between lru */
+   if (!TestClearPageLRU(page)) {
+   pagevec_add(, page);
+   continue;
+   }
+   lvaddr[j++] = (unsigned long)
+   mem_cgroup_page_lruvec(page, page_pgdat(page));
+   pagevec_add(isopv, page);
+   }
+   pagevec_reinit(pvec);
+   if (pagevec_count())
+   release_pages(busypv.pages, busypv.nr);
+
+   shell_sort(isopv, lvaddr);
+}
+
 static void pagevec_lru_move_fn(struct pagevec *pvec,
void (*move_fn)(struct page *page, struct lruvec *lruvec))
 {
-   int i;
+   int i, n;
struct lruvec *lruvec = NULL;
unsigned long flags = 0;
+   unsigned long lvaddr[PAGEVEC_SIZE];
+   struct pagevec isopv;
 
-   for (i = 0; i < pagevec_count(pvec); i++) {
-   struct page *page = pvec->pages[i];
+   pagevec_init();
 
-   /* block memcg migration during page moving between lru */
-   if (!TestClearPageLRU(page))
-   continue;
+   sort_isopv(pvec, , lvaddr);
 
-   lruvec = relock_page_lruvec_irqsave(page, lruvec, );
-   (*move_fn)(page, lruvec);
+   n = pagevec_count();
+   if (!n)
+   return;
 
-   SetPageLRU(page);
+   lruvec = (struct lruvec *)lvaddr[0];
+   spin_lock_irqsave(>lru_lock, flags);
+
+   for (i = 0; i < n; i++) {
+   /* lock new lruvec if lruvec changes, we have sorted them */
+   if (lruvec != (struct lruvec *)lvaddr[i]) {
+   spin_unlock_irqrestore(>lru_lock, flags);
+   lruvec = (struct lruvec *)lvaddr[i];
+   spin_lock_irqsave(>lru_lock, flags);
+   }
+
+   (*move_fn)(isopv.pages[i], lruvec);
+
+   SetPageLRU(isopv.pages[i]);
}
-   if (lruvec)
-   unlock_page_lruvec_irqrestore(lruvec, flags);
-   release_pages(pvec->pages, pvec->nr);
-   pagevec_reinit(pvec);
+   spin_unlock_irqrestore(>lru_lock, flags);
+   release_pages(isopv.pages, isopv.nr);
 }
 
 static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
-- 
2.29.GIT

[RFC PATCH 3/4] mm/swap.c: extend the usage to pagevec_lru_add

2020-12-25 Thread Alex Shi

The only different for __pagevec_lru_add and other page moving between
lru lists is page to add lru list has no need to do TestClearPageLRU and
set the lru bit back. So we could combound them with a clear lru bit
switch in sort function parameter.

Than all lru list operation functions could be united.

Signed-off-by: Alex Shi 
Cc: Konstantin Khlebnikov 
Cc: Hugh Dickins 
Cc: Yu Zhao 
Cc: Michal Hocko 
Cc: Matthew Wilcox (Oracle) 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/swap.c | 31 ---
 1 file changed, 12 insertions(+), 19 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index bb5300b7e321..9a2269e5099b 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -12,6 +12,7 @@
  * Started 18.12.91
  * Swap aging added 23.2.95, Stephen Tweedie.
  * Buffermem limits added 12.3.98, Rik van Riel.
+ * Pre-sort pagevec added 12.1.20, Alex Shi.
  */
 
 #include 
@@ -227,8 +228,8 @@ static void shell_sort(struct pagevec *pvec, unsigned long 
*lvaddr)
 }
 
 /* Get lru bit cleared page and their lruvec address, release the others */
-void sort_isopv(struct pagevec *pvec, struct pagevec *isopv,
-   unsigned long *lvaddr)
+static void sort_isopv(struct pagevec *pvec, struct pagevec *isopv,
+   unsigned long *lvaddr, bool clearlru)
 {
int i, j;
struct pagevec busypv;
@@ -242,7 +243,7 @@ void sort_isopv(struct pagevec *pvec, struct pagevec *isopv,
pvec->pages[i] = NULL;
 
/* block memcg migration during page moving between lru */
-   if (!TestClearPageLRU(page)) {
+   if (clearlru && !TestClearPageLRU(page)) {
pagevec_add(, page);
continue;
}
@@ -266,9 +267,13 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
unsigned long flags = 0;
unsigned long lvaddr[PAGEVEC_SIZE];
struct pagevec sortedpv;
+   bool clearlru;
+
+   /* don't clear lru bit for new page adding to lru */
+   clearlru = pvec != this_cpu_ptr(_pvecs.lru_add);
 
pagevec_init();
-   sort_isopv(pvec, , lvaddr);
+   sort_isopv(pvec, , lvaddr, clearlru);
 
n = pagevec_count();
if (!n)
@@ -287,7 +292,8 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
 
(*move_fn)(sortedpv.pages[i], lruvec);
 
-   SetPageLRU(sortedpv.pages[i]);
+   if (clearlru)
+   SetPageLRU(sortedpv.pages[i]);
}
spin_unlock_irqrestore(>lru_lock, flags);
release_pages(sortedpv.pages, sortedpv.nr);
@@ -,20 +1117,7 @@ static void __pagevec_lru_add_fn(struct page *page, 
struct lruvec *lruvec)
  */
 void __pagevec_lru_add(struct pagevec *pvec)
 {
-   int i;
-   struct lruvec *lruvec = NULL;
-   unsigned long flags = 0;
-
-   for (i = 0; i < pagevec_count(pvec); i++) {
-   struct page *page = pvec->pages[i];
-
-   lruvec = relock_page_lruvec_irqsave(page, lruvec, );
-   __pagevec_lru_add_fn(page, lruvec);
-   }
-   if (lruvec)
-   unlock_page_lruvec_irqrestore(lruvec, flags);
-   release_pages(pvec->pages, pvec->nr);
-   pagevec_reinit(pvec);
+   pagevec_lru_move_fn(pvec, __pagevec_lru_add_fn);
 }
 
 /**
-- 
2.29.GIT

[RFC PATCH 2/4] mm/swap.c: bail out early for no memcg and no numa

2020-12-25 Thread Alex Shi

If a system has memcg disabled and no numa node, like a embedded system,
there is no needs to do the pagevec sort, since only just one lruvec in
system. In this situation, we could skip the pagevec sorting.

Signed-off-by: Alex Shi 
Cc: Konstantin Khlebnikov 
Cc: Hugh Dickins 
Cc: Yu Zhao 
Cc: Michal Hocko 
Cc: Matthew Wilcox (Oracle) 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/swap.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index 994641331bf7..bb5300b7e321 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -235,6 +235,7 @@ void sort_isopv(struct pagevec *pvec, struct pagevec *isopv,
 
pagevec_init();
 
+
for (i = 0, j = 0; i < pagevec_count(pvec); i++) {
struct page *page = pvec->pages[i];
 
@@ -253,7 +254,8 @@ void sort_isopv(struct pagevec *pvec, struct pagevec *isopv,
if (pagevec_count())
release_pages(busypv.pages, busypv.nr);
 
-   shell_sort(isopv, lvaddr);
+   if (!mem_cgroup_disabled() || num_online_nodes() > 1)
+   shell_sort(isopv, lvaddr);
 }
 
 static void pagevec_lru_move_fn(struct pagevec *pvec,
@@ -263,13 +265,12 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
struct lruvec *lruvec = NULL;
unsigned long flags = 0;
unsigned long lvaddr[PAGEVEC_SIZE];
-   struct pagevec isopv;
-
-   pagevec_init();
+   struct pagevec sortedpv;
 
-   sort_isopv(pvec, , lvaddr);
+   pagevec_init();
+   sort_isopv(pvec, , lvaddr);
 
-   n = pagevec_count();
+   n = pagevec_count();
if (!n)
return;
 
@@ -284,12 +285,12 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
spin_lock_irqsave(>lru_lock, flags);
}
 
-   (*move_fn)(isopv.pages[i], lruvec);
+   (*move_fn)(sortedpv.pages[i], lruvec);
 
-   SetPageLRU(isopv.pages[i]);
+   SetPageLRU(sortedpv.pages[i]);
}
spin_unlock_irqrestore(>lru_lock, flags);
-   release_pages(isopv.pages, isopv.nr);
+   release_pages(sortedpv.pages, sortedpv.nr);
 }
 
 static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
-- 
2.29.GIT

[RFC PATCH 0/4] pre sort pages on lruvec in pagevec

2020-12-25 Thread Alex Shi

This idea was tried on per memcg lru lock patchset v18, and had a good
result, about 5%~20+% performance gain on lru lock busy benchmarks,
like case-lru-file-readtwice.

But on the latest kernel, I can not reproduce the result on my box.
Also I can not reproduce Tim's performance gain too on my box.

So I don't know if it's workable in some scenario, just sent out if
someone has interesting...

Alex Shi (4):
  mm/swap.c: pre-sort pages in pagevec for pagevec_lru_move_fn
  mm/swap.c: bail out early for no memcg and no numa
  mm/swap.c: extend the usage to pagevec_lru_add
  mm/swap.c: no sort if all page's lruvec are same

Cc: Konstantin Khlebnikov 
Cc: Hugh Dickins 
Cc: Yu Zhao 
Cc: Michal Hocko 
Cc: Matthew Wilcox (Oracle) 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org

 mm/swap.c | 118 +-
 1 file changed, 91 insertions(+), 27 deletions(-)

-- 
2.29.GIT

Re: [PATCH v2] docs/zh_CN: Improve Cinese transolation quality.

2020-12-23 Thread Alex Shi

在 2020/12/21 下午8:34, Ran Wang 写道:
> Hi Alex,
> 
> 
> On Monday, December 21, 2020 3:52 PM, Alex Shi wrote:
> 
>> 在 2020/12/19 上午11:42, Ran Wang 写道:
>>> Hi Jonathan,
>>>  
>>> On Tuesday, December 8, 2020 11:00 PM Jonathan Corbet wrote:
>>>  
>>>> On Tue,  8 Dec 2020 21:16:04 +0800
>>>> Ran Wang  wrote:
>>>>
>>>>> Signed-off-by: Ran Wang 
>>>>> ---
>>>>> Change in v2:
>>>>>    - For 'cn_development_coding' part, change back to >'是关于编码过程的'
>>>>>
>>>>>   .../translations/zh_CN/process/1.Intro.rst    | 61 >++-
>>>>>   1 file changed, 32 insertions(+), 29 deletions(-)
>>>>
>>>> Thank you for working to improve the documentation!  >Please, though,
>>>> include a changelog with your patch; what does "improve >translation
>>>> quality" mean here?
>>>>
>>>> Thanks,
>>>>
>>>> jon
>>>
>>> Sorry I missed your mail.
>>>
>>> Actually I feel difficult to list the change log to describe this (after 
>>> not I am not a language teacher :) ).
>>>
>>> I would say the original translation looks like a little bit more by 
>>> machine: English word to Chinese word directly without considering 
>>> particular scenarios (such as software development related terms we used in 
>>> Chinese, a little bit different to normal usage maybe). So I tried to 
>>> re-tell the story in a way more kind of 'human' to make everything clearer 
>>> for Chinese reader.
>>
>> Hi Ran,
>>
>> I don't think you describe correctly for your new translation. And you are 
>> not 're-tell story'
>> for a standard community co-work process, which we don't need. Also the 
>> original translation is
>> not by machine, it's by myself.
> 
> Sorry, I didn't mean to offend. You are right.

That's all right. I guess, few my translation is easy to be misunderstood, and 
I own some explantion:
like 2 places:

1, 'There are a great many "reasons" why kernel code should be merged into the 
...'
Here if 'reasons' translated as 'benefits/advantage', that would fits better in 
Chinese
converstion custom. (I don't strong oppose this). but usually we keep 'reasons' 
original meaning.

2, 'managing patches with git and reviewing patches "posted" by others.' 
Here the 'posted', I did think a lot on '发布' or '提交', It costed me sometime, 
But finally I used '发布'
not ‘提交’， since the latter is easy to be confused with git 'commit' in Chinese, 
while patches we hold
here were posted by email, not 'git'.

> 
>> What you did right is polishing the Chinese words, make it more fluency and 
>> fit better for Chinese
>> custom, although it costs a bit verboseness and a bit precision.> 
> The word 'polishing' is the perfect word to describe this, thank you.

We are not professional translators or interpreters (although my wife is one). 
:)
But there is a standards of the Chinese translation, fidelity, fluency, 
elegence, let's hang on fidelity,
and try best on fluency or elegence. :)

Thanks
Alex

Re: linux kernel新手想参与文档翻译

2020-12-23 Thread Alex Shi

CC linux-doc

?? 2020/12/23 11:03, ?? :
> ??
> ?0?2 ?0?2 ?0?2 
> Linuxkernel??Documentation/translations/zh_CN

 ?? 
??

> ?0?2 ?0?2 ?0?2 
> ??
> ?0?2 ?0?2 ?0?2 
> Documentation/translations/zh_CN??Documentation/??patch??CodingStyle

??coding style,  ?? coding 
style, 
2??
  make help ??make cleandocs/htmldocs  
 htmldocs 
,patch ??Jonathan Corbet 
, cc 

> ?0?2 ?0?2 ?0?2 
> 

 patch review ?? ??

> ?0?2 ?0?2 ?0?2
> ?0?2 ?0?2 ?0?2 

?? ??


Alex

Re: [PATCH v2 2/3] mm/memcg: remove rcu locking for lock_page_lruvec function series

2020-12-22 Thread Alex Shi

Cc: Hui Su and Alexander Duyck as Hugh suggested.

在 2020/12/22 下午1:20, Alex Shi 写道:
> lock_page_lruvec() and its variants used rcu_read_lock() with the
> intention of safeguarding against the mem_cgroup being destroyed
> concurrently; but so long as they are called under the specified
> conditions (as they are), there is no way for the page's mem_cgroup
> to be destroyed.  Delete the unnecessary rcu_read_lock() and _unlock().
> 
> Hugh Dickin's polished the commit log, Thanks a lot!
> 
> Signed-off-by: Alex Shi 
> Acked-by: Hugh Dickins 
> Cc: Hugh Dickins 
> Cc: Johannes Weiner 
> Cc: Michal Hocko 
> Cc: Vladimir Davydov 
> Cc: Andrew Morton 
> Cc: cgro...@vger.kernel.org
> Cc: linux...@kvack.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  mm/memcontrol.c | 6 --
>  1 file changed, 6 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 8d400efc81b9..0af13c4fe4b3 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1356,10 +1356,8 @@ struct lruvec *lock_page_lruvec(struct page *page)
>   struct lruvec *lruvec;
>   struct pglist_data *pgdat = page_pgdat(page);
>  
> - rcu_read_lock();
>   lruvec = mem_cgroup_page_lruvec(page, pgdat);
>   spin_lock(>lru_lock);
> - rcu_read_unlock();
>  
>   lruvec_memcg_debug(lruvec, page);
>  
> @@ -1371,10 +1369,8 @@ struct lruvec *lock_page_lruvec_irq(struct page *page)
>   struct lruvec *lruvec;
>   struct pglist_data *pgdat = page_pgdat(page);
>  
> - rcu_read_lock();
>   lruvec = mem_cgroup_page_lruvec(page, pgdat);
>   spin_lock_irq(>lru_lock);
> - rcu_read_unlock();
>  
>   lruvec_memcg_debug(lruvec, page);
>  
> @@ -1386,10 +1382,8 @@ struct lruvec *lock_page_lruvec_irqsave(struct page 
> *page, unsigned long *flags)
>   struct lruvec *lruvec;
>   struct pglist_data *pgdat = page_pgdat(page);
>  
> - rcu_read_lock();
>   lruvec = mem_cgroup_page_lruvec(page, pgdat);
>   spin_lock_irqsave(>lru_lock, *flags);
> - rcu_read_unlock();
>  
>   lruvec_memcg_debug(lruvec, page);
>  
>

Re: [PATCH 1/3] mm/memcg: revise the using condition of lock_page_lruvec function series

2020-12-21 Thread Alex Shi




在 2020/12/22 上午11:01, Hugh Dickins 写道:
> On Thu, 17 Dec 2020, Alex Shi wrote:
> 
>> The series function could be used under lock_page_memcg(), add this and
>> a bit style changes following commit_charge().
>>
>> Signed-off-by: Alex Shi 
>> Cc: Hugh Dickins 
> 
> This patch, or its intention,
> Acked-by: Hugh Dickins 
> but rewording suggested below, and requested above -
> which left me very puzzled before eventually I understood it.
> I don't think we need to talk about "a bit style changes",
> but the cross-reference to commit_charge() is helpful.
> 
> "
> lock_page_lruvec() and its variants are safe to use under the same
> conditions as commit_charge(): add lock_page_memcg() to the comment.
> "

Thanks a lot, Hugh. Yes, your commit log are far more better than mine. :)

I will resent with your changes and Ack.

Thanks!
Alex

> 
>> Cc: Johannes Weiner 
>> Cc: Michal Hocko 
>> Cc: Vladimir Davydov 
>> Cc: Andrew Morton 
>> Cc: cgro...@vger.kernel.org
>> Cc: linux...@kvack.org
>> Cc: linux-kernel@vger.kernel.org
>> ---
>>  mm/memcontrol.c | 9 +
>>  1 file changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index b80328f52fb4..e6b50d068b2f 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -1345,10 +1345,11 @@ void lruvec_memcg_debug(struct lruvec *lruvec, 
>> struct page *page)
>>   * lock_page_lruvec - lock and return lruvec for a given page.
>>   * @page: the page
>>   *
>> - * This series functions should be used in either conditions:
>> - * PageLRU is cleared or unset
>> - * or page->_refcount is zero
>> - * or page is locked.
>> + * This series functions should be used in any one of following conditions:
> 
> These functions are safe to use under any of the following conditions:
> 
>> + * - PageLRU is cleared or unset
>> + * - page->_refcount is zero
>> + * - page is locked.
> 
> Remove that full stop...
> 
>> + * - lock_page_memcg()
> 
> ... and, if you wish (I don't care), add full stop at the end of that line.
> 
> Maybe reorder those to the same order as listed in commit_charge().
> Copy its text exactly? I don't think so, actually, I find your wording
> (e.g. _refcount is zero) more explicit: good to have both descriptions.
> 
>>   */
>>  struct lruvec *lock_page_lruvec(struct page *page)
>>  {
>> -- 
>> 2.29.GIT

[PATCH v2 1/3] mm/memcg: revise the using condition of lock_page_lruvec function series

2020-12-21 Thread Alex Shi

lock_page_lruvec() and its variants are safe to use under the same
conditions as commit_charge(): add lock_page_memcg() to the comment.

Polished with Hugh Dickins' suggestions, thanks!

Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Vladimir Davydov 
Cc: Andrew Morton 
Cc: cgro...@vger.kernel.org
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/memcontrol.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b80328f52fb4..8d400efc81b9 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1345,10 +1345,11 @@ void lruvec_memcg_debug(struct lruvec *lruvec, struct 
page *page)
  * lock_page_lruvec - lock and return lruvec for a given page.
  * @page: the page
  *
- * This series functions should be used in either conditions:
- * PageLRU is cleared or unset
- * or page->_refcount is zero
- * or page is locked.
+ * These functions are safe to use under any of the following conditions:
+ * - page locked
+ * - PageLRU cleared
+ * - lock_page_memcg()
+ * - page->_refcount is zero
  */
 struct lruvec *lock_page_lruvec(struct page *page)
 {
-- 
2.29.GIT

[PATCH v2 3/3] mm/compaction: remove rcu_read_lock during page compaction

2020-12-21 Thread Alex Shi

isolate_migratepages_block() used rcu_read_lock() with the intention
of safeguarding against the mem_cgroup being destroyed concurrently;
but its TestClearPageLRU already protects against that.  Delete the
unnecessary rcu_read_lock() and _unlock().

Hugh Dickin' helped on commit log polishing, Thanks!

Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/compaction.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 8049d3530812..02af220fb992 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -995,7 +995,6 @@ isolate_migratepages_block(struct compact_control *cc, 
unsigned long low_pfn,
if (!TestClearPageLRU(page))
goto isolate_fail_put;
 
-   rcu_read_lock();
lruvec = mem_cgroup_page_lruvec(page, pgdat);
 
/* If we already hold the lock, we can skip some rechecking */
@@ -1005,7 +1004,6 @@ isolate_migratepages_block(struct compact_control *cc, 
unsigned long low_pfn,
 
compact_lock_irqsave(>lru_lock, , cc);
locked = lruvec;
-   rcu_read_unlock();
 
lruvec_memcg_debug(lruvec, page);
 
@@ -1026,8 +1024,7 @@ isolate_migratepages_block(struct compact_control *cc, 
unsigned long low_pfn,
SetPageLRU(page);
goto isolate_fail_put;
}
-   } else
-   rcu_read_unlock();
+   }
 
/* The whole page is taken off the LRU; skip the tail pages. */
if (PageCompound(page))
-- 
2.29.GIT

[PATCH v2 2/3] mm/memcg: remove rcu locking for lock_page_lruvec function series

2020-12-21 Thread Alex Shi

lock_page_lruvec() and its variants used rcu_read_lock() with the
intention of safeguarding against the mem_cgroup being destroyed
concurrently; but so long as they are called under the specified
conditions (as they are), there is no way for the page's mem_cgroup
to be destroyed.  Delete the unnecessary rcu_read_lock() and _unlock().

Hugh Dickin's polished the commit log, Thanks a lot!

Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Vladimir Davydov 
Cc: Andrew Morton 
Cc: cgro...@vger.kernel.org
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/memcontrol.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8d400efc81b9..0af13c4fe4b3 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1356,10 +1356,8 @@ struct lruvec *lock_page_lruvec(struct page *page)
struct lruvec *lruvec;
struct pglist_data *pgdat = page_pgdat(page);
 
-   rcu_read_lock();
lruvec = mem_cgroup_page_lruvec(page, pgdat);
spin_lock(>lru_lock);
-   rcu_read_unlock();
 
lruvec_memcg_debug(lruvec, page);
 
@@ -1371,10 +1369,8 @@ struct lruvec *lock_page_lruvec_irq(struct page *page)
struct lruvec *lruvec;
struct pglist_data *pgdat = page_pgdat(page);
 
-   rcu_read_lock();
lruvec = mem_cgroup_page_lruvec(page, pgdat);
spin_lock_irq(>lru_lock);
-   rcu_read_unlock();
 
lruvec_memcg_debug(lruvec, page);
 
@@ -1386,10 +1382,8 @@ struct lruvec *lock_page_lruvec_irqsave(struct page 
*page, unsigned long *flags)
struct lruvec *lruvec;
struct pglist_data *pgdat = page_pgdat(page);
 
-   rcu_read_lock();
lruvec = mem_cgroup_page_lruvec(page, pgdat);
spin_lock_irqsave(>lru_lock, *flags);
-   rcu_read_unlock();
 
lruvec_memcg_debug(lruvec, page);
 
-- 
2.29.GIT

Re: [PATCH v2] docs/zh_CN: Improve Cinese transolation quality.

2020-12-20 Thread Alex Shi




在 2020/12/19 上午11:42, Ran Wang 写道:
> Hi Jonathan,
> 
> On Tuesday, December 8, 2020 11:00 PM Jonathan Corbet wrote:
>  
>> On Tue,  8 Dec 2020 21:16:04 +0800
>> Ran Wang  wrote:
>>
>>> Signed-off-by: Ran Wang 
>>> ---
>>> Change in v2:
>>>    - For 'cn_development_coding' part, change back to >'是关于编码过程的'
>>>
>>>   .../translations/zh_CN/process/1.Intro.rst    | 61 >++-
>>>   1 file changed, 32 insertions(+), 29 deletions(-)
>>
>> Thank you for working to improve the documentation!  >Please, though,
>> include a changelog with your patch; what does "improve >translation
>> quality" mean here?
>>
>> Thanks,
>>
>> jon
> 
> Sorry I missed your mail.
> 
> Actually I feel difficult to list the change log to describe this (after not 
> I am not a language teacher :) ).
> 
> I would say the original translation looks like a little bit more by machine: 
> English word to Chinese word directly without considering particular 
> scenarios (such as software development related terms we used in Chinese, a 
> little bit different to normal usage maybe). So I tried to re-tell the story 
> in a way more kind of 'human' to make everything clearer for Chinese reader.

Hi Ran,

I don't think you describe correctly for your new translation. And you are not 
're-tell story'
for a standard community co-work process, which we don't need. Also the 
original translation is
not by machine, it's by myself.

What you did right is polishing the Chinese words, make it more fluency and fit 
better for Chinese
custom, although it costs a bit verboseness and a bit precision.


Thanks
Alex

> 
> Anyway, I am willing to provide you such change log if you could provide me 
> an example for reference (this is my first time to post such patch).
> 
> Thanks & Regards,
> Ran
>

[PATCH 2/3] mm/memcg: remove rcu locking for lock_page_lruvec function series

2020-12-16 Thread Alex Shi

The rcu_read_lock was used to block memcg destory, but with the detailed
calling conditions, the memcg won't gone since the page is hold. So we
don't need it now, let's remove them to save locking load in debugging.

Signed-off-by: Alex Shi 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Vladimir Davydov 
Cc: Andrew Morton 
Cc: cgro...@vger.kernel.org
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/memcontrol.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e6b50d068b2f..98bbee1d2faf 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1356,10 +1356,8 @@ struct lruvec *lock_page_lruvec(struct page *page)
struct lruvec *lruvec;
struct pglist_data *pgdat = page_pgdat(page);
 
-   rcu_read_lock();
lruvec = mem_cgroup_page_lruvec(page, pgdat);
spin_lock(>lru_lock);
-   rcu_read_unlock();
 
lruvec_memcg_debug(lruvec, page);
 
@@ -1371,10 +1369,8 @@ struct lruvec *lock_page_lruvec_irq(struct page *page)
struct lruvec *lruvec;
struct pglist_data *pgdat = page_pgdat(page);
 
-   rcu_read_lock();
lruvec = mem_cgroup_page_lruvec(page, pgdat);
spin_lock_irq(>lru_lock);
-   rcu_read_unlock();
 
lruvec_memcg_debug(lruvec, page);
 
@@ -1386,10 +1382,8 @@ struct lruvec *lock_page_lruvec_irqsave(struct page 
*page, unsigned long *flags)
struct lruvec *lruvec;
struct pglist_data *pgdat = page_pgdat(page);
 
-   rcu_read_lock();
lruvec = mem_cgroup_page_lruvec(page, pgdat);
spin_lock_irqsave(>lru_lock, *flags);
-   rcu_read_unlock();
 
lruvec_memcg_debug(lruvec, page);
 
-- 
2.29.GIT

[PATCH 3/3] mm/compaction: remove rcu_read_lock during page compaction

2020-12-16 Thread Alex Shi

rcu_read_lock was used to guard memcg destory, now TestClearPageLRU
could block this happen, so we don't need it. Remove it to reduce
locking load in debugging mode.

Signed-off-by: Alex Shi 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/compaction.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 8049d3530812..02af220fb992 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -995,7 +995,6 @@ isolate_migratepages_block(struct compact_control *cc, 
unsigned long low_pfn,
if (!TestClearPageLRU(page))
goto isolate_fail_put;
 
-   rcu_read_lock();
lruvec = mem_cgroup_page_lruvec(page, pgdat);
 
/* If we already hold the lock, we can skip some rechecking */
@@ -1005,7 +1004,6 @@ isolate_migratepages_block(struct compact_control *cc, 
unsigned long low_pfn,
 
compact_lock_irqsave(>lru_lock, , cc);
locked = lruvec;
-   rcu_read_unlock();
 
lruvec_memcg_debug(lruvec, page);
 
@@ -1026,8 +1024,7 @@ isolate_migratepages_block(struct compact_control *cc, 
unsigned long low_pfn,
SetPageLRU(page);
goto isolate_fail_put;
}
-   } else
-   rcu_read_unlock();
+   }
 
/* The whole page is taken off the LRU; skip the tail pages. */
if (PageCompound(page))
-- 
2.29.GIT

[PATCH 1/3] mm/memcg: revise the using condition of lock_page_lruvec function series

2020-12-16 Thread Alex Shi

The series function could be used under lock_page_memcg(), add this and
a bit style changes following commit_charge().

Signed-off-by: Alex Shi 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Vladimir Davydov 
Cc: Andrew Morton 
Cc: cgro...@vger.kernel.org
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/memcontrol.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b80328f52fb4..e6b50d068b2f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1345,10 +1345,11 @@ void lruvec_memcg_debug(struct lruvec *lruvec, struct 
page *page)
  * lock_page_lruvec - lock and return lruvec for a given page.
  * @page: the page
  *
- * This series functions should be used in either conditions:
- * PageLRU is cleared or unset
- * or page->_refcount is zero
- * or page is locked.
+ * This series functions should be used in any one of following conditions:
+ * - PageLRU is cleared or unset
+ * - page->_refcount is zero
+ * - page is locked.
+ * - lock_page_memcg()
  */
 struct lruvec *lock_page_lruvec(struct page *page)
 {
-- 
2.29.GIT

Re: [PATCH] mm/mmap: replace if (cond) BUG() with BUG_ON()

2020-12-11 Thread Alex Shi



I'm very sorry, a typo here. the patch should be updated:

>From ed4fa1c6d5bed5766c5f0c35af0c597855d7be06 Mon Sep 17 00:00:00 2001
From: Alex Shi 
Date: Fri, 11 Dec 2020 21:26:46 +0800
Subject: [PATCH] mm/mmap: replace if (cond) BUG() with BUG_ON()

coccinelle reports some warnings:
WARNING: Use BUG_ON instead of if condition followed by BUG.

It could be fixed by BUG_ON().

Reported-by: ab...@linux.alibaba.com
Signed-off-by: Alex Shi 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/mmap.c | 22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 8144fc3c5a78..107fa91bb59f 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -712,9 +712,8 @@ static void __insert_vm_struct(struct mm_struct *mm, struct 
vm_area_struct *vma)
struct vm_area_struct *prev;
struct rb_node **rb_link, *rb_parent;
 
-   if (find_vma_links(mm, vma->vm_start, vma->vm_end,
-  , _link, _parent))
-   BUG();
+   BUG_ON(find_vma_links(mm, vma->vm_start, vma->vm_end,
+  , _link, _parent));
__vma_link(mm, vma, prev, rb_link, rb_parent);
mm->map_count++;
 }
@@ -3585,9 +3584,8 @@ static void vm_lock_anon_vma(struct mm_struct *mm, struct 
anon_vma *anon_vma)
 * can't change from under us thanks to the
 * anon_vma->root->rwsem.
 */
-   if (__test_and_set_bit(0, (unsigned long *)
-  
_vma->root->rb_root.rb_root.rb_node))
-   BUG();
+   BUG_ON(__test_and_set_bit(0, (unsigned long *)
+   _vma->root->rb_root.rb_root.rb_node));
}
 }
 
@@ -3603,8 +3601,7 @@ static void vm_lock_mapping(struct mm_struct *mm, struct 
address_space *mapping)
 * mm_all_locks_mutex, there may be other cpus
 * changing other bitflags in parallel to us.
 */
-   if (test_and_set_bit(AS_MM_ALL_LOCKS, >flags))
-   BUG();
+   BUG_ON(test_and_set_bit(AS_MM_ALL_LOCKS, >flags));
down_write_nest_lock(>i_mmap_rwsem, >mmap_lock);
}
 }
@@ -3701,9 +3698,8 @@ static void vm_unlock_anon_vma(struct anon_vma *anon_vma)
 * can't change from under us until we release the
 * anon_vma->root->rwsem.
 */
-   if (!__test_and_clear_bit(0, (unsigned long *)
- 
_vma->root->rb_root.rb_root.rb_node))
-   BUG();
+   BUG_ON(!__test_and_clear_bit(0, (unsigned long *)
+   _vma->root->rb_root.rb_root.rb_node));
anon_vma_unlock_write(anon_vma);
}
 }
@@ -3716,9 +3712,7 @@ static void vm_unlock_mapping(struct address_space 
*mapping)
 * because we hold the mm_all_locks_mutex.
 */
i_mmap_unlock_write(mapping);
-   if (!test_and_clear_bit(AS_MM_ALL_LOCKS,
-   >flags))
-   BUG();
+   BUG_ON(!test_and_clear_bit(AS_MM_ALL_LOCKS, >flags));
}
 }
 
-- 
2.29.GIT

[PATCH] mm/zsmalloc: replace if (cond) BUG() with BUG_ON()

2020-12-11 Thread Alex Shi

coccinelle reports some warning:
WARNING: Use BUG_ON instead of if condition followed by BUG.

It could be fixed by BUG_ON().

Reported-by: ab...@linux.alibaba.com
Signed-off-by: Alex Shi 
Cc: Minchan Kim 
Cc: Nitin Gupta 
Cc: Sergey Senozhatsky 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/zsmalloc.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 7289f502ffac..1ea0605dbe94 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1988,8 +1988,7 @@ static int zs_page_migrate(struct address_space *mapping, 
struct page *newpage,
head = obj_to_head(page, addr);
if (head & OBJ_ALLOCATED_TAG) {
handle = head & ~OBJ_ALLOCATED_TAG;
-   if (!testpin_tag(handle))
-   BUG();
+   BUG_ON(!testpin_tag(handle));
 
old_obj = handle_to_obj(handle);
obj_to_location(old_obj, , _idx);
@@ -2036,8 +2035,8 @@ static int zs_page_migrate(struct address_space *mapping, 
struct page *newpage,
head = obj_to_head(page, addr);
if (head & OBJ_ALLOCATED_TAG) {
handle = head & ~OBJ_ALLOCATED_TAG;
-   if (!testpin_tag(handle))
-   BUG();
+   BUG_ON(!testpin_tag(handle));
+
unpin_tag(handle);
}
}
-- 
2.29.GIT

[PATCH] mm/mmap: replace if (cond) BUG() with BUG_ON()

2020-12-11 Thread Alex Shi

coccinelle reports some warnings:
WARNING: Use BUG_ON instead of if condition followed by BUG.

It could be fixed by BUG_ON().

Reported-by: ab...@linux.alibaba.com
Signed-off-by: Alex Shi 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/mmap.c | 22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 8144fc3c5a78..2dab93dedae6 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -712,9 +712,8 @@ static void __insert_vm_struct(struct mm_struct *mm, struct 
vm_area_struct *vma)
struct vm_area_struct *prev;
struct rb_node **rb_link, *rb_parent;
 
-   if (find_vma_links(mm, vma->vm_start, vma->vm_end,
-  , _link, _parent))
-   BUG();
+   BUF_ON(find_vma_links(mm, vma->vm_start, vma->vm_end,
+  , _link, _parent));
__vma_link(mm, vma, prev, rb_link, rb_parent);
mm->map_count++;
 }
@@ -3585,9 +3584,8 @@ static void vm_lock_anon_vma(struct mm_struct *mm, struct 
anon_vma *anon_vma)
 * can't change from under us thanks to the
 * anon_vma->root->rwsem.
 */
-   if (__test_and_set_bit(0, (unsigned long *)
-  
_vma->root->rb_root.rb_root.rb_node))
-   BUG();
+   BUG_ON(__test_and_set_bit(0, (unsigned long *)
+   _vma->root->rb_root.rb_root.rb_node));
}
 }
 
@@ -3603,8 +3601,7 @@ static void vm_lock_mapping(struct mm_struct *mm, struct 
address_space *mapping)
 * mm_all_locks_mutex, there may be other cpus
 * changing other bitflags in parallel to us.
 */
-   if (test_and_set_bit(AS_MM_ALL_LOCKS, >flags))
-   BUG();
+   BUG_ON(test_and_set_bit(AS_MM_ALL_LOCKS, >flags));
down_write_nest_lock(>i_mmap_rwsem, >mmap_lock);
}
 }
@@ -3701,9 +3698,8 @@ static void vm_unlock_anon_vma(struct anon_vma *anon_vma)
 * can't change from under us until we release the
 * anon_vma->root->rwsem.
 */
-   if (!__test_and_clear_bit(0, (unsigned long *)
- 
_vma->root->rb_root.rb_root.rb_node))
-   BUG();
+   BUG_ON(!__test_and_clear_bit(0, (unsigned long *)
+   _vma->root->rb_root.rb_root.rb_node));
anon_vma_unlock_write(anon_vma);
}
 }
@@ -3716,9 +3712,7 @@ static void vm_unlock_mapping(struct address_space 
*mapping)
 * because we hold the mm_all_locks_mutex.
 */
i_mmap_unlock_write(mapping);
-   if (!test_and_clear_bit(AS_MM_ALL_LOCKS,
-   >flags))
-   BUG();
+   BUG_ON(!test_and_clear_bit(AS_MM_ALL_LOCKS, >flags));
}
 }
 
-- 
2.29.GIT

Re: [PATCH 00/11] mm: lru related cleanups

2020-12-10 Thread Alex Shi

Hi Yu,

btw, after this patchset, to do cacheline alignment on each of lru lists
are possible, so did you try that to see performance changes?

Thanks
Alex

在 2020/12/8 上午6:09, Yu Zhao 写道:
> The cleanups are intended to reduce the verbosity in lru list
> operations and make them less error-prone. A typical example
> would be how the patches change __activate_page():
> 
>  static void __activate_page(struct page *page, struct lruvec *lruvec)
>  {
>   if (!PageActive(page) && !PageUnevictable(page)) {
> - int lru = page_lru_base_type(page);
>   int nr_pages = thp_nr_pages(page);
>  
> - del_page_from_lru_list(page, lruvec, lru);
> + del_page_from_lru_list(page, lruvec);
>   SetPageActive(page);
> - lru += LRU_ACTIVE;
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
>   trace_mm_lru_activate(page);
>  
> There are a few more places like __activate_page() and they are
> unnecessarily repetitive in terms of figuring out which list a page
> should be added onto or deleted from. And with the duplicated code
> removed, they are easier to read, IMO.
> 
> Patch 1 to 5 basically cover the above. Patch 6 and 7 make code more
> robust by improving bug reporting. Patch 8, 9 and 10 take care of
> some dangling helpers left in header files. Patch 11 isn't strictly a
> clean-up patch, but it seems still relevant to include it here.
> 
> Yu Zhao (11):
>   mm: use add_page_to_lru_list()
>   mm: shuffle lru list addition and deletion functions
>   mm: don't pass "enum lru_list" to lru list addition functions
>   mm: don't pass "enum lru_list" to trace_mm_lru_insertion()
>   mm: don't pass "enum lru_list" to del_page_from_lru_list()
>   mm: add __clear_page_lru_flags() to replace page_off_lru()
>   mm: VM_BUG_ON lru page flags
>   mm: fold page_lru_base_type() into its sole caller
>   mm: fold __update_lru_size() into its sole caller
>   mm: make lruvec_lru_size() static
>   mm: enlarge the "int nr_pages" parameter of update_lru_size()
> 
>  include/linux/memcontrol.h |  10 +--
>  include/linux/mm_inline.h  | 115 ++---
>  include/linux/mmzone.h |   2 -
>  include/linux/vmstat.h |   6 +-
>  include/trace/events/pagemap.h |  11 ++--
>  mm/compaction.c|   2 +-
>  mm/memcontrol.c|  10 +--
>  mm/mlock.c |   3 +-
>  mm/swap.c  |  50 ++
>  mm/vmscan.c|  21 ++
>  10 files changed, 91 insertions(+), 139 deletions(-)
>

Re: [PATCH v2] docs/zh_CN: Improve Cinese transolation quality.

2020-12-08 Thread Alex Shi

Reviewed-by: Alex Shi 


在 2020/12/8 下午9:16, Ran Wang 写道:
> Signed-off-by: Ran Wang 
> ---
> Change in v2:
>   - For 'cn_development_coding' part, change back to '是关于编码过程的'
> 
>  .../translations/zh_CN/process/1.Intro.rst| 61 ++-
>  1 file changed, 32 insertions(+), 29 deletions(-)
> 
> diff --git a/Documentation/translations/zh_CN/process/1.Intro.rst 
> b/Documentation/translations/zh_CN/process/1.Intro.rst
> index 10a15f3dc282..f05405d96c51 100644
> --- a/Documentation/translations/zh_CN/process/1.Intro.rst
> +++ b/Documentation/translations/zh_CN/process/1.Intro.rst
> @@ -11,33 +11,35 @@
>  执行摘要
>  
>  
> -本节的其余部分涵盖了内核开发过程的范围，以及开发人员及其雇主在这方面可能遇
> -到的各种挫折。内核代码应该合并到正式的（“主线”）内核中有很多原因，包括对用
> -户的自动可用性、多种形式的社区支持以及影响内核开发方向的能力。提供给Linux
> -内核的代码必须在与GPL兼容的许可证下可用。
> +本节的其余部分介绍了内核开发流程相关知识，其中包括开发者及其雇主在这方面可能遇
> +到的各种问题。内核代码合并到官方的（“主线”）仓库会有很多好处，比如能在第一时
> +间让用户获得更新，可以从社区得到多种形式的支持，以及能够以此影响内核未来发展方
> +向。需要注意提供给Linux内核的代码必须是在与GPL兼容的许可证下使用。
>  
> -:ref:`cn_development_process` 介绍了开发过程、内核发布周期和合并窗口的机制。
> -涵盖了补丁开发、审查和合并周期中的各个阶段。有一些关于工具和邮件列表的讨论。
> -鼓励希望开始内核开发的开发人员作为初始练习跟踪并修复bug。
> +:ref:`cn_development_process` 介绍了内核开发流程、发布周期以及合并窗口期相关的
> +机制。同时还讲解了补丁开发、审查和合并周期各个阶段要点。此外还包括一些关于工具
> +和邮件列表的讨论。我们建议那些希望开始内核开发的开发者们以跟踪并修复bug作为初
> +始练习。
>  
>  
> -:ref:`cn_development_early_stage` 包括早期项目规划，重点是尽快让开发社区参与
> +:ref:`cn_development_early_stage` 介绍项目早期的工作规划，重点是尽快让开发社区
> +有机会参与到规划中
>  
> -:ref:`cn_development_coding` 是关于编码过程的；讨论了其他开发人员遇到的几个
> -陷阱。对补丁的一些要求已经涵盖，并且介绍了一些工具，这些工具有助于确保内核
> -补丁是正确的。
> +:ref:`cn_development_coding` 是关于编码过程的；讨论了一些其他开发者曾经走入到
> +的误区。并介绍社区对补丁的要求，同时指导如何通过使用一些工具来帮助确保内核补
> +丁的质量。
>  
> -:ref:`cn_development_posting` 讨论发布补丁以供评审的过程。为了让开发社区
> -认真对待，补丁必须正确格式化和描述，并且必须发送到正确的地方。遵循本节中的
> -建议有助于确保为您的工作提供最好的接纳。
> +:ref:`cn_development_posting` 介绍发布补丁以供评审的流程。补丁只有在符合特定的
> +格式及正确描述，并且发送到正确的地方，开发社区才有可能对其认真审查。遵循本节中
> +的建议有助于确保为您的工作成果提供最好的接纳。
>  
> -:ref:`cn_development_followthrough` 介绍了发布补丁之后发生的事情；该工作
> -在这一点上还远远没有完成。与审阅者一起工作是开发过程中的一个重要部分；本节
> -提供了一些关于如何在这个重要阶段避免问题的提示。当补丁被合并到主线中时，
> -开发人员要注意不要假定任务已经完成。
> +:ref:`cn_development_followthrough` 介绍了提交补丁之后发生的事情；至此工作实际
> +上还远未完成。与审阅者一起合作是开发过程中的重要部分；本节提供了一些关于如何在
> +这个重要阶段避免出现问题的提示。此外，即使当补丁已经被合并到主线中，开发者也不
> +能认为任务就此完成。
>  
>  :ref:`cn_development_advancedtopics` 介绍了两个“高级”主题：
> -使用Git管理补丁和查看其他人发布的补丁。
> +使用Git管理补丁和查看其他人提交的补丁。
>  
>  :ref:`cn_development_conclusion` 总结了有关内核开发的更多信息，附带有带有
>  指向资源的链接.
> @@ -62,19 +64,20 @@ Linux最引人注目的特性之一是这些开发人员可以访问它；任何
>  内核开发周期可以涉及1000多个开发人员，他们为100多个不同的公司
>  （或者根本没有公司）工作。
>  
> -与内核开发社区合作并不是特别困难。但是，尽管如此，许多潜在的贡献者在尝试做
> -内核工作时遇到了困难。内核社区已经发展了自己独特的操作方式，使其能够在每天
> -都要更改数千行代码的环境中顺利运行（并生成高质量的产品）。因此，Linux内核开发
> +与内核开发社区合作并不是特别困难。但是，尽管如此，许多潜在的贡献者在尝试参与
> +内核开发时遇到了困难。内核社区已经发展了自己独特的开发流程，使其能够在每天
> +都要更改数千行代码的环境中顺利运转（并生成高质量的产品）。因此，Linux内核开发
>  过程与专有的开发方法有很大的不同也就不足为奇了。
>  
> -对于新开发人员来说，内核的开发过程可能会让人感到奇怪和恐惧，但这个背后有充分的
> -理由和坚实的经验。一个不了解内核社区的方式的开发人员（或者更糟的是，他们试图
> -抛弃或规避内核社区的方式）会有一个令人沮丧的体验。开发社区, 在帮助那些试图学习
> -的人的同时，没有时间帮助那些不愿意倾听或不关心开发过程的人。
> +对于新开发者来说，内核的开发流程可能会让人感到陌生和望而生畏，但这个背后其实
> +是有充分的理由和坚实的实际经验作支撑。一个不了解内核社区工作方式的开发者（或
> +者更糟的是，如果他们试图抛弃或规避内核社区的方式）将会有一个令人沮丧的体验。
> +毕竟开发社区在帮助那些试图学习的人的同时，没有时间帮助那些不愿意倾听或不关心
> +开发流程的人。
>  
> -希望阅读本文的人能够避免这种令人沮丧的经历。这里有很多材料，但阅读时所做的
> -努力会在短时间内得到回报。开发社区总是需要能让内核变更好的开发人员；下面的
> -文本应该帮助您或为您工作的人员加入我们的社区。
> +希望大家能通过阅读本文来避免那些令人沮丧的经历。这里有很多材料，请相信阅读这
> +些所付出的努力将会在短时间内得到回报。开发社区总是需要那些能让内核变更好的
> +开发者；下面的文章应当能帮助您或为您工作的人加入我们的社区。
>  
>  致谢
>  
>

Re: [PATCH 11/11] mm: enlarge the "int nr_pages" parameter of update_lru_size()

2020-12-08 Thread Alex Shi

Reviewed-by: Alex Shi 

在 2020/12/8 上午6:09, Yu Zhao 写道:
> update_lru_sizes() defines an unsigned long argument and passes it as
> nr_pages to update_lru_size(). Though this isn't causing any overflows
> I'm aware of, it's a bad idea to go through the demotion given that we
> have recently stumbled on a related type promotion problem fixed by
> commit 2da9f6305f30 ("mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit")
> 
> Note that the underlying counters are already in long. This is another
> reason we shouldn't have the demotion.
> 
> This patch enlarges all relevant parameters on the path to the final
> underlying counters:
>   update_lru_size(int -> long)
>   if memcg:
>   __mod_lruvec_state(int -> long)
>   if smp:
>   __mod_node_page_state(long)
>   else:
>   __mod_node_page_state(int -> long)
>   __mod_memcg_lruvec_state(int -> long)
>   __mod_memcg_state(int -> long)
>   else:
>   __mod_lruvec_state(int -> long)
>   if smp:
>   __mod_node_page_state(long)
>   else:
>   __mod_node_page_state(int -> long)
> 
>   __mod_zone_page_state(long)
> 
>   if memcg:
>   mem_cgroup_update_lru_size(int -> long)
> 
> Note that __mod_node_page_state() for the smp case and
> __mod_zone_page_state() already use long. So this change also fixes
> the inconsistency.
> 
> Signed-off-by: Yu Zhao 
> ---
>  include/linux/memcontrol.h | 10 +-
>  include/linux/mm_inline.h  |  2 +-
>  include/linux/vmstat.h |  6 +++---
>  mm/memcontrol.c| 10 +-
>  4 files changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 3febf64d1b80..1454201abb8d 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -810,7 +810,7 @@ static inline bool mem_cgroup_online(struct mem_cgroup 
> *memcg)
>  int mem_cgroup_select_victim_node(struct mem_cgroup *memcg);
>  
>  void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
> - int zid, int nr_pages);
> + int zid, long nr_pages);
>  
>  static inline
>  unsigned long mem_cgroup_get_zone_lru_size(struct lruvec *lruvec,
> @@ -896,7 +896,7 @@ static inline unsigned long memcg_page_state_local(struct 
> mem_cgroup *memcg,
>   return x;
>  }
>  
> -void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val);
> +void __mod_memcg_state(struct mem_cgroup *memcg, int idx, long val);
>  
>  /* idx can be of type enum memcg_stat_item or node_stat_item */
>  static inline void mod_memcg_state(struct mem_cgroup *memcg,
> @@ -948,7 +948,7 @@ static inline unsigned long 
> lruvec_page_state_local(struct lruvec *lruvec,
>  }
>  
>  void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
> -   int val);
> +   long val);
>  void __mod_lruvec_kmem_state(void *p, enum node_stat_item idx, int val);
>  
>  static inline void mod_lruvec_kmem_state(void *p, enum node_stat_item idx,
> @@ -1346,7 +1346,7 @@ static inline unsigned long 
> memcg_page_state_local(struct mem_cgroup *memcg,
>  
>  static inline void __mod_memcg_state(struct mem_cgroup *memcg,
>int idx,
> -  int nr)
> +  long nr)
>  {
>  }
>  
> @@ -1369,7 +1369,7 @@ static inline unsigned long 
> lruvec_page_state_local(struct lruvec *lruvec,
>  }
>  
>  static inline void __mod_memcg_lruvec_state(struct lruvec *lruvec,
> - enum node_stat_item idx, int val)
> + enum node_stat_item idx, long val)
>  {
>  }
>  
> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> index 355ea1ee32bd..18e85071b44a 100644
> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -26,7 +26,7 @@ static inline int page_is_file_lru(struct page *page)
>  
>  static __always_inline void update_lru_size(struct lruvec *lruvec,
>   enum lru_list lru, enum zone_type zid,
> - int nr_pages)
> + long nr_pages)
>  {
>   struct pglist_data *pgdat = lruvec_pgdat(lruvec);
>  
> diff --g

Re: [PATCH 10/11] mm: make lruvec_lru_size() static

2020-12-08 Thread Alex Shi

Reviewed-by: Alex Shi 

在 2020/12/8 上午6:09, Yu Zhao 写道:
> All other references to the function were removed after
> commit b910718a948a ("mm: vmscan: detect file thrashing at the reclaim root")
> 
> Signed-off-by: Yu Zhao 
> ---
>  include/linux/mmzone.h | 2 --
>  mm/vmscan.c| 3 ++-
>  2 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index b593316bff3d..2fc54e269eaf 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -872,8 +872,6 @@ static inline struct pglist_data *lruvec_pgdat(struct 
> lruvec *lruvec)
>  #endif
>  }
>  
> -extern unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list 
> lru, int zone_idx);
> -
>  #ifdef CONFIG_HAVE_MEMORYLESS_NODES
>  int local_memory_node(int node_id);
>  #else
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 95e581c9d9af..fd0c2313bee4 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -310,7 +310,8 @@ unsigned long zone_reclaimable_pages(struct zone *zone)
>   * @lru: lru to use
>   * @zone_idx: zones to consider (use MAX_NR_ZONES for the whole LRU list)
>   */
> -unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru, int 
> zone_idx)
> +static unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list 
> lru,
> +  int zone_idx)
>  {
>   unsigned long size = 0;
>   int zid;
>

Re: [PATCH 09/11] mm: fold __update_lru_size() into its sole caller

2020-12-08 Thread Alex Shi

Reviewed-by: Alex Shi 

在 2020/12/8 上午6:09, Yu Zhao 写道:
> All other references to the function were removed after
> commit a892cb6b977f ("mm/vmscan.c: use update_lru_size() in 
> update_lru_sizes()")
> 
> Signed-off-by: Yu Zhao 
> ---
>  include/linux/mm_inline.h | 9 +
>  1 file changed, 1 insertion(+), 8 deletions(-)
> 
> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> index 7183c7a03f09..355ea1ee32bd 100644
> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -24,7 +24,7 @@ static inline int page_is_file_lru(struct page *page)
>   return !PageSwapBacked(page);
>  }
>  
> -static __always_inline void __update_lru_size(struct lruvec *lruvec,
> +static __always_inline void update_lru_size(struct lruvec *lruvec,
>   enum lru_list lru, enum zone_type zid,
>   int nr_pages)
>  {
> @@ -33,13 +33,6 @@ static __always_inline void __update_lru_size(struct 
> lruvec *lruvec,
>   __mod_lruvec_state(lruvec, NR_LRU_BASE + lru, nr_pages);
>   __mod_zone_page_state(>node_zones[zid],
>   NR_ZONE_LRU_BASE + lru, nr_pages);
> -}
> -
> -static __always_inline void update_lru_size(struct lruvec *lruvec,
> - enum lru_list lru, enum zone_type zid,
> - int nr_pages)
> -{
> - __update_lru_size(lruvec, lru, zid, nr_pages);
>  #ifdef CONFIG_MEMCG
>   mem_cgroup_update_lru_size(lruvec, lru, zid, nr_pages);
>  #endif
>

Re: [PATCH 08/11] mm: fold page_lru_base_type() into its sole caller

2020-12-08 Thread Alex Shi

Reviewed-by: Alex Shi 

在 2020/12/8 上午6:09, Yu Zhao 写道:
> We've removed all other references to this function.
> 
> Signed-off-by: Yu Zhao 
> ---
>  include/linux/mm_inline.h | 27 ++-
>  1 file changed, 6 insertions(+), 21 deletions(-)
> 
> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> index 6d907a4dd6ad..7183c7a03f09 100644
> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -45,21 +45,6 @@ static __always_inline void update_lru_size(struct lruvec 
> *lruvec,
>  #endif
>  }
>  
> -/**
> - * page_lru_base_type - which LRU list type should a page be on?
> - * @page: the page to test
> - *
> - * Used for LRU list index arithmetic.
> - *
> - * Returns the base LRU type - file or anon - @page should be on.
> - */
> -static inline enum lru_list page_lru_base_type(struct page *page)
> -{
> - if (page_is_file_lru(page))
> - return LRU_INACTIVE_FILE;
> - return LRU_INACTIVE_ANON;
> -}
> -
>  /**
>   * __clear_page_lru_flags - clear page lru flags before releasing a page
>   * @page: the page that was on lru and now has a zero reference
> @@ -92,12 +77,12 @@ static __always_inline enum lru_list page_lru(struct page 
> *page)
>   VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);
>  
>   if (PageUnevictable(page))
> - lru = LRU_UNEVICTABLE;
> - else {
> - lru = page_lru_base_type(page);
> - if (PageActive(page))
> - lru += LRU_ACTIVE;
> - }
> + return LRU_UNEVICTABLE;
> +
> + lru = page_is_file_lru(page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
> + if (PageActive(page))
> + lru += LRU_ACTIVE;
> +
>   return lru;
>  }
>  
>

Re: [PATCH 04/11] mm: don't pass "enum lru_list" to trace_mm_lru_insertion()

2020-12-08 Thread Alex Shi

Reviewed-by: Alex Shi 

在 2020/12/8 上午6:09, Yu Zhao 写道:
> The parameter is redundant in the sense that it can be extracted
> from the "struct page" parameter by page_lru() correctly.
> 
> Signed-off-by: Yu Zhao 
> ---
>  include/trace/events/pagemap.h | 11 ---
>  mm/swap.c  |  5 +
>  2 files changed, 5 insertions(+), 11 deletions(-)
> 
> diff --git a/include/trace/events/pagemap.h b/include/trace/events/pagemap.h
> index 8fd1babae761..e1735fe7c76a 100644
> --- a/include/trace/events/pagemap.h
> +++ b/include/trace/events/pagemap.h
> @@ -27,24 +27,21 @@
>  
>  TRACE_EVENT(mm_lru_insertion,
>  
> - TP_PROTO(
> - struct page *page,
> - int lru
> - ),
> + TP_PROTO(struct page *page),
>  
> - TP_ARGS(page, lru),
> + TP_ARGS(page),
>  
>   TP_STRUCT__entry(
>   __field(struct page *,  page)
>   __field(unsigned long,  pfn )
> - __field(int,lru )
> + __field(enum lru_list,  lru )
>   __field(unsigned long,  flags   )
>   ),
>  
>   TP_fast_assign(
>   __entry->page   = page;
>   __entry->pfn= page_to_pfn(page);
> - __entry->lru= lru;
> + __entry->lru= page_lru(page);
>   __entry->flags  = trace_pagemap_flags(page);
>   ),
>  
> diff --git a/mm/swap.c b/mm/swap.c
> index 136acabbfab5..e053b4db108a 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -957,7 +957,6 @@ EXPORT_SYMBOL(__pagevec_release);
>  
>  static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec)
>  {
> - enum lru_list lru;
>   int was_unevictable = TestClearPageUnevictable(page);
>   int nr_pages = thp_nr_pages(page);
>  
> @@ -993,11 +992,9 @@ static void __pagevec_lru_add_fn(struct page *page, 
> struct lruvec *lruvec)
>   smp_mb__after_atomic();
>  
>   if (page_evictable(page)) {
> - lru = page_lru(page);
>   if (was_unevictable)
>   __count_vm_events(UNEVICTABLE_PGRESCUED, nr_pages);
>   } else {
> - lru = LRU_UNEVICTABLE;
>   ClearPageActive(page);
>   SetPageUnevictable(page);
>   if (!was_unevictable)
> @@ -1005,7 +1002,7 @@ static void __pagevec_lru_add_fn(struct page *page, 
> struct lruvec *lruvec)
>   }
>  
>   add_page_to_lru_list(page, lruvec);
> - trace_mm_lru_insertion(page, lru);
> + trace_mm_lru_insertion(page);
>  }
>  
>  struct lruvecs {
>

Re: [PATCH 03/11] mm: don't pass "enum lru_list" to lru list addition functions

2020-12-08 Thread Alex Shi




在 2020/12/8 上午6:09, Yu Zhao 写道:
> The "enum lru_list" parameter to add_page_to_lru_list() and
> add_page_to_lru_list_tail() is redundant in the sense that it can
> be extracted from the "struct page" parameter by page_lru().
> 
> A caveat is that we need to make sure PageActive() or
> PageUnevictable() is correctly set or cleared before calling
> these two functions. And they are indeed.
> 
> Signed-off-by: Yu Zhao 
> ---
>  include/linux/mm_inline.h |  8 ++--
>  mm/swap.c | 15 +++
>  mm/vmscan.c   |  6 ++
>  3 files changed, 15 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> index 2889741f450a..130ba3201d3f 100644
> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -106,15 +106,19 @@ static __always_inline enum lru_list page_lru(struct 
> page *page)
>  }
>  
>  static __always_inline void add_page_to_lru_list(struct page *page,
> - struct lruvec *lruvec, enum lru_list lru)
> + struct lruvec *lruvec)
>  {
> + enum lru_list lru = page_lru(page);
> +
>   update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
>   list_add(>lru, >lists[lru]);
>  }
>  
>  static __always_inline void add_page_to_lru_list_tail(struct page *page,
> - struct lruvec *lruvec, enum lru_list lru)
> + struct lruvec *lruvec)
>  {
> + enum lru_list lru = page_lru(page);
> +
>   update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
>   list_add_tail(>lru, >lists[lru]);
>  }
> diff --git a/mm/swap.c b/mm/swap.c
> index 5022dfe388ad..136acabbfab5 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -231,7 +231,7 @@ static void pagevec_move_tail_fn(struct page *page, 
> struct lruvec *lruvec)
>   if (!PageUnevictable(page)) {
>   del_page_from_lru_list(page, lruvec, page_lru(page));
>   ClearPageActive(page);
> - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
> + add_page_to_lru_list_tail(page, lruvec);
>   __count_vm_events(PGROTATED, thp_nr_pages(page));
>   }
>  }
> @@ -313,8 +313,7 @@ static void __activate_page(struct page *page, struct 
> lruvec *lruvec)
>  
>   del_page_from_lru_list(page, lruvec, lru);
>   SetPageActive(page);
> - lru += LRU_ACTIVE;

Uh, actully, page to lru functions like, page_lru(page), always inline, so 
generally, no instruction
increasing, except few place like here.

 
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
>   trace_mm_lru_activate(page);
>  
>   __count_vm_events(PGACTIVATE, nr_pages);
> @@ -543,14 +542,14 @@ static void lru_deactivate_file_fn(struct page *page, 
> struct lruvec *lruvec)
>* It can make readahead confusing.  But race window
>* is _really_ small and  it's non-critical problem.
>*/
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
>   SetPageReclaim(page);
>   } else {
>   /*
>* The page's writeback ends up during pagevec
>* We moves tha page into tail of inactive.
>*/
> - add_page_to_lru_list_tail(page, lruvec, lru);
> + add_page_to_lru_list_tail(page, lruvec);
>   __count_vm_events(PGROTATED, nr_pages);
>   }
>  
> @@ -570,7 +569,7 @@ static void lru_deactivate_fn(struct page *page, struct 
> lruvec *lruvec)
>   del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
>   ClearPageActive(page);
>   ClearPageReferenced(page);
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
>  
>   __count_vm_events(PGDEACTIVATE, nr_pages);
>   __count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE,
> @@ -595,7 +594,7 @@ static void lru_lazyfree_fn(struct page *page, struct 
> lruvec *lruvec)
>* anonymous pages
>*/
>   ClearPageSwapBacked(page);
> - add_page_to_lru_list(page, lruvec, LRU_INACTIVE_FILE);
> + add_page_to_lru_list(page, lruvec);
>  
>   __count_vm_events(PGLAZYFREE, nr_pages);
>   __count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE,
> @@ -1005,7 +1004,7 @@ static void __pagevec_lru_add_fn(struct page *page, 
> struct lruvec *lruvec)
>   __count_vm_events(UNEVICTABLE_PGCULLED, nr_pages);
>   }
>  
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
>   trace_mm_lru_insertion(page, lru);
>  }
>  
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index a174594e40f8..8fc8f2c9d7ec 100644
> --- a/mm/vmscan.c
> +++

Re: [PATCH 03/11] mm: don't pass "enum lru_list" to lru list addition functions

2020-12-08 Thread Alex Shi




在 2020/12/8 上午6:09, Yu Zhao 写道:
>  
>   __count_vm_events(PGACTIVATE, nr_pages);
> @@ -543,14 +542,14 @@ static void lru_deactivate_file_fn(struct page *page, 
> struct lruvec *lruvec)
>* It can make readahead confusing.  But race window
>* is _really_ small and  it's non-critical problem.
>*/
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
>   SetPageReclaim(page);
>   } else {
>   /*
>* The page's writeback ends up during pagevec
>* We moves tha page into tail of inactive.
>*/
> - add_page_to_lru_list_tail(page, lruvec, lru);
> + add_page_to_lru_list_tail(page, lruvec);
>   __count_vm_events(PGROTATED, nr_pages);
>   }
>  
> @@ -570,7 +569,7 @@ static void lru_deactivate_fn(struct page *page, struct 
> lruvec *lruvec)
>   del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
>   ClearPageActive(page);
>   ClearPageReferenced(page);
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
>  
>   __count_vm_events(PGDEACTIVATE, nr_pages);
>   __count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE,

seems leave the lru = xxx out, could save 2 function calling in 
lru_deactivate_file_fn(), is this right?

Re: [PATCH 01/11] mm: use add_page_to_lru_list()

2020-12-07 Thread Alex Shi

Reviewed-by: Alex Shi 

在 2020/12/8 上午6:09, Yu Zhao 写道:
> There is add_page_to_lru_list(), and move_pages_to_lru() should reuse
> it, not duplicate it.
> 
> Signed-off-by: Yu Zhao 
> ---
>  mm/vmscan.c | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 469016222cdb..a174594e40f8 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1821,7 +1821,6 @@ static unsigned noinline_for_stack 
> move_pages_to_lru(struct lruvec *lruvec,
>   int nr_pages, nr_moved = 0;
>   LIST_HEAD(pages_to_free);
>   struct page *page;
> - enum lru_list lru;
>  
>   while (!list_empty(list)) {
>   page = lru_to_page(list);
> @@ -1866,11 +1865,8 @@ static unsigned noinline_for_stack 
> move_pages_to_lru(struct lruvec *lruvec,
>* inhibits memcg migration).
>*/
>   VM_BUG_ON_PAGE(!lruvec_holds_page_lru_lock(page, lruvec), page);
> - lru = page_lru(page);
> + add_page_to_lru_list(page, lruvec, page_lru(page));
>   nr_pages = thp_nr_pages(page);
> -
> - update_lru_size(lruvec, lru, page_zonenum(page), nr_pages);
> - list_add(>lru, >lists[lru]);
>   nr_moved += nr_pages;
>   if (PageActive(page))
>   workingset_age_nonresident(lruvec, nr_pages);
>

Re: [PATCH] docs/zh_CN: Improve Cinese transolation quality.

2020-12-07 Thread Alex Shi




在 2020/12/7 下午9:05, Ran Wang 写道:
>>> +:ref:`cn_development_followthrough` 介绍了提交补丁之后发生的事情；至此工作实际
>> is it bybond 80 chars?
> On my part this line is aligned with context (on vim), and pass checkpatch.pl 
> check,

En, right, maybe a line alignment? but it depends on different editors. it's 
fine.
> 
>>> +上还远未完成。与审阅者一起合作是开发过程中的重要部分；本节提供了一些关于如何在
>>> +这个重要阶段避免出现问题的提示。此外，即使当补丁已经被合并到主线中，开发者也不
>>> +能认为任务就此完成。
>>>   
>>>   :ref:`cn_development_advancedtopics` 介绍了两个“高级”主题：
>>> -使用Git管理补丁和查看其他人发布的补丁。
>>> +使用Git管理补丁和查看其他人提交的补丁。
>> Any different of above lines?
> 发布 => 提交  :)

ops, overlooked, 提交 is better.

Thanks!

Re: [PATCH] docs/zh_CN: Improve Cinese transolation quality.

2020-12-06 Thread Alex Shi




在 2020/12/5 下午6:36, Ran Wang 写道:
> Signed-off-by: Ran Wang 
> ---
>  .../translations/zh_CN/process/1.Intro.rst| 61 ++-
>  1 file changed, 32 insertions(+), 29 deletions(-)
> 
> diff --git a/Documentation/translations/zh_CN/process/1.Intro.rst 
> b/Documentation/translations/zh_CN/process/1.Intro.rst
> index 10a15f3dc282..73feae61410c 100644
> --- a/Documentation/translations/zh_CN/process/1.Intro.rst
> +++ b/Documentation/translations/zh_CN/process/1.Intro.rst
> @@ -11,33 +11,35 @@
>  执行摘要
>  
>  
> -本节的其余部分涵盖了内核开发过程的范围，以及开发人员及其雇主在这方面可能遇
> -到的各种挫折。内核代码应该合并到正式的（“主线”）内核中有很多原因，包括对用
> -户的自动可用性、多种形式的社区支持以及影响内核开发方向的能力。提供给Linux
> -内核的代码必须在与GPL兼容的许可证下可用。
> +本节的其余部分介绍了内核开发流程相关知识，其中包括开发者及其雇主在这方面可能遇
> +到的各种问题。内核代码合并到官方的（“主线”）仓库会有很多好处，比如能在第一时
> +间让用户获得更新，可以从社区得到多种形式的支持，以及能够以此影响内核未来发展方
> +向。需要注意提供给Linux内核的代码必须是在与GPL兼容的许可证下使用。

good. 
>  
> -:ref:`cn_development_process` 介绍了开发过程、内核发布周期和合并窗口的机制。
> -涵盖了补丁开发、审查和合并周期中的各个阶段。有一些关于工具和邮件列表的讨论。
> -鼓励希望开始内核开发的开发人员作为初始练习跟踪并修复bug。
> +:ref:`cn_development_process` 介绍了内核开发流程、发布周期以及合并窗口期相关的
> +机制。同时还讲解了补丁开发、审查和合并周期各个阶段要点。此外还包括一些关于工具
> +和邮件列表的讨论。我们建议那些希望开始内核开发的开发者们以跟踪并修复bug作为初
> +始练习。
>  

good.

>  
> -:ref:`cn_development_early_stage` 包括早期项目规划，重点是尽快让开发社区参与
> +:ref:`cn_development_early_stage` 介绍项目早期的工作规划，重点是尽快让开发社区
> +有机会参与到规划中
>  
> -:ref:`cn_development_coding` 是关于编码过程的；讨论了其他开发人员遇到的几个
> -陷阱。对补丁的一些要求已经涵盖，并且介绍了一些工具，这些工具有助于确保内核
> -补丁是正确的。
> +:ref:`cn_development_coding` 代码编写过程相关；

Maybe the original word is more fluency? 

> 讨论了一些其他开发者曾经走入到
> +的误区。并介绍社区对补丁的要求，同时指导如何通过使用一些工具来帮助确保内核补
> +丁的质量。

ok.

>  
> -:ref:`cn_development_posting` 讨论发布补丁以供评审的过程。为了让开发社区
> -认真对待，补丁必须正确格式化和描述，并且必须发送到正确的地方。遵循本节中的
> -建议有助于确保为您的工作提供最好的接纳。
> +:ref:`cn_development_posting` 介绍发布补丁以供评审的流程。补丁只有在符合特定的
> +格式及正确描述，并且发送到正确的地方，开发社区才有可能对其认真审查。遵循本节中
> +的建议有助于确保为您的工作成果提供最好的接纳。
>  
> -:ref:`cn_development_followthrough` 介绍了发布补丁之后发生的事情；该工作
> -在这一点上还远远没有完成。与审阅者一起工作是开发过程中的一个重要部分；本节
> -提供了一些关于如何在这个重要阶段避免问题的提示。当补丁被合并到主线中时，
> -开发人员要注意不要假定任务已经完成。
> +:ref:`cn_development_followthrough` 介绍了提交补丁之后发生的事情；至此工作实际

is it bybond 80 chars?

> +上还远未完成。与审阅者一起合作是开发过程中的重要部分；本节提供了一些关于如何在
> +这个重要阶段避免出现问题的提示。此外，即使当补丁已经被合并到主线中，开发者也不
> +能认为任务就此完成。
>  
>  :ref:`cn_development_advancedtopics` 介绍了两个“高级”主题：
> -使用Git管理补丁和查看其他人发布的补丁。
> +使用Git管理补丁和查看其他人提交的补丁。

Any different of above lines?
>  
>  :ref:`cn_development_conclusion` 总结了有关内核开发的更多信息，附带有带有
>  指向资源的链接.
> @@ -62,19 +64,20 @@ Linux最引人注目的特性之一是这些开发人员可以访问它；任何
>  内核开发周期可以涉及1000多个开发人员，他们为100多个不同的公司
>  （或者根本没有公司）工作。
>  
> -与内核开发社区合作并不是特别困难。但是，尽管如此，许多潜在的贡献者在尝试做
> -内核工作时遇到了困难。内核社区已经发展了自己独特的操作方式，使其能够在每天
> -都要更改数千行代码的环境中顺利运行（并生成高质量的产品）。因此，Linux内核开发
> +与内核开发社区合作并不是特别困难。但是，尽管如此，许多潜在的贡献者在尝试参与
> +内核开发时遇到了困难。内核社区已经发展了自己独特的开发流程，使其能够在每天
> +都要更改数千行代码的环境中顺利运转（并生成高质量的产品）。因此，Linux内核开发
>  过程与专有的开发方法有很大的不同也就不足为奇了。
good.

>  
> -对于新开发人员来说，内核的开发过程可能会让人感到奇怪和恐惧，但这个背后有充分的
> -理由和坚实的经验。一个不了解内核社区的方式的开发人员（或者更糟的是，他们试图
> -抛弃或规避内核社区的方式）会有一个令人沮丧的体验。开发社区, 在帮助那些试图学习
> -的人的同时，没有时间帮助那些不愿意倾听或不关心开发过程的人。
> +对于新开发者来说，内核的开发流程可能会让人感到陌生和望而生畏，但这个背后其实
> +是有充分的理由和坚实的实际经验作支撑。一个不了解内核社区工作方式的开发者（或
> +者更糟的是，如果他们试图抛弃或规避内核社区的方式）将会有一个令人沮丧的体验。
> +毕竟开发社区在帮助那些试图学习的人的同时，没有时间帮助那些不愿意倾听或不关心
> +开发流程的人。

good.

>  
> -希望阅读本文的人能够避免这种令人沮丧的经历。这里有很多材料，但阅读时所做的
> -努力会在短时间内得到回报。开发社区总是需要能让内核变更好的开发人员；下面的
> -文本应该帮助您或为您工作的人员加入我们的社区。
> +希望大家能通过阅读本文来避免那些令人沮丧的经历。这里有很多材料，请相信阅读这
> +些所付出的努力将会在短时间内得到回报。开发社区总是需要那些能让内核变更好的
> +开发者；下面的文章应当能帮助您或为您工作的人加入我们的社区。

good.
>  
>  致谢
>  
>

Re: [PATCH 1/3] mm/swap.c: pre-sort pages in pagevec for pagevec_lru_move_fn

2020-12-01 Thread Alex Shi




在 2020/12/1 下午4:10, Michal Hocko 写道:
> On Tue 01-12-20 16:02:13, Alex Shi wrote:
>> Pages in pagevec may have different lruvec, so we have to do relock in
>> function pagevec_lru_move_fn(), but a relock may cause current cpu wait
>> for long time on the same lock for spinlock fairness reason.
>>
>> Before per memcg lru_lock, we have to bear the relock since the spinlock
>> is the only way to serialize page's memcg/lruvec. Now TestClearPageLRU
>> could be used to isolate pages exculsively, and stable the page's
>> lruvec/memcg. So it gives us a chance to sort the page's lruvec before
>> moving action in pagevec_lru_move_fn. Then we don't suffer from the
>> spinlock's fairness wait.
> Do you have any data to show any improvements from this?
> 

Hi Michal,

Thanks for quick response.

Not yet. I am running for data. but according to the lru_add result, there
should be a big gain for multiple memcgs scenario.

Also I don't except a quick accept, just send out the idea for comments 
when the thread is still warm. :)

Thanks
Alex

[PATCH 3/3] mm/swap.c: extend the usage to pagevec_lru_add

2020-12-01 Thread Alex Shi

The only different for __pagevec_lru_add and other page moving between
lru lists is page to add lru list has no need to do TestClearPageLRU and
set the lru bit back. So we could combound them with a clear lru bit
switch in sort function parameter.

Than all lru list operation functions could be united.

Signed-off-by: Alex Shi 
Cc: Konstantin Khlebnikov 
Cc: Hugh Dickins 
Cc: Yu Zhao 
Cc: Michal Hocko 
Cc: Matthew Wilcox (Oracle) 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/swap.c | 30 +++---
 1 file changed, 11 insertions(+), 19 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index 814809845700..6a7920b2937f 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -12,6 +12,7 @@
  * Started 18.12.91
  * Swap aging added 23.2.95, Stephen Tweedie.
  * Buffermem limits added 12.3.98, Rik van Riel.
+ * Pre-sort pagevec added 1.12.20, Alex Shi.
  */
 
 #include 
@@ -227,8 +228,8 @@ static void shell_sort(struct pagevec *pvec, unsigned long 
*lvaddr)
 }
 
 /* Get lru bit cleared page and their lruvec address, release the others */
-void sort_isopv(struct pagevec *pvec, struct pagevec *isopv,
-   unsigned long *lvaddr)
+static void sort_isopv(struct pagevec *pvec, struct pagevec *isopv,
+   unsigned long *lvaddr, bool clearlru)
 {
int i, j;
struct pagevec busypv;
@@ -241,7 +242,7 @@ void sort_isopv(struct pagevec *pvec, struct pagevec *isopv,
pvec->pages[i] = NULL;
 
/* block memcg migration during page moving between lru */
-   if (!TestClearPageLRU(page)) {
+   if (clearlru && !TestClearPageLRU(page)) {
pagevec_add(, page);
continue;
}
@@ -265,10 +266,13 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
unsigned long lvaddr[PAGEVEC_SIZE];
struct pagevec isopv;
struct pagevec *pv;
+   bool clearlru;
+
+   clearlru = pvec != this_cpu_ptr(_pvecs.lru_add);
 
if (!mem_cgroup_disabled() || num_online_nodes() > 1) {
pagevec_init();
-   sort_isopv(pvec, , lvaddr);
+   sort_isopv(pvec, , lvaddr, clearlru);
pv = 
} else {
pv = pvec;
@@ -291,7 +295,8 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
 
(*move_fn)(pv->pages[i], lruvec);
 
-   SetPageLRU(pv->pages[i]);
+   if (clearlru)
+   SetPageLRU(pv->pages[i]);
}
spin_unlock_irqrestore(>lru_lock, flags);
release_pages(pv->pages, pv->nr);
@@ -1086,20 +1091,7 @@ static void __pagevec_lru_add_fn(struct page *page, 
struct lruvec *lruvec)
  */
 void __pagevec_lru_add(struct pagevec *pvec)
 {
-   int i;
-   struct lruvec *lruvec = NULL;
-   unsigned long flags = 0;
-
-   for (i = 0; i < pagevec_count(pvec); i++) {
-   struct page *page = pvec->pages[i];
-
-   lruvec = relock_page_lruvec_irqsave(page, lruvec, );
-   __pagevec_lru_add_fn(page, lruvec);
-   }
-   if (lruvec)
-   unlock_page_lruvec_irqrestore(lruvec, flags);
-   release_pages(pvec->pages, pvec->nr);
-   pagevec_reinit(pvec);
+   pagevec_lru_move_fn(pvec, __pagevec_lru_add_fn);
 }
 
 /**
-- 
2.29.GIT

[PATCH 2/3] mm/swap.c: bail out early for no memcg and no numa

2020-12-01 Thread Alex Shi

If a system has memcg disabled and no numa node, like a embedded system,
there is no needs to do the pagevec sort, since only just one lruvec in
system. In this situation, we could skip the pagevec sorting.

Signed-off-by: Alex Shi 
Cc: Konstantin Khlebnikov 
Cc: Hugh Dickins 
Cc: Yu Zhao 
Cc: Michal Hocko 
Cc: Matthew Wilcox (Oracle) 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/swap.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index 17d8990e5ca7..814809845700 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -264,12 +264,17 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
unsigned long flags = 0;
unsigned long lvaddr[PAGEVEC_SIZE];
struct pagevec isopv;
+   struct pagevec *pv;
 
-   pagevec_init();
-
-   sort_isopv(pvec, , lvaddr);
+   if (!mem_cgroup_disabled() || num_online_nodes() > 1) {
+   pagevec_init();
+   sort_isopv(pvec, , lvaddr);
+   pv = 
+   } else {
+   pv = pvec;
+   }
 
-   n = pagevec_count();
+   n = pagevec_count(pv);
if (!n)
return;
 
@@ -284,12 +289,12 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
spin_lock_irqsave(>lru_lock, flags);
}
 
-   (*move_fn)(isopv.pages[i], lruvec);
+   (*move_fn)(pv->pages[i], lruvec);
 
-   SetPageLRU(isopv.pages[i]);
+   SetPageLRU(pv->pages[i]);
}
spin_unlock_irqrestore(>lru_lock, flags);
-   release_pages(isopv.pages, isopv.nr);
+   release_pages(pv->pages, pv->nr);
 }
 
 static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
-- 
2.29.GIT

[PATCH 1/3] mm/swap.c: pre-sort pages in pagevec for pagevec_lru_move_fn

2020-12-01 Thread Alex Shi

Pages in pagevec may have different lruvec, so we have to do relock in
function pagevec_lru_move_fn(), but a relock may cause current cpu wait
for long time on the same lock for spinlock fairness reason.

Before per memcg lru_lock, we have to bear the relock since the spinlock
is the only way to serialize page's memcg/lruvec. Now TestClearPageLRU
could be used to isolate pages exculsively, and stable the page's
lruvec/memcg. So it gives us a chance to sort the page's lruvec before
moving action in pagevec_lru_move_fn. Then we don't suffer from the
spinlock's fairness wait.

Signed-off-by: Alex Shi 
Cc: Konstantin Khlebnikov 
Cc: Hugh Dickins 
Cc: Yu Zhao 
Cc: Michal Hocko 
Cc: Matthew Wilcox (Oracle) 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/swap.c | 92 +++
 1 file changed, 79 insertions(+), 13 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index 490553f3f9ef..17d8990e5ca7 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -201,29 +201,95 @@ int get_kernel_page(unsigned long start, int write, 
struct page **pages)
 }
 EXPORT_SYMBOL_GPL(get_kernel_page);
 
+/* Pratt's gaps for shell sort, https://en.wikipedia.org/wiki/Shellsort */
+static int gaps[] = { 6, 4, 3, 2, 1, 0};
+
+/* Shell sort pagevec[] on page's lruvec.*/
+static void shell_sort(struct pagevec *pvec, unsigned long *lvaddr)
+{
+   int g, i, j, n = pagevec_count(pvec);
+
+   for (g=0; gaps[g] > 0 && gaps[g] <= n/2; g++) {
+   int gap = gaps[g];
+
+   for (i = gap; i < n; i++) {
+   unsigned long tmp = lvaddr[i];
+   struct page *page = pvec->pages[i];
+
+   for (j = i - gap; j >= 0 && lvaddr[j] > tmp; j -= gap) {
+   lvaddr[j + gap] = lvaddr[j];
+   pvec->pages[j + gap] = pvec->pages[j];
+   }
+   lvaddr[j + gap] = tmp;
+   pvec->pages[j + gap] = page;
+   }
+   }
+}
+
+/* Get lru bit cleared page and their lruvec address, release the others */
+void sort_isopv(struct pagevec *pvec, struct pagevec *isopv,
+   unsigned long *lvaddr)
+{
+   int i, j;
+   struct pagevec busypv;
+
+   pagevec_init();
+
+   for (i = 0, j = 0; i < pagevec_count(pvec); i++) {
+   struct page *page = pvec->pages[i];
+
+   pvec->pages[i] = NULL;
+
+   /* block memcg migration during page moving between lru */
+   if (!TestClearPageLRU(page)) {
+   pagevec_add(, page);
+   continue;
+   }
+   lvaddr[j++] = (unsigned long)
+   mem_cgroup_page_lruvec(page, page_pgdat(page));
+   pagevec_add(isopv, page);
+   }
+   pagevec_reinit(pvec);
+   if (pagevec_count())
+   release_pages(busypv.pages, busypv.nr);
+
+   shell_sort(isopv, lvaddr);
+}
+
 static void pagevec_lru_move_fn(struct pagevec *pvec,
void (*move_fn)(struct page *page, struct lruvec *lruvec))
 {
-   int i;
+   int i, n;
struct lruvec *lruvec = NULL;
unsigned long flags = 0;
+   unsigned long lvaddr[PAGEVEC_SIZE];
+   struct pagevec isopv;
 
-   for (i = 0; i < pagevec_count(pvec); i++) {
-   struct page *page = pvec->pages[i];
+   pagevec_init();
 
-   /* block memcg migration during page moving between lru */
-   if (!TestClearPageLRU(page))
-   continue;
+   sort_isopv(pvec, , lvaddr);
 
-   lruvec = relock_page_lruvec_irqsave(page, lruvec, );
-   (*move_fn)(page, lruvec);
+   n = pagevec_count();
+   if (!n)
+   return;
 
-   SetPageLRU(page);
+   lruvec = (struct lruvec *)lvaddr[0];
+   spin_lock_irqsave(>lru_lock, flags);
+
+   for (i = 0; i < n; i++) {
+   /* lock new lruvec if lruvec changes, we have sorted them */
+   if (lruvec != (struct lruvec *)lvaddr[i]) {
+   spin_unlock_irqrestore(>lru_lock, flags);
+   lruvec = (struct lruvec *)lvaddr[i];
+   spin_lock_irqsave(>lru_lock, flags);
+   }
+
+   (*move_fn)(isopv.pages[i], lruvec);
+
+   SetPageLRU(isopv.pages[i]);
}
-   if (lruvec)
-   unlock_page_lruvec_irqrestore(lruvec, flags);
-   release_pages(pvec->pages, pvec->nr);
-   pagevec_reinit(pvec);
+   spin_unlock_irqrestore(>lru_lock, flags);
+   release_pages(isopv.pages, isopv.nr);
 }
 
 static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
-- 
2.29.GIT

Re: [PATCH] certs/blacklist: fix kernel doc interface issue

2020-11-30 Thread Alex Shi




在 2020/11/30 下午6:04, David Howells 写道:
> Alex Shi  wrote:
> 
>>  /**
>>   * mark_hash_blacklisted - Add a hash to the system blacklist
>> - * @hash - The hash as a hex string with a type prefix (eg. 
>> "tbs:23aa429783")
>> + * @hash: - The hash as a hex string with a type prefix (eg. 
>> "tbs:23aa429783")
> 
> You should remove the dash when making this change.  I'll do that for you.


Hi David,

Very appreciate for the fixing and reminder!

Regards
Alex

Re: [PATCH] mm/memcg: bail out early when !memcg in mem_cgroup_lruvec

2020-11-28 Thread Alex Shi




在 2020/11/28 下午12:02, Andrew Morton 写道:
> On Fri, 27 Nov 2020 11:08:35 +0800 Alex Shi  
> wrote:
> 
>> Sometime, we use NULL memcg in mem_cgroup_lruvec(memcg, pgdat)
>> so we could get out early in the situation to avoid useless checking.
>>
>> Also warning if both parameter are NULL.
> 
> Why do you think a warning is needed here?

Uh, Consider there are no problem for long time, it could be saved.

> 
>> --- a/include/linux/memcontrol.h
>> +++ b/include/linux/memcontrol.h
>> @@ -613,14 +613,13 @@ static inline struct lruvec *mem_cgroup_lruvec(struct 
>> mem_cgroup *memcg,
>>  struct mem_cgroup_per_node *mz;
>>  struct lruvec *lruvec;
>>  
>> -if (mem_cgroup_disabled()) {
>> +VM_WARN_ON_ONCE(!memcg && !pgdat);
>> +
>> +if (mem_cgroup_disabled() || !memcg) {
>>  lruvec = >__lruvec;
>>  goto out;
>>  }
>>  
>> -if (!memcg)
>> -memcg = root_mem_cgroup;
>> -
> 
> This change isn't obviously equivalent, is it?

If !memcg, the root_mem_cgroup will still lead the lruvec to a pgdat
same as parameter.

> 
>>  mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id);
>>  lruvec = >lruvec;
>>  out:
> 
> And the resulting code is awkward:
> 
>   if (mem_cgroup_disabled() || !memcg) {
>   lruvec = >__lruvec;
>   goto out;
>   }
> 
>   mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id);
>   lruvec = >lruvec;
> out:
> 
> 
> could be
> 
>   if (mem_cgroup_disabled() || !memcg) {
>   lruvec = >__lruvec;
>   } else {
>   mem_cgroup_per_node mz;
> 
>       mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id);
>   lruvec = >lruvec;
>   }
> 

Right. remove 'goto' is better for understander.

So, is the following patch ok?

>From 225f29e03b40a7cbaeb4e3bb76f8efbcd7d648a2 Mon Sep 17 00:00:00 2001
From: Alex Shi 
Date: Wed, 25 Nov 2020 14:06:33 +0800
Subject: [PATCH v2] mm/memcg: bail out early when !memcg in mem_cgroup_lruvec

Sometime, we use NULL memcg in mem_cgroup_lruvec(memcg, pgdat)
so we could get out early in the situation to avoid useless checking.

Polished as Andrew Morton's suggestion.

Signed-off-by: Alex Shi 
Cc: Andrew Morton 
Cc: Johannes Weiner 
Cc: Shakeel Butt 
Cc: Roman Gushchin 
Cc: Lorenzo Stoakes 
Cc: Stephen Rothwell 
Cc: Alexander Duyck 
Cc: Yafang Shao 
Cc: Wei Yang 
Cc: linux-kernel@vger.kernel.org
---
 include/linux/memcontrol.h | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 3e6a1df3bdb9..4ff2ffe2b73d 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -610,20 +610,17 @@ mem_cgroup_nodeinfo(struct mem_cgroup *memcg, int nid)
 static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg,
   struct pglist_data *pgdat)
 {
-   struct mem_cgroup_per_node *mz;
struct lruvec *lruvec;
 
-   if (mem_cgroup_disabled()) {
+   if (mem_cgroup_disabled() || !memcg) {
lruvec = >__lruvec;
-   goto out;
-   }
+   } else {
+   struct mem_cgroup_per_node *mz;
 
-   if (!memcg)
-   memcg = root_mem_cgroup;
+   mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id);
+   lruvec = >lruvec;
+   }
 
-   mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id);
-   lruvec = >lruvec;
-out:
/*
 * Since a node can be onlined after the mem_cgroup was created,
 * we have to be prepared to initialize lruvec->pgdat here;
-- 
2.29.GIT

[tip: x86/cleanups] x86/PCI: Make a kernel-doc comment a normal one

2020-11-27 Thread tip-bot2 for Alex Shi

The following commit has been merged into the x86/cleanups branch of tip:

Commit-ID: 638920a66a17c8e1f4415cbab0d49dc4a344c2a7
Gitweb:
https://git.kernel.org/tip/638920a66a17c8e1f4415cbab0d49dc4a344c2a7
Author:Alex Shi 
AuthorDate:Fri, 13 Nov 2020 16:58:14 +08:00
Committer: Borislav Petkov 
CommitterDate: Fri, 27 Nov 2020 13:43:09 +01:00

x86/PCI: Make a kernel-doc comment a normal one

The comment is using kernel-doc markup but that comment isn't a
kernel-doc comment so make it a normal one to avoid:

  arch/x86/pci/i386.c:373: warning: Function parameter or member \
  'pcibios_assign_resources' not described in 'fs_initcall'

 [ bp: Massage and fixup comment while at it. ]

Signed-off-by: Alex Shi 
Signed-off-by: Borislav Petkov 
Link: 
https://lkml.kernel.org/r/1605257895-5536-5-git-send-email-alex@linux.alibaba.com
---
 arch/x86/pci/i386.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index fa855bb..f2f4a5d 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -366,9 +366,9 @@ static int __init pcibios_assign_resources(void)
return 0;
 }
 
-/**
- * called in fs_initcall (one below subsys_initcall),
- * give a chance for motherboard reserve resources
+/*
+ * This is an fs_initcall (one below subsys_initcall) in order to reserve
+ * resources properly.
  */
 fs_initcall(pcibios_assign_resources);

Re: [PATCH next] mm/swap.c: reduce lock contention in lru_cache_add

2020-11-26 Thread Alex Shi




在 2020/11/26 下午11:55, Matthew Wilcox 写道:
> On Thu, Nov 26, 2020 at 04:44:04PM +0100, Vlastimil Babka wrote:
>> However, Matthew wanted to increase pagevec size [1] and once 15^2 becomes
>> 63^2, it starts to be somewhat more worrying.
>>
>> [1] 
>> https://lore.kernel.org/linux-mm/20201105172651.2455-1-wi...@infradead.org/
> 
> Well, Tim wanted it ;-)
> 
> I would suggest that rather than an insertion sort (or was it a bubble
> sort?), we should be using a Shell sort.  It's ideal for these kinds of
> smallish arrays.
> 
> https://en.wikipedia.org/wiki/Shellsort
> 

Uh, looks perfect good!. I gonna look into it. :)

Thanks!

[PATCH] mm/memcg: bail out early when !memcg in mem_cgroup_lruvec

2020-11-26 Thread Alex Shi

Sometime, we use NULL memcg in mem_cgroup_lruvec(memcg, pgdat)
so we could get out early in the situation to avoid useless checking.

Also warning if both parameter are NULL.

Signed-off-by: Alex Shi 
Cc: Andrew Morton 
Cc: Johannes Weiner 
Cc: Shakeel Butt 
Cc: Roman Gushchin 
Cc: Lorenzo Stoakes 
Cc: Stephen Rothwell 
Cc: Alexander Duyck 
Cc: Yafang Shao 
Cc: Wei Yang 
Cc: linux-kernel@vger.kernel.org
---
 include/linux/memcontrol.h | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 3e6a1df3bdb9..4cdb110f84e0 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -613,14 +613,13 @@ static inline struct lruvec *mem_cgroup_lruvec(struct 
mem_cgroup *memcg,
struct mem_cgroup_per_node *mz;
struct lruvec *lruvec;
 
-   if (mem_cgroup_disabled()) {
+   VM_WARN_ON_ONCE(!memcg && !pgdat);
+
+   if (mem_cgroup_disabled() || !memcg) {
lruvec = >__lruvec;
goto out;
}
 
-   if (!memcg)
-   memcg = root_mem_cgroup;
-
mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id);
lruvec = >lruvec;
 out:
-- 
2.29.GIT

Re: [PATCH next] mm/vmscan: __isolate_lru_page_prepare clean up

2020-11-26 Thread Alex Shi




在 2020/11/26 下午11:23, Vlastimil Babka 写道:
>>>
>>> I tried that, and .text became significantly larger, for reasons which
>>> I didn't investigate ;)
> 
> I found out that comparing whole .text doesn't often work as changes might be 
> lost in alignment, or
> once in a while cross the alignment boundary and become exagerated. 
> bloat-o-meter works nice though.
> 
>> Uh, BTW, with the gcc 8.3.1 and centos 7, goto or continue version has same 
>> size
>> on my side with or w/o DEBUG_LIST. But actually, this clean up patch could
>> add 10 bytes also with or w/o DEDBUG_LIST.
>>
>> Maybe related with different compiler?
> 
> gcc (SUSE Linux) 10.2.1 20201117 [revision 
> 98ba03ffe0b9f37b4916ce6238fad754e00d720b]
> 
> ./scripts/bloat-o-meter vmscan.o.before mm/vmscan.o
> add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-1 (-1)
> Function old new   delta
> isolate_lru_pages   1125    1124  -1
> Total: Before=57283, After=57282, chg -0.00%
> 
> Not surprising, as I'd expect the compiler to figure out by itself that 
> list_move + continue
> repeats and can be unified.  The reason for goto to stay would be rather 
> readability (subjective).

Hi Vlastimil,

Thanks for tool sharing! The gcc do give different.

My data is read from 'size' tool and isolate_lru_pages text size from 'objdump 
-d'. Maybe a
same way like bloat-o-meter. :)

Thanks
Alex

Re: [PATCH next] mm/swap.c: reduce lock contention in lru_cache_add

2020-11-26 Thread Alex Shi




在 2020/11/26 下午3:24, Yu Zhao 写道:
> Oh, no, I'm not against your idea. I was saying it doesn't seem
> necessary to sort -- a nested loop would just do the job given
> pagevec is small.
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index cb3794e13b48..1d238edc2907 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -996,15 +996,26 @@ static void __pagevec_lru_add_fn(struct page *page, 
> struct lruvec *lruvec)
>   */
>  void __pagevec_lru_add(struct pagevec *pvec)
>  {
> - int i;
> + int i, j;
>   struct lruvec *lruvec = NULL;
>   unsigned long flags = 0;
>  
>   for (i = 0; i < pagevec_count(pvec); i++) {
>   struct page *page = pvec->pages[i];
>  
> + if (!page)
> + continue;
> +
>   lruvec = relock_page_lruvec_irqsave(page, lruvec, );
> - __pagevec_lru_add_fn(page, lruvec);
> +
> + for (j = i; j < pagevec_count(pvec); j++) {
> + if (page_to_nid(pvec->pages[j]) != page_to_nid(page) ||
> + page_memcg(pvec->pages[j]) != page_memcg(page))
> + continue;
> +
> + __pagevec_lru_add_fn(pvec->pages[j], lruvec);
> + pvec->pages[j] = NULL;
> + }

Uh, I have to say your method is more better than mine.
And this could be reused for all relock_page_lruvec. I expect this could
speed up lru performance a lot!


>   }
>   if (lruvec)
>   unlock_page_lruvec_irqrestore(lruvec, flags);

Re: [PATCH next] mm/swap.c: reduce lock contention in lru_cache_add

2020-11-25 Thread Alex Shi




在 2020/11/26 下午12:52, Yu Zhao 写道:
>>   */
>>  void __pagevec_lru_add(struct pagevec *pvec)
>>  {
>> -int i;
>> -struct lruvec *lruvec = NULL;
>> +int i, nr_lruvec;
>>  unsigned long flags = 0;
>> +struct page *page;
>> +struct lruvecs lruvecs;
>>  
>> -for (i = 0; i < pagevec_count(pvec); i++) {
>> -struct page *page = pvec->pages[i];
>> +nr_lruvec = sort_page_lruvec(, pvec);
> Simply looping pvec multiple times (15 at most) for different lruvecs
> would be better because 1) it requires no extra data structures and
> therefore has better cache locality (theoretically faster) 2) it only
> loops once when !CONFIG_MEMCG and !CONFIG_NUMA and therefore has no
> impact on Android and Chrome OS.
> 

With multiple memcgs, it do help a lot, I had gotten 30% grain on readtwice
case. but yes, w/o MEMCG and NUMA, it's good to keep old behavior. So 
would you like has a proposal for this?

Thanks
Alex

Re: [PATCH next] mm/swap.c: reduce lock contention in lru_cache_add

2020-11-25 Thread Alex Shi




在 2020/11/25 下午11:38, Vlastimil Babka 写道:
> On 11/20/20 9:27 AM, Alex Shi wrote:
>> The current relock logical will change lru_lock when found a new
>> lruvec, so if 2 memcgs are reading file or alloc page at same time,
>> they could hold the lru_lock alternately, and wait for each other for
>> fairness attribute of ticket spin lock.
>>
>> This patch will sort that all lru_locks and only hold them once in
>> above scenario. That could reduce fairness waiting for lock reget.
>> Than, vm-scalability/case-lru-file-readtwice could get ~5% performance
>> gain on my 2P*20core*HT machine.
> 
> Hm, once you sort the pages like this, it's a shame not to splice them 
> instead of more list_del() + list_add() iterations. update_lru_size() could 
> be also called once?

Yes, looks it's a good idea to use splice instead of list_del/add, but pages
may on different lru list in a same lruvec, and also may come from different
zones. That could involve 5 cycles for different lists, and more for zones...

I give up the try.

Re: [PATCH next] mm/vmscan: __isolate_lru_page_prepare clean up

2020-11-25 Thread Alex Shi




在 2020/11/26 上午7:43, Andrew Morton 写道:
> On Tue, 24 Nov 2020 12:21:28 +0100 Vlastimil Babka  wrote:
> 
>> On 11/22/20 3:00 PM, Alex Shi wrote:
>>> Thanks a lot for all comments, I picked all up and here is the v3:
>>>
>>>  From 167131dd106a96fd08af725df850e0da6ec899af Mon Sep 17 00:00:00 2001
>>> From: Alex Shi 
>>> Date: Fri, 20 Nov 2020 14:49:16 +0800
>>> Subject: [PATCH v3 next] mm/vmscan: __isolate_lru_page_prepare clean up
>>>
>>> The function just return 2 results, so use a 'switch' to deal with its
>>> result is unnecessary, and simplify it to a bool func as Vlastimil
>>> suggested.
>>>
>>> Also remove 'goto' by reusing list_move(), and take Matthew Wilcox's
>>> suggestion to update comments in function.
>>
>> I wouldn't mind if the goto stayed, but it's not repeating that much 
>> without it (list_move() + continue, 3 times) so...
> 
> I tried that, and .text became significantly larger, for reasons which
> I didn't investigate ;)
> 


Uh, BTW, with the gcc 8.3.1 and centos 7, goto or continue version has same size
on my side with or w/o DEBUG_LIST. But actually, this clean up patch could
add 10 bytes also with or w/o DEDBUG_LIST.

Maybe related with different compiler?

Thanks
Alex

Re: [PATCH] mm/memcg: warn on missing memcg on mem_cgroup_page_lruvec()

2020-11-25 Thread Alex Shi

Acked-by: Alex Shi 


在 2020/11/25 下午7:22, Lorenzo Stoakes 写道:
> Move memcg check to mem_cgroup_page_lruvec() as there are callers which
> may invoke this with !memcg in mem_cgroup_lruvec(), whereas they should
> not in mem_cgroup_page_lruvec().
> 
> We expect that we have always charged a page to the memcg before
> mem_cgroup_page_lruvec() is invoked, so add a warning to assert that this
> is the case.
> 
> Signed-off-by: Lorenzo Stoakes 
> Reported-by: syzbot+ce635500093181f39...@syzkaller.appspotmail.com
> ---
>  include/linux/memcontrol.h | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 87ed56dc75f9..3e6a1df3bdb9 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -618,7 +618,6 @@ static inline struct lruvec *mem_cgroup_lruvec(struct 
> mem_cgroup *memcg,
>   goto out;
>   }
>  
> - VM_WARN_ON_ONCE(!memcg);
>   if (!memcg)
>   memcg = root_mem_cgroup;
>  
> @@ -645,7 +644,10 @@ static inline struct lruvec *mem_cgroup_lruvec(struct 
> mem_cgroup *memcg,
>  static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page,
>   struct pglist_data *pgdat)
>  {
> - return mem_cgroup_lruvec(page_memcg(page), pgdat);
> + struct mem_cgroup *memcg = page_memcg(page);
> +
> + VM_WARN_ON_ONCE_PAGE(!memcg, page);
> + return mem_cgroup_lruvec(memcg, pgdat);
>  }
>  
>  static inline bool lruvec_holds_page_lru_lock(struct page *page,
>

Re: linux-next boot error: WARNING in prepare_kswapd_sleep

2020-11-24 Thread Alex Shi




在 2020/11/25 上午1:59, Lorenzo Stoakes 写道:
> On Tue, 24 Nov 2020 at 07:54, syzbot
>  wrote:
>> syzbot found the following issue on:
>>
>> HEAD commit:d9137320 Add linux-next specific files for 20201124
> 
> This appears to be a product of 4b2904f3 ("mm/memcg: add missed
> warning in mem_cgroup_lruvec") adding a VM_WARN_ON_ONCE() to
> mem_cgroup_lruvec, which when invoked from a function other than
> mem_cgroup_page_lruvec() can in fact be called with the condition
> false.
> If we move the check back into mem_cgroup_page_lruvec() it resolves
> the issue. I enclose a simple version of this below, happy to submit
> as a proper patch if this is the right approach:
> 
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 87ed56dc75f9..27cc40a490b2 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -618,7 +618,6 @@ static inline struct lruvec
> *mem_cgroup_lruvec(struct mem_cgroup *memcg,
> goto out;
> }
> 
> -   VM_WARN_ON_ONCE(!memcg);
> if (!memcg)
> memcg = root_mem_cgroup;
> 
> @@ -645,6 +644,7 @@ static inline struct lruvec
> *mem_cgroup_lruvec(struct mem_cgroup *memcg,
>  static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page,
> struct pglist_data *pgdat)
>  {
> +   VM_WARN_ON_ONCE_PAGE(!page_memcg(page), page);
> return mem_cgroup_lruvec(page_memcg(page), pgdat);
>  }
> 

Acked.

Right. Would you like to remove the bad commit 4b2904f3 ("mm/memcg: add missed
 warning in mem_cgroup_lruvec") and replace yours.

and further more, could you like try another patch?

Thanks
Alex

>From 073b222bd06a96c39656b0460c705e48c7eedafc Mon Sep 17 00:00:00 2001
From: Alex Shi 
Date: Wed, 25 Nov 2020 14:06:33 +0800
Subject: [PATCH] mm/memcg: bail out early when !memcg in mem_cgroup_lruvec

In some scenarios, we call NULL memcg in mem_cgroup_lruvec(NULL, pgdat)
so we could get out early to skip unnecessary check.

Also warning if both parameter are NULL.

Signed-off-by: Alex Shi 
---
 include/linux/memcontrol.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 3a995bb3157f..5e4da83eb9ce 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -613,7 +613,9 @@ static inline struct lruvec *mem_cgroup_lruvec(struct 
mem_cgroup *memcg,
struct mem_cgroup_per_node *mz;
struct lruvec *lruvec;
 
-   if (mem_cgroup_disabled()) {
+   VM_WARN_ON_ONCE(!memcg && !pgdat);
+
+   if (mem_cgroup_disabled() || !memcg) {
lruvec = >__lruvec;
goto out;
}
-- 
2.29.GIT

Re: INFO: task can't die in shrink_inactive_list (2)

2020-11-24 Thread Alex Shi




在 2020/11/24 下午8:00, Alex Shi 写道:
>>> syzbot found the following issue on:
>>>
>>> HEAD commit:03430750 Add linux-next specific files for 20201116
>>> git tree:   linux-next
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=13f80e5e50
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=a1c4c3f27041fdb8
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=e5a33e700b1dd0da20a2
>>> compiler:   gcc (GCC) 10.1.0-syz 20200507
>>> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12f7bc5a50
>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10934cf250
> CC Peter Zijlstra.
> 
> I found next-20200821 had a very very similar ops as this.
> https://groups.google.com/g/syzkaller-upstream-moderation/c/S0pyqK1dZv8/m/dxMoEhGdAQAJ
> So does this means the bug exist for long time from 5.9-rc1?
> 
> The reproducer works randomly on a cpu=2, mem=1600M x86 vm. It could cause 
> hung again
> on both kernel, but both with different kernel stack.
> 
> Maybe is system just too busy? I will try more older kernel with the 
> reproducer.

5.8 kernel sometime also failed on this test on my 2 cpus vm guest with 2g 
memory:
Any comments for this issue?

Thanks
Alex

[ 5875.750929][  T946] INFO: task repro:31866 blocked for more than 143 seconds.
[ 5875.751618][  T946]   Not tainted 5.8.0 #6
[ 5875.752046][  T946] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables th.
[ 5875.752845][  T946] repro   D12088 31866  1 0x80004086
[ 5875.753436][  T946] Call Trace:
[ 5875.753747][  T946]  __schedule+0x394/0x950
[ 5875.774033][  T946]  ? __mutex_lock+0x46f/0x9c0
[ 5875.774481][  T946]  ? blkdev_put+0x18/0x120
[ 5875.774894][  T946]  schedule+0x37/0xe0
[ 5875.775260][  T946]  schedule_preempt_disabled+0xf/0x20
[ 5875.775753][  T946]  __mutex_lock+0x474/0x9c0
[ 5875.776174][  T946]  ? lock_acquire+0xa7/0x390
[ 5875.776602][  T946]  ? locks_remove_file+0x1e7/0x2d0
[ 5875.777079][  T946]  ? blkdev_put+0x18/0x120
[ 5875.777485][  T946]  blkdev_put+0x18/0x120
[ 5875.777880][  T946]  blkdev_close+0x1f/0x30
[ 5875.778281][  T946]  __fput+0xf0/0x260
[ 5875.778639][  T946]  task_work_run+0x68/0xb0
[ 5875.779054][  T946]  do_exit+0x3df/0xce0
[ 5875.779430][  T946]  ? get_signal+0x11d/0xca0
[ 5875.779846][  T946]  do_group_exit+0x42/0xb0
[ 5875.780261][  T946]  get_signal+0x16a/0xca0
[ 5875.780662][  T946]  ? handle_mm_fault+0xc8f/0x19c0
[ 5875.781134][  T946]  do_signal+0x2b/0x8e0
[ 5875.781521][  T946]  ? trace_hardirqs_off+0xe/0xf0
[ 5875.781989][  T946]  __prepare_exit_to_usermode+0xef/0x1f0
[ 5875.782512][  T946]  ? asm_exc_page_fault+0x8/0x30
[ 5875.782979][  T946]  prepare_exit_to_usermode+0x5/0x30
[ 5875.783461][  T946]  asm_exc_page_fault+0x1e/0x30
[ 5875.783909][  T946] RIP: 0033:0x428dd7
[ 5875.794899][  T946] Code: Bad RIP value.
[ 5875.795290][  T946] RSP: 002b:7f37c99e0d78 EFLAGS: 00010202
[ 5875.795858][  T946] RAX: 2080 RBX:  RCX: 
76656f
[ 5875.796588][  T946] RDX: 000c RSI: 004b2370 RDI: 
20
[ 5875.797326][  T946] RBP: 7f37c99e0da0 R08: 7f37c99e1700 R09: 
7f37c99e10
[ 5875.798063][  T946] R10: 7f37c99e19d0 R11: 0202 R12: 
00
[ 5875.798802][  T946] R13: 00021000 R14:  R15: 
7f37c99e10

Re: INFO: task can't die in shrink_inactive_list (2)

2020-11-24 Thread Alex Shi




在 2020/11/24 下午8:00, Alex Shi 写道:
>>>
>>> syzbot found the following issue on:
>>>
>>> HEAD commit:03430750 Add linux-next specific files for 20201116
>>> git tree:   linux-next
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=13f80e5e50
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=a1c4c3f27041fdb8
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=e5a33e700b1dd0da20a2
>>> compiler:   gcc (GCC) 10.1.0-syz 20200507
>>> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12f7bc5a50
>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10934cf250
> CC Peter Zijlstra.
> 
> I found next-20200821 had a very very similar ops as this.
> https://groups.google.com/g/syzkaller-upstream-moderation/c/S0pyqK1dZv8/m/dxMoEhGdAQAJ
> So does this means the bug exist for long time from 5.9-rc1?

5.8 kernel sometime also failed on this test on my 2 cpus vm guest with 2g 
memory:


Thanks
Alex

[ 5875.750929][  T946] INFO: task repro:31866 blocked for more than 143 seconds.
[ 5875.751618][  T946]   Not tainted 5.8.0 #6
[ 5875.752046][  T946] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables th.
[ 5875.752845][  T946] repro   D12088 31866  1 0x80004086
[ 5875.753436][  T946] Call Trace:
[ 5875.753747][  T946]  __schedule+0x394/0x950
[ 5875.774033][  T946]  ? __mutex_lock+0x46f/0x9c0
[ 5875.774481][  T946]  ? blkdev_put+0x18/0x120
[ 5875.774894][  T946]  schedule+0x37/0xe0
[ 5875.775260][  T946]  schedule_preempt_disabled+0xf/0x20
[ 5875.775753][  T946]  __mutex_lock+0x474/0x9c0
[ 5875.776174][  T946]  ? lock_acquire+0xa7/0x390
[ 5875.776602][  T946]  ? locks_remove_file+0x1e7/0x2d0
[ 5875.777079][  T946]  ? blkdev_put+0x18/0x120
[ 5875.777485][  T946]  blkdev_put+0x18/0x120
[ 5875.777880][  T946]  blkdev_close+0x1f/0x30
[ 5875.778281][  T946]  __fput+0xf0/0x260
[ 5875.778639][  T946]  task_work_run+0x68/0xb0
[ 5875.779054][  T946]  do_exit+0x3df/0xce0
[ 5875.779430][  T946]  ? get_signal+0x11d/0xca0
[ 5875.779846][  T946]  do_group_exit+0x42/0xb0
[ 5875.780261][  T946]  get_signal+0x16a/0xca0
[ 5875.780662][  T946]  ? handle_mm_fault+0xc8f/0x19c0
[ 5875.781134][  T946]  do_signal+0x2b/0x8e0
[ 5875.781521][  T946]  ? trace_hardirqs_off+0xe/0xf0
[ 5875.781989][  T946]  __prepare_exit_to_usermode+0xef/0x1f0
[ 5875.782512][  T946]  ? asm_exc_page_fault+0x8/0x30
[ 5875.782979][  T946]  prepare_exit_to_usermode+0x5/0x30
[ 5875.783461][  T946]  asm_exc_page_fault+0x1e/0x30
[ 5875.783909][  T946] RIP: 0033:0x428dd7
[ 5875.794899][  T946] Code: Bad RIP value.
[ 5875.795290][  T946] RSP: 002b:7f37c99e0d78 EFLAGS: 00010202
[ 5875.795858][  T946] RAX: 2080 RBX:  RCX: 
76656f
[ 5875.796588][  T946] RDX: 000c RSI: 004b2370 RDI: 
20
[ 5875.797326][  T946] RBP: 7f37c99e0da0 R08: 7f37c99e1700 R09: 
7f37c99e10
[ 5875.798063][  T946] R10: 7f37c99e19d0 R11: 0202 R12: 
00
[ 5875.798802][  T946] R13: 00021000 R14:  R15: 
7f37c99e10

Re: INFO: task can't die in shrink_inactive_list (2)

2020-11-24 Thread Alex Shi




在 2020/11/24 上午11:54, Andrew Morton 写道:
> On Fri, 20 Nov 2020 17:55:22 -0800 syzbot 
>  wrote:
> 
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit:03430750 Add linux-next specific files for 20201116
>> git tree:   linux-next
>> console output: https://syzkaller.appspot.com/x/log.txt?x=13f80e5e50
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=a1c4c3f27041fdb8
>> dashboard link: https://syzkaller.appspot.com/bug?extid=e5a33e700b1dd0da20a2
>> compiler:   gcc (GCC) 10.1.0-syz 20200507
>> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12f7bc5a50
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10934cf250

CC Peter Zijlstra.

I found next-20200821 had a very very similar ops as this.
https://groups.google.com/g/syzkaller-upstream-moderation/c/S0pyqK1dZv8/m/dxMoEhGdAQAJ
So does this means the bug exist for long time from 5.9-rc1?

The reproducer works randomly on a cpu=2, mem=1600M x86 vm. It could cause hung 
again
on both kernel, but both with different kernel stack.

Maybe is system just too busy? I will try more older kernel with the reproducer.

Thanks
Alex

BTW, I remove the drm and wireless config in my testing.

[ 1861.939128][ T1586] INFO: task systemd-udevd:8999 blocked for more than 143 
seconds.
[ 1861.939969][ T1586]   Not tainted 5.9.0-rc1-next-20200821 #5
[ 1861.940553][ T1586] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[ 1861.941369][ T1586] task:systemd-udevd   state:D stack:21192 pid: 8999 ppid: 
 4717 flags:0x4080
[ 1861.942245][ T1586] Call Trace:
[ 1861.942581][ T1586]  __schedule+0xaab/0x1f20
[ 1861.943014][ T1586]  ? __sched_text_start+0x8/0x8
[ 1861.943482][ T1586]  schedule+0xc4/0x2b0
[ 1861.943872][ T1586]  schedule_preempt_disabled+0xf/0x20
[ 1861.944390][ T1586]  __mutex_lock+0x8a0/0x13e0
[ 1861.944831][ T1586]  ? __blkdev_get+0x4bc/0x1a00
[ 1861.945286][ T1586]  ? mutex_lock_io_nested+0x12c0/0x12c0
[ 1861.945818][ T1586]  ? up_read+0x1a5/0x740
[ 1861.946224][ T1586]  ? down_read+0x10a/0x420
[ 1861.946653][ T1586]  ? kobj_lookup+0x37a/0x480
[ 1861.947095][ T1586]  ? __blkdev_get+0x4bc/0x1a00
[ 1861.947545][ T1586]  __blkdev_get+0x4bc/0x1a00
[ 1861.947997][ T1586]  ? lock_release+0x730/0x730
[ 1861.948464][ T1586]  ? __blkdev_put+0x720/0x720
[ 1861.962189][T15367] systemd-journald[15367]: Sent WATCHDOG=1 notification.
[ 1861.991663][ T1586]  blkdev_get+0x20/0x80
[ 1861.992088][ T1586]  blkdev_open+0x20a/0x290
[ 1861.992514][ T1586]  do_dentry_open+0x69a/0x1240
[ 1861.992975][ T1586]  ? bd_acquire+0x2c0/0x2c0
[ 1861.993414][ T1586]  path_openat+0xdd2/0x26f0
[ 1861.993846][ T1586]  ? path_lookupat.isra.41+0x520/0x520
[ 1861.994368][ T1586]  ? lockdep_hardirqs_on_prepare+0x4d0/0x4d0
[ 1861.994937][ T1586]  ? lockdep_hardirqs_on_prepare+0x4d0/0x4d0
[ 1861.995502][ T1586]  ? ___sys_sendmsg+0x11c/0x180
[ 1861.995954][ T1586]  ? find_held_lock+0x33/0x1c0
[ 1861.996405][ T1586]  ? __might_fault+0x11f/0x1d0
[ 1861.996850][ T1586]  do_filp_open+0x192/0x260
[ 1861.997268][ T1586]  ? may_open_dev+0xf0/0xf0
[ 1861.997699][ T1586]  ? rwlock_bug.part.1+0x90/0x90
[ 1861.998161][ T1586]  ? do_raw_spin_unlock+0x4f/0x260
[ 1861.998650][ T1586]  ? __alloc_fd+0x282/0x600
[ 1862.002012][ T1586]  ? lock_downgrade+0x6f0/0x6f0
[ 1862.007607][ T1586]  do_sys_openat2+0x573/0x850
[ 1862.008112][ T1586]  ? file_open_root+0x3f0/0x3f0
[ 1862.008570][ T1586]  ? trace_hardirqs_on+0x5f/0x220
[ 1862.028918][ T1586]  do_sys_open+0xca/0x140
[ 1862.028932][ T1586]  ? filp_open+0x70/0x70
[ 1862.028945][ T1586]  do_syscall_64+0x2d/0x70
[ 1862.028954][ T1586]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1862.028966][ T1586] RIP: 0033:0x7fc04686eea0
[ 1862.028969][ T1586] Code: Bad RIP value.
[ 1862.028974][ T1586] RSP: 002b:7ffd2c78ae68 EFLAGS: 0246 ORIG_RAX: 
0002
[ 1862.028983][ T1586] RAX: ffda RBX: 55785498f3c0 RCX: 
7fc04686eea0
[ 1862.028988][ T1586] RDX: 55785498fcd0 RSI: 000a0800 RDI: 
55785498fcd0
[ 1862.028992][ T1586] RBP:  R08: 7ffd2c7ad090 R09: 
00051dc0
[ 1862.028997][ T1586] R10: 00051dc0 R11: 0246 R12: 
557854990340
[ 1862.029002][ T1586] R13: 557854984010 R14: 557854990200 R15: 
000c
[ 1862.029024][ T1586] INFO: task repro:17514 can't die for more than 143 
seconds.
[ 1862.036603][ T1586] task:repro   state:R  running task 
stack:25520 pid:17514 ppid:  8947 flags:0x4086
[ 1862.037596][ T1586] Call Trace:
[ 1862.037909][ T1586]  __schedule+0xaab/0x1f20
[ 1862.038322][ T1586]  ? __sched_text_start+0x8/0x8
[ 1862.038776][ T1586]  ? preempt_schedule_irq+0x30/0x90
[ 1862.070004][ T1586]  ? bdev_evict_inode+0x420/0x420
[ 1862.070497][ T1586]  ? _raw_spin_unlock_irqrestore+0x47/0x60
[ 1862.071036][ T1586]  ? blkdev_write_begin+0x40/0x40
[ 1862.071504][ T1586]  ? read_pages+0x1ee/0x1170
[ 1862.071933][ T1586]  ? _raw_spin_unlock_irqrestore+0x34/0x60
[

Re: INFO: task can't die in shrink_inactive_list (2)

2020-11-23 Thread Alex Shi

CC: Hugh Dickin & Johannes, 



在 2020/11/24 上午11:54, Andrew Morton 写道:
> On Fri, 20 Nov 2020 17:55:22 -0800 syzbot 
>  wrote:
> 
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit:03430750 Add linux-next specific files for 20201116
>> git tree:   linux-next
>> console output: https://syzkaller.appspot.com/x/log.txt?x=13f80e5e50
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=a1c4c3f27041fdb8
>> dashboard link: https://syzkaller.appspot.com/bug?extid=e5a33e700b1dd0da20a2
>> compiler:   gcc (GCC) 10.1.0-syz 20200507
>> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12f7bc5a50
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10934cf250
> 
> Alex, your series "per memcg lru lock" changed the vmscan code rather a
> lot.  Could you please take a look at that reproducer?
> 

Sure, I will try to reproduce and look into it.

Thanks!
Alex

>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+e5a33e700b1dd0da2...@syzkaller.appspotmail.com
>>
>> INFO: task syz-executor880:8534 can't die for more than 143 seconds.
>> task:syz-executor880 state:R  running task stack:25304 pid: 8534 ppid:  
>> 8504 flags:0x4006
>> Call Trace:
>>  context_switch kernel/sched/core.c:4269 [inline]
>>  __schedule+0x890/0x2030 kernel/sched/core.c:5019
>>  preempt_schedule_common+0x45/0xc0 kernel/sched/core.c:5179
>>  preempt_schedule_thunk+0x16/0x18 arch/x86/entry/thunk_64.S:40
>>  __raw_spin_unlock_irq include/linux/spinlock_api_smp.h:169 [inline]
>>  _raw_spin_unlock_irq+0x3c/0x40 kernel/locking/spinlock.c:199
>>  spin_unlock_irq include/linux/spinlock.h:404 [inline]
>>  shrink_inactive_list+0x4b1/0xce0 mm/vmscan.c:1974
>>  shrink_list mm/vmscan.c:2167 [inline]
>>  shrink_lruvec+0x61b/0x11b0 mm/vmscan.c:2462
>>  shrink_node_memcgs mm/vmscan.c:2650 [inline]
>>  shrink_node+0x839/0x1d60 mm/vmscan.c:2767
>>  shrink_zones mm/vmscan.c:2970 [inline]
>>  do_try_to_free_pages+0x38b/0x1440 mm/vmscan.c:3025
>>  try_to_free_pages+0x29f/0x720 mm/vmscan.c:3264
>>  __perform_reclaim mm/page_alloc.c:4360 [inline]
>>  __alloc_pages_direct_reclaim mm/page_alloc.c:4381 [inline]
>>  __alloc_pages_slowpath.constprop.0+0x917/0x2510 mm/page_alloc.c:4785
>>  __alloc_pages_nodemask+0x5f0/0x730 mm/page_alloc.c:4995
>>  alloc_pages_current+0x191/0x2a0 mm/mempolicy.c:2271
>>  alloc_pages include/linux/gfp.h:547 [inline]
>>  __page_cache_alloc mm/filemap.c:977 [inline]
>>  __page_cache_alloc+0x2ce/0x360 mm/filemap.c:962
>>  page_cache_ra_unbounded+0x3a1/0x920 mm/readahead.c:216
>>  do_page_cache_ra+0xf9/0x140 mm/readahead.c:267
>>  do_sync_mmap_readahead mm/filemap.c:2721 [inline]
>>  filemap_fault+0x19d0/0x2940 mm/filemap.c:2809
>>  __do_fault+0x10d/0x4d0 mm/memory.c:3623
>>  do_shared_fault mm/memory.c:4071 [inline]
>>  do_fault mm/memory.c:4149 [inline]
>>  handle_pte_fault mm/memory.c:4385 [inline]
>>  __handle_mm_fault mm/memory.c:4520 [inline]
>>  handle_mm_fault+0x3033/0x55d0 mm/memory.c:4618
>>  do_user_addr_fault+0x55b/0xba0 arch/x86/mm/fault.c:1377
>>  handle_page_fault arch/x86/mm/fault.c:1434 [inline]
>>  exc_page_fault+0x9e/0x180 arch/x86/mm/fault.c:1490
>>  asm_exc_page_fault+0x1e/0x30 arch/x86/include/asm/idtentry.h:580
>> RIP: 0033:0x400e71
>> Code: Unable to access opcode bytes at RIP 0x400e47.
>> RSP: 002b:7f8a5353fdc0 EFLAGS: 00010246
>> RAX: 6c756e2f7665642f RBX: 006dbc38 RCX: 00402824
>> RDX: 928195da81441750 RSI:  RDI: 006dbc30
>> RBP: 006dbc30 R08:  R09: 7f8a53540700
>> R10: 7f8a535409d0 R11: 0202 R12: 006dbc3c
>> R13: 7ffe80747a5f R14: 7f8a535409c0 R15: 0001
>>
>> Showing all locks held in the system:
>> 1 lock held by khungtaskd/1659:
>>  #0: 8b339ce0 (rcu_read_lock){}-{1:2}, at: 
>> debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6252
>> 1 lock held by kswapd0/2195:
>> 1 lock held by kswapd1/2196:
>> 1 lock held by in:imklog/8191:
>>  #0: 8880125b1270 (>f_pos_lock){+.+.}-{3:3}, at: 
>> __fdget_pos+0xe9/0x100 fs/file.c:932
>> 1 lock held by cron/8189:
>> 2 locks held by syz-executor880/8502:
>> 2 locks held by syz-executor880/8505:
>> 2 locks held by syz-executor880/8507:
>> 2 locks held by syz-executor880/11706:
>> 2 locks held by syz-executor880/11709:
>> 3 locks held by syz-executor880/12008:
>> 2 locks held by syz-executor880/12015:
>>
>> =
>>
>>
>>
>> ---
>> This report is generated by a bot. It may contain errors.
>> See https://goo.gl/tpsmEJ for more information about syzbot.
>> syzbot engineers can be reached at syzkal...@googlegroups.com.
>>
>> syzbot will keep track of this issue. See:
>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>> syzbot can test patches for this issue, for details see:
>> https://goo.gl/tpsmEJ#testing-patches

[REF PATCH] block/loop: remove unused range

2020-11-23 Thread Alex Shi

The variable isn't used, so don't brother to set it.

Signed-off-by: Alex Shi 
Cc: Jens Axboe  
Cc: linux-bl...@vger.kernel.org 
Cc: linux-kernel@vger.kernel.org 
---
 drivers/block/loop.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 9a27d4f1c08a..ff8d25a379f7 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -2312,7 +2312,6 @@ MODULE_ALIAS("devname:loop-control");
 static int __init loop_init(void)
 {
int i, nr;
-   unsigned long range;
struct loop_device *lo;
int err;
 
@@ -2351,10 +2350,8 @@ static int __init loop_init(void)
 */
if (max_loop) {
nr = max_loop;
-   range = max_loop << part_shift;
} else {
nr = CONFIG_BLK_DEV_LOOP_MIN_COUNT;
-   range = 1UL << MINORBITS;
}
 
err = misc_register(_misc);
-- 
2.29.GIT

[PATCH] fs/btrfs: remove parent_level in btrfs_sb_log_location_bdev

2020-11-23 Thread Alex Shi

The variable parent_level isn't used, so don't bother to get it.

Signed-off-by: Alex Shi 
Cc: Chris Mason  
Cc: Josef Bacik  
Cc: David Sterba  
Cc: linux-bt...@vger.kernel.org 
Cc: linux-kernel@vger.kernel.org 
---
 fs/btrfs/ctree.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 32a57a70b98d..e5a0941c4bde 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -1578,13 +1578,10 @@ int btrfs_realloc_node(struct btrfs_trans_handle *trans,
int end_slot;
int i;
int err = 0;
-   int parent_level;
u32 blocksize;
int progress_passed = 0;
struct btrfs_disk_key disk_key;
 
-   parent_level = btrfs_header_level(parent);
-
WARN_ON(trans->transaction != fs_info->running_transaction);
WARN_ON(trans->transid != fs_info->generation);
 
-- 
2.29.GIT

[PATCH] sched/core: remove rq getting in schedule_tail

2020-11-23 Thread Alex Shi

commit dfa50b605c2a ("sched: Make finish_task_switch() return 'struct rq
*'") moved the 'rq' parameter into finish_task_switch, so we don't need
it now in schedule_tail.

Signed-off-by: Alex Shi 
Cc: Ingo Molnar  
Cc: Peter Zijlstra  
Cc: Juri Lelli  
Cc: Vincent Guittot  
Cc: Dietmar Eggemann  
Cc: Steven Rostedt  
Cc: Ben Segall  
Cc: Mel Gorman  
Cc: Daniel Bristot de Oliveira  
Cc: linux-kernel@vger.kernel.org 
---
 kernel/sched/core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 44d526b8d942..ab473fce092b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4239,7 +4239,6 @@ static struct rq *finish_task_switch(struct task_struct 
*prev)
 asmlinkage __visible void schedule_tail(struct task_struct *prev)
__releases(rq->lock)
 {
-   struct rq *rq;
 
/*
 * New tasks start with FORK_PREEMPT_COUNT, see there and
@@ -4249,7 +4248,7 @@ asmlinkage __visible void schedule_tail(struct 
task_struct *prev)
 * and the preempt_enable() will end up enabling preemption.
 */
 
-   rq = finish_task_switch(prev);
+   finish_task_switch(prev);
preempt_enable();
 
if (current->set_child_tid)
-- 
2.29.GIT

Re: [PATCH] Documentation: Chinese translation of Documentation/arm64/elf_hwcaps.rst

2020-11-23 Thread Alex Shi

Thanks Bailu!

Reviewed-by: Alex Shi 


在 2020/11/24 上午10:38, Bailu Lin 写道:
> This is a Chinese translated version of
>  Documentation/arm64/elf_hwcaps.rst
> 
> Signed-off-by: Bailu Lin 
> ---
> Changes in v2:
>  - Modify five translation issues as Alex sugguested.
> ---
>  Documentation/arm64/elf_hwcaps.rst|   2 +
>  .../translations/zh_CN/arm64/elf_hwcaps.rst   | 240 ++
>  .../translations/zh_CN/arm64/index.rst|   1 +
>  3 files changed, 243 insertions(+)
>  create mode 100644 Documentation/translations/zh_CN/arm64/elf_hwcaps.rst
> 
> diff --git a/Documentation/arm64/elf_hwcaps.rst 
> b/Documentation/arm64/elf_hwcaps.rst
> index bbd9cf54db6c..87821662eeb2 100644
> --- a/Documentation/arm64/elf_hwcaps.rst
> +++ b/Documentation/arm64/elf_hwcaps.rst
> @@ -1,3 +1,5 @@
> +.. _elf_hwcaps_index:
> +
>  
>  ARM64 ELF hwcaps
>  
> diff --git a/Documentation/translations/zh_CN/arm64/elf_hwcaps.rst 
> b/Documentation/translations/zh_CN/arm64/elf_hwcaps.rst
> new file mode 100644
> index ..c7e4385ee63f
> --- /dev/null
> +++ b/Documentation/translations/zh_CN/arm64/elf_hwcaps.rst
> @@ -0,0 +1,240 @@
> +.. include:: ../disclaimer-zh_CN.rst
> +
> +:Original: :ref:`Documentation/arm64/elf_hwcaps.rst `
> +
> +Translator: Bailu Lin 
> +
> +
> +ARM64 ELF hwcaps
> +
> +
> +这篇文档描述了 arm64 ELF hwcaps 的用法和语义。
> +
> +
> +1. 简介
> +---
> +
> +有些硬件或软件功能仅在某些 CPU 实现上和/或在具体某个内核配置上可用，但
> +对于处于 EL0 的用户空间代码没有可用的架构发现机制。内核通过在辅助向量表
> +公开一组称为 hwcaps 的标志而把这些功能暴露给用户空间。
> +
> +用户空间软件可以通过获取辅助向量的 AT_HWCAP 或 AT_HWCAP2 条目来测试功能，
> +并测试是否设置了相关标志，例如::
> +
> + bool floating_point_is_present(void)
> + {
> + unsigned long hwcaps = getauxval(AT_HWCAP);
> + if (hwcaps & HWCAP_FP)
> + return true;
> +
> + return false;
> + }
> +
> +如果软件依赖于 hwcap 描述的功能，在尝试使用该功能前则应检查相关的 hwcap
> +标志以验证该功能是否存在。
> +
> +不能通过其他方式探查这些功能。当一个功能不可用时，尝试使用它可能导致不可
> +预测的行为，并且无法保证能确切的知道该功能不可用，例如 SIGILL。
> +
> +
> +2. Hwcaps 的说明
> +
> +
> +大多数 hwcaps 旨在说明通过架构 ID 寄存器(处于 EL0 的用户空间代码无法访问)
> +描述的功能的存在。这些 hwcap 通过 ID 寄存器字段定义，并且应根据 ARM 体系
> +结构参考手册（ARM ARM）中定义的字段来解释说明。
> +
> +这些 hwcaps 以下面的形式描述::
> +
> +idreg.field == val 表示有某个功能。
> +
> +当 idreg.field 中有 val 时，hwcaps 表示 ARM ARM 定义的功能是有效的，但是
> +并不是说要完全和 val 相等，也不是说 idreg.field 描述的其他功能就是缺失的。
> +
> +其他 hwcaps 可能表明无法仅由 ID 寄存器描述的功能的存在。这些 hwcaps 可能
> +没有被 ID 寄存器描述，需要参考其他文档。
> +
> +
> +3. AT_HWCAP 中揭示的 hwcaps
> +---
> +
> +HWCAP_FP
> +ID_AA64PFR0_EL1.FP == 0b 表示有此功能。
> +
> +HWCAP_ASIMD
> +ID_AA64PFR0_EL1.AdvSIMD == 0b 表示有此功能。
> +
> +HWCAP_EVTSTRM
> +通用计时器频率配置为大约100KHz以生成事件。
> +
> +HWCAP_AES
> +ID_AA64ISAR0_EL1.AES == 0b0001 表示有此功能。
> +
> +HWCAP_PMULL
> +ID_AA64ISAR0_EL1.AES == 0b0010 表示有此功能。
> +
> +HWCAP_SHA1
> +ID_AA64ISAR0_EL1.SHA1 == 0b0001 表示有此功能。
> +
> +HWCAP_SHA2
> +ID_AA64ISAR0_EL1.SHA2 == 0b0001 表示有此功能。
> +
> +HWCAP_CRC32
> +ID_AA64ISAR0_EL1.CRC32 == 0b0001 表示有此功能。
> +
> +HWCAP_ATOMICS
> +ID_AA64ISAR0_EL1.Atomic == 0b0010 表示有此功能。
> +
> +HWCAP_FPHP
> +ID_AA64PFR0_EL1.FP == 0b0001 表示有此功能。
> +
> +HWCAP_ASIMDHP
> +ID_AA64PFR0_EL1.AdvSIMD == 0b0001 表示有此功能。
> +
> +HWCAP_CPUID
> +根据 Documentation/arm64/cpu-feature-registers.rst 描述，EL0 可以访问
> +某些 ID 寄存器。
> +
> +这些 ID 寄存器可能表示功能的可用性。
> +
> +HWCAP_ASIMDRDM
> +ID_AA64ISAR0_EL1.RDM == 0b0001 表示有此功能。
> +
> +HWCAP_JSCVT
> +ID_AA64ISAR1_EL1.JSCVT == 0b0001 表示有此功能。
> +
> +HWCAP_FCMA
> +ID_AA64ISAR1_EL1.FCMA == 0b0001 表示有此功能。
> +
> +HWCAP_LRCPC
> +ID_AA64ISAR1_EL1.LRCPC == 0b0001 表示有此功能。
> +
> +HWCAP_DCPOP
> +ID_AA64ISAR1_EL1.DPB == 0b0001 表示有此功能。
> +
> +HWCAP_SHA3
> +ID_AA64ISAR0_EL1.SHA3 == 0b0001 表示有此功能。
> +
> +HWCAP_SM3
> +ID_AA64ISAR0_EL1.SM3 == 0b0001 表示有此功能。
> +
> +HWCAP_SM4
> +ID_AA64ISAR0_EL1.SM4 == 0b0001 表示有此功能。
> +
> +HWCAP_ASIMDDP
> +ID_AA64ISAR0_EL1.DP == 0b0001 表示有此功能。
> +
> +HWCAP_SHA512
> +ID_AA64ISAR0_EL1.SHA2 == 0b0010 表示有此功能。
> +
> +HWCAP_SVE
> +ID_AA64PFR0_EL1.SVE == 0b0001 表示有此功能。
> +
> +HWCAP_ASIMDFHM
> +ID_AA64ISAR0_EL1.FHM == 0b0001 表示有此功能。
> +
> +HWCAP_DIT
> +ID_AA64PFR0_EL1.DIT == 0b0001 表示有此功能。
> +
> +HWCAP_USCAT
> +ID_AA64MMFR2_EL1.AT == 0b0001 表示有此功能。
> +
> +HWCAP_ILRCPC
> +ID_AA64ISAR1_EL1.LRCPC == 0b0010 表示有此功能。
> +
> +HWCAP_FLAGM
> +ID_AA64ISAR0_EL1.TS == 0b0001 表示有此功能。
> +

Re: [PATCH] Documentation: Chinese translation of Documentation/arm64/elf_hwcaps.rst

2020-11-22 Thread Alex Shi




在 2020/11/21 下午6:23, Bailu Lin 写道:
> This is a Chinese translated version of
>  Documentation/arm64/elf_hwcaps.rst
> 
> Signed-off-by: Bailu Lin 
> ---
>  Documentation/arm64/elf_hwcaps.rst|   2 +
>  .../translations/zh_CN/arm64/elf_hwcaps.rst   | 240 ++
>  .../translations/zh_CN/arm64/index.rst|   1 +
>  3 files changed, 243 insertions(+)
>  create mode 100644 Documentation/translations/zh_CN/arm64/elf_hwcaps.rst
> 
> diff --git a/Documentation/arm64/elf_hwcaps.rst 
> b/Documentation/arm64/elf_hwcaps.rst
> index bbd9cf54db6c..87821662eeb2 100644
> --- a/Documentation/arm64/elf_hwcaps.rst
> +++ b/Documentation/arm64/elf_hwcaps.rst
> @@ -1,3 +1,5 @@
> +.. _elf_hwcaps_index:
> +
>  
>  ARM64 ELF hwcaps
>  
> diff --git a/Documentation/translations/zh_CN/arm64/elf_hwcaps.rst 
> b/Documentation/translations/zh_CN/arm64/elf_hwcaps.rst
> new file mode 100644
> index ..c7e4385ee63f
> --- /dev/null
> +++ b/Documentation/translations/zh_CN/arm64/elf_hwcaps.rst
> @@ -0,0 +1,240 @@
> +.. include:: ../disclaimer-zh_CN.rst
> +
> +:Original: :ref:`Documentation/arm64/elf_hwcaps.rst `
> +
> +Translator: Bailu Lin 
> +
> +
> +ARM64 ELF hwcaps
> +
> +
> +这篇文档描述了 arm64 ELF hwcaps 的使用方法和规范。

is it better to use 语义 for "semantics'? and 用法 is enough for usages.

> +
> +
> +1. 简介
> +---
> +
> +有些硬件或软件功能仅在某些 CPU 实现上和/或在具体某个内核配置上可用，但
> +对于处于 EL0 的用户空间代码没有可用的架构发现机制。内核通过在辅助向量表
> +公开一组称为 hwcaps 的标志而把这些功能开放给用户空间。

expose, means 暴露 or 揭示， not '开放'， 

> +
> +用户空间软件可以通过获取辅助向量的 AT_HWCAP 或 AT_HWCAP2 条目来测试功能，
> +并测试是否设置了相关标志，例如::
> +
> + bool floating_point_is_present(void)
> + {
> + unsigned long hwcaps = getauxval(AT_HWCAP);
> + if (hwcaps & HWCAP_FP)
> + return true;
> +
> + return false;
> + }
> +
> +如果软件依赖于 hwcap 描述的功能，在尝试使用该功能前则应检查相关的 hwcap
> +标志以验证该功能是否存在。
> +
> +不能通过其他方式探查这些功能。当一个功能不可用时，尝试使用它可能导致不可
> +预测的行为，并且无法保证能确切的知道该功能不可用，例如 SIGILL。
> +
> +
> +2. Hwcaps 的说明
> +
> +
> +大多数 hwcaps 旨在说明通过架构 ID 寄存器(处于 EL0 的用户空间代码无法访问)
> +描述的功能的存在。这些 hwcap 通过 ID 寄存器字段定义，并且应根据 ARM 体系
> +结构参考手册（ARM ARM）中定义的字段来解释说明。
> +
> +这些 hwcaps 以下面的形式描述::
> +
> +Functionality implied by idreg.field == val.

Why this line isn't aligned with next section all detailed explanation?
Do we need to translate or keep Enghlish version for all of them?


> +
> +当 idreg.field 中有 val 时，hwcaps 表示 ARM ARM 定义的功能是有效的，但是
> +并不是说要完全和 val 相等，也不是说 idreg.field 描述的其他功能就是缺失的。
> +
> +其他 hwcaps 可能表明无法仅由 ID 寄存器描述的功能的存在。这些 hwcaps 可能
> +没有被 ID 寄存器描述，需要参考其他文档。
> +
> +
> +3. AT_HWCAP 中公开的 hwcaps

it should be 揭示 not 公开

> +---
> +
> +HWCAP_FP
> +ID_AA64PFR0_EL1.FP == 0b 表示的功能。

ID_AA64PFR0_EL1.FP == 0b 表示有此功能。Is this better?

Thanks
Alex


> +
> +HWCAP_ASIMD
> +ID_AA64PFR0_EL1.AdvSIMD == 0b 表示的功能。
> +
> +HWCAP_EVTSTRM
> +通用计时器频率配置为大约100KHz以生成事件。
> +
> +HWCAP_AES
> +ID_AA64ISAR0_EL1.AES == 0b0001 表示的功能。
> +
> +HWCAP_PMULL
> +ID_AA64ISAR0_EL1.AES == 0b0010 表示的功能。
> +
> +HWCAP_SHA1
> +ID_AA64ISAR0_EL1.SHA1 == 0b0001 表示的功能。
> +
> +HWCAP_SHA2
> +ID_AA64ISAR0_EL1.SHA2 == 0b0001 表示的功能。
> +
> +HWCAP_CRC32
> +ID_AA64ISAR0_EL1.CRC32 == 0b0001 表示的功能。
> +
> +HWCAP_ATOMICS
> +ID_AA64ISAR0_EL1.Atomic == 0b0010 表示的功能。
> +
> +HWCAP_FPHP
> +ID_AA64PFR0_EL1.FP == 0b0001 表示的功能。
> +
> +HWCAP_ASIMDHP
> +ID_AA64PFR0_EL1.AdvSIMD == 0b0001 表示的功能。
> +
> +HWCAP_CPUID
> +根据 Documentation/arm64/cpu-feature-registers.rst 描述，EL0 可以访问
> +某些 ID 寄存器。
> +
> +这些 ID 寄存器可能表示功能的可用性。
> +
> +HWCAP_ASIMDRDM
> +ID_AA64ISAR0_EL1.RDM == 0b0001 表示的功能。
> +
> +HWCAP_JSCVT
> +ID_AA64ISAR1_EL1.JSCVT == 0b0001 表示的功能。
> +
> +HWCAP_FCMA
> +ID_AA64ISAR1_EL1.FCMA == 0b0001 表示的功能。
> +
> +HWCAP_LRCPC
> +ID_AA64ISAR1_EL1.LRCPC == 0b0001 表示的功能。
> +
> +HWCAP_DCPOP
> +ID_AA64ISAR1_EL1.DPB == 0b0001 表示的功能。
> +
> +HWCAP_SHA3
> +ID_AA64ISAR0_EL1.SHA3 == 0b0001 表示的功能。
> +
> +HWCAP_SM3
> +ID_AA64ISAR0_EL1.SM3 == 0b0001 表示的功能。
> +
> +HWCAP_SM4
> +ID_AA64ISAR0_EL1.SM4 == 0b0001 表示的功能。
> +
> +HWCAP_ASIMDDP
> +ID_AA64ISAR0_EL1.DP == 0b0001 表示的功能。
> +
> +HWCAP_SHA512
> +ID_AA64ISAR0_EL1.SHA2 == 0b0010 表示的功能。
> +
> +HWCAP_SVE
> +ID_AA64PFR0_EL1.SVE == 0b0001 表示的功能。
> +
> +HWCAP_ASIMDFHM
> +ID_AA64ISAR0_EL1.FHM == 0b0001 表示的功能。
> +
> +HWCAP_DIT
> +ID_AA64PFR0_EL1.DIT == 0b0001 表示的功能。
> +
> +HWCAP_USCAT
> +ID_AA64MMFR2_EL1.AT == 0b0001 表示的功能。
> +
> +HWCAP_ILRCPC
> +ID_AA64ISAR1_EL1.LRCPC == 0b0010 表示的功能。
> +
> +HWCAP_FLAGM
> +ID_AA64ISAR0_EL1.TS == 0b0001 表示的功能。
> +
> +HWCAP_SSBS
> +ID_AA64PFR1_EL1.SSBS == 0b0010 表示的功能。
> +
> +HWCAP_SB
> +ID_AA64ISAR1_EL1.SB == 0b0001 表示的功能。
> +
> +HWCAP_PACA
> +如 Documentation/arm64/pointer-authentication.rst 所描述，
> +ID_AA64ISAR1_EL1.APA == 0b0001 或 ID_AA64ISAR1_EL1.API == 0b0001
> +表示的功能。
> +
> +HWCAP_PACG
> +如

Re: [PATCH next] mm/swap.c: reduce lock contention in lru_cache_add

2020-11-22 Thread Alex Shi




在 2020/11/21 上午7:19, Andrew Morton 写道:
> On Fri, 20 Nov 2020 16:27:27 +0800 Alex Shi  
> wrote:
> 
>> The current relock logical will change lru_lock when found a new
>> lruvec, so if 2 memcgs are reading file or alloc page at same time,
>> they could hold the lru_lock alternately, and wait for each other for
>> fairness attribute of ticket spin lock.
>>
>> This patch will sort that all lru_locks and only hold them once in
>> above scenario. That could reduce fairness waiting for lock reget.
>> Than, vm-scalability/case-lru-file-readtwice could get ~5% performance
>> gain on my 2P*20core*HT machine.
> 
> But what happens when all or most of the pages belong to the same
> lruvec?  This sounds like the common case - won't it suffer?
> 
Hi Andrew,

My testing show no regression on this situation, like original centos7,
The most spending time is on lru_lock for lru sensitive case.

Thanks
Alex

Re: [PATCH next] mm/vmscan: __isolate_lru_page_prepare clean up

2020-11-22 Thread Alex Shi




在 2020/11/22 下午8:35, Matthew Wilcox 写道:
> On Sun, Nov 22, 2020 at 08:00:19PM +0800, Alex Shi wrote:
>>  mm/compaction.c |  2 +-
>>  mm/vmscan.c | 69 +++--
>>  2 files changed, 34 insertions(+), 37 deletions(-)
> 
> How is it possible you're changing the signature of a function without
> touching a header file?  Surely __isolate_lru_page_prepare() must be declared
> in mm/internal.h ?
> 
>> +++ b/mm/vmscan.c
>> @@ -1536,19 +1536,17 @@ unsigned int reclaim_clean_pages_from_list(struct 
>> zone *zone,
>>   * page:page to consider
>>   * mode:one of the LRU isolation modes defined above
>>   *
>> - * returns 0 on success, -ve errno on failure.
>> + * returns ture on success, false on failure.
> 
> "true".
> 
>> @@ -1674,35 +1672,34 @@ static unsigned long isolate_lru_pages(unsigned long 
>> nr_to_scan,
>>   * only when the page is being freed somewhere else.
>>   */
>>  scan += nr_pages;
>> -switch (__isolate_lru_page_prepare(page, mode)) {
>> -case 0:
>> +if (!__isolate_lru_page_prepare(page, mode)) {
>> +/* else it is being freed elsewhere */
> 
> I don't think the word "else" helps here.  Just
>   /* It is being freed elsewhere */
> 
>> +if (!TestClearPageLRU(page)) {
>>  /*
>> + * This page may in other isolation path,
>> + * but we still hold lru_lock.
>>   */
> 
> I don't think this comment helps me understand what's going on here.
> Maybe:
> 
>   /* Another thread is already isolating this page */
> 
>> +put_page(page);
>>  list_move(>lru, src);
>> +continue;
>>  }

Hi Matthew,

Thanks a lot for all comments, I picked all up and here is the v3:

>From 167131dd106a96fd08af725df850e0da6ec899af Mon Sep 17 00:00:00 2001
From: Alex Shi 
Date: Fri, 20 Nov 2020 14:49:16 +0800
Subject: [PATCH v3 next] mm/vmscan: __isolate_lru_page_prepare clean up

The function just return 2 results, so use a 'switch' to deal with its
result is unnecessary, and simplify it to a bool func as Vlastimil
suggested.

Also remove 'goto' by reusing list_move(), and take Matthew Wilcox's
suggestion to update comments in function.

Signed-off-by: Alex Shi 
Cc: Andrew Morton 
Cc: Matthew Wilcox 
Cc: Hugh Dickins 
Cc: Yu Zhao 
Cc: Vlastimil Babka 
Cc: Michal Hocko 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 include/linux/swap.h |  2 +-
 mm/compaction.c  |  2 +-
 mm/vmscan.c  | 68 
 3 files changed, 33 insertions(+), 39 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 596bc2f4d9b0..5bba15ac5a2e 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -356,7 +356,7 @@ extern void lru_cache_add_inactive_or_unevictable(struct 
page *page,
 extern unsigned long zone_reclaimable_pages(struct zone *zone);
 extern unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
gfp_t gfp_mask, nodemask_t *mask);
-extern int __isolate_lru_page_prepare(struct page *page, isolate_mode_t mode);
+extern bool __isolate_lru_page_prepare(struct page *page, isolate_mode_t mode);
 extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
  unsigned long nr_pages,
  gfp_t gfp_mask,
diff --git a/mm/compaction.c b/mm/compaction.c
index b68931854253..8d71ffebe6cb 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -988,7 +988,7 @@ isolate_migratepages_block(struct compact_control *cc, 
unsigned long low_pfn,
if (unlikely(!get_page_unless_zero(page)))
goto isolate_fail;
 
-   if (__isolate_lru_page_prepare(page, isolate_mode) != 0)
+   if (!__isolate_lru_page_prepare(page, isolate_mode))
goto isolate_fail_put;
 
/* Try isolate the page */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c6f94e55c3fe..4d2703c43310 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1536,19 +1536,17 @@ unsigned int reclaim_clean_pages_from_list(struct zone 
*zone,
  * page:   page to consider
  * mode:   one of the LRU isolation modes defined above
  *
- * returns 0 on success, -ve errno on failure.
+ * returns true on success, false on failure.
  */
-int __isolate_lru_page_prepare(struct page *page, isolate_mode_t mode)
+bool __isolate_lru_page_prepare(struct page *page, isolate_mode_t mode)
 {
-   int

Re: [PATCH next] mm/vmscan: __isolate_lru_page_prepare clean up

2020-11-22 Thread Alex Shi




在 2020/11/21 上午7:13, Andrew Morton 写道:
> On Fri, 20 Nov 2020 16:03:33 +0800 Alex Shi  
> wrote:
> 
>> The function just return 2 results, so use a 'switch' to deal with its
>> result is unnecessary, and simplify it to a bool func as Vlastimil
>> suggested.
>>
>> Also removed 'goto' in using by reusing list_move().
>>
>> ...
>>
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1540,7 +1540,7 @@ unsigned int reclaim_clean_pages_from_list(struct zone 
>> *zone,
>>   */
>>  int __isolate_lru_page_prepare(struct page *page, isolate_mode_t mode)
>>  {
>> -int ret = -EBUSY;
>> +int ret = false;
>>  
>>  /* Only take pages on the LRU. */
>>  if (!PageLRU(page))
>> @@ -1590,7 +1590,7 @@ int __isolate_lru_page_prepare(struct page *page, 
>> isolate_mode_t mode)
>>  if ((mode & ISOLATE_UNMAPPED) && page_mapped(page))
>>  return ret;
>>  
>> -return 0;
>> +return true;
>>  }
> 
> The resulting __isolate_lru_page_prepare() is rather unpleasing.
> 
> - Why return an int and not a bool?
> 
> - `int ret = false' is a big hint that `ret' should have bool type!
> 
> - Why not just remove `ret' and do `return false' in all those `return
>   ret' places?
> 
> - The __isolate_lru_page_prepare() kerneldoc still says "returns 0 on
>   success, -ve errno on failure".  
> 

Hi Andrew,

Thanks a lot for caching and sorry for the bad patch.
It initially a 'int' version, and change it to bool in a hurry weekend.
I am sorry.

>From 36c4fbda2d55633d3c1a3e79f045cd9877453ab7 Mon Sep 17 00:00:00 2001
From: Alex Shi 
Date: Fri, 20 Nov 2020 14:49:16 +0800
Subject: [PATCH v2 next] mm/vmscan: __isolate_lru_page_prepare clean up

The function just return 2 results, so use a 'switch' to deal with its
result is unnecessary, and simplify it to a bool func as Vlastimil
suggested.

Also remove 'goto' by reusing list_move().

Signed-off-by: Alex Shi 
Cc: Andrew Morton 
Cc: Hugh Dickins 
Cc: Yu Zhao 
Cc: Vlastimil Babka 
Cc: Michal Hocko 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/compaction.c |  2 +-
 mm/vmscan.c | 69 +++--
 2 files changed, 34 insertions(+), 37 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index b68931854253..8d71ffebe6cb 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -988,7 +988,7 @@ isolate_migratepages_block(struct compact_control *cc, 
unsigned long low_pfn,
if (unlikely(!get_page_unless_zero(page)))
goto isolate_fail;
 
-   if (__isolate_lru_page_prepare(page, isolate_mode) != 0)
+   if (!__isolate_lru_page_prepare(page, isolate_mode))
goto isolate_fail_put;
 
/* Try isolate the page */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c6f94e55c3fe..ab2fdee0828e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1536,19 +1536,17 @@ unsigned int reclaim_clean_pages_from_list(struct zone 
*zone,
  * page:   page to consider
  * mode:   one of the LRU isolation modes defined above
  *
- * returns 0 on success, -ve errno on failure.
+ * returns ture on success, false on failure.
  */
-int __isolate_lru_page_prepare(struct page *page, isolate_mode_t mode)
+bool __isolate_lru_page_prepare(struct page *page, isolate_mode_t mode)
 {
-   int ret = -EBUSY;
-
/* Only take pages on the LRU. */
if (!PageLRU(page))
-   return ret;
+   return false;
 
/* Compaction should not handle unevictable pages but CMA can do so */
if (PageUnevictable(page) && !(mode & ISOLATE_UNEVICTABLE))
-   return ret;
+   return false;
 
/*
 * To minimise LRU disruption, the caller can indicate that it only
@@ -1561,7 +1559,7 @@ int __isolate_lru_page_prepare(struct page *page, 
isolate_mode_t mode)
if (mode & ISOLATE_ASYNC_MIGRATE) {
/* All the caller can do on PageWriteback is block */
if (PageWriteback(page))
-   return ret;
+   return false;
 
if (PageDirty(page)) {
struct address_space *mapping;
@@ -1577,20 +1575,20 @@ int __isolate_lru_page_prepare(struct page *page, 
isolate_mode_t mode)
 * from the page cache.
 */
if (!trylock_page(page))
-   return ret;
+   return false;
 
mapping = page_mapping(page);
migrate_dirty = !mapping || mapping->a_ops->migratepage;
unlock_page(page);
if (!migrate_dirty)
-

Re: [PATCH next-akpm] mm/memcg: add missed warning in mem_cgroup_lruvec

2020-11-20 Thread Alex Shi




在 2020/11/20 下午5:30, Alex Shi 写道:
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 925b09ace986..7809020ef7bd 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -618,6 +618,7 @@ static inline struct lruvec *mem_cgroup_lruvec(struct 
> mem_cgroup *memcg,
>   goto out;
>   }
>  
> + VM_WARN_ON_ONCE_PAGE(!memcg, page);

Very sorry for typo above! please use the updated patch.


>From 68d69172f21f39928cf8ff204f5ff5cd62ac7776 Mon Sep 17 00:00:00 2001
From: Alex Shi 
Date: Fri, 20 Nov 2020 17:02:41 +0800
Subject: [PATCH] mm/memcg: add missed warning in mem_cgroup_lruvec

commit "(mm/memcontrol:rewrite mem_cgroup_page_lruvec())" on mm tree use
mem_cgroup_lruvec to rewrite mem_cgroup_page_lruvec, but it missed a
DEBUG_VM warning as following, since we always charge a page before
return from allocation. Add back this warning is helpful:

    VM_WARN_ON_ONCE(!memcg);

Signed-off-by: Alex Shi 
Cc: Andrew Morton  
Cc: Johannes Weiner  
Cc: Shakeel Butt  
Cc: Roman Gushchin  
Cc: Michal Hocko  
Cc: Yafang Shao  
Cc: Alexander Duyck  
Cc: Hui Su 
Cc: Wei Yang  
Cc: linux-kernel@vger.kernel.org 
---
 include/linux/memcontrol.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 925b09ace986..303438822818 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -618,6 +618,7 @@ static inline struct lruvec *mem_cgroup_lruvec(struct 
mem_cgroup *memcg,
goto out;
}
 
+   VM_WARN_ON_ONCE(!memcg);
if (!memcg)
memcg = root_mem_cgroup;
 
-- 
2.29.GIT

[PATCH next-akpm] mm/memcg: remove incorrect comments

2020-11-20 Thread Alex Shi

Swapcache readahead pages are charged before get used, so it unlikely be
migrated before charged. remove the incorrect comments.

Signed-off-by: Alex Shi 
Cc: Johannes Weiner  
Cc: Michal Hocko  
Cc: Vladimir Davydov  
Cc: Andrew Morton  
Cc: cgro...@vger.kernel.org 
Cc: linux...@kvack.org 
Cc: linux-kernel@vger.kernel.org 
---
 mm/memcontrol.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 45465c03a8d7..08c267305725 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6941,7 +6941,6 @@ void mem_cgroup_migrate(struct page *oldpage, struct page 
*newpage)
if (page_memcg(newpage))
return;
 
-   /* Swapcache readahead pages can get replaced before being charged */
memcg = page_memcg(oldpage);
VM_WARN_ON_ONCE_PAGE(!memcg, oldpage);
if (!memcg)
-- 
2.29.GIT

[PATCH next-akpm] mm/memcg: add missed warning in mem_cgroup_lruvec

2020-11-20 Thread Alex Shi

commit "(mm/memcontrol:rewrite mem_cgroup_page_lruvec())" on mm tree use
mem_cgroup_lruvec to rewrite mem_cgroup_page_lruvec, but it missed a
DEBUG_VM warning as following, since we always charge a page before
return from allocation. Add back this warning is helpful:

VM_WARN_ON_ONCE_PAGE(!memcg, page);

Signed-off-by: Alex Shi 
Cc: Andrew Morton  
Cc: Johannes Weiner  
Cc: Shakeel Butt  
Cc: Roman Gushchin  
Cc: Michal Hocko  
Cc: Yafang Shao  
Cc: Alexander Duyck  
Cc: Hui Su 
Cc: Wei Yang  
Cc: linux-kernel@vger.kernel.org 
---
 include/linux/memcontrol.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 925b09ace986..7809020ef7bd 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -618,6 +618,7 @@ static inline struct lruvec *mem_cgroup_lruvec(struct 
mem_cgroup *memcg,
goto out;
}
 
+   VM_WARN_ON_ONCE_PAGE(!memcg, page);
if (!memcg)
memcg = root_mem_cgroup;
 
-- 
2.29.GIT

[PATCH next] mm/swap.c: reduce lock contention in lru_cache_add

2020-11-20 Thread Alex Shi

The current relock logical will change lru_lock when found a new
lruvec, so if 2 memcgs are reading file or alloc page at same time,
they could hold the lru_lock alternately, and wait for each other for
fairness attribute of ticket spin lock.

This patch will sort that all lru_locks and only hold them once in
above scenario. That could reduce fairness waiting for lock reget.
Than, vm-scalability/case-lru-file-readtwice could get ~5% performance
gain on my 2P*20core*HT machine.

Suggested-by: Konstantin Khlebnikov 
Signed-off-by: Alex Shi 
Cc: Konstantin Khlebnikov 
Cc: Andrew Morton 
Cc: Hugh Dickins 
Cc: Yu Zhao 
Cc: Michal Hocko 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/swap.c | 57 +++
 1 file changed, 49 insertions(+), 8 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index 490553f3f9ef..c787b38bf9c0 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -1009,24 +1009,65 @@ static void __pagevec_lru_add_fn(struct page *page, 
struct lruvec *lruvec)
trace_mm_lru_insertion(page, lru);
 }
 
+struct lruvecs {
+   struct list_head lists[PAGEVEC_SIZE];
+   struct lruvec *vecs[PAGEVEC_SIZE];
+};
+
+/* Sort pvec pages on their lruvec */
+int sort_page_lruvec(struct lruvecs *lruvecs, struct pagevec *pvec)
+{
+   int i, j, nr_lruvec;
+   struct page *page;
+   struct lruvec *lruvec = NULL;
+
+   lruvecs->vecs[0] = NULL;
+   for (i = nr_lruvec = 0; i < pagevec_count(pvec); i++) {
+   page = pvec->pages[i];
+   lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page));
+
+   /* Try to find a same lruvec */
+   for (j = 0; j <= nr_lruvec; j++)
+   if (lruvec == lruvecs->vecs[j])
+   break;
+
+   /* A new lruvec */
+   if (j > nr_lruvec) {
+   INIT_LIST_HEAD(>lists[nr_lruvec]);
+   lruvecs->vecs[nr_lruvec] = lruvec;
+   j = nr_lruvec++;
+   lruvecs->vecs[nr_lruvec] = 0;
+   }
+
+   list_add_tail(>lru, >lists[j]);
+   }
+
+   return nr_lruvec;
+}
+
 /*
  * Add the passed pages to the LRU, then drop the caller's refcount
  * on them.  Reinitialises the caller's pagevec.
  */
 void __pagevec_lru_add(struct pagevec *pvec)
 {
-   int i;
-   struct lruvec *lruvec = NULL;
+   int i, nr_lruvec;
unsigned long flags = 0;
+   struct page *page;
+   struct lruvecs lruvecs;
 
-   for (i = 0; i < pagevec_count(pvec); i++) {
-   struct page *page = pvec->pages[i];
+   nr_lruvec = sort_page_lruvec(, pvec);
 
-   lruvec = relock_page_lruvec_irqsave(page, lruvec, );
-   __pagevec_lru_add_fn(page, lruvec);
+   for (i = 0; i < nr_lruvec; i++) {
+   spin_lock_irqsave([i]->lru_lock, flags);
+   while (!list_empty([i])) {
+   page = lru_to_page([i]);
+   list_del(>lru);
+   __pagevec_lru_add_fn(page, lruvecs.vecs[i]);
+   }
+   spin_unlock_irqrestore([i]->lru_lock, flags);
}
-   if (lruvec)
-   unlock_page_lruvec_irqrestore(lruvec, flags);
+
release_pages(pvec->pages, pvec->nr);
pagevec_reinit(pvec);
 }
-- 
2.29.GIT

[PATCH next] mm/vmscan: __isolate_lru_page_prepare clean up

2020-11-20 Thread Alex Shi

The function just return 2 results, so use a 'switch' to deal with its
result is unnecessary, and simplify it to a bool func as Vlastimil
suggested.

Also removed 'goto' in using by reusing list_move().

Signed-off-by: Alex Shi 
Cc: Andrew Morton 
Cc: Hugh Dickins 
Cc: Yu Zhao 
Cc: Vlastimil Babka 
Cc: Michal Hocko 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/compaction.c |  2 +-
 mm/vmscan.c | 53 -
 2 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index b68931854253..8d71ffebe6cb 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -988,7 +988,7 @@ isolate_migratepages_block(struct compact_control *cc, 
unsigned long low_pfn,
if (unlikely(!get_page_unless_zero(page)))
goto isolate_fail;
 
-   if (__isolate_lru_page_prepare(page, isolate_mode) != 0)
+   if (!__isolate_lru_page_prepare(page, isolate_mode))
goto isolate_fail_put;
 
/* Try isolate the page */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c6f94e55c3fe..7f32c1979804 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1540,7 +1540,7 @@ unsigned int reclaim_clean_pages_from_list(struct zone 
*zone,
  */
 int __isolate_lru_page_prepare(struct page *page, isolate_mode_t mode)
 {
-   int ret = -EBUSY;
+   int ret = false;
 
/* Only take pages on the LRU. */
if (!PageLRU(page))
@@ -1590,7 +1590,7 @@ int __isolate_lru_page_prepare(struct page *page, 
isolate_mode_t mode)
if ((mode & ISOLATE_UNMAPPED) && page_mapped(page))
return ret;
 
-   return 0;
+   return true;
 }
 
 /*
@@ -1674,35 +1674,34 @@ static unsigned long isolate_lru_pages(unsigned long 
nr_to_scan,
 * only when the page is being freed somewhere else.
 */
scan += nr_pages;
-   switch (__isolate_lru_page_prepare(page, mode)) {
-   case 0:
+   if (!__isolate_lru_page_prepare(page, mode)) {
+   /* else it is being freed elsewhere */
+   list_move(>lru, src);
+   continue;
+   }
+   /*
+* Be careful not to clear PageLRU until after we're
+* sure the page is not being freed elsewhere -- the
+* page release code relies on it.
+*/
+   if (unlikely(!get_page_unless_zero(page))) {
+   list_move(>lru, src);
+   continue;
+   }
+
+   if (!TestClearPageLRU(page)) {
/*
-* Be careful not to clear PageLRU until after we're
-* sure the page is not being freed elsewhere -- the
-* page release code relies on it.
+* This page may in other isolation path,
+* but we still hold lru_lock.
 */
-   if (unlikely(!get_page_unless_zero(page)))
-   goto busy;
-
-   if (!TestClearPageLRU(page)) {
-   /*
-* This page may in other isolation path,
-* but we still hold lru_lock.
-*/
-   put_page(page);
-   goto busy;
-   }
-
-   nr_taken += nr_pages;
-   nr_zone_taken[page_zonenum(page)] += nr_pages;
-   list_move(>lru, dst);
-   break;
-
-   default:
-busy:
-   /* else it is being freed elsewhere */
+   put_page(page);
list_move(>lru, src);
+   continue;
}
+
+   nr_taken += nr_pages;
+   nr_zone_taken[page_zonenum(page)] += nr_pages;
+   list_move(>lru, dst);
}
 
/*
-- 
2.29.GIT

Re: [PATCH] docs/vm: remove unused 3 items explanation for /proc/vmstat

2020-11-18 Thread Alex Shi




在 2020/11/19 上午4:46, Jonathan Corbet 写道:
> On Mon, 16 Nov 2020 17:51:22 +0800
> Alex Shi  wrote:
> 
>> Commit 5647bc293ab1 ("mm: compaction: Move migration fail/success
>> stats to migrate.c"), removed 3 items in /proc/vmstat. but the docs
>> still has their explanation. let's remove them.
>>
>> "compact_blocks_moved",
>> "compact_pages_moved",
>> "compact_pagemigrate_failed",
> 
> So a quick look says that the above-mentioned patch didn't remove those
> three items; two of them were, instead, renamed.  Rather than just taking
> out the old information, it seems we should actually update it to reflect
> current reality?
> 

I thought about the replacement, but there are couple of migration events
have no explanation:

#ifdef CONFIG_MIGRATION
"pgmigrate_success",
"pgmigrate_fail",
"thp_migration_success",
"thp_migration_fail",
"thp_migration_split",
#endif

It's better to fill them together, also change current explanation accordinglly.
but I'm not so confident on this now...

Thanks
Alex

Re: [PATCH] khugepaged: add couples parameter explanation for kernel-doc markup

2020-11-17 Thread Alex Shi




在 2020/11/17 下午3:15, Alex Shi 写道:
>  /**
>   * collapse_file - collapse filemap/tmpfs/shmem pages into huge one.
>   *
> + * @mm: process address space where collapse happens
> + * @file: file that collapse on
> + * @start: collapse start address 

Hi Andrew,

A trailing whitespace at above line. very sorry for this, and it's the update:

>From 190ff88583fa755ff791fc303e4b7ac75f6c96f7 Mon Sep 17 00:00:00 2001
From: Alex Shi 
Date: Tue, 17 Nov 2020 14:58:01 +0800
Subject: [PATCH] khugepaged: add couples parameter explanation for kernel-doc
 markup

Add missed parameter explanation for some kernel-doc warnings:
mm/khugepaged.c:102: warning: Function parameter or member
'nr_pte_mapped_thp' not described in 'mm_slot'
mm/khugepaged.c:102: warning: Function parameter or member
'pte_mapped_thp' not described in 'mm_slot'
mm/khugepaged.c:1424: warning: Function parameter or member 'mm' not
described in 'collapse_pte_mapped_thp'
mm/khugepaged.c:1424: warning: Function parameter or member 'addr' not
described in 'collapse_pte_mapped_thp'
mm/khugepaged.c:1626: warning: Function parameter or member 'mm' not
described in 'collapse_file'
mm/khugepaged.c:1626: warning: Function parameter or member 'file' not
described in 'collapse_file'
mm/khugepaged.c:1626: warning: Function parameter or member 'start' not
described in 'collapse_file'
mm/khugepaged.c:1626: warning: Function parameter or member 'hpage' not
described in 'collapse_file'
mm/khugepaged.c:1626: warning: Function parameter or member 'node' not
described in 'collapse_file'

Signed-off-by: Alex Shi 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/khugepaged.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 757292532767..2cc78a51c398 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -90,6 +90,8 @@ static struct kmem_cache *mm_slot_cache __read_mostly;
  * @hash: hash collision list
  * @mm_node: khugepaged scan list headed in khugepaged_scan.mm_head
  * @mm: the mm that this information is valid for
+ * @nr_pte_mapped_thp: number of pte mapped THP
+ * @pte_mapped_thp: address array corresponding pte mapped THP
  */
 struct mm_slot {
struct hlist_node hash;
@@ -1414,7 +1416,11 @@ static int khugepaged_add_pte_mapped_thp(struct 
mm_struct *mm,
 }
 
 /**
- * Try to collapse a pte-mapped THP for mm at address haddr.
+ * collapse_pte_mapped_thp - Try to collapse a pte-mapped THP for mm at
+ * address haddr.
+ *
+ * @mm: process address space where collapse happens
+ * @addr: THP collapse address
  *
  * This function checks whether all the PTEs in the PMD are pointing to the
  * right THP. If so, retract the page table so the THP can refault in with
@@ -1605,6 +1611,12 @@ static void retract_page_tables(struct address_space 
*mapping, pgoff_t pgoff)
 /**
  * collapse_file - collapse filemap/tmpfs/shmem pages into huge one.
  *
+ * @mm: process address space where collapse happens
+ * @file: file that collapse on
+ * @start: collapse start address
+ * @hpage: new allocated huge page for collapse
+ * @node: appointed node the new huge page allocate from
+ *
  * Basic scheme is simple, details are more complex:
  *  - allocate and lock a new huge page;
  *  - scan page cache replacing old pages with the new one
-- 
2.29.GIT

[PATCH] mm/truncate: add parameter explanation for invalidate_mapping_pagevec

2020-11-17 Thread Alex Shi

To fix a kernel-doc markups issue:
mm/truncate.c:646: warning: Function parameter or member 'mapping' not
described in 'invalidate_mapping_pagevec'
mm/truncate.c:646: warning: Function parameter or member 'start' not
described in 'invalidate_mapping_pagevec'
mm/truncate.c:646: warning: Function parameter or member 'end' not
described in 'invalidate_mapping_pagevec'
mm/truncate.c:646: warning: Function parameter or member 'nr_pagevec'
not described in 'invalidate_mapping_pagevec'

Signed-off-by: Alex Shi 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/truncate.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/mm/truncate.c b/mm/truncate.c
index 960edf5803ca..c196fad0bb5d 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -637,9 +637,15 @@ unsigned long invalidate_mapping_pages(struct 
address_space *mapping,
 EXPORT_SYMBOL(invalidate_mapping_pages);
 
 /**
- * This helper is similar with the above one, except that it accounts for pages
- * that are likely on a pagevec and count them in @nr_pagevec, which will used 
by
- * the caller.
+ * invalidate_mapping_pagevec - Invalidate all the unlocked pages of one inode
+ * @mapping: the address_space which holds the pages to invalidate
+ * @start: the offset 'from' which to invalidate
+ * @end: the offset 'to' which to invalidate (inclusive)
+ * @nr_pagevec: invalidate failed page number for caller
+ *
+ * This helper is similar with invalidate_mapping_pages, except that it 
accounts
+ * for pages that failed to invalidate on a pagevec and count them in
+ * @nr_pagevec, which will used by the caller.
  */
 void invalidate_mapping_pagevec(struct address_space *mapping,
pgoff_t start, pgoff_t end, unsigned long *nr_pagevec)
-- 
2.29.GIT

[PATCH] mm/mapping_dirty_helpers: enhance the kernel-doc markups

2020-11-17 Thread Alex Shi

Add and change parameter explanation for wp_pte and clean_record_pte, to
avoid W1 warning:
mm/mapping_dirty_helpers.c:34: warning: Function parameter or member
'end' not described in 'wp_pte'
mm/mapping_dirty_helpers.c:88: warning: Function parameter or member
'end' not described in 'clean_record_pte'

Signed-off-by: Alex Shi 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/mapping_dirty_helpers.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/mapping_dirty_helpers.c b/mm/mapping_dirty_helpers.c
index 2c7d03675903..b59054ef2e10 100644
--- a/mm/mapping_dirty_helpers.c
+++ b/mm/mapping_dirty_helpers.c
@@ -23,7 +23,8 @@ struct wp_walk {
 /**
  * wp_pte - Write-protect a pte
  * @pte: Pointer to the pte
- * @addr: The virtual page address
+ * @addr: The start of protecting virtual address
+ * @end: The end of protecting virtual address
  * @walk: pagetable walk callback argument
  *
  * The function write-protects a pte and records the range in
@@ -74,7 +75,8 @@ struct clean_walk {
  * clean_record_pte - Clean a pte and record its address space offset in a
  * bitmap
  * @pte: Pointer to the pte
- * @addr: The virtual page address
+ * @addr: The start of virtual address to be clean
+ * @end: The end of virtual address to be clean
  * @walk: pagetable walk callback argument
  *
  * The function cleans a pte and records the range in
-- 
2.29.GIT

[PATCH] mm/vmalloc: add 'align' parameter explanation for pvm_determine_end_from_reverse

2020-11-17 Thread Alex Shi

Kernel-doc markup has a issue on pvm_determine_end_from_reverse:
mm/vmalloc.c:3145: warning: Function parameter or member 'align' not
described in 'pvm_determine_end_from_reverse'
Add a explanation for it to remove the warning.

Signed-off-by: Alex Shi 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/vmalloc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 7e903524e002..4cd7652ce626 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3137,6 +3137,7 @@ pvm_find_va_enclose_addr(unsigned long addr)
  * @va:
  *   in - the VA we start the search(reverse order);
  *   out - the VA with the highest aligned end address.
+ * @align: alignment for required highest address
  *
  * Returns: determined end address within vmap_area
  */
-- 
2.29.GIT

[PATCH] khugepaged: add couples parameter explanation for kernel-doc markup

2020-11-16 Thread Alex Shi

Add missed parameter explanation for some kernel-doc warnings:
mm/khugepaged.c:102: warning: Function parameter or member
'nr_pte_mapped_thp' not described in 'mm_slot'
mm/khugepaged.c:102: warning: Function parameter or member
'pte_mapped_thp' not described in 'mm_slot'
mm/khugepaged.c:1424: warning: Function parameter or member 'mm' not
described in 'collapse_pte_mapped_thp'
mm/khugepaged.c:1424: warning: Function parameter or member 'addr' not
described in 'collapse_pte_mapped_thp'
mm/khugepaged.c:1626: warning: Function parameter or member 'mm' not
described in 'collapse_file'
mm/khugepaged.c:1626: warning: Function parameter or member 'file' not
described in 'collapse_file'
mm/khugepaged.c:1626: warning: Function parameter or member 'start' not
described in 'collapse_file'
mm/khugepaged.c:1626: warning: Function parameter or member 'hpage' not
described in 'collapse_file'
mm/khugepaged.c:1626: warning: Function parameter or member 'node' not
described in 'collapse_file'

Signed-off-by: Alex Shi 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/khugepaged.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 757292532767..8f7aee4efdc3 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -90,6 +90,8 @@ static struct kmem_cache *mm_slot_cache __read_mostly;
  * @hash: hash collision list
  * @mm_node: khugepaged scan list headed in khugepaged_scan.mm_head
  * @mm: the mm that this information is valid for
+ * @nr_pte_mapped_thp: number of pte mapped THP
+ * @pte_mapped_thp: address array corresponding pte mapped THP
  */
 struct mm_slot {
struct hlist_node hash;
@@ -1414,7 +1416,11 @@ static int khugepaged_add_pte_mapped_thp(struct 
mm_struct *mm,
 }
 
 /**
- * Try to collapse a pte-mapped THP for mm at address haddr.
+ * collapse_pte_mapped_thp - Try to collapse a pte-mapped THP for mm at
+ * address haddr.
+ *
+ * @mm: process address space where collapse happens
+ * @addr: THP collapse address
  *
  * This function checks whether all the PTEs in the PMD are pointing to the
  * right THP. If so, retract the page table so the THP can refault in with
@@ -1605,6 +1611,12 @@ static void retract_page_tables(struct address_space 
*mapping, pgoff_t pgoff)
 /**
  * collapse_file - collapse filemap/tmpfs/shmem pages into huge one.
  *
+ * @mm: process address space where collapse happens
+ * @file: file that collapse on
+ * @start: collapse start address 
+ * @hpage: new allocated huge page for collapse
+ * @node: appointed node the new huge page allocate from
+ *
  * Basic scheme is simple, details are more complex:
  *  - allocate and lock a new huge page;
  *  - scan page cache replacing old pages with the new one
-- 
2.29.GIT

[PATCH] mm: add colon to fix kernel-doc markups error for check_pte

2020-11-16 Thread Alex Shi

The function check_pte needs a correct colon for kernel-doc markup,
otherwise, gcc has the following warning for W=1,
mm/page_vma_mapped.c:86: warning: Function parameter or member 'pvmw'
not described in 'check_pte'

Signed-off-by: Alex Shi 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/page_vma_mapped.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index 5e77b269c330..86e3a3688d59 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -66,18 +66,19 @@ static inline bool pfn_is_match(struct page *page, unsigned 
long pfn)
 
 /**
  * check_pte - check if @pvmw->page is mapped at the @pvmw->pte
+ * @pvmw: page_vma_mapped_walk struct, includes a pair pte and page for 
checking
  *
  * page_vma_mapped_walk() found a place where @pvmw->page is *potentially*
  * mapped. check_pte() has to validate this.
  *
- * @pvmw->pte may point to empty PTE, swap PTE or PTE pointing to arbitrary
- * page.
+ * pvmw->pte may point to empty PTE, swap PTE or PTE pointing to
+ * arbitrary page.
  *
  * If PVMW_MIGRATION flag is set, returns true if @pvmw->pte contains migration
  * entry that points to @pvmw->page or any subpage in case of THP.
  *
- * If PVMW_MIGRATION flag is not set, returns true if @pvmw->pte points to
- * @pvmw->page or any subpage in case of THP.
+ * If PVMW_MIGRATION flag is not set, returns true if pvmw->pte points to
+ * pvmw->page or any subpage in case of THP.
  *
  * Otherwise, return false.
  *
-- 
2.29.GIT

[PATCH] docs/vm: remove unused 3 items explanation for /proc/vmstat

2020-11-16 Thread Alex Shi

Commit 5647bc293ab1 ("mm: compaction: Move migration fail/success
stats to migrate.c"), removed 3 items in /proc/vmstat. but the docs
still has their explanation. let's remove them.

"compact_blocks_moved",
"compact_pages_moved",
"compact_pagemigrate_failed",

Signed-off-by: Alex Shi 
Cc: Jonathan Corbet  
Cc: Andrew Morton  
Cc: Yang Shi  
Cc: "Kirill A. Shutemov"  
Cc: David Rientjes  
Cc: Zi Yan  
Cc: linux-...@vger.kernel.org 
Cc: linux-kernel@vger.kernel.org 
---
 Documentation/admin-guide/mm/transhuge.rst | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/Documentation/admin-guide/mm/transhuge.rst 
b/Documentation/admin-guide/mm/transhuge.rst
index b2acd0d395ca..3b8a336511a4 100644
--- a/Documentation/admin-guide/mm/transhuge.rst
+++ b/Documentation/admin-guide/mm/transhuge.rst
@@ -401,21 +401,6 @@ compact_fail
is incremented if the system tries to compact memory
but failed.
 
-compact_pages_moved
-   is incremented each time a page is moved. If
-   this value is increasing rapidly, it implies that the system
-   is copying a lot of data to satisfy the huge page allocation.
-   It is possible that the cost of copying exceeds any savings
-   from reduced TLB misses.
-
-compact_pagemigrate_failed
-   is incremented when the underlying mechanism
-   for moving a page failed.
-
-compact_blocks_moved
-   is incremented each time memory compaction examines
-   a huge page aligned range of pages.
-
 It is possible to establish how long the stalls were using the function
 tracer to record how long was spent in __alloc_pages_nodemask and
 using the mm_page_alloc tracepoint to identify which allocations were
-- 
2.29.GIT

Re: [PATCH doc] doc: zh_CN: add tmpfs to index tree

2020-11-16 Thread Alex Shi

Reviewed-by: Alex Shi 

在 2020/11/16 下午2:47, Wang Qing 写道:
> Add temfs to the index tree while adding tempfs translation.
> 
> Signed-off-by: Wang Qing 
> ---
>  Documentation/translations/zh_CN/filesystems/index.rst | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/Documentation/translations/zh_CN/filesystems/index.rst 
> b/Documentation/translations/zh_CN/filesystems/index.rst
> index 186501d..9f2a8b0
> --- a/Documentation/translations/zh_CN/filesystems/index.rst
> +++ b/Documentation/translations/zh_CN/filesystems/index.rst
> @@ -25,4 +25,5 @@ Linux Kernel中的文件系统
>  
> virtiofs
> debugfs
> +   tmpfs
>  
>

Re: [PATCH v21 00/19] per memcg lru lock

2020-11-15 Thread Alex Shi

Hi Andrew,

With all patches are acked-by Hugh and Johannes, and full testing from LKP,
is this patchset ready for more testing on linux-next? or anything still need
be improved?

Thanks
Alex


在 2020/11/5 下午4:55, Alex Shi 写道:
> This version rebase on next/master 20201104, with much of Johannes's
> Acks and some changes according to Johannes comments. And add a new patch
> v21-0006-mm-rmap-stop-store-reordering-issue-on-page-mapp.patch to support
> v21-0007.
> 
> This patchset followed 2 memcg VM_WARN_ON_ONCE_PAGE patches which were
> added to -mm tree yesterday.
>  
> Many thanks for line by line review by Hugh Dickins, Alexander Duyck and
> Johannes Weiner.
> 
> So now this patchset includes 3 parts:
> 1, some code cleanup and minimum optimization as a preparation. 
> 2, use TestCleanPageLRU as page isolation's precondition.
> 3, replace per node lru_lock with per memcg per node lru_lock.
> 
> Current lru_lock is one for each of node, pgdat->lru_lock, that guard for
> lru lists, but now we had moved the lru lists into memcg for long time. Still
> using per node lru_lock is clearly unscalable, pages on each of memcgs have
> to compete each others for a whole lru_lock. This patchset try to use per
> lruvec/memcg lru_lock to repleace per node lru lock to guard lru lists, make
> it scalable for memcgs and get performance gain.
> 
> Currently lru_lock still guards both lru list and page's lru bit, that's ok.
> but if we want to use specific lruvec lock on the page, we need to pin down
> the page's lruvec/memcg during locking. Just taking lruvec lock first may be
> undermined by the page's memcg charge/migration. To fix this problem, we could
> take out the page's lru bit clear and use it as pin down action to block the
> memcg changes. That's the reason for new atomic func TestClearPageLRU.
> So now isolating a page need both actions: TestClearPageLRU and hold the
> lru_lock.
> 
> The typical usage of this is isolate_migratepages_block() in compaction.c
> we have to take lru bit before lru lock, that serialized the page isolation
> in memcg page charge/migration which will change page's lruvec and new 
> lru_lock in it.
> 
> The above solution suggested by Johannes Weiner, and based on his new memcg 
> charge path, then have this patchset. (Hugh Dickins tested and contributed 
> much
> code from compaction fix to general code polish, thanks a lot!).
> 
> Daniel Jordan's testing show 62% improvement on modified readtwice case
> on his 2P * 10 core * 2 HT broadwell box on v18, which has no much different
> with this v20.
> https://lore.kernel.org/lkml/20200915165807.kpp7uhiw7l3lo...@ca-dmjordan1.us.oracle.com/
> 
> Thanks Hugh Dickins and Konstantin Khlebnikov, they both brought this
> idea 8 years ago, and others who give comments as well: Daniel Jordan, 
> Mel Gorman, Shakeel Butt, Matthew Wilcox, Alexander Duyck etc.
> 
> Thanks for Testing support from Intel 0day and Rong Chen, Fengguang Wu,
> and Yun Wang. Hugh Dickins also shared his kbuild-swap case. Thanks!

Re: [PATCH 1/6] time: fix kernel-doc markup

2020-11-15 Thread Alex Shi




在 2020/11/16 上午6:48, Thomas Gleixner 写道:
> On Fri, Nov 13 2020 at 15:24, Alex Shi wrote:
> 
>> The kernel-doc interface error cause some warning:
> 
> I fixes the lot up and applied it. Please look at the changes I did and
> be more careful next time.
> 

Hi Thomas,

Thanks a lot for all fix and kindly coaching! I have learned a lot here.

Thanks
Alex

[tip: timers/core] timekeeping: Add missing parameter docs for pvclock_gtod_[un]register_notifier()

2020-11-15 Thread tip-bot2 for Alex Shi

The following commit has been merged into the timers/core branch of tip:

Commit-ID: f27f7c3f100e74a7f451a63a15788f50c52f7cce
Gitweb:
https://git.kernel.org/tip/f27f7c3f100e74a7f451a63a15788f50c52f7cce
Author:Alex Shi 
AuthorDate:Fri, 13 Nov 2020 15:24:32 +08:00
Committer: Thomas Gleixner 
CommitterDate: Sun, 15 Nov 2020 23:47:24 +01:00

timekeeping: Add missing parameter docs for pvclock_gtod_[un]register_notifier()

The kernel-doc parser complains about:
 kernel/time/timekeeping.c:651: warning: Function parameter or member
 'nb' not described in 'pvclock_gtod_register_notifier'
 kernel/time/timekeeping.c:670: warning: Function parameter or member
 'nb' not described in 'pvclock_gtod_unregister_notifier'

Add the missing parameter explanations.

[ tglx: Massaged changelog ]

Signed-off-by: Alex Shi 
Signed-off-by: Thomas Gleixner 
Link: 
https://lore.kernel.org/r/1605252275-63652-3-git-send-email-alex@linux.alibaba.com

---
 kernel/time/timekeeping.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index ab4b831..9c93923 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -663,6 +663,7 @@ static void update_pvclock_gtod(struct timekeeper *tk, bool 
was_set)
 
 /**
  * pvclock_gtod_register_notifier - register a pvclock timedata update listener
+ * @nb: Pointer to the notifier block to register
  */
 int pvclock_gtod_register_notifier(struct notifier_block *nb)
 {
@@ -682,6 +683,7 @@ EXPORT_SYMBOL_GPL(pvclock_gtod_register_notifier);
 /**
  * pvclock_gtod_unregister_notifier - unregister a pvclock
  * timedata update listener
+ * @nb: Pointer to the notifier block to unregister
  */
 int pvclock_gtod_unregister_notifier(struct notifier_block *nb)
 {

[tip: timers/core] timekeeping: Add missing parameter documentation for update_fast_timekeeper()

2020-11-15 Thread tip-bot2 for Alex Shi

The following commit has been merged into the timers/core branch of tip:

Commit-ID: e025b03113d27139ce2b28b82599018e4d8fa5f6
Gitweb:
https://git.kernel.org/tip/e025b03113d27139ce2b28b82599018e4d8fa5f6
Author:Alex Shi 
AuthorDate:Fri, 13 Nov 2020 15:24:31 +08:00
Committer: Thomas Gleixner 
CommitterDate: Sun, 15 Nov 2020 23:47:24 +01:00

timekeeping: Add missing parameter documentation for update_fast_timekeeper()

Address the following warning:

 kernel/time/timekeeping.c:415: warning: Function parameter or member
 'tkf' not described in 'update_fast_timekeeper'

[ tglx: Remove the bogus ktime_get_mono_fast_ns() part ]

Signed-off-by: Alex Shi 
Signed-off-by: Thomas Gleixner 
Link: 
https://lore.kernel.org/r/1605252275-63652-2-git-send-email-alex@linux.alibaba.com

---
 kernel/time/timekeeping.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 570fc50..a823703 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -407,6 +407,7 @@ static inline u64 timekeeping_cycles_to_ns(const struct 
tk_read_base *tkr, u64 c
 /**
  * update_fast_timekeeper - Update the fast and NMI safe monotonic timekeeper.
  * @tkr: Timekeeping readout base from which we take the update
+ * @tkf: Pointer to NMI safe timekeeper
  *
  * We want to use this from any context including NMI and tracing /
  * instrumenting the timekeeping code itself.

[tip: timers/core] time: Add missing colons for parameter documentation of time64_to_tm()

2020-11-15 Thread tip-bot2 for Alex Shi

The following commit has been merged into the timers/core branch of tip:

Commit-ID: a0f5a65fa5faeef708d022698d5fcba290a35856
Gitweb:
https://git.kernel.org/tip/a0f5a65fa5faeef708d022698d5fcba290a35856
Author:Alex Shi 
AuthorDate:Fri, 13 Nov 2020 15:24:30 +08:00
Committer: Thomas Gleixner 
CommitterDate: Sun, 15 Nov 2020 23:47:23 +01:00

time: Add missing colons for parameter documentation of time64_to_tm()

Address these kernel-doc warnings:

 kernel/time/timeconv.c:79: warning: Function parameter or member
 'totalsecs' not described in 'time64_to_tm'
 kernel/time/timeconv.c:79: warning: Function parameter or member
 'offset' not described in 'time64_to_tm'
 kernel/time/timeconv.c:79: warning: Function parameter or member
 'result' not described in 'time64_to_tm'

The parameters are described but lack colons after the parameter name.

[ tglx: Massaged changelog ]

Signed-off-by: Alex Shi 
Signed-off-by: Thomas Gleixner 
Link: 
https://lore.kernel.org/r/1605252275-63652-1-git-send-email-alex@linux.alibaba.com

---
 kernel/time/timeconv.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/time/timeconv.c b/kernel/time/timeconv.c
index 589e0a5..62e3b46 100644
--- a/kernel/time/timeconv.c
+++ b/kernel/time/timeconv.c
@@ -70,10 +70,10 @@ static const unsigned short __mon_yday[2][13] = {
 /**
  * time64_to_tm - converts the calendar time to local broken-down time
  *
- * @totalsecs  the number of seconds elapsed since 00:00:00 on January 1, 1970,
+ * @totalsecs: the number of seconds elapsed since 00:00:00 on January 1, 1970,
  * Coordinated Universal Time (UTC).
- * @offset offset seconds adding to totalsecs.
- * @result pointer to struct tm variable to receive broken-down time
+ * @offset:offset seconds adding to totalsecs.
+ * @result:pointer to struct tm variable to receive broken-down time
  */
 void time64_to_tm(time64_t totalsecs, int offset, struct tm *result)
 {

[tip: timers/core] timekeeping: Remove static functions from kernel-doc markup

2020-11-15 Thread tip-bot2 for Alex Shi

The following commit has been merged into the timers/core branch of tip:

Commit-ID: 199d280c884de44c3b0daeb77438db43f6db01a2
Gitweb:
https://git.kernel.org/tip/199d280c884de44c3b0daeb77438db43f6db01a2
Author:Alex Shi 
AuthorDate:Fri, 13 Nov 2020 15:24:33 +08:00
Committer: Thomas Gleixner 
CommitterDate: Sun, 15 Nov 2020 23:47:23 +01:00

timekeeping: Remove static functions from kernel-doc markup

Various static functions in the timekeeping code have function comments
which pretend to be kernel-doc, but are incomplete and trigger parser
warnings.

As these functions are local to the timekeeping core code there is no need
to expose them via kernel-doc.

Remove the double star kernel-doc marker and remove excess newlines.

[ tglx: Massaged changelog and removed excess newlines ]

Signed-off-by: Alex Shi 
Signed-off-by: Thomas Gleixner 
Link: 
https://lore.kernel.org/r/1605252275-63652-4-git-send-email-alex@linux.alibaba.com

---
 kernel/time/timekeeping.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 6858a31..570fc50 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1415,9 +1415,8 @@ void timekeeping_warp_clock(void)
}
 }
 
-/**
+/*
  * __timekeeping_set_tai_offset - Sets the TAI offset from UTC and monotonic
- *
  */
 static void __timekeeping_set_tai_offset(struct timekeeper *tk, s32 tai_offset)
 {
@@ -1425,7 +1424,7 @@ static void __timekeeping_set_tai_offset(struct 
timekeeper *tk, s32 tai_offset)
tk->offs_tai = ktime_add(tk->offs_real, ktime_set(tai_offset, 0));
 }
 
-/**
+/*
  * change_clocksource - Swaps clocksources if a new one is available
  *
  * Accumulates current time interval and initializes new clocksource
@@ -2023,13 +2022,12 @@ static void timekeeping_adjust(struct timekeeper *tk, 
s64 offset)
}
 }
 
-/**
+/*
  * accumulate_nsecs_to_secs - Accumulates nsecs into secs
  *
  * Helper function that accumulates the nsecs greater than a second
  * from the xtime_nsec field to the xtime_secs field.
  * It also calls into the NTP code to handle leapsecond processing.
- *
  */
 static inline unsigned int accumulate_nsecs_to_secs(struct timekeeper *tk)
 {
@@ -2071,7 +2069,7 @@ static inline unsigned int 
accumulate_nsecs_to_secs(struct timekeeper *tk)
return clock_set;
 }
 
-/**
+/*
  * logarithmic_accumulation - shifted accumulation of cycles
  *
  * This functions accumulates a shifted interval of cycles into
@@ -2314,7 +2312,7 @@ ktime_t ktime_get_update_offsets_now(unsigned int 
*cwsseq, ktime_t *offs_real,
return base;
 }
 
-/**
+/*
  * timekeeping_validate_timex - Ensures the timex is ok for use in do_adjtimex
  */
 static int timekeeping_validate_timex(const struct __kernel_timex *txc)

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 3068 matches

Mail list logo