Re: [PATCH 4.4 15/47] ubi: fastmap: Correctly handle interrupted erasures in EBA
Ben, Am Samstag, 28. Juli 2018, 03:28:58 CEST schrieb Ben Hutchings: > > > Isn't there a risk here, that a read error leads to erasing data that > > > might be recoverable if the read is retried? > > > > Well, read error means that already something went very wrong. At other > > places > > in UBI wo also don't retry reading headers and consider it as fatal when we > > are unable to read it. > > We could also read the EC header, but what do we gain from that? > > If the VID header is not readable we cannot check fastmap either. > > > > What case exactly do you have in mind? > > I suppose I'm thinking about data recovery from a flash device that has > become unreliable. I'm not familiar with ubi; is there a read-only > mode where the erase wouldn't actually be performed? We have a ready only mode, UBI enters this mode upon fatal errors. But I don't see what this has to do with the case this patch addresses. Here we need to recover from interrupted erases when fastmap is used. With fastmap the mapping still looks valid to the upper layer but due to a bad erase we might face later ECC errors and therefore we have to check whether it is still valid upon first access. If data is present, mapping is ok. If not (no data or ECC error), unmap it to make the state sync with the upper layer. Thanks, //richard
Re: [PATCH 4.4 15/47] ubi: fastmap: Correctly handle interrupted erasures in EBA
Ben, Am Samstag, 28. Juli 2018, 03:28:58 CEST schrieb Ben Hutchings: > > > Isn't there a risk here, that a read error leads to erasing data that > > > might be recoverable if the read is retried? > > > > Well, read error means that already something went very wrong. At other > > places > > in UBI wo also don't retry reading headers and consider it as fatal when we > > are unable to read it. > > We could also read the EC header, but what do we gain from that? > > If the VID header is not readable we cannot check fastmap either. > > > > What case exactly do you have in mind? > > I suppose I'm thinking about data recovery from a flash device that has > become unreliable. I'm not familiar with ubi; is there a read-only > mode where the erase wouldn't actually be performed? We have a ready only mode, UBI enters this mode upon fatal errors. But I don't see what this has to do with the case this patch addresses. Here we need to recover from interrupted erases when fastmap is used. With fastmap the mapping still looks valid to the upper layer but due to a bad erase we might face later ECC errors and therefore we have to check whether it is still valid upon first access. If data is present, mapping is ok. If not (no data or ECC error), unmap it to make the state sync with the upper layer. Thanks, //richard
Re: [PATCH 4/4] cpufreq: intel_pstate: enable boost for Skylake Xeon
Srinivas Pandruvada writes: > Enable HWP boost on Skylake server and workstations. > Please revert this series, it led to significant energy usage and graphics performance regressions [1]. The reasons are roughly the ones we discussed by e-mail off-list last April: This causes the intel_pstate driver to decrease the EPP to zero when the workload blocks on IO frequently enough, which for the regressing benchmarks detailed in [1] is a symptom of the workload being heavily IO-bound, which means they won't benefit at all from the EPP boost since they aren't significantly CPU-bound, and they will suffer a decrease in parallelism due to the active CPU core using a larger fraction of the TDP in order to achieve the same work, causing the GPU to have a lower power budget available, leading to a decrease in system performance. You may want to give a shot to my previous suggestion of using [2] in order to detect whether the system is IO-bound, which you can use as an indicator that the optimization implemented in this series cannot possibly improve performance and can be expected to hurt energy usage. Thanks. [1] https://bugs.freedesktop.org/show_bug.cgi?id=107410 [2] https://patchwork.kernel.org/patch/10312259/ > Reported-by: Mel Gorman > Tested-by: Giovanni Gherdovich > Signed-off-by: Srinivas Pandruvada > --- > drivers/cpufreq/intel_pstate.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c > index 70bf63bb4e0e..01c8da1f99db 100644 > --- a/drivers/cpufreq/intel_pstate.c > +++ b/drivers/cpufreq/intel_pstate.c > @@ -1794,6 +1794,12 @@ static const struct x86_cpu_id > intel_pstate_cpu_ee_disable_ids[] = { > {} > }; > > +static const struct x86_cpu_id intel_pstate_hwp_boost_ids[] __initconst = { > + ICPU(INTEL_FAM6_SKYLAKE_X, core_funcs), > + ICPU(INTEL_FAM6_SKYLAKE_DESKTOP, core_funcs), > + {} > +}; > + > static int intel_pstate_init_cpu(unsigned int cpunum) > { > struct cpudata *cpu; > @@ -1824,6 +1830,10 @@ static int intel_pstate_init_cpu(unsigned int cpunum) > intel_pstate_disable_ee(cpunum); > > intel_pstate_hwp_enable(cpu); > + > + id = x86_match_cpu(intel_pstate_hwp_boost_ids); > + if (id) > + hwp_boost = true; > } > > intel_pstate_get_cpu_pstates(cpu); > -- > 2.13.6 signature.asc Description: PGP signature
Re: [PATCH 4/4] cpufreq: intel_pstate: enable boost for Skylake Xeon
Srinivas Pandruvada writes: > Enable HWP boost on Skylake server and workstations. > Please revert this series, it led to significant energy usage and graphics performance regressions [1]. The reasons are roughly the ones we discussed by e-mail off-list last April: This causes the intel_pstate driver to decrease the EPP to zero when the workload blocks on IO frequently enough, which for the regressing benchmarks detailed in [1] is a symptom of the workload being heavily IO-bound, which means they won't benefit at all from the EPP boost since they aren't significantly CPU-bound, and they will suffer a decrease in parallelism due to the active CPU core using a larger fraction of the TDP in order to achieve the same work, causing the GPU to have a lower power budget available, leading to a decrease in system performance. You may want to give a shot to my previous suggestion of using [2] in order to detect whether the system is IO-bound, which you can use as an indicator that the optimization implemented in this series cannot possibly improve performance and can be expected to hurt energy usage. Thanks. [1] https://bugs.freedesktop.org/show_bug.cgi?id=107410 [2] https://patchwork.kernel.org/patch/10312259/ > Reported-by: Mel Gorman > Tested-by: Giovanni Gherdovich > Signed-off-by: Srinivas Pandruvada > --- > drivers/cpufreq/intel_pstate.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c > index 70bf63bb4e0e..01c8da1f99db 100644 > --- a/drivers/cpufreq/intel_pstate.c > +++ b/drivers/cpufreq/intel_pstate.c > @@ -1794,6 +1794,12 @@ static const struct x86_cpu_id > intel_pstate_cpu_ee_disable_ids[] = { > {} > }; > > +static const struct x86_cpu_id intel_pstate_hwp_boost_ids[] __initconst = { > + ICPU(INTEL_FAM6_SKYLAKE_X, core_funcs), > + ICPU(INTEL_FAM6_SKYLAKE_DESKTOP, core_funcs), > + {} > +}; > + > static int intel_pstate_init_cpu(unsigned int cpunum) > { > struct cpudata *cpu; > @@ -1824,6 +1830,10 @@ static int intel_pstate_init_cpu(unsigned int cpunum) > intel_pstate_disable_ee(cpunum); > > intel_pstate_hwp_enable(cpu); > + > + id = x86_match_cpu(intel_pstate_hwp_boost_ids); > + if (id) > + hwp_boost = true; > } > > intel_pstate_get_cpu_pstates(cpu); > -- > 2.13.6 signature.asc Description: PGP signature
Re: [PATCH] scripts: Add Python 3 support to tracing/draw_functrace.py
2018-07-26 0:01 GMT+09:00 Jeremy Cline : > On 07/25/2018 10:39 AM, Masahiro Yamada wrote: >> 2018-07-21 4:35 GMT+09:00 Jeremy Cline : >>> Use the print function. This maintains Python 2 support and should have >>> no functional change. >>> >>> Signed-off-by: Jeremy Cline >>> --- >>> scripts/tracing/draw_functrace.py | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/scripts/tracing/draw_functrace.py >>> b/scripts/tracing/draw_functrace.py >>> index db40fa04cd51..7d44e796d362 100755 >>> --- a/scripts/tracing/draw_functrace.py >>> +++ b/scripts/tracing/draw_functrace.py >>> @@ -20,6 +20,7 @@ Usage: >>> $ scripts/draw_functrace.py < raw_trace_func > draw_functrace >>> Then you have your drawn trace in draw_functrace >>> """ >>> +from __future__ import print_function >> >> What do you need this line for? >> >> I have not tested this, >> but I guess print(CallTree.ROOT) will work for Python 2. > > Although "print(CallTree.ROOT)" (as a statement) works in Python 2, > its behavior is different than print (as a function) in Python 3. In > this case, there's no additional arguments being provided so the > behavior will match, but if someone added an argument it would work > differently on Python 2 vs Python 3: > > Python 2.7.15 print("hello", "world") > ('hello', 'world') > > Python 3.6.6 print("hello, "world") > File "", line 1 > print("hello, "world") >^ > SyntaxError: invalid syntax Yes, I know this. > Importing the print_function works on Python 2.6+[0] and changes print > to be a function in Python 2 so it'll behave the same in 2 and 3. Given > that this script doesn't appear to change much it's probably not going > to save anyone from making that mistake, though. Would you prefer a > patch without it? Either will do. If it is tedious to respin, I will pick this up. I saw only one simple print statement in this script, so I wanted to ask you if this had some reason I might be missing. Anyway, we will remove 'from __future__ import print_function' when Python 2 retires. > [0] https://docs.python.org/3/library/__future__.html > > Regards, > Jeremy > >> >> >> >>> >>> import sys, re >>> @@ -123,7 +124,7 @@ def main(): >>> tree = tree.getParent(caller) >>> tree = tree.calls(callee, calltime) >>> >>> - print CallTree.ROOT >>> + print(CallTree.ROOT) >>> >>> if __name__ == "__main__": >>> main() >>> -- >>> 2.17.1 >>> >> >> >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Best Regards Masahiro Yamada
Re: [PATCH] scripts: Add Python 3 support to tracing/draw_functrace.py
2018-07-26 0:01 GMT+09:00 Jeremy Cline : > On 07/25/2018 10:39 AM, Masahiro Yamada wrote: >> 2018-07-21 4:35 GMT+09:00 Jeremy Cline : >>> Use the print function. This maintains Python 2 support and should have >>> no functional change. >>> >>> Signed-off-by: Jeremy Cline >>> --- >>> scripts/tracing/draw_functrace.py | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/scripts/tracing/draw_functrace.py >>> b/scripts/tracing/draw_functrace.py >>> index db40fa04cd51..7d44e796d362 100755 >>> --- a/scripts/tracing/draw_functrace.py >>> +++ b/scripts/tracing/draw_functrace.py >>> @@ -20,6 +20,7 @@ Usage: >>> $ scripts/draw_functrace.py < raw_trace_func > draw_functrace >>> Then you have your drawn trace in draw_functrace >>> """ >>> +from __future__ import print_function >> >> What do you need this line for? >> >> I have not tested this, >> but I guess print(CallTree.ROOT) will work for Python 2. > > Although "print(CallTree.ROOT)" (as a statement) works in Python 2, > its behavior is different than print (as a function) in Python 3. In > this case, there's no additional arguments being provided so the > behavior will match, but if someone added an argument it would work > differently on Python 2 vs Python 3: > > Python 2.7.15 print("hello", "world") > ('hello', 'world') > > Python 3.6.6 print("hello, "world") > File "", line 1 > print("hello, "world") >^ > SyntaxError: invalid syntax Yes, I know this. > Importing the print_function works on Python 2.6+[0] and changes print > to be a function in Python 2 so it'll behave the same in 2 and 3. Given > that this script doesn't appear to change much it's probably not going > to save anyone from making that mistake, though. Would you prefer a > patch without it? Either will do. If it is tedious to respin, I will pick this up. I saw only one simple print statement in this script, so I wanted to ask you if this had some reason I might be missing. Anyway, we will remove 'from __future__ import print_function' when Python 2 retires. > [0] https://docs.python.org/3/library/__future__.html > > Regards, > Jeremy > >> >> >> >>> >>> import sys, re >>> @@ -123,7 +124,7 @@ def main(): >>> tree = tree.getParent(caller) >>> tree = tree.calls(callee, calltime) >>> >>> - print CallTree.ROOT >>> + print(CallTree.ROOT) >>> >>> if __name__ == "__main__": >>> main() >>> -- >>> 2.17.1 >>> >> >> >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Best Regards Masahiro Yamada
Re: [PATCH 4.17 00/66] 4.17.11-stable review
On Fri, Jul 27, 2018 at 10:31:48AM -0700, Guenter Roeck wrote: > On Fri, Jul 27, 2018 at 11:44:53AM +0200, Greg Kroah-Hartman wrote: > > This is the start of the stable review cycle for the 4.17.11 release. > > There are 66 patches in this series, all will be posted as a response > > to this one. If anyone has any issues with these being applied, please > > let me know. > > > > Responses should be made by Sun Jul 29 09:37:38 UTC 2018. > > Anything received after that time might be too late. > > > > Build results: > total: 134 pass: 134 fail: 0 > Qemu test results: > total: 177 pass: 177 fail: 0 > > Details are available at http://kerneltests.org/builders/. Thanks for testing all of these and letting me know. greg k-h
Re: [PATCH 4.17 00/66] 4.17.11-stable review
On Fri, Jul 27, 2018 at 10:31:48AM -0700, Guenter Roeck wrote: > On Fri, Jul 27, 2018 at 11:44:53AM +0200, Greg Kroah-Hartman wrote: > > This is the start of the stable review cycle for the 4.17.11 release. > > There are 66 patches in this series, all will be posted as a response > > to this one. If anyone has any issues with these being applied, please > > let me know. > > > > Responses should be made by Sun Jul 29 09:37:38 UTC 2018. > > Anything received after that time might be too late. > > > > Build results: > total: 134 pass: 134 fail: 0 > Qemu test results: > total: 177 pass: 177 fail: 0 > > Details are available at http://kerneltests.org/builders/. Thanks for testing all of these and letting me know. greg k-h
Re: [PATCH 4.17 00/66] 4.17.11-stable review
On Fri, Jul 27, 2018 at 01:49:42PM -0600, Shuah Khan wrote: > On 07/27/2018 03:44 AM, Greg Kroah-Hartman wrote: > > This is the start of the stable review cycle for the 4.17.11 release. > > There are 66 patches in this series, all will be posted as a response > > to this one. If anyone has any issues with these being applied, please > > let me know. > > > > Responses should be made by Sun Jul 29 09:37:38 UTC 2018. > > Anything received after that time might be too late. > > > > The whole patch series can be found in one patch at: > > > > https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.17.11-rc1.gz > > or in the git tree and branch at: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git > > linux-4.17.y > > and the diffstat can be found below. > > > > thanks, > > > > greg k-h > > > > Missed a few releases while I was away :) Welcome back! > Compiled and booted on my test system. No dmesg regressions. Thanks for testing all of these and letting me know. greg k-h
Re: [PATCH 4.17 00/66] 4.17.11-stable review
On Fri, Jul 27, 2018 at 01:49:42PM -0600, Shuah Khan wrote: > On 07/27/2018 03:44 AM, Greg Kroah-Hartman wrote: > > This is the start of the stable review cycle for the 4.17.11 release. > > There are 66 patches in this series, all will be posted as a response > > to this one. If anyone has any issues with these being applied, please > > let me know. > > > > Responses should be made by Sun Jul 29 09:37:38 UTC 2018. > > Anything received after that time might be too late. > > > > The whole patch series can be found in one patch at: > > > > https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.17.11-rc1.gz > > or in the git tree and branch at: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git > > linux-4.17.y > > and the diffstat can be found below. > > > > thanks, > > > > greg k-h > > > > Missed a few releases while I was away :) Welcome back! > Compiled and booted on my test system. No dmesg regressions. Thanks for testing all of these and letting me know. greg k-h
[PATCH 2/5] staging: gasket: apex: fixup undefined PCI class
From: Todd Poynor Apex chips with class 0 (PCI_CLASS_NOT_DEFINED) fixed up to PCI_CLASS_SYSTEM_OTHER to enable PCI resource assignments. Signed-off-by: Todd Poynor --- drivers/staging/gasket/apex_driver.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/staging/gasket/apex_driver.c b/drivers/staging/gasket/apex_driver.c index 73fc2683a834d..ab466d49608a7 100644 --- a/drivers/staging/gasket/apex_driver.c +++ b/drivers/staging/gasket/apex_driver.c @@ -739,3 +739,10 @@ static ssize_t sysfs_show( gasket_sysfs_put_device_data(device, gasket_dev); return ret; } + +static void apex_pci_fixup_class(struct pci_dev *pdev) +{ + pdev->class = (PCI_CLASS_SYSTEM_OTHER << 8) | pdev->class; +} +DECLARE_PCI_FIXUP_CLASS_HEADER(APEX_PCI_VENDOR_ID, APEX_PCI_DEVICE_ID, + PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class); -- 2.18.0.345.g5c9ce644c3-goog
[PATCH 3/5] staging: gasket: sysfs: remove unnecessary NULL check on device ptr
From: Todd Poynor The device pointer passed into get_mapping() will never be NULL; the check is unnecessary. Reported-by: Greg Kroah-Hartman Signed-off-by: Todd Poynor --- drivers/staging/gasket/gasket_sysfs.c | 5 - 1 file changed, 5 deletions(-) diff --git a/drivers/staging/gasket/gasket_sysfs.c b/drivers/staging/gasket/gasket_sysfs.c index 2d8647de697cd..da972ce0e0db0 100644 --- a/drivers/staging/gasket/gasket_sysfs.c +++ b/drivers/staging/gasket/gasket_sysfs.c @@ -67,11 +67,6 @@ static struct gasket_sysfs_mapping *get_mapping(struct device *device) { int i; - if (!device) { - pr_debug("%s: Received NULL device\n", __func__); - return NULL; - } - for (i = 0; i < GASKET_SYSFS_NUM_MAPPINGS; i++) { mutex_lock(_mappings[i].mutex); if (dev_mappings[i].device == device) { -- 2.18.0.345.g5c9ce644c3-goog
[PATCH 1/5] staging: gasket: sysfs: remove check for refcount already zero
From: Todd Poynor Remove the check for refcount already zero, which shouldn't be necessary. Signed-off-by: Todd Poynor --- drivers/staging/gasket/gasket_sysfs.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/staging/gasket/gasket_sysfs.c b/drivers/staging/gasket/gasket_sysfs.c index 418b81797e638..2d8647de697cd 100644 --- a/drivers/staging/gasket/gasket_sysfs.c +++ b/drivers/staging/gasket/gasket_sysfs.c @@ -111,8 +111,6 @@ static void put_mapping(struct gasket_sysfs_mapping *mapping) } mutex_lock(>mutex); - if (refcount_read(>refcount.refcount) == 0) - dev_err(mapping->device, "Refcount is already 0\n"); if (kref_put(>refcount, release_entry)) { dev_dbg(mapping->device, "Removing Gasket sysfs mapping\n"); /* -- 2.18.0.345.g5c9ce644c3-goog
[PATCH 0/5] staging: gasket: fixes and cleanups
From: Todd Poynor The fun continues with gasket+apex: remove dead code and unnecessary stuff, fixup apex PCI class for devices that advertise class 0 (undefined), and make sure the struct device doesn't go away on us. Most of these from review comments of previous patch series. Todd Poynor (5): staging: gasket: sysfs: remove check for refcount already zero staging: gasket: apex: fixup undefined PCI class staging: gasket: sysfs: remove unnecessary NULL check on device ptr staging: gasket: page table: remove code for "no dma_ops" staging: gasket: core: hold reference on device kobj while in use drivers/staging/gasket/apex_driver.c | 7 +++ drivers/staging/gasket/gasket_core.c | 5 +- drivers/staging/gasket/gasket_page_table.c | 58 +++--- drivers/staging/gasket/gasket_page_table.h | 3 +- drivers/staging/gasket/gasket_sysfs.c | 7 --- 5 files changed, 17 insertions(+), 63 deletions(-) -- 2.18.0.345.g5c9ce644c3-goog
[PATCH 3/5] staging: gasket: sysfs: remove unnecessary NULL check on device ptr
From: Todd Poynor The device pointer passed into get_mapping() will never be NULL; the check is unnecessary. Reported-by: Greg Kroah-Hartman Signed-off-by: Todd Poynor --- drivers/staging/gasket/gasket_sysfs.c | 5 - 1 file changed, 5 deletions(-) diff --git a/drivers/staging/gasket/gasket_sysfs.c b/drivers/staging/gasket/gasket_sysfs.c index 2d8647de697cd..da972ce0e0db0 100644 --- a/drivers/staging/gasket/gasket_sysfs.c +++ b/drivers/staging/gasket/gasket_sysfs.c @@ -67,11 +67,6 @@ static struct gasket_sysfs_mapping *get_mapping(struct device *device) { int i; - if (!device) { - pr_debug("%s: Received NULL device\n", __func__); - return NULL; - } - for (i = 0; i < GASKET_SYSFS_NUM_MAPPINGS; i++) { mutex_lock(_mappings[i].mutex); if (dev_mappings[i].device == device) { -- 2.18.0.345.g5c9ce644c3-goog
[PATCH 1/5] staging: gasket: sysfs: remove check for refcount already zero
From: Todd Poynor Remove the check for refcount already zero, which shouldn't be necessary. Signed-off-by: Todd Poynor --- drivers/staging/gasket/gasket_sysfs.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/staging/gasket/gasket_sysfs.c b/drivers/staging/gasket/gasket_sysfs.c index 418b81797e638..2d8647de697cd 100644 --- a/drivers/staging/gasket/gasket_sysfs.c +++ b/drivers/staging/gasket/gasket_sysfs.c @@ -111,8 +111,6 @@ static void put_mapping(struct gasket_sysfs_mapping *mapping) } mutex_lock(>mutex); - if (refcount_read(>refcount.refcount) == 0) - dev_err(mapping->device, "Refcount is already 0\n"); if (kref_put(>refcount, release_entry)) { dev_dbg(mapping->device, "Removing Gasket sysfs mapping\n"); /* -- 2.18.0.345.g5c9ce644c3-goog
[PATCH 0/5] staging: gasket: fixes and cleanups
From: Todd Poynor The fun continues with gasket+apex: remove dead code and unnecessary stuff, fixup apex PCI class for devices that advertise class 0 (undefined), and make sure the struct device doesn't go away on us. Most of these from review comments of previous patch series. Todd Poynor (5): staging: gasket: sysfs: remove check for refcount already zero staging: gasket: apex: fixup undefined PCI class staging: gasket: sysfs: remove unnecessary NULL check on device ptr staging: gasket: page table: remove code for "no dma_ops" staging: gasket: core: hold reference on device kobj while in use drivers/staging/gasket/apex_driver.c | 7 +++ drivers/staging/gasket/gasket_core.c | 5 +- drivers/staging/gasket/gasket_page_table.c | 58 +++--- drivers/staging/gasket/gasket_page_table.h | 3 +- drivers/staging/gasket/gasket_sysfs.c | 7 --- 5 files changed, 17 insertions(+), 63 deletions(-) -- 2.18.0.345.g5c9ce644c3-goog
[PATCH 2/5] staging: gasket: apex: fixup undefined PCI class
From: Todd Poynor Apex chips with class 0 (PCI_CLASS_NOT_DEFINED) fixed up to PCI_CLASS_SYSTEM_OTHER to enable PCI resource assignments. Signed-off-by: Todd Poynor --- drivers/staging/gasket/apex_driver.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/staging/gasket/apex_driver.c b/drivers/staging/gasket/apex_driver.c index 73fc2683a834d..ab466d49608a7 100644 --- a/drivers/staging/gasket/apex_driver.c +++ b/drivers/staging/gasket/apex_driver.c @@ -739,3 +739,10 @@ static ssize_t sysfs_show( gasket_sysfs_put_device_data(device, gasket_dev); return ret; } + +static void apex_pci_fixup_class(struct pci_dev *pdev) +{ + pdev->class = (PCI_CLASS_SYSTEM_OTHER << 8) | pdev->class; +} +DECLARE_PCI_FIXUP_CLASS_HEADER(APEX_PCI_VENDOR_ID, APEX_PCI_DEVICE_ID, + PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class); -- 2.18.0.345.g5c9ce644c3-goog
[PATCH 4/5] staging: gasket: page table: remove code for "no dma_ops"
From: Todd Poynor Remove code with TODOs on it for working around apparent problems previously seen in a qemu environment where dma_ops was not set correctly. There is no user of this in the current code. Signed-off-by: Todd Poynor --- drivers/staging/gasket/gasket_core.c | 2 +- drivers/staging/gasket/gasket_page_table.c | 58 +++--- drivers/staging/gasket/gasket_page_table.h | 3 +- 3 files changed, 8 insertions(+), 55 deletions(-) diff --git a/drivers/staging/gasket/gasket_core.c b/drivers/staging/gasket/gasket_core.c index f44805c38159b..859a6df9e12df 100644 --- a/drivers/staging/gasket/gasket_core.c +++ b/drivers/staging/gasket/gasket_core.c @@ -916,7 +916,7 @@ static int gasket_enable_dev( _dev->bar_data[ driver_desc->page_table_bar_index], _desc->page_table_configs[tbl_idx], - gasket_dev->dev, gasket_dev->pci_dev, true); + gasket_dev->dev, gasket_dev->pci_dev); if (ret) { dev_err(gasket_dev->dev, "Couldn't init page table %d: %d\n", diff --git a/drivers/staging/gasket/gasket_page_table.c b/drivers/staging/gasket/gasket_page_table.c index 32f1c1e10c7e2..722839603f20d 100644 --- a/drivers/staging/gasket/gasket_page_table.c +++ b/drivers/staging/gasket/gasket_page_table.c @@ -212,12 +212,6 @@ struct gasket_page_table { * gasket_mmap function, so user_virt belongs in the driver anyhow. */ struct gasket_coherent_page_entry *coherent_pages; - - /* -* Whether the page table uses arch specific dma_ops or -* whether the driver is supplying its own. -*/ - bool dma_ops; }; /* Mapping declarations */ @@ -290,7 +284,7 @@ int gasket_page_table_init( struct gasket_page_table **ppg_tbl, const struct gasket_bar_data *bar_data, const struct gasket_page_table_config *page_table_config, - struct device *device, struct pci_dev *pci_dev, bool has_dma_ops) + struct device *device, struct pci_dev *pci_dev) { ulong bytes; struct gasket_page_table *pg_tbl; @@ -353,7 +347,6 @@ int gasket_page_table_init( bar_data->virt_base[page_table_config->extended_reg]); pg_tbl->device = device; pg_tbl->pci_dev = pci_dev; - pg_tbl->dma_ops = has_dma_ops; dev_dbg(device, "Page table initialized successfully\n"); @@ -759,33 +752,6 @@ static int gasket_map_extended_pages( return 0; } -/* - * TODO: dma_map_page() is not plugged properly when running under qemu. i.e. - * dma_ops are not set properly, which causes the kernel to assert. - * - * This temporary hack allows the driver to work on qemu, but need to be fixed: - * - either manually set the dma_ops for the architecture (which incidentally - * can't be done in an out-of-tree module) - or get qemu to fill the device tree - * properly so as linux plug the proper dma_ops or so as the driver can detect - * that it is runnig on qemu - */ -static inline dma_addr_t _no_op_dma_map_page( - struct device *dev, struct page *page, size_t offset, size_t size, - enum dma_data_direction dir) -{ - /* -* struct dma_map_ops *ops = get_dma_ops(dev); -* dma_addr_t addr; -* -* kmemcheck_mark_initialized(page_address(page) + offset, size); -* BUG_ON(!valid_dma_direction(dir)); -* addr = ops->map_page(dev, page, offset, size, dir, NULL); -* debug_dma_map_page(dev, page, offset, size, dir, addr, false); -*/ - - return page_to_phys(page); -} - /* * Get and map last level page table buffers. * @pg_tbl: Gasket page table pointer. @@ -856,16 +822,9 @@ static int gasket_perform_mapping( ptes[i].offset = offset; /* Map the page into DMA space. */ - if (pg_tbl->dma_ops) { - /* hook in to kernel map functions */ - ptes[i].dma_addr = dma_map_page(pg_tbl->device, - page, 0, PAGE_SIZE, DMA_BIDIRECTIONAL); - } else { - ptes[i].dma_addr = _no_op_dma_map_page( - pg_tbl->device, page, 0, PAGE_SIZE, - DMA_BIDIRECTIONAL); - } - + ptes[i].dma_addr = + dma_map_page(pg_tbl->device, page, 0, PAGE_SIZE, +DMA_BIDIRECTIONAL); dev_dbg(pg_tbl->device, "%s i %d pte %p pfn %p -> mapped %llx\n", __func__, i, [i], @@ -1042,13 +1001,8 @@ static int gasket_alloc_extended_subtable( } /* Map the page into DMA space. */ - if
[PATCH 5/5] staging: gasket: core: hold reference on device kobj while in use
From: Todd Poynor Hold a reference on the struct device kobject while a pointer to that device is in use by gasket. Reported-by: Greg Kroah-Hartman Signed-off-by: Todd Poynor --- drivers/staging/gasket/gasket_core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/staging/gasket/gasket_core.c b/drivers/staging/gasket/gasket_core.c index 859a6df9e12df..1dc7273e38734 100644 --- a/drivers/staging/gasket/gasket_core.c +++ b/drivers/staging/gasket/gasket_core.c @@ -449,6 +449,7 @@ static int gasket_alloc_dev( gasket_dev->dev_idx = dev_idx; snprintf(gasket_dev->kobj_name, GASKET_NAME_MAX, "%s", kobj_name); gasket_dev->dev = parent; + kobject_get(_dev->dev->kobj); /* gasket_bar_data is uninitialized. */ gasket_dev->num_page_tables = driver_desc->num_page_tables; /* max_page_table_size and *page table are uninit'ed */ @@ -487,7 +488,7 @@ static void gasket_free_dev(struct gasket_dev *gasket_dev) mutex_lock(_desc->mutex); internal_desc->devs[gasket_dev->dev_idx] = NULL; mutex_unlock(_desc->mutex); - + kobject_put(_dev->dev->kobj); kfree(gasket_dev); } -- 2.18.0.345.g5c9ce644c3-goog
[PATCH 4/5] staging: gasket: page table: remove code for "no dma_ops"
From: Todd Poynor Remove code with TODOs on it for working around apparent problems previously seen in a qemu environment where dma_ops was not set correctly. There is no user of this in the current code. Signed-off-by: Todd Poynor --- drivers/staging/gasket/gasket_core.c | 2 +- drivers/staging/gasket/gasket_page_table.c | 58 +++--- drivers/staging/gasket/gasket_page_table.h | 3 +- 3 files changed, 8 insertions(+), 55 deletions(-) diff --git a/drivers/staging/gasket/gasket_core.c b/drivers/staging/gasket/gasket_core.c index f44805c38159b..859a6df9e12df 100644 --- a/drivers/staging/gasket/gasket_core.c +++ b/drivers/staging/gasket/gasket_core.c @@ -916,7 +916,7 @@ static int gasket_enable_dev( _dev->bar_data[ driver_desc->page_table_bar_index], _desc->page_table_configs[tbl_idx], - gasket_dev->dev, gasket_dev->pci_dev, true); + gasket_dev->dev, gasket_dev->pci_dev); if (ret) { dev_err(gasket_dev->dev, "Couldn't init page table %d: %d\n", diff --git a/drivers/staging/gasket/gasket_page_table.c b/drivers/staging/gasket/gasket_page_table.c index 32f1c1e10c7e2..722839603f20d 100644 --- a/drivers/staging/gasket/gasket_page_table.c +++ b/drivers/staging/gasket/gasket_page_table.c @@ -212,12 +212,6 @@ struct gasket_page_table { * gasket_mmap function, so user_virt belongs in the driver anyhow. */ struct gasket_coherent_page_entry *coherent_pages; - - /* -* Whether the page table uses arch specific dma_ops or -* whether the driver is supplying its own. -*/ - bool dma_ops; }; /* Mapping declarations */ @@ -290,7 +284,7 @@ int gasket_page_table_init( struct gasket_page_table **ppg_tbl, const struct gasket_bar_data *bar_data, const struct gasket_page_table_config *page_table_config, - struct device *device, struct pci_dev *pci_dev, bool has_dma_ops) + struct device *device, struct pci_dev *pci_dev) { ulong bytes; struct gasket_page_table *pg_tbl; @@ -353,7 +347,6 @@ int gasket_page_table_init( bar_data->virt_base[page_table_config->extended_reg]); pg_tbl->device = device; pg_tbl->pci_dev = pci_dev; - pg_tbl->dma_ops = has_dma_ops; dev_dbg(device, "Page table initialized successfully\n"); @@ -759,33 +752,6 @@ static int gasket_map_extended_pages( return 0; } -/* - * TODO: dma_map_page() is not plugged properly when running under qemu. i.e. - * dma_ops are not set properly, which causes the kernel to assert. - * - * This temporary hack allows the driver to work on qemu, but need to be fixed: - * - either manually set the dma_ops for the architecture (which incidentally - * can't be done in an out-of-tree module) - or get qemu to fill the device tree - * properly so as linux plug the proper dma_ops or so as the driver can detect - * that it is runnig on qemu - */ -static inline dma_addr_t _no_op_dma_map_page( - struct device *dev, struct page *page, size_t offset, size_t size, - enum dma_data_direction dir) -{ - /* -* struct dma_map_ops *ops = get_dma_ops(dev); -* dma_addr_t addr; -* -* kmemcheck_mark_initialized(page_address(page) + offset, size); -* BUG_ON(!valid_dma_direction(dir)); -* addr = ops->map_page(dev, page, offset, size, dir, NULL); -* debug_dma_map_page(dev, page, offset, size, dir, addr, false); -*/ - - return page_to_phys(page); -} - /* * Get and map last level page table buffers. * @pg_tbl: Gasket page table pointer. @@ -856,16 +822,9 @@ static int gasket_perform_mapping( ptes[i].offset = offset; /* Map the page into DMA space. */ - if (pg_tbl->dma_ops) { - /* hook in to kernel map functions */ - ptes[i].dma_addr = dma_map_page(pg_tbl->device, - page, 0, PAGE_SIZE, DMA_BIDIRECTIONAL); - } else { - ptes[i].dma_addr = _no_op_dma_map_page( - pg_tbl->device, page, 0, PAGE_SIZE, - DMA_BIDIRECTIONAL); - } - + ptes[i].dma_addr = + dma_map_page(pg_tbl->device, page, 0, PAGE_SIZE, +DMA_BIDIRECTIONAL); dev_dbg(pg_tbl->device, "%s i %d pte %p pfn %p -> mapped %llx\n", __func__, i, [i], @@ -1042,13 +1001,8 @@ static int gasket_alloc_extended_subtable( } /* Map the page into DMA space. */ - if
[PATCH 5/5] staging: gasket: core: hold reference on device kobj while in use
From: Todd Poynor Hold a reference on the struct device kobject while a pointer to that device is in use by gasket. Reported-by: Greg Kroah-Hartman Signed-off-by: Todd Poynor --- drivers/staging/gasket/gasket_core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/staging/gasket/gasket_core.c b/drivers/staging/gasket/gasket_core.c index 859a6df9e12df..1dc7273e38734 100644 --- a/drivers/staging/gasket/gasket_core.c +++ b/drivers/staging/gasket/gasket_core.c @@ -449,6 +449,7 @@ static int gasket_alloc_dev( gasket_dev->dev_idx = dev_idx; snprintf(gasket_dev->kobj_name, GASKET_NAME_MAX, "%s", kobj_name); gasket_dev->dev = parent; + kobject_get(_dev->dev->kobj); /* gasket_bar_data is uninitialized. */ gasket_dev->num_page_tables = driver_desc->num_page_tables; /* max_page_table_size and *page table are uninit'ed */ @@ -487,7 +488,7 @@ static void gasket_free_dev(struct gasket_dev *gasket_dev) mutex_lock(_desc->mutex); internal_desc->devs[gasket_dev->dev_idx] = NULL; mutex_unlock(_desc->mutex); - + kobject_put(_dev->dev->kobj); kfree(gasket_dev); } -- 2.18.0.345.g5c9ce644c3-goog
Compliment of the day to you Dear Friend.
Compliment of the day to you Dear Friend. Dear Friend. I am Mrs. Amina Kadi. am sending this brief letter to solicit your partnership to transfer $5.5 million US Dollars. I shall send you more information and procedures when I receive positive response from you. Mrs. Amina Kadi
Compliment of the day to you Dear Friend.
Compliment of the day to you Dear Friend. Dear Friend. I am Mrs. Amina Kadi. am sending this brief letter to solicit your partnership to transfer $5.5 million US Dollars. I shall send you more information and procedures when I receive positive response from you. Mrs. Amina Kadi
Re: [alsa-devel] [PATCH] ASoC: soc-pcm: Use delay set in pointer function
On 7/27/2018 8:39 PM, Pierre-Louis Bossart wrote: > On 7/27/18 5:13 AM, Akshu Agrawal wrote: >> There are cases where a pointer function populates >> runtime->delay, such as: >> ./sound/pci/hda/hda_controller.c >> ./sound/soc/intel/atom/sst-mfld-platform-pcm.c >> >> Also, in some cases cpu dai used is generic and the pcm >> driver needs to set delay. >> >> This delay was getting lost and was overwritten by delays >> from codec or cpu dai delay function if exposed. > > Humm, yes the runtime->delay set in the .pointer function would be lost > without this change, but the delay would still be provided in the > followup call to .delay. > With your change, the same delay will be accounted for twice? > It will not be accounted twice because no driver which is setting runtime->delay is defining .delay op for cpu_dai. Vice versa is also true, the drivers which define .delay for cpu_dai don't set runtime->delay. And I think this is expected from drivers else it would be a bug from their side. .delay for codec_dai anyway is different and has to be accounted for. Thanks, Akshu >> >> Signed-off-by: Akshu Agrawal >> --- >> sound/soc/soc-pcm.c | 5 - >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c >> index 98be04b..b1a2bc2 100644 >> --- a/sound/soc/soc-pcm.c >> +++ b/sound/soc/soc-pcm.c >> @@ -1179,6 +1179,9 @@ static snd_pcm_uframes_t soc_pcm_pointer(struct >> snd_pcm_substream *substream) >> snd_pcm_sframes_t codec_delay = 0; >> int i; >> >> +/* clearing the previous delay */ >> +runtime->delay = 0; >> + >> for_each_rtdcom(rtd, rtdcom) { >> component = rtdcom->component; >> >> @@ -1203,7 +1206,7 @@ static snd_pcm_uframes_t soc_pcm_pointer(struct >> snd_pcm_substream *substream) >> } >> delay += codec_delay; >> >> -runtime->delay = delay; >> +runtime->delay += delay; >> >> return offset; >> } >> >
Re: [alsa-devel] [PATCH] ASoC: soc-pcm: Use delay set in pointer function
On 7/27/2018 8:39 PM, Pierre-Louis Bossart wrote: > On 7/27/18 5:13 AM, Akshu Agrawal wrote: >> There are cases where a pointer function populates >> runtime->delay, such as: >> ./sound/pci/hda/hda_controller.c >> ./sound/soc/intel/atom/sst-mfld-platform-pcm.c >> >> Also, in some cases cpu dai used is generic and the pcm >> driver needs to set delay. >> >> This delay was getting lost and was overwritten by delays >> from codec or cpu dai delay function if exposed. > > Humm, yes the runtime->delay set in the .pointer function would be lost > without this change, but the delay would still be provided in the > followup call to .delay. > With your change, the same delay will be accounted for twice? > It will not be accounted twice because no driver which is setting runtime->delay is defining .delay op for cpu_dai. Vice versa is also true, the drivers which define .delay for cpu_dai don't set runtime->delay. And I think this is expected from drivers else it would be a bug from their side. .delay for codec_dai anyway is different and has to be accounted for. Thanks, Akshu >> >> Signed-off-by: Akshu Agrawal >> --- >> sound/soc/soc-pcm.c | 5 - >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c >> index 98be04b..b1a2bc2 100644 >> --- a/sound/soc/soc-pcm.c >> +++ b/sound/soc/soc-pcm.c >> @@ -1179,6 +1179,9 @@ static snd_pcm_uframes_t soc_pcm_pointer(struct >> snd_pcm_substream *substream) >> snd_pcm_sframes_t codec_delay = 0; >> int i; >> >> +/* clearing the previous delay */ >> +runtime->delay = 0; >> + >> for_each_rtdcom(rtd, rtdcom) { >> component = rtdcom->component; >> >> @@ -1203,7 +1206,7 @@ static snd_pcm_uframes_t soc_pcm_pointer(struct >> snd_pcm_substream *substream) >> } >> delay += codec_delay; >> >> -runtime->delay = delay; >> +runtime->delay += delay; >> >> return offset; >> } >> >
[PATCH URGENT for ftrace/core] lockdep: Fix error due to incorrect pointer return
From: "Joel Fernandes (Google)" The 0-day bot caught an issue which all my tests missed (will add it into my tests) where this_cpu_ptr is incorrectly returning the wrong pointer from get_lock_stats. Fix it. Fixes: f4ac253a8df0 ("lockdep: use this_cpu_ptr instead of get_cpu_var stats") Signed-off-by: Joel Fernandes (Google) --- kernel/locking/lockdep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index a6a6b7eb4b82..03bfaeb9f4e6 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -249,7 +249,7 @@ void clear_lock_stats(struct lock_class *class) static struct lock_class_stats *get_lock_stats(struct lock_class *class) { - return _cpu_ptr(_lock_stats)[class - lock_classes]; + return _cpu_ptr(cpu_lock_stats)[class - lock_classes]; } static void lock_release_holdtime(struct held_lock *hlock) -- 2.18.0.345.g5c9ce644c3-goog
[PATCH URGENT for ftrace/core] lockdep: Fix error due to incorrect pointer return
From: "Joel Fernandes (Google)" The 0-day bot caught an issue which all my tests missed (will add it into my tests) where this_cpu_ptr is incorrectly returning the wrong pointer from get_lock_stats. Fix it. Fixes: f4ac253a8df0 ("lockdep: use this_cpu_ptr instead of get_cpu_var stats") Signed-off-by: Joel Fernandes (Google) --- kernel/locking/lockdep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index a6a6b7eb4b82..03bfaeb9f4e6 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -249,7 +249,7 @@ void clear_lock_stats(struct lock_class *class) static struct lock_class_stats *get_lock_stats(struct lock_class *class) { - return _cpu_ptr(_lock_stats)[class - lock_classes]; + return _cpu_ptr(cpu_lock_stats)[class - lock_classes]; } static void lock_release_holdtime(struct held_lock *hlock) -- 2.18.0.345.g5c9ce644c3-goog
[PATCH] PM / devfreq: Generic cpufreq governor
Many CPU architectures have caches that can scale independent of the CPUs. Frequency scaling of the caches is necessary to make sure the cache is not a performance bottleneck that leads to poor performance and power. The same idea applies for RAM/DDR. To achieve this, this patch adds a generic devfreq governor that can listen to the frequency transitions of each CPU frequency domain and then adjusts the frequency of the cache (or any devfreq device) based on the frequency of the CPUs. To decide the frequency of the device, the governor does one of the following: * Uses a CPU frequency to device frequency mapping table - Either one mapping table used for all CPU freq policies (typically used for system with homogeneous cores/clusters that have the same OPPs. - One mapping table per CPU freq policy (typically used for ASMP systems with heterogeneous CPUs with different OPPs) OR * Scales the device frequency in proportion to the CPU frequency. So, if the CPUs are running at their max frequency, the device runs at its max frequency. If the CPUs are running at their min frequency, the device runs at its min frequency. And interpolated for frequencies in between. Signed-off-by: Saravana Kannan --- .../bindings/devfreq/devfreq-cpufreq.txt | 53 ++ drivers/devfreq/Kconfig| 8 + drivers/devfreq/Makefile | 1 + drivers/devfreq/governor_cpufreq.c | 583 + 4 files changed, 645 insertions(+) create mode 100644 Documentation/devicetree/bindings/devfreq/devfreq-cpufreq.txt create mode 100644 drivers/devfreq/governor_cpufreq.c diff --git a/Documentation/devicetree/bindings/devfreq/devfreq-cpufreq.txt b/Documentation/devicetree/bindings/devfreq/devfreq-cpufreq.txt new file mode 100644 index 000..6537538 --- /dev/null +++ b/Documentation/devicetree/bindings/devfreq/devfreq-cpufreq.txt @@ -0,0 +1,53 @@ +Devfreq CPUfreq governor + +devfreq-cpufreq is a parent device that contains one or more child devices. +Each child device provides CPU frequency to device frequency mapping for a +specific device. Examples of devices that could use this are: DDR, cache and +CCI. + +Parent device name shall be "devfreq-cpufreq". + +Required child device properties: +- cpu-to-dev-map, or cpu-to-dev-map-: + A list of tuples where each tuple consists of a + CPU frequency (KHz) and the corresponding device + frequency. CPU frequencies not listed in the table + will use the device frequency that corresponds to the + next rounded up CPU frequency. + Use "cpu-to-dev-map" if all CPUs in the system should + share same mapping. + Use cpu-to-dev-map- to describe different + mappings for different CPUs. The property should be + listed only for the first CPU if multiple CPUs are + synchronous. +- target-dev: Phandle to device that this mapping applies to. + +Example: + devfreq-cpufreq { + cpubw-cpufreq { + target-dev = <>; + cpu-to-dev-map = + < 30 1144 >, + < 422400 2288 >, + < 652800 3051 >, + < 883200 5996 >, + < 1190400 8056 >, + < 1497600 10101 >, + < 1728000 12145 >, + < 2649600 16250 >; + }; + + cache-cpufreq { + target-dev = <>; + cpu-to-dev-map = + < 30 30 >, + < 422400 422400 >, + < 652800 499200 >, + < 883200 576000 >, + < 96 96 >, + < 1497600 1036800 >, + < 1574400 1574400 >, + < 1728000 1651200 >, + < 2649600 1728000 >; + }; + }; diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig index 8503018..5eeb33f 100644 --- a/drivers/devfreq/Kconfig +++ b/drivers/devfreq/Kconfig @@ -73,6 +73,14 @@ config DEVFREQ_GOV_PASSIVE through sysfs entries. The passive governor recommends that devfreq device uses the OPP table to get the frequency/voltage. +config DEVFREQ_GOV_CPUFREQ + tristate "CPUfreq" + depends on CPU_FREQ + help + Chooses frequency based on the online CPUs' current frequency and a + CPU frequency to device frequency mapping table(s). This governor + can be useful for controlling devices such
[PATCH] PM / devfreq: Generic cpufreq governor
Many CPU architectures have caches that can scale independent of the CPUs. Frequency scaling of the caches is necessary to make sure the cache is not a performance bottleneck that leads to poor performance and power. The same idea applies for RAM/DDR. To achieve this, this patch adds a generic devfreq governor that can listen to the frequency transitions of each CPU frequency domain and then adjusts the frequency of the cache (or any devfreq device) based on the frequency of the CPUs. To decide the frequency of the device, the governor does one of the following: * Uses a CPU frequency to device frequency mapping table - Either one mapping table used for all CPU freq policies (typically used for system with homogeneous cores/clusters that have the same OPPs. - One mapping table per CPU freq policy (typically used for ASMP systems with heterogeneous CPUs with different OPPs) OR * Scales the device frequency in proportion to the CPU frequency. So, if the CPUs are running at their max frequency, the device runs at its max frequency. If the CPUs are running at their min frequency, the device runs at its min frequency. And interpolated for frequencies in between. Signed-off-by: Saravana Kannan --- .../bindings/devfreq/devfreq-cpufreq.txt | 53 ++ drivers/devfreq/Kconfig| 8 + drivers/devfreq/Makefile | 1 + drivers/devfreq/governor_cpufreq.c | 583 + 4 files changed, 645 insertions(+) create mode 100644 Documentation/devicetree/bindings/devfreq/devfreq-cpufreq.txt create mode 100644 drivers/devfreq/governor_cpufreq.c diff --git a/Documentation/devicetree/bindings/devfreq/devfreq-cpufreq.txt b/Documentation/devicetree/bindings/devfreq/devfreq-cpufreq.txt new file mode 100644 index 000..6537538 --- /dev/null +++ b/Documentation/devicetree/bindings/devfreq/devfreq-cpufreq.txt @@ -0,0 +1,53 @@ +Devfreq CPUfreq governor + +devfreq-cpufreq is a parent device that contains one or more child devices. +Each child device provides CPU frequency to device frequency mapping for a +specific device. Examples of devices that could use this are: DDR, cache and +CCI. + +Parent device name shall be "devfreq-cpufreq". + +Required child device properties: +- cpu-to-dev-map, or cpu-to-dev-map-: + A list of tuples where each tuple consists of a + CPU frequency (KHz) and the corresponding device + frequency. CPU frequencies not listed in the table + will use the device frequency that corresponds to the + next rounded up CPU frequency. + Use "cpu-to-dev-map" if all CPUs in the system should + share same mapping. + Use cpu-to-dev-map- to describe different + mappings for different CPUs. The property should be + listed only for the first CPU if multiple CPUs are + synchronous. +- target-dev: Phandle to device that this mapping applies to. + +Example: + devfreq-cpufreq { + cpubw-cpufreq { + target-dev = <>; + cpu-to-dev-map = + < 30 1144 >, + < 422400 2288 >, + < 652800 3051 >, + < 883200 5996 >, + < 1190400 8056 >, + < 1497600 10101 >, + < 1728000 12145 >, + < 2649600 16250 >; + }; + + cache-cpufreq { + target-dev = <>; + cpu-to-dev-map = + < 30 30 >, + < 422400 422400 >, + < 652800 499200 >, + < 883200 576000 >, + < 96 96 >, + < 1497600 1036800 >, + < 1574400 1574400 >, + < 1728000 1651200 >, + < 2649600 1728000 >; + }; + }; diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig index 8503018..5eeb33f 100644 --- a/drivers/devfreq/Kconfig +++ b/drivers/devfreq/Kconfig @@ -73,6 +73,14 @@ config DEVFREQ_GOV_PASSIVE through sysfs entries. The passive governor recommends that devfreq device uses the OPP table to get the frequency/voltage. +config DEVFREQ_GOV_CPUFREQ + tristate "CPUfreq" + depends on CPU_FREQ + help + Chooses frequency based on the online CPUs' current frequency and a + CPU frequency to device frequency mapping table(s). This governor + can be useful for controlling devices such
Re: [GIT PULL] ACPI fix for v4.18-rc7
> ACPI Warning: \_SB.IETM._ART: Return Package type mismatch .. _ART is the AML method for "Active Cooling Relationship Table". iIt tells the system how fans are related to temperature sensors. It is likely that Windows uses DPTF on this platform, and since DPTF is not using _ART, Linux may be pointing out a vestigial _ART. Anyway, if your system doesn't handle being hot very well -- this error message is related to that capability. cheers, -Len
Re: [GIT PULL] ACPI fix for v4.18-rc7
> ACPI Warning: \_SB.IETM._ART: Return Package type mismatch .. _ART is the AML method for "Active Cooling Relationship Table". iIt tells the system how fans are related to temperature sensors. It is likely that Windows uses DPTF on this platform, and since DPTF is not using _ART, Linux may be pointing out a vestigial _ART. Anyway, if your system doesn't handle being hot very well -- this error message is related to that capability. cheers, -Len
Hello from Lisa
Hello Dear, I am Miss Lisa. I have very important thing to discuss with you. please, this information is very vital. Contact me with my private email so we can talk ( lisajoh...@hotmail.com ) I find your email when I was searching for some that will understand me. Lisa
Hello from Lisa
Hello Dear, I am Miss Lisa. I have very important thing to discuss with you. please, this information is very vital. Contact me with my private email so we can talk ( lisajoh...@hotmail.com ) I find your email when I was searching for some that will understand me. Lisa
Re: [PATCH 13/38] tomoyo: Implement security hooks for the new mount API [ver #10]
On 2018/07/28 2:32, David Howells wrote: > Implement the security hook to check the creation of a new mountpoint for > Tomoyo. > > As far as I can tell, Tomoyo doesn't make use of the mount data or parse > any mount options, so I haven't implemented any of the fs_context hooks for > it. > > Signed-off-by: David Howells > cc: Tetsuo Handa > cc: tomoyo-dev...@lists.sourceforge.jp > cc: linux-security-mod...@vger.kernel.org Would you provide examples of each possible combination as a C program? For example, if one mount point from multiple sources with different options are possible, please describe such pattern using syscall so that LSM modules can run it to see whether they are working as expected.
Re: [PATCH 13/38] tomoyo: Implement security hooks for the new mount API [ver #10]
On 2018/07/28 2:32, David Howells wrote: > Implement the security hook to check the creation of a new mountpoint for > Tomoyo. > > As far as I can tell, Tomoyo doesn't make use of the mount data or parse > any mount options, so I haven't implemented any of the fs_context hooks for > it. > > Signed-off-by: David Howells > cc: Tetsuo Handa > cc: tomoyo-dev...@lists.sourceforge.jp > cc: linux-security-mod...@vger.kernel.org Would you provide examples of each possible combination as a C program? For example, if one mount point from multiple sources with different options are possible, please describe such pattern using syscall so that LSM modules can run it to see whether they are working as expected.
Re: [PATCH v2 3/3] selftests/ftrace: Fix kprobe string testcase to not probe notrace function
On Fri, 27 Jul 2018 10:31:45 -0400 Steven Rostedt wrote: > On Fri, 27 Jul 2018 08:26:40 -0600 > Shuah Khan wrote: > > > Masami, > > > > The patch isn't in my Inbox. Could you please send it. > > You don't subscribe to LKML? ;-) > > I just bounced it to you. > > -- Steve Sorry Shuah and Steve, I've fixed a problem in this patch. I've sent v3 patch to you. Thank you, -- Masami Hiramatsu
Re: [PATCH v2 3/3] selftests/ftrace: Fix kprobe string testcase to not probe notrace function
On Fri, 27 Jul 2018 10:31:45 -0400 Steven Rostedt wrote: > On Fri, 27 Jul 2018 08:26:40 -0600 > Shuah Khan wrote: > > > Masami, > > > > The patch isn't in my Inbox. Could you please send it. > > You don't subscribe to LKML? ;-) > > I just bounced it to you. > > -- Steve Sorry Shuah and Steve, I've fixed a problem in this patch. I've sent v3 patch to you. Thank you, -- Masami Hiramatsu
[PATCH v3 3/3] selftests/ftrace: Fix kprobe string testcase to not probe notrace function
Fix kprobe string argument testcase to not probe notrace function. Instead, it probes tracefs function which must be available with ftrace. Signed-off-by: Masami Hiramatsu --- Changes in v3: - Fix probepoint testcase too. --- .../ftrace/test.d/kprobe/kprobe_args_string.tc | 30 .../selftests/ftrace/test.d/kprobe/probepoint.tc |2 + 2 files changed, 13 insertions(+), 19 deletions(-) diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc index a0002563e9ee..1ad70cdaf442 100644 --- a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc +++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc @@ -9,28 +9,22 @@ echo > kprobe_events case `uname -m` in x86_64) - ARG2=%si - OFFS=8 + ARG1=%di ;; i[3456]86) - ARG2=%cx - OFFS=4 + ARG1=%ax ;; aarch64) - ARG2=%x1 - OFFS=8 + ARG1=%x0 ;; arm*) - ARG2=%r1 - OFFS=4 + ARG1=%r0 ;; ppc64*) - ARG2=%r4 - OFFS=8 + ARG1=%r3 ;; ppc*) - ARG2=%r4 - OFFS=4 + ARG1=%r3 ;; *) echo "Please implement other architecture here" @@ -38,17 +32,17 @@ ppc*) esac : "Test get argument (1)" -echo "p:testprobe create_trace_kprobe arg1=+0(+0(${ARG2})):string" > kprobe_events +echo "p:testprobe tracefs_create_dir arg1=+0(${ARG1}):string" > kprobe_events echo 1 > events/kprobes/testprobe/enable -! echo test >> kprobe_events -tail -n 1 trace | grep -qe "testprobe.* arg1=\"test\"" +echo "p:test _do_fork" >> kprobe_events +grep -qe "testprobe.* arg1=\"test\"" trace echo 0 > events/kprobes/testprobe/enable : "Test get argument (2)" -echo "p:testprobe create_trace_kprobe arg1=+0(+0(${ARG2})):string arg2=+0(+${OFFS}(${ARG2})):string" > kprobe_events +echo "p:testprobe tracefs_create_dir arg1=+0(${ARG1}):string arg2=+0(${ARG1}):string" > kprobe_events echo 1 > events/kprobes/testprobe/enable -! echo test1 test2 >> kprobe_events -tail -n 1 trace | grep -qe "testprobe.* arg1=\"test1\" arg2=\"test2\"" +echo "p:test _do_fork" >> kprobe_events +grep -qe "testprobe.* arg1=\"test\" arg2=\"test\"" trace echo 0 > events/enable echo > kprobe_events diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/probepoint.tc b/tools/testing/selftests/ftrace/test.d/kprobe/probepoint.tc index 4fda01a08da4..519d2763f5d2 100644 --- a/tools/testing/selftests/ftrace/test.d/kprobe/probepoint.tc +++ b/tools/testing/selftests/ftrace/test.d/kprobe/probepoint.tc @@ -4,7 +4,7 @@ [ -f kprobe_events ] || exit_unsupported # this is configurable -TARGET_FUNC=create_trace_kprobe +TARGET_FUNC=tracefs_create_dir dec_addr() { # hexaddr printf "%d" "0x"`echo $1 | tail -c 8`
[PATCH v3 3/3] selftests/ftrace: Fix kprobe string testcase to not probe notrace function
Fix kprobe string argument testcase to not probe notrace function. Instead, it probes tracefs function which must be available with ftrace. Signed-off-by: Masami Hiramatsu --- Changes in v3: - Fix probepoint testcase too. --- .../ftrace/test.d/kprobe/kprobe_args_string.tc | 30 .../selftests/ftrace/test.d/kprobe/probepoint.tc |2 + 2 files changed, 13 insertions(+), 19 deletions(-) diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc index a0002563e9ee..1ad70cdaf442 100644 --- a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc +++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc @@ -9,28 +9,22 @@ echo > kprobe_events case `uname -m` in x86_64) - ARG2=%si - OFFS=8 + ARG1=%di ;; i[3456]86) - ARG2=%cx - OFFS=4 + ARG1=%ax ;; aarch64) - ARG2=%x1 - OFFS=8 + ARG1=%x0 ;; arm*) - ARG2=%r1 - OFFS=4 + ARG1=%r0 ;; ppc64*) - ARG2=%r4 - OFFS=8 + ARG1=%r3 ;; ppc*) - ARG2=%r4 - OFFS=4 + ARG1=%r3 ;; *) echo "Please implement other architecture here" @@ -38,17 +32,17 @@ ppc*) esac : "Test get argument (1)" -echo "p:testprobe create_trace_kprobe arg1=+0(+0(${ARG2})):string" > kprobe_events +echo "p:testprobe tracefs_create_dir arg1=+0(${ARG1}):string" > kprobe_events echo 1 > events/kprobes/testprobe/enable -! echo test >> kprobe_events -tail -n 1 trace | grep -qe "testprobe.* arg1=\"test\"" +echo "p:test _do_fork" >> kprobe_events +grep -qe "testprobe.* arg1=\"test\"" trace echo 0 > events/kprobes/testprobe/enable : "Test get argument (2)" -echo "p:testprobe create_trace_kprobe arg1=+0(+0(${ARG2})):string arg2=+0(+${OFFS}(${ARG2})):string" > kprobe_events +echo "p:testprobe tracefs_create_dir arg1=+0(${ARG1}):string arg2=+0(${ARG1}):string" > kprobe_events echo 1 > events/kprobes/testprobe/enable -! echo test1 test2 >> kprobe_events -tail -n 1 trace | grep -qe "testprobe.* arg1=\"test1\" arg2=\"test2\"" +echo "p:test _do_fork" >> kprobe_events +grep -qe "testprobe.* arg1=\"test\" arg2=\"test\"" trace echo 0 > events/enable echo > kprobe_events diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/probepoint.tc b/tools/testing/selftests/ftrace/test.d/kprobe/probepoint.tc index 4fda01a08da4..519d2763f5d2 100644 --- a/tools/testing/selftests/ftrace/test.d/kprobe/probepoint.tc +++ b/tools/testing/selftests/ftrace/test.d/kprobe/probepoint.tc @@ -4,7 +4,7 @@ [ -f kprobe_events ] || exit_unsupported # this is configurable -TARGET_FUNC=create_trace_kprobe +TARGET_FUNC=tracefs_create_dir dec_addr() { # hexaddr printf "%d" "0x"`echo $1 | tail -c 8`
[PATCH v3 1/3] tracing: kprobes: Prohibit probing on notrace function
Prohibit kprobe-events probing on notrace function. Since probing on the notrace function can cause recursive event call. In most case those are just skipped, but in some case it falls into infinit recursive call. This protection can be disabled by the kconfig CONFIG_KPROBE_EVENTS_ON_NOTRACE=y, but it is highly recommended to keep it "n" for normal kernel. Signed-off-by: Masami Hiramatsu Tested-by: Francis Deslauriers --- Changes in v2 - Add CONFIG_KPROBE_EVENTS_ON_NOTRACE kconfig for knocking down the protection. Changes in v3 - Fix to check raw-address (no symbol) probe point correctly. --- kernel/trace/Kconfig| 18 + kernel/trace/trace_kprobe.c | 46 +++ 2 files changed, 55 insertions(+), 9 deletions(-) diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index dcc0166d1997..24d5a58467a3 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -456,6 +456,24 @@ config KPROBE_EVENTS This option is also required by perf-probe subcommand of perf tools. If you want to use perf tools, this option is strongly recommended. +config KPROBE_EVENTS_ON_NOTRACE + bool "Do NOT protect notrace function from kprobe events" + depends on KPROBE_EVENTS + default n + help + This is only for the developers who want to debug ftrace itself + using kprobe events. + + Usually, ftrace related functions are protected from kprobe-events + to prevent an infinit recursion or any unexpected execution path + which leads to a kernel crash. + + This option disables such protection and allows you to put kprobe + events on ftrace functions for debugging ftrace by itself. + Note that this might let you shoot yourself in the foot. + + If unsure, say N. + config UPROBE_EVENTS bool "Enable uprobes-based dynamic events" depends on ARCH_SUPPORTS_UPROBES diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 27ace4513c43..2bf75af92f16 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -87,6 +87,21 @@ static nokprobe_inline unsigned long trace_kprobe_nhit(struct trace_kprobe *tk) return nhit; } +static nokprobe_inline +unsigned long trace_kprobe_address(struct trace_kprobe *tk) +{ + unsigned long addr; + + if (tk->symbol) { + addr = (unsigned long) + kallsyms_lookup_name(trace_kprobe_symbol(tk)); + addr += tk->rp.kp.offset; + } else { + addr = (unsigned long)tk->rp.kp.addr; + } + return addr; +} + bool trace_kprobe_on_func_entry(struct trace_event_call *call) { struct trace_kprobe *tk = (struct trace_kprobe *)call->data; @@ -99,16 +114,8 @@ bool trace_kprobe_on_func_entry(struct trace_event_call *call) bool trace_kprobe_error_injectable(struct trace_event_call *call) { struct trace_kprobe *tk = (struct trace_kprobe *)call->data; - unsigned long addr; - if (tk->symbol) { - addr = (unsigned long) - kallsyms_lookup_name(trace_kprobe_symbol(tk)); - addr += tk->rp.kp.offset; - } else { - addr = (unsigned long)tk->rp.kp.addr; - } - return within_error_injection_list(addr); + return within_error_injection_list(trace_kprobe_address(tk)); } static int register_kprobe_event(struct trace_kprobe *tk); @@ -496,6 +503,21 @@ disable_trace_kprobe(struct trace_kprobe *tk, struct trace_event_file *file) return ret; } +#ifdef CONFIG_KPROBE_EVENTS_ON_NOTRACE +#define within_notrace_func(tk)(false) +#else +static bool within_notrace_func(struct trace_kprobe *tk) +{ + unsigned long offset, size, addr; + + addr = trace_kprobe_address(tk); + if (!kallsyms_lookup_size_offset(addr, , )) + return true;/* Out of range. */ + + return !ftrace_location_range(addr - offset, addr - offset + size); +} +#endif + /* Internal register function - just handle k*probes and flags */ static int __register_trace_kprobe(struct trace_kprobe *tk) { @@ -504,6 +526,12 @@ static int __register_trace_kprobe(struct trace_kprobe *tk) if (trace_probe_is_registered(>tp)) return -EINVAL; + if (within_notrace_func(tk)) { + pr_warn("Could not probe notrace function %s\n", + trace_kprobe_symbol(tk)); + return -EINVAL; + } + for (i = 0; i < tk->tp.nr_args; i++) traceprobe_update_arg(>tp.args[i]);
[PATCH v3 1/3] tracing: kprobes: Prohibit probing on notrace function
Prohibit kprobe-events probing on notrace function. Since probing on the notrace function can cause recursive event call. In most case those are just skipped, but in some case it falls into infinit recursive call. This protection can be disabled by the kconfig CONFIG_KPROBE_EVENTS_ON_NOTRACE=y, but it is highly recommended to keep it "n" for normal kernel. Signed-off-by: Masami Hiramatsu Tested-by: Francis Deslauriers --- Changes in v2 - Add CONFIG_KPROBE_EVENTS_ON_NOTRACE kconfig for knocking down the protection. Changes in v3 - Fix to check raw-address (no symbol) probe point correctly. --- kernel/trace/Kconfig| 18 + kernel/trace/trace_kprobe.c | 46 +++ 2 files changed, 55 insertions(+), 9 deletions(-) diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index dcc0166d1997..24d5a58467a3 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -456,6 +456,24 @@ config KPROBE_EVENTS This option is also required by perf-probe subcommand of perf tools. If you want to use perf tools, this option is strongly recommended. +config KPROBE_EVENTS_ON_NOTRACE + bool "Do NOT protect notrace function from kprobe events" + depends on KPROBE_EVENTS + default n + help + This is only for the developers who want to debug ftrace itself + using kprobe events. + + Usually, ftrace related functions are protected from kprobe-events + to prevent an infinit recursion or any unexpected execution path + which leads to a kernel crash. + + This option disables such protection and allows you to put kprobe + events on ftrace functions for debugging ftrace by itself. + Note that this might let you shoot yourself in the foot. + + If unsure, say N. + config UPROBE_EVENTS bool "Enable uprobes-based dynamic events" depends on ARCH_SUPPORTS_UPROBES diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 27ace4513c43..2bf75af92f16 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -87,6 +87,21 @@ static nokprobe_inline unsigned long trace_kprobe_nhit(struct trace_kprobe *tk) return nhit; } +static nokprobe_inline +unsigned long trace_kprobe_address(struct trace_kprobe *tk) +{ + unsigned long addr; + + if (tk->symbol) { + addr = (unsigned long) + kallsyms_lookup_name(trace_kprobe_symbol(tk)); + addr += tk->rp.kp.offset; + } else { + addr = (unsigned long)tk->rp.kp.addr; + } + return addr; +} + bool trace_kprobe_on_func_entry(struct trace_event_call *call) { struct trace_kprobe *tk = (struct trace_kprobe *)call->data; @@ -99,16 +114,8 @@ bool trace_kprobe_on_func_entry(struct trace_event_call *call) bool trace_kprobe_error_injectable(struct trace_event_call *call) { struct trace_kprobe *tk = (struct trace_kprobe *)call->data; - unsigned long addr; - if (tk->symbol) { - addr = (unsigned long) - kallsyms_lookup_name(trace_kprobe_symbol(tk)); - addr += tk->rp.kp.offset; - } else { - addr = (unsigned long)tk->rp.kp.addr; - } - return within_error_injection_list(addr); + return within_error_injection_list(trace_kprobe_address(tk)); } static int register_kprobe_event(struct trace_kprobe *tk); @@ -496,6 +503,21 @@ disable_trace_kprobe(struct trace_kprobe *tk, struct trace_event_file *file) return ret; } +#ifdef CONFIG_KPROBE_EVENTS_ON_NOTRACE +#define within_notrace_func(tk)(false) +#else +static bool within_notrace_func(struct trace_kprobe *tk) +{ + unsigned long offset, size, addr; + + addr = trace_kprobe_address(tk); + if (!kallsyms_lookup_size_offset(addr, , )) + return true;/* Out of range. */ + + return !ftrace_location_range(addr - offset, addr - offset + size); +} +#endif + /* Internal register function - just handle k*probes and flags */ static int __register_trace_kprobe(struct trace_kprobe *tk) { @@ -504,6 +526,12 @@ static int __register_trace_kprobe(struct trace_kprobe *tk) if (trace_probe_is_registered(>tp)) return -EINVAL; + if (within_notrace_func(tk)) { + pr_warn("Could not probe notrace function %s\n", + trace_kprobe_symbol(tk)); + return -EINVAL; + } + for (i = 0; i < tk->tp.nr_args; i++) traceprobe_update_arg(>tp.args[i]);
[PATCH v3 2/3] selftest/ftrace: Move kprobe selftest function to separate compile unit
From: Francis Deslauriers Move selftest function to its own compile unit so it can be compiled with the ftrace cflags (CC_FLAGS_FTRACE) allowing it to be probed during the ftrace startup tests. Signed-off-by: Francis Deslauriers Signed-off-by: Masami Hiramatsu --- kernel/trace/Makefile|5 + kernel/trace/trace_kprobe.c | 12 +--- kernel/trace/trace_kprobe_selftest.c | 10 ++ kernel/trace/trace_kprobe_selftest.h |7 +++ 4 files changed, 23 insertions(+), 11 deletions(-) create mode 100644 kernel/trace/trace_kprobe_selftest.c create mode 100644 kernel/trace/trace_kprobe_selftest.h diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile index e2538c7638d4..e38771eccb2f 100644 --- a/kernel/trace/Makefile +++ b/kernel/trace/Makefile @@ -13,6 +13,11 @@ obj-y += trace_selftest_dynamic.o endif endif +ifdef CONFIG_FTRACE_STARTUP_TEST +CFLAGS_trace_kprobe_selftest.o = $(CC_FLAGS_FTRACE) +obj-$(CONFIG_KPROBE_EVENTS) += trace_kprobe_selftest.o +endif + # If unlikely tracing is enabled, do not trace these files ifdef CONFIG_TRACING_BRANCHES KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 2bf75af92f16..88b49396ba08 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -23,6 +23,7 @@ #include #include +#include "trace_kprobe_selftest.h" #include "trace_probe.h" #define KPROBE_EVENT_SYSTEM "kprobes" @@ -1578,17 +1579,6 @@ fs_initcall(init_kprobe_trace); #ifdef CONFIG_FTRACE_STARTUP_TEST -/* - * The "__used" keeps gcc from removing the function symbol - * from the kallsyms table. 'noinline' makes sure that there - * isn't an inlined version used by the test method below - */ -static __used __init noinline int -kprobe_trace_selftest_target(int a1, int a2, int a3, int a4, int a5, int a6) -{ - return a1 + a2 + a3 + a4 + a5 + a6; -} - static __init struct trace_event_file * find_trace_probe_file(struct trace_kprobe *tk, struct trace_array *tr) { diff --git a/kernel/trace/trace_kprobe_selftest.c b/kernel/trace/trace_kprobe_selftest.c new file mode 100644 index ..16548ee4c8c6 --- /dev/null +++ b/kernel/trace/trace_kprobe_selftest.c @@ -0,0 +1,10 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Function used during the kprobe self test. This function is in a separate + * compile unit so it can be compile with CC_FLAGS_FTRACE to ensure that it + * can be probed by the selftests. + */ +int kprobe_trace_selftest_target(int a1, int a2, int a3, int a4, int a5, int a6) +{ + return a1 + a2 + a3 + a4 + a5 + a6; +} diff --git a/kernel/trace/trace_kprobe_selftest.h b/kernel/trace/trace_kprobe_selftest.h new file mode 100644 index ..4e10ec41c013 --- /dev/null +++ b/kernel/trace/trace_kprobe_selftest.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Function used during the kprobe self test. This function is in a separate + * compile unit so it can be compile with CC_FLAGS_FTRACE to ensure that it + * can be probed by the selftests. + */ +int kprobe_trace_selftest_target(int a1, int a2, int a3, int a4, int a5, int a6);
[PATCH v3 2/3] selftest/ftrace: Move kprobe selftest function to separate compile unit
From: Francis Deslauriers Move selftest function to its own compile unit so it can be compiled with the ftrace cflags (CC_FLAGS_FTRACE) allowing it to be probed during the ftrace startup tests. Signed-off-by: Francis Deslauriers Signed-off-by: Masami Hiramatsu --- kernel/trace/Makefile|5 + kernel/trace/trace_kprobe.c | 12 +--- kernel/trace/trace_kprobe_selftest.c | 10 ++ kernel/trace/trace_kprobe_selftest.h |7 +++ 4 files changed, 23 insertions(+), 11 deletions(-) create mode 100644 kernel/trace/trace_kprobe_selftest.c create mode 100644 kernel/trace/trace_kprobe_selftest.h diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile index e2538c7638d4..e38771eccb2f 100644 --- a/kernel/trace/Makefile +++ b/kernel/trace/Makefile @@ -13,6 +13,11 @@ obj-y += trace_selftest_dynamic.o endif endif +ifdef CONFIG_FTRACE_STARTUP_TEST +CFLAGS_trace_kprobe_selftest.o = $(CC_FLAGS_FTRACE) +obj-$(CONFIG_KPROBE_EVENTS) += trace_kprobe_selftest.o +endif + # If unlikely tracing is enabled, do not trace these files ifdef CONFIG_TRACING_BRANCHES KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 2bf75af92f16..88b49396ba08 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -23,6 +23,7 @@ #include #include +#include "trace_kprobe_selftest.h" #include "trace_probe.h" #define KPROBE_EVENT_SYSTEM "kprobes" @@ -1578,17 +1579,6 @@ fs_initcall(init_kprobe_trace); #ifdef CONFIG_FTRACE_STARTUP_TEST -/* - * The "__used" keeps gcc from removing the function symbol - * from the kallsyms table. 'noinline' makes sure that there - * isn't an inlined version used by the test method below - */ -static __used __init noinline int -kprobe_trace_selftest_target(int a1, int a2, int a3, int a4, int a5, int a6) -{ - return a1 + a2 + a3 + a4 + a5 + a6; -} - static __init struct trace_event_file * find_trace_probe_file(struct trace_kprobe *tk, struct trace_array *tr) { diff --git a/kernel/trace/trace_kprobe_selftest.c b/kernel/trace/trace_kprobe_selftest.c new file mode 100644 index ..16548ee4c8c6 --- /dev/null +++ b/kernel/trace/trace_kprobe_selftest.c @@ -0,0 +1,10 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Function used during the kprobe self test. This function is in a separate + * compile unit so it can be compile with CC_FLAGS_FTRACE to ensure that it + * can be probed by the selftests. + */ +int kprobe_trace_selftest_target(int a1, int a2, int a3, int a4, int a5, int a6) +{ + return a1 + a2 + a3 + a4 + a5 + a6; +} diff --git a/kernel/trace/trace_kprobe_selftest.h b/kernel/trace/trace_kprobe_selftest.h new file mode 100644 index ..4e10ec41c013 --- /dev/null +++ b/kernel/trace/trace_kprobe_selftest.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Function used during the kprobe self test. This function is in a separate + * compile unit so it can be compile with CC_FLAGS_FTRACE to ensure that it + * can be probed by the selftests. + */ +int kprobe_trace_selftest_target(int a1, int a2, int a3, int a4, int a5, int a6);
[PATCH v3 0/3] tracing: kprobes: Prohibit probing on notrace functions
Hello, This is the 3rd version of the series to prohibit kprobe on notrace functions which Francis sent before. This version fixes some issues on previous version. Fix to handle the no-symbol kprobe-events correctly and Fix probepoint.tc testcase to not use notrace function as a probe point. Thank you, --- Francis Deslauriers (1): selftest/ftrace: Move kprobe selftest function to separate compile unit Masami Hiramatsu (2): tracing: kprobes: Prohibit probing on notrace function selftests/ftrace: Fix kprobe string testcase to not probe notrace function kernel/trace/Kconfig | 18 ++ kernel/trace/Makefile |5 ++ kernel/trace/trace_kprobe.c| 58 +--- kernel/trace/trace_kprobe_selftest.c | 10 +++ kernel/trace/trace_kprobe_selftest.h |7 ++ .../ftrace/test.d/kprobe/kprobe_args_string.tc | 30 -- .../selftests/ftrace/test.d/kprobe/probepoint.tc |2 - 7 files changed, 91 insertions(+), 39 deletions(-) create mode 100644 kernel/trace/trace_kprobe_selftest.c create mode 100644 kernel/trace/trace_kprobe_selftest.h -- Masami Hiramatsu (Linaro)
[PATCH v3 0/3] tracing: kprobes: Prohibit probing on notrace functions
Hello, This is the 3rd version of the series to prohibit kprobe on notrace functions which Francis sent before. This version fixes some issues on previous version. Fix to handle the no-symbol kprobe-events correctly and Fix probepoint.tc testcase to not use notrace function as a probe point. Thank you, --- Francis Deslauriers (1): selftest/ftrace: Move kprobe selftest function to separate compile unit Masami Hiramatsu (2): tracing: kprobes: Prohibit probing on notrace function selftests/ftrace: Fix kprobe string testcase to not probe notrace function kernel/trace/Kconfig | 18 ++ kernel/trace/Makefile |5 ++ kernel/trace/trace_kprobe.c| 58 +--- kernel/trace/trace_kprobe_selftest.c | 10 +++ kernel/trace/trace_kprobe_selftest.h |7 ++ .../ftrace/test.d/kprobe/kprobe_args_string.tc | 30 -- .../selftests/ftrace/test.d/kprobe/probepoint.tc |2 - 7 files changed, 91 insertions(+), 39 deletions(-) create mode 100644 kernel/trace/trace_kprobe_selftest.c create mode 100644 kernel/trace/trace_kprobe_selftest.h -- Masami Hiramatsu (Linaro)
Re: RCU nocb list not reclaiming causing OOM
On Sat, Jul 28, 2018 at 12:07:19AM +, David Chen wrote: > Hi Paul, > > I wasn't talking about the xchg() though. > > The smp_mb__after_atomic() is not for xchg(), it's for `*tail = > rdp->nocb_gp_head;` > it's stated so in the comment. And I do think we need ordering between > `*tail = rdp->nocb_gp_head;` and wake up, because the waiter is checking on > head not tail. > > swait_event_interruptible(rdp->nocb_wq, > > READ_ONCE(rdp->nocb_follower_head)); > > So what I'm saying is that since we need to maintain ordering between `*tail > = rdp->nocb_gp_head;` > and wake up, we need to change the smp_mb__after_atomic() to smp_mb(). Because > smp_mb__after_atomic() wouldn't guarantee > > So this is what I'm proposing. Good eyes! Hmmm... What do I do about this in mainline? Ah, I introduced a ->nocb_lock in mainline to prevent this from happening. In that same commit that you didn't want to use because it is hard to backport. ;-) So yes, but there might well be other misorderings fixed by the hard-to-backport commit that your change below does not cover. Still I do agree that you need full ordering at that point. Thanx, Paul > diff -ru linux-4.9.37.orig/kernel/rcu/tree_plugin.h > linux-4.9.37/kernel/rcu/tree_plugin.h > --- linux-4.9.37.orig/kernel/rcu/tree_plugin.h2017-07-12 > 06:42:41.0 -0700 > +++ linux-4.9.37/kernel/rcu/tree_plugin.h 2018-07-27 16:23:41.349044259 > -0700 > @@ -2076,7 +2076,7 @@ > /* Append callbacks to follower's "done" list. */ > tail = xchg(>nocb_follower_tail, rdp->nocb_gp_tail); > *tail = rdp->nocb_gp_head; > - smp_mb__after_atomic(); /* Store *tail before wakeup. */ > + smp_mb(); /* Store *tail before wakeup. */ > if (rdp != my_rdp && tail == >nocb_follower_head) { > /* >* List was empty, wake up the follower. > > Thanks, > David > > From: Paul E. McKenney > Sent: Friday, July 27, 2018 4:47 PM > To: David Chen > Cc: linux-kernel@vger.kernel.org > Subject: Re: RCU nocb list not reclaiming causing OOM > > > On Fri, Jul 27, 2018 at 11:16:39PM +, David Chen wrote: > > Hi Paul, > > > > Thanks for the advice. > > The bug is kind of hard to hit, so I can't say for certain yet. > > Well, you can always remove the "tail == >nocb_follower_head" as an > extra belt-and-suspenders safety net. I am not putting that in mainline, > but in the privacy of your own copy of the kernel, I don't see any really > serious problem with it. (As long as you aren't going for absolute maximum > performance, but even then there are other more important tuning actions > and code changes you could make.) > > > Though after another look at the code, I found out the > > `smp_mb__after_atomic();` > > seems to be only a compiler barrier on x86. > > Yes, and that is because the locked xchg instruction used on x86 to > implement xchg() already provides full ordering. ;-) > > > tail = xchg(>nocb_follower_tail, rdp->nocb_gp_tail); > > *tail = rdp->nocb_gp_head; > > smp_mb__after_atomic(); /* Store *tail before wakeup. */ > > if (rdp != my_rdp && tail == >nocb_follower_head) { > > swake_up(>nocb_wq); > > > > But that wouldn't be enough right? Because from 6b5fc3a13318, it stated that > > wakeup operation don't guarantee ordering. And when the follower wakes up, > > it checks > > for nocb_follower_head, which is assigned by `*tail = rdp->nocb_gp_head;` > > which doesn't > > have LOCK prefix. So it's possible for follower to wake up and see the list > > is empty and go > > back to sleep. > > Again, xchg() is defined to provide full ordering against all operations > before and after it. Each architecture is required to do whatever is > necessary to implement that full ordering, and x86 need only provide > its "lock xchg" instruction. > > The smp_mb__after_atomic() has effect only after atomic read-modify-write > operations that do not return a value, for example, atomic_inc(). > If you use it after a value-returning atomic read-modify-write operation > like xchg(), all you do is needlessly slow things down on platforms > that provide non-empty smp_mb__after_atomic() definitions. So again, > smp_mb__after_atomic() after xchg() is pointless. > > Please take a look at Documentation/core-api/atomic_ops.rst in the > Linux-kernel source tree for more information. Or get a v4.17 kernel > source tree and check this using the memory model (tools/memory-model > in that version). > > Thanx, Paul > > > Thanks, > > David > > > > From: Paul E. McKenney > > Sent: Friday, July 27, 2018 3:31 PM > > To: David Chen > > Cc: linux-kernel@vger.kernel.org > > Subject: Re: RCU nocb list not
Re: RCU nocb list not reclaiming causing OOM
On Sat, Jul 28, 2018 at 12:07:19AM +, David Chen wrote: > Hi Paul, > > I wasn't talking about the xchg() though. > > The smp_mb__after_atomic() is not for xchg(), it's for `*tail = > rdp->nocb_gp_head;` > it's stated so in the comment. And I do think we need ordering between > `*tail = rdp->nocb_gp_head;` and wake up, because the waiter is checking on > head not tail. > > swait_event_interruptible(rdp->nocb_wq, > > READ_ONCE(rdp->nocb_follower_head)); > > So what I'm saying is that since we need to maintain ordering between `*tail > = rdp->nocb_gp_head;` > and wake up, we need to change the smp_mb__after_atomic() to smp_mb(). Because > smp_mb__after_atomic() wouldn't guarantee > > So this is what I'm proposing. Good eyes! Hmmm... What do I do about this in mainline? Ah, I introduced a ->nocb_lock in mainline to prevent this from happening. In that same commit that you didn't want to use because it is hard to backport. ;-) So yes, but there might well be other misorderings fixed by the hard-to-backport commit that your change below does not cover. Still I do agree that you need full ordering at that point. Thanx, Paul > diff -ru linux-4.9.37.orig/kernel/rcu/tree_plugin.h > linux-4.9.37/kernel/rcu/tree_plugin.h > --- linux-4.9.37.orig/kernel/rcu/tree_plugin.h2017-07-12 > 06:42:41.0 -0700 > +++ linux-4.9.37/kernel/rcu/tree_plugin.h 2018-07-27 16:23:41.349044259 > -0700 > @@ -2076,7 +2076,7 @@ > /* Append callbacks to follower's "done" list. */ > tail = xchg(>nocb_follower_tail, rdp->nocb_gp_tail); > *tail = rdp->nocb_gp_head; > - smp_mb__after_atomic(); /* Store *tail before wakeup. */ > + smp_mb(); /* Store *tail before wakeup. */ > if (rdp != my_rdp && tail == >nocb_follower_head) { > /* >* List was empty, wake up the follower. > > Thanks, > David > > From: Paul E. McKenney > Sent: Friday, July 27, 2018 4:47 PM > To: David Chen > Cc: linux-kernel@vger.kernel.org > Subject: Re: RCU nocb list not reclaiming causing OOM > > > On Fri, Jul 27, 2018 at 11:16:39PM +, David Chen wrote: > > Hi Paul, > > > > Thanks for the advice. > > The bug is kind of hard to hit, so I can't say for certain yet. > > Well, you can always remove the "tail == >nocb_follower_head" as an > extra belt-and-suspenders safety net. I am not putting that in mainline, > but in the privacy of your own copy of the kernel, I don't see any really > serious problem with it. (As long as you aren't going for absolute maximum > performance, but even then there are other more important tuning actions > and code changes you could make.) > > > Though after another look at the code, I found out the > > `smp_mb__after_atomic();` > > seems to be only a compiler barrier on x86. > > Yes, and that is because the locked xchg instruction used on x86 to > implement xchg() already provides full ordering. ;-) > > > tail = xchg(>nocb_follower_tail, rdp->nocb_gp_tail); > > *tail = rdp->nocb_gp_head; > > smp_mb__after_atomic(); /* Store *tail before wakeup. */ > > if (rdp != my_rdp && tail == >nocb_follower_head) { > > swake_up(>nocb_wq); > > > > But that wouldn't be enough right? Because from 6b5fc3a13318, it stated that > > wakeup operation don't guarantee ordering. And when the follower wakes up, > > it checks > > for nocb_follower_head, which is assigned by `*tail = rdp->nocb_gp_head;` > > which doesn't > > have LOCK prefix. So it's possible for follower to wake up and see the list > > is empty and go > > back to sleep. > > Again, xchg() is defined to provide full ordering against all operations > before and after it. Each architecture is required to do whatever is > necessary to implement that full ordering, and x86 need only provide > its "lock xchg" instruction. > > The smp_mb__after_atomic() has effect only after atomic read-modify-write > operations that do not return a value, for example, atomic_inc(). > If you use it after a value-returning atomic read-modify-write operation > like xchg(), all you do is needlessly slow things down on platforms > that provide non-empty smp_mb__after_atomic() definitions. So again, > smp_mb__after_atomic() after xchg() is pointless. > > Please take a look at Documentation/core-api/atomic_ops.rst in the > Linux-kernel source tree for more information. Or get a v4.17 kernel > source tree and check this using the memory model (tools/memory-model > in that version). > > Thanx, Paul > > > Thanks, > > David > > > > From: Paul E. McKenney > > Sent: Friday, July 27, 2018 3:31 PM > > To: David Chen > > Cc: linux-kernel@vger.kernel.org > > Subject: Re: RCU nocb list not
Re: [GIT PULL] ACPI fix for v4.18-rc7
On Fri, Jul 27, 2018 at 5:02 PM Schmauss, Erik wrote: > > The patch below should be able to fix this. Yes, the error messages seem gone with this. I see another ACPI warning, but that one has always been there: acpi INT33D5:00: intel-hid: created platform device ACPI Warning: \_SB.IETM._ART: Return Package type mismatch at index 0 - found Integer, expected Reference (20180531/nspredef-263) ACPI: Invalid package element [0]: got number, expecting [R] _ART package 0 is invalid, ignored and doesn't seem to cause any problems. Just mentioning it in case somebody goes "yeah.." Linus
Re: [GIT PULL] ACPI fix for v4.18-rc7
On Fri, Jul 27, 2018 at 5:02 PM Schmauss, Erik wrote: > > The patch below should be able to fix this. Yes, the error messages seem gone with this. I see another ACPI warning, but that one has always been there: acpi INT33D5:00: intel-hid: created platform device ACPI Warning: \_SB.IETM._ART: Return Package type mismatch at index 0 - found Integer, expected Reference (20180531/nspredef-263) ACPI: Invalid package element [0]: got number, expecting [R] _ART package 0 is invalid, ignored and doesn't seem to cause any problems. Just mentioning it in case somebody goes "yeah.." Linus
Re: [PATCH 4.4 15/47] ubi: fastmap: Correctly handle interrupted erasures in EBA
On Thu, 2018-07-26 at 08:25 +0200, Richard Weinberger wrote: > Ben, > > Am Donnerstag, 26. Juli 2018, 04:12:54 CEST schrieb Ben Hutchings: > > On Tue, 2018-07-10 at 20:24 +0200, Greg Kroah-Hartman wrote: > > > 4.4-stable review patch. If anyone has any objections, please let me > > > know. > > > > > > -- > > > > > > From: Richard Weinberger > > > > > > commit 781932375ffc6411713ee0926ccae8596ed0261c upstream. > > > > > > Fastmap cannot track the LEB unmap operation, therefore it can > > > happen that after an interrupted erasure the mapping still looks > > > good from Fastmap's point of view, while reading from the PEB will > > > cause an ECC error and confuses the upper layer. > > > > > > Instead of teaching users of UBI how to deal with that, we read back > > > the VID header and check for errors. If the PEB is empty or shows ECC > > > errors we fixup the mapping and schedule the PEB for erasure. > > > > [...] > > > > Isn't there a risk here, that a read error leads to erasing data that > > might be recoverable if the read is retried? > > Well, read error means that already something went very wrong. At other places > in UBI wo also don't retry reading headers and consider it as fatal when we > are unable to read it. > We could also read the EC header, but what do we gain from that? > If the VID header is not readable we cannot check fastmap either. > > What case exactly do you have in mind? I suppose I'm thinking about data recovery from a flash device that has become unreliable. I'm not familiar with ubi; is there a read-only mode where the erase wouldn't actually be performed? Ben. -- Ben Hutchings, Software Developer Codethink Ltd https://www.codethink.co.uk/ Dale House, 35 Dale Street Manchester, M1 2HF, United Kingdom
Re: [PATCH 4.4 15/47] ubi: fastmap: Correctly handle interrupted erasures in EBA
On Thu, 2018-07-26 at 08:25 +0200, Richard Weinberger wrote: > Ben, > > Am Donnerstag, 26. Juli 2018, 04:12:54 CEST schrieb Ben Hutchings: > > On Tue, 2018-07-10 at 20:24 +0200, Greg Kroah-Hartman wrote: > > > 4.4-stable review patch. If anyone has any objections, please let me > > > know. > > > > > > -- > > > > > > From: Richard Weinberger > > > > > > commit 781932375ffc6411713ee0926ccae8596ed0261c upstream. > > > > > > Fastmap cannot track the LEB unmap operation, therefore it can > > > happen that after an interrupted erasure the mapping still looks > > > good from Fastmap's point of view, while reading from the PEB will > > > cause an ECC error and confuses the upper layer. > > > > > > Instead of teaching users of UBI how to deal with that, we read back > > > the VID header and check for errors. If the PEB is empty or shows ECC > > > errors we fixup the mapping and schedule the PEB for erasure. > > > > [...] > > > > Isn't there a risk here, that a read error leads to erasing data that > > might be recoverable if the read is retried? > > Well, read error means that already something went very wrong. At other places > in UBI wo also don't retry reading headers and consider it as fatal when we > are unable to read it. > We could also read the EC header, but what do we gain from that? > If the VID header is not readable we cannot check fastmap either. > > What case exactly do you have in mind? I suppose I'm thinking about data recovery from a flash device that has become unreliable. I'm not familiar with ubi; is there a read-only mode where the erase wouldn't actually be performed? Ben. -- Ben Hutchings, Software Developer Codethink Ltd https://www.codethink.co.uk/ Dale House, 35 Dale Street Manchester, M1 2HF, United Kingdom
Re; HELLO DEAR MY DONATION TO YOU.
HELLO DEAR MY DONATION TO YOU. May the peace of the Almighty God be with you and your Family, With Due Respect and Humility, I was compelled to write you under humanitarian ground. My name is Mrs Rosareio Romarinhio. TIMOR Origin but married with Late Sir Ratnavale from MALAYSIA. I have took a personal decision to donate what I inherited from my late husband to the Charity, less privilege, I am 58 years old and I was diagnosed for cancer since the past 4 years, immediately after the death of my husband. We were both married for many years without a child, I took this decision because I don't have any child that will inherit this money. My late husband was The chairman/CEO of k: E global industries and a philanthropist based in MALAYSIA. I have decided to donate a fund through Credit Account of ( $1,6000.00 (One Million, Six Hundred Thousand us dollars ) to you for this assignment, Charity, less privilege, building of schools, hospitals and also for the assistance of the poor widows, Motherless babies, Charity organization, CHRISTIAN OR MUSLIM, and orphans. I don't know you in person but God knows you, so contacting you for this assignment is a direction from the holy spirit of God to donate this fund outside this country through you as my late husband has already donated alot in this country, so you have to make sure that you use this donation fund as I have directed so that the name of the Almighty God will be glorify forever. Your urgent response is required in this matter due to my present critical condition of my health, it was a Sister nurse working in this hospital that helped me to type this message, because I am loosing strenght every munites of the day. All I need from you is prayers that GOD will accept my soul incase I didn't survive this surgery. Also pray for the soul of my late Husband to rest in peace. Please always be prayerful all through your life, we are visitors on this earth and we must be very careful in whatever we do so that our soul will not be a waste. I wish you all the best and may God bless you abundantly. I AM WAITING FOR YOUR URGENT REPLY Remain blessed in God. Yours. Mother Mrs Rosareio Romarinhio
Re; HELLO DEAR MY DONATION TO YOU.
HELLO DEAR MY DONATION TO YOU. May the peace of the Almighty God be with you and your Family, With Due Respect and Humility, I was compelled to write you under humanitarian ground. My name is Mrs Rosareio Romarinhio. TIMOR Origin but married with Late Sir Ratnavale from MALAYSIA. I have took a personal decision to donate what I inherited from my late husband to the Charity, less privilege, I am 58 years old and I was diagnosed for cancer since the past 4 years, immediately after the death of my husband. We were both married for many years without a child, I took this decision because I don't have any child that will inherit this money. My late husband was The chairman/CEO of k: E global industries and a philanthropist based in MALAYSIA. I have decided to donate a fund through Credit Account of ( $1,6000.00 (One Million, Six Hundred Thousand us dollars ) to you for this assignment, Charity, less privilege, building of schools, hospitals and also for the assistance of the poor widows, Motherless babies, Charity organization, CHRISTIAN OR MUSLIM, and orphans. I don't know you in person but God knows you, so contacting you for this assignment is a direction from the holy spirit of God to donate this fund outside this country through you as my late husband has already donated alot in this country, so you have to make sure that you use this donation fund as I have directed so that the name of the Almighty God will be glorify forever. Your urgent response is required in this matter due to my present critical condition of my health, it was a Sister nurse working in this hospital that helped me to type this message, because I am loosing strenght every munites of the day. All I need from you is prayers that GOD will accept my soul incase I didn't survive this surgery. Also pray for the soul of my late Husband to rest in peace. Please always be prayerful all through your life, we are visitors on this earth and we must be very careful in whatever we do so that our soul will not be a waste. I wish you all the best and may God bless you abundantly. I AM WAITING FOR YOUR URGENT REPLY Remain blessed in God. Yours. Mother Mrs Rosareio Romarinhio
Re: [PATCH v2 1/3] tracing: kprobes: Prohibit probing on notrace function
On Thu, 26 Jul 2018 14:53:27 +0900 Masami Hiramatsu wrote: > Prohibit kprobe-events probing on notrace function. > Since probing on the notrace function can cause recursive > event call. In most case those are just skipped, but > in some case it falls into infinit recursive call. > > This protection can be disabled by the kconfig > CONFIG_KPROBE_EVENTS_ON_NOTRACE=y, but it is highly > recommended to keep it "n" for normal kernel. > > Signed-off-by: Masami Hiramatsu > Tested-by: Francis Deslauriers > --- > Changes from v1 >- Add CONFIG_KPROBE_EVENTS_ON_NOTRACE kconfig for knocking down > the protection. > --- > kernel/trace/Kconfig| 18 ++ > kernel/trace/trace_kprobe.c | 23 +++ > 2 files changed, 41 insertions(+) > > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig > index dcc0166d1997..24d5a58467a3 100644 > --- a/kernel/trace/Kconfig > +++ b/kernel/trace/Kconfig > @@ -456,6 +456,24 @@ config KPROBE_EVENTS > This option is also required by perf-probe subcommand of perf tools. > If you want to use perf tools, this option is strongly recommended. > > +config KPROBE_EVENTS_ON_NOTRACE > + bool "Do NOT protect notrace function from kprobe events" > + depends on KPROBE_EVENTS > + default n > + help > + This is only for the developers who want to debug ftrace itself > + using kprobe events. > + > + Usually, ftrace related functions are protected from kprobe-events > + to prevent an infinit recursion or any unexpected execution path > + which leads to a kernel crash. > + > + This option disables such protection and allows you to put kprobe > + events on ftrace functions for debugging ftrace by itself. > + Note that this might let you shoot yourself in the foot. > + > + If unsure, say N. > + > config UPROBE_EVENTS > bool "Enable uprobes-based dynamic events" > depends on ARCH_SUPPORTS_UPROBES > diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c > index 27ace4513c43..1f1b4d712a7e 100644 > --- a/kernel/trace/trace_kprobe.c > +++ b/kernel/trace/trace_kprobe.c > @@ -496,6 +496,23 @@ disable_trace_kprobe(struct trace_kprobe *tk, struct > trace_event_file *file) > return ret; > } > > +#ifdef CONFIG_KPROBE_EVENTS_ON_NOTRACE > +#define within_notrace_func(tk) (false) > +#else > +static bool within_notrace_func(struct trace_kprobe *tk) > +{ > + unsigned long offset, size, addr; > + > + addr = kallsyms_lookup_name(trace_kprobe_symbol(tk)); Oops, I found this is wrong. (Thanks for old myself :P) As ftracetest probepoint.tc said, kprobe-events can be defined without symbol (direct address) for debugging. In that case trace_kprobe_symbol will not return the symbol string. I'll fix this series again. Thank you, > + addr += trace_kprobe_offset(tk); > + > + if (!kallsyms_lookup_size_offset(addr, , )) > + return true;/* Out of range. */ > + > + return !ftrace_location_range(addr - offset, addr - offset + size); > +} > +#endif > + > /* Internal register function - just handle k*probes and flags */ > static int __register_trace_kprobe(struct trace_kprobe *tk) > { > @@ -504,6 +521,12 @@ static int __register_trace_kprobe(struct trace_kprobe > *tk) > if (trace_probe_is_registered(>tp)) > return -EINVAL; > > + if (within_notrace_func(tk)) { > + pr_warn("Could not probe notrace function %s\n", > + trace_kprobe_symbol(tk)); > + return -EINVAL; > + } > + > for (i = 0; i < tk->tp.nr_args; i++) > traceprobe_update_arg(>tp.args[i]); > > -- Masami Hiramatsu
Re: [PATCH v2 1/3] tracing: kprobes: Prohibit probing on notrace function
On Thu, 26 Jul 2018 14:53:27 +0900 Masami Hiramatsu wrote: > Prohibit kprobe-events probing on notrace function. > Since probing on the notrace function can cause recursive > event call. In most case those are just skipped, but > in some case it falls into infinit recursive call. > > This protection can be disabled by the kconfig > CONFIG_KPROBE_EVENTS_ON_NOTRACE=y, but it is highly > recommended to keep it "n" for normal kernel. > > Signed-off-by: Masami Hiramatsu > Tested-by: Francis Deslauriers > --- > Changes from v1 >- Add CONFIG_KPROBE_EVENTS_ON_NOTRACE kconfig for knocking down > the protection. > --- > kernel/trace/Kconfig| 18 ++ > kernel/trace/trace_kprobe.c | 23 +++ > 2 files changed, 41 insertions(+) > > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig > index dcc0166d1997..24d5a58467a3 100644 > --- a/kernel/trace/Kconfig > +++ b/kernel/trace/Kconfig > @@ -456,6 +456,24 @@ config KPROBE_EVENTS > This option is also required by perf-probe subcommand of perf tools. > If you want to use perf tools, this option is strongly recommended. > > +config KPROBE_EVENTS_ON_NOTRACE > + bool "Do NOT protect notrace function from kprobe events" > + depends on KPROBE_EVENTS > + default n > + help > + This is only for the developers who want to debug ftrace itself > + using kprobe events. > + > + Usually, ftrace related functions are protected from kprobe-events > + to prevent an infinit recursion or any unexpected execution path > + which leads to a kernel crash. > + > + This option disables such protection and allows you to put kprobe > + events on ftrace functions for debugging ftrace by itself. > + Note that this might let you shoot yourself in the foot. > + > + If unsure, say N. > + > config UPROBE_EVENTS > bool "Enable uprobes-based dynamic events" > depends on ARCH_SUPPORTS_UPROBES > diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c > index 27ace4513c43..1f1b4d712a7e 100644 > --- a/kernel/trace/trace_kprobe.c > +++ b/kernel/trace/trace_kprobe.c > @@ -496,6 +496,23 @@ disable_trace_kprobe(struct trace_kprobe *tk, struct > trace_event_file *file) > return ret; > } > > +#ifdef CONFIG_KPROBE_EVENTS_ON_NOTRACE > +#define within_notrace_func(tk) (false) > +#else > +static bool within_notrace_func(struct trace_kprobe *tk) > +{ > + unsigned long offset, size, addr; > + > + addr = kallsyms_lookup_name(trace_kprobe_symbol(tk)); Oops, I found this is wrong. (Thanks for old myself :P) As ftracetest probepoint.tc said, kprobe-events can be defined without symbol (direct address) for debugging. In that case trace_kprobe_symbol will not return the symbol string. I'll fix this series again. Thank you, > + addr += trace_kprobe_offset(tk); > + > + if (!kallsyms_lookup_size_offset(addr, , )) > + return true;/* Out of range. */ > + > + return !ftrace_location_range(addr - offset, addr - offset + size); > +} > +#endif > + > /* Internal register function - just handle k*probes and flags */ > static int __register_trace_kprobe(struct trace_kprobe *tk) > { > @@ -504,6 +521,12 @@ static int __register_trace_kprobe(struct trace_kprobe > *tk) > if (trace_probe_is_registered(>tp)) > return -EINVAL; > > + if (within_notrace_func(tk)) { > + pr_warn("Could not probe notrace function %s\n", > + trace_kprobe_symbol(tk)); > + return -EINVAL; > + } > + > for (i = 0; i < tk->tp.nr_args; i++) > traceprobe_update_arg(>tp.args[i]); > > -- Masami Hiramatsu
Re: include architecture Kconfig files from top-level Kconfig v3
2018-07-27 16:48 GMT+09:00 Christoph Hellwig : > On Fri, Jul 27, 2018 at 10:32:19AM +0900, Masahiro Yamada wrote: >> This will just add a new unmet dependency warning. >> CONFIG_PREEMPT_COUNT will be still selected. > > True. I guess we simply need to prohibit CONFIG_DEBUG_ATOMIC_SLEEP > explicitly if PREEMPT_COUNT isn't supported. E.g something like this: > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > index 26d3ff7e3cf4..373ce9fecd7e 100644 > --- a/lib/Kconfig.debug > +++ b/lib/Kconfig.debug > @@ -1195,6 +1195,7 @@ config DEBUG_ATOMIC_SLEEP > bool "Sleep inside atomic section checking" > select PREEMPT_COUNT > depends on DEBUG_KERNEL > + depends on !ARCH_NO_PREEMPT > help > If you say Y here, various routines which may sleep will become very > noisy if they are called inside atomic sections: when a spinlock is I will fold this into 07/10. I see another 'select PREEMPT_COUNT' in drivers/gpu/drm/i915/Kconfig.debug but DRM_I915 depends on X86, so this is not a real problem. -- Best Regards Masahiro Yamada
Re: include architecture Kconfig files from top-level Kconfig v3
2018-07-27 16:48 GMT+09:00 Christoph Hellwig : > On Fri, Jul 27, 2018 at 10:32:19AM +0900, Masahiro Yamada wrote: >> This will just add a new unmet dependency warning. >> CONFIG_PREEMPT_COUNT will be still selected. > > True. I guess we simply need to prohibit CONFIG_DEBUG_ATOMIC_SLEEP > explicitly if PREEMPT_COUNT isn't supported. E.g something like this: > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > index 26d3ff7e3cf4..373ce9fecd7e 100644 > --- a/lib/Kconfig.debug > +++ b/lib/Kconfig.debug > @@ -1195,6 +1195,7 @@ config DEBUG_ATOMIC_SLEEP > bool "Sleep inside atomic section checking" > select PREEMPT_COUNT > depends on DEBUG_KERNEL > + depends on !ARCH_NO_PREEMPT > help > If you say Y here, various routines which may sleep will become very > noisy if they are called inside atomic sections: when a spinlock is I will fold this into 07/10. I see another 'select PREEMPT_COUNT' in drivers/gpu/drm/i915/Kconfig.debug but DRM_I915 depends on X86, so this is not a real problem. -- Best Regards Masahiro Yamada
[PATCH] sched/fair: sync expires_seq in distribute_cfs_runtime()
Each time we sync cfs_rq->runtime_expires with cfs_b->runtime_expires, we should sync its ->expires_seq too. However it is missing for distribute_cfs_runtime(), especially the slack timer call path. Fixes: 512ac999d275 ("sched/fair: Fix bandwidth timer clock drift condition") Cc: Xunlei Pang Cc: Ben Segall Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Signed-off-by: Cong Wang --- kernel/sched/fair.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 2f0a0be4d344..910c50db3d74 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4857,7 +4857,7 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq) } static u64 distribute_cfs_runtime(struct cfs_bandwidth *cfs_b, - u64 remaining, u64 expires) + u64 remaining, u64 expires, int expires_seq) { struct cfs_rq *cfs_rq; u64 runtime; @@ -4880,6 +4880,7 @@ static u64 distribute_cfs_runtime(struct cfs_bandwidth *cfs_b, cfs_rq->runtime_remaining += runtime; cfs_rq->runtime_expires = expires; + cfs_rq->expires_seq = expires_seq; /* we check whether we're throttled above */ if (cfs_rq->runtime_remaining > 0) @@ -4905,7 +4906,7 @@ static u64 distribute_cfs_runtime(struct cfs_bandwidth *cfs_b, static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun) { u64 runtime, runtime_expires; - int throttled; + int throttled, expires_seq; /* no need to continue the timer with no bandwidth constraint */ if (cfs_b->quota == RUNTIME_INF) @@ -4933,6 +4934,7 @@ static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun) cfs_b->nr_throttled += overrun; runtime_expires = cfs_b->runtime_expires; + expires_seq = cfs_b->expires_seq; /* * This check is repeated as we are holding onto the new bandwidth while @@ -4946,7 +4948,7 @@ static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun) raw_spin_unlock(_b->lock); /* we can't nest cfs_b->lock while distributing bandwidth */ runtime = distribute_cfs_runtime(cfs_b, runtime, -runtime_expires); +runtime_expires, expires_seq); raw_spin_lock(_b->lock); throttled = !list_empty(_b->throttled_cfs_rq); @@ -5055,6 +5057,7 @@ static __always_inline void return_cfs_rq_runtime(struct cfs_rq *cfs_rq) static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b) { u64 runtime = 0, slice = sched_cfs_bandwidth_slice(); + int expires_seq; u64 expires; /* confirm we're still not at a refresh boundary */ @@ -5068,12 +5071,13 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b) runtime = cfs_b->runtime; expires = cfs_b->runtime_expires; + expires_seq = cfs_b->expires_seq; raw_spin_unlock(_b->lock); if (!runtime) return; - runtime = distribute_cfs_runtime(cfs_b, runtime, expires); + runtime = distribute_cfs_runtime(cfs_b, runtime, expires, expires_seq); raw_spin_lock(_b->lock); if (expires == cfs_b->runtime_expires) -- 2.14.4
Re: [PATCH 34/38] vfs: syscall: Add fsinfo() to query filesystem information [ver #10]
Hi David, > On 28 Jul 2018, at 00:49, David Howells wrote: > Jann Horn wrote: >>> +static int fsinfo_generic_name_encoding(struct dentry *dentry, char *buf) >>> +{ >>> + static const char encoding[] = "utf8"; >>> + >>> + if (buf) >>> + memcpy(buf, encoding, sizeof(encoding) - 1); >>> + return sizeof(encoding) - 1; >>> +} >> >> Is this meant to be "encoding to be used by userspace" or "encoding of >> on-disk filenames"? > > The latter. > >> Are there any plans to create filesystems that behave differently? > > isofs, fat, ntfs, cifs for example. > >> If the latter: This is wrong for e.g. a vfat mount that uses a codepage, >> right? Should the default in that case not be "I don't know"? > > Quite possibly. Note that it could also be what you're interpreting it as > because the codepage got overridden by a mount parameter rather than what's on > the disk (assuming the medium actually records this). No, nothing like that is recorded on disk. That would have been way too helpful! (-; The only place Windows records such information is, you may have guessed this: in the registry which of course is local to the computer and unrelated to what removable media is attached... > One thing I'm confused about is that fat has both a codepage and a charset and > I'm not sure of the difference. Oh that is quite simple. (-: The codepage is what is used to translate from/to the on-disk DOS 8.3 style names into the kernel's Unicode character representation. The correct codepage for a particular volume is not stored on disk so it can lead to all sorts of fun if you for example create some names on for example a Japanese Windows on a FAT formatted USB stick and then plug that into a US or European Windows where the default code pages are completely different - all your filenames will appear totally corrupt. (Note this ONLY affects 8.3 style/DOS/short names or whatever you want to call them.) The charset on the other hand is what is used to convert strings coming in from/going out to userspace into the kernel's Unicode character representation. The one nice thing about VFAT (and there aren't many nice things about it!) is that for long names (i.e. not the 8.3 style/DOS/short names), it actually stores on-disk little-endian UTF-16 (since Windows 2000, before that it used little endian UCS-2 - the change was needed to support things like Emojis and some languages that go outside the UCS-2 range of fixed 16-bit unicode). Hope this clears that up. Best regards, Anton > David -- Anton Altaparmakov (replace at with @) Lead in File System Development, Tuxera Inc., http://www.tuxera.com/ Linux NTFS maintainer
Re: [PATCH 34/38] vfs: syscall: Add fsinfo() to query filesystem information [ver #10]
Hi David, > On 28 Jul 2018, at 00:49, David Howells wrote: > Jann Horn wrote: >>> +static int fsinfo_generic_name_encoding(struct dentry *dentry, char *buf) >>> +{ >>> + static const char encoding[] = "utf8"; >>> + >>> + if (buf) >>> + memcpy(buf, encoding, sizeof(encoding) - 1); >>> + return sizeof(encoding) - 1; >>> +} >> >> Is this meant to be "encoding to be used by userspace" or "encoding of >> on-disk filenames"? > > The latter. > >> Are there any plans to create filesystems that behave differently? > > isofs, fat, ntfs, cifs for example. > >> If the latter: This is wrong for e.g. a vfat mount that uses a codepage, >> right? Should the default in that case not be "I don't know"? > > Quite possibly. Note that it could also be what you're interpreting it as > because the codepage got overridden by a mount parameter rather than what's on > the disk (assuming the medium actually records this). No, nothing like that is recorded on disk. That would have been way too helpful! (-; The only place Windows records such information is, you may have guessed this: in the registry which of course is local to the computer and unrelated to what removable media is attached... > One thing I'm confused about is that fat has both a codepage and a charset and > I'm not sure of the difference. Oh that is quite simple. (-: The codepage is what is used to translate from/to the on-disk DOS 8.3 style names into the kernel's Unicode character representation. The correct codepage for a particular volume is not stored on disk so it can lead to all sorts of fun if you for example create some names on for example a Japanese Windows on a FAT formatted USB stick and then plug that into a US or European Windows where the default code pages are completely different - all your filenames will appear totally corrupt. (Note this ONLY affects 8.3 style/DOS/short names or whatever you want to call them.) The charset on the other hand is what is used to convert strings coming in from/going out to userspace into the kernel's Unicode character representation. The one nice thing about VFAT (and there aren't many nice things about it!) is that for long names (i.e. not the 8.3 style/DOS/short names), it actually stores on-disk little-endian UTF-16 (since Windows 2000, before that it used little endian UCS-2 - the change was needed to support things like Emojis and some languages that go outside the UCS-2 range of fixed 16-bit unicode). Hope this clears that up. Best regards, Anton > David -- Anton Altaparmakov (replace at with @) Lead in File System Development, Tuxera Inc., http://www.tuxera.com/ Linux NTFS maintainer
[PATCH] sched/fair: sync expires_seq in distribute_cfs_runtime()
Each time we sync cfs_rq->runtime_expires with cfs_b->runtime_expires, we should sync its ->expires_seq too. However it is missing for distribute_cfs_runtime(), especially the slack timer call path. Fixes: 512ac999d275 ("sched/fair: Fix bandwidth timer clock drift condition") Cc: Xunlei Pang Cc: Ben Segall Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Signed-off-by: Cong Wang --- kernel/sched/fair.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 2f0a0be4d344..910c50db3d74 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4857,7 +4857,7 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq) } static u64 distribute_cfs_runtime(struct cfs_bandwidth *cfs_b, - u64 remaining, u64 expires) + u64 remaining, u64 expires, int expires_seq) { struct cfs_rq *cfs_rq; u64 runtime; @@ -4880,6 +4880,7 @@ static u64 distribute_cfs_runtime(struct cfs_bandwidth *cfs_b, cfs_rq->runtime_remaining += runtime; cfs_rq->runtime_expires = expires; + cfs_rq->expires_seq = expires_seq; /* we check whether we're throttled above */ if (cfs_rq->runtime_remaining > 0) @@ -4905,7 +4906,7 @@ static u64 distribute_cfs_runtime(struct cfs_bandwidth *cfs_b, static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun) { u64 runtime, runtime_expires; - int throttled; + int throttled, expires_seq; /* no need to continue the timer with no bandwidth constraint */ if (cfs_b->quota == RUNTIME_INF) @@ -4933,6 +4934,7 @@ static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun) cfs_b->nr_throttled += overrun; runtime_expires = cfs_b->runtime_expires; + expires_seq = cfs_b->expires_seq; /* * This check is repeated as we are holding onto the new bandwidth while @@ -4946,7 +4948,7 @@ static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun) raw_spin_unlock(_b->lock); /* we can't nest cfs_b->lock while distributing bandwidth */ runtime = distribute_cfs_runtime(cfs_b, runtime, -runtime_expires); +runtime_expires, expires_seq); raw_spin_lock(_b->lock); throttled = !list_empty(_b->throttled_cfs_rq); @@ -5055,6 +5057,7 @@ static __always_inline void return_cfs_rq_runtime(struct cfs_rq *cfs_rq) static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b) { u64 runtime = 0, slice = sched_cfs_bandwidth_slice(); + int expires_seq; u64 expires; /* confirm we're still not at a refresh boundary */ @@ -5068,12 +5071,13 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b) runtime = cfs_b->runtime; expires = cfs_b->runtime_expires; + expires_seq = cfs_b->expires_seq; raw_spin_unlock(_b->lock); if (!runtime) return; - runtime = distribute_cfs_runtime(cfs_b, runtime, expires); + runtime = distribute_cfs_runtime(cfs_b, runtime, expires, expires_seq); raw_spin_lock(_b->lock); if (expires == cfs_b->runtime_expires) -- 2.14.4
Re: Remounting filesystem read-only
On Fri, Jul 27, 2018 at 01:34:31PM -0700, Sodagudi Prasad wrote: > > The error should be pretty clear: "Inode table for bg 0 marked as > > needing zeroing". That should never happen. > > Can you provide any debug patch to detect when this corruption is happening? > Source of this corruption and how this is partition getting corrupted? > Or which file system operation lead to this corruption? Do you have a reliable repro? If it's a one-off, it can be caused by *anything*. Crappy hardware, a bug in some proprietary, binary-only GPU driver dereferencing some wild pointer that corrupts kernel memory, etc. Asking for a debug patch is like asking for "can you create technology that can detect when a cockroach enter my house?" So if you have a reliable repro, then we know what operations might be triggering the corruption, and then you work on creating a minimal repro, and only *then* when we have a restricted set of possibilities that might be the cause (for example, if removing a GPU call makes the problem go away, then the patch would need to be in the proprietary GPU driver) > I am digging code a bit around this warning to understand more. The warning means that a flag in block group descriptor #0 is set that should never be set. How did the flag get set? There is any number of things that could cause that. You might want to look at the block group descriptor via dumpe2fs or debugfs, to see if it's just a single bit getting flipped, or if the entire block group descriptor is garbage. Note that under normal code paths, the flag *never* gets set by ext4 kernel code. The flag will get set on non-block group 0 block group descriptors by ext4, and the ext4 kernel code will only clear the flag. Of course, if there is a bug in some driver that dereferences a pointer widely, all bets are off. - Ted
Re: Remounting filesystem read-only
On Fri, Jul 27, 2018 at 01:34:31PM -0700, Sodagudi Prasad wrote: > > The error should be pretty clear: "Inode table for bg 0 marked as > > needing zeroing". That should never happen. > > Can you provide any debug patch to detect when this corruption is happening? > Source of this corruption and how this is partition getting corrupted? > Or which file system operation lead to this corruption? Do you have a reliable repro? If it's a one-off, it can be caused by *anything*. Crappy hardware, a bug in some proprietary, binary-only GPU driver dereferencing some wild pointer that corrupts kernel memory, etc. Asking for a debug patch is like asking for "can you create technology that can detect when a cockroach enter my house?" So if you have a reliable repro, then we know what operations might be triggering the corruption, and then you work on creating a minimal repro, and only *then* when we have a restricted set of possibilities that might be the cause (for example, if removing a GPU call makes the problem go away, then the patch would need to be in the proprietary GPU driver) > I am digging code a bit around this warning to understand more. The warning means that a flag in block group descriptor #0 is set that should never be set. How did the flag get set? There is any number of things that could cause that. You might want to look at the block group descriptor via dumpe2fs or debugfs, to see if it's just a single bit getting flipped, or if the entire block group descriptor is garbage. Note that under normal code paths, the flag *never* gets set by ext4 kernel code. The flag will get set on non-block group 0 block group descriptors by ext4, and the ext4 kernel code will only clear the flag. Of course, if there is a bug in some driver that dereferences a pointer widely, all bets are off. - Ted
Re: [PATCH v2] PM / devfreq: Add support for QCOM devfreq firmware
On 2018-05-23 07:39, Rob Herring wrote: Reviving an old thread. Sorry about the late reply. Got busy. On Tue, May 22, 2018 at 1:30 PM, Saravana Kannan wrote: On 05/22/2018 11:08 AM, Rob Herring wrote: On Fri, May 18, 2018 at 12:52:40AM -0700, Saravana Kannan wrote: The firmware present in some QCOM chipsets offloads the steps necessary for changing the frequency of some devices (Eg: L3). This driver implements the devfreq interface for this firmware so that various governors could be used to scale the frequency of these devices. Each client (say cluster 0 and cluster 1) that wants to vote for a particular device's frequency (say, L3 frequency) is represented as a separate voter device (qcom,devfreq-fw-voter) that's a child of the firmware device (qcom,devfreq-fw). Signed-off-by: Saravana Kannan --- .../bindings/devfreq/devfreq-qcom-fw.txt | 41 +++ drivers/devfreq/Kconfig| 14 + drivers/devfreq/Makefile | 1 + drivers/devfreq/devfreq_qcom_fw.c | 330 + 4 files changed, 386 insertions(+) create mode 100644 Documentation/devicetree/bindings/devfreq/devfreq-qcom-fw.txt create mode 100644 drivers/devfreq/devfreq_qcom_fw.c diff --git a/Documentation/devicetree/bindings/devfreq/devfreq-qcom-fw.txt b/Documentation/devicetree/bindings/devfreq/devfreq-qcom-fw.txt new file mode 100644 index 000..f882a0b --- /dev/null +++ b/Documentation/devicetree/bindings/devfreq/devfreq-qcom-fw.txt @@ -0,0 +1,41 @@ +QCOM Devfreq firmware device + +Some Qualcomm Technologies, Inc. (QTI) chipsets have a firmware that +offloads the steps for frequency switching. It provides a table of +supported frequencies and a register to request one of the supported +freqencies. + +The qcom,devfreq-fw represents this firmware as a device. Sometimes, +multiple entities want to vote on the frequency request that is sent to the +firmware. The qcom,devfreq-fw-voter represents these voters as child +devices of the corresponding qcom,devfreq-fw device. + +Required properties: +- compatible: Must be "qcom,devfreq-fw" or "qcom,devfreq-fw-voter" No versions of firmware? Sure, I can add a v1. Right now the interface has always been identical. I thought if it changed in the future I'll add -v2. Sounds like you are making up version numbers. If you don't have real h/w or firmware version numbers, then use an SoC specific compatible string. +Only for qcom,devfreq-fw: +- reg: Pairs of physical base addresses and region sizes of + memory mapped registers. Registers? Is this firmware or h/w block? It's a HW block that has its own firmware. So you have 2 things that could change: the h/w interface and the firmware version. Make sure the compatible string(s) is specific enough for the OS to know the exact combination. For all practical purposes, the FW is opaque to the OS. It doesn't affect anything the OS can do with the IP block. So, the HW version is what matters. I'll figure out the actual HW version and use that. +- reg-names: Names of the bases for the above registers. + Required register regions are: + - "en-base": address of register to check if the + firmware is enabled. + - "ftbl-base": address region for the frequency + table. + - "perf-base": address of register to request a + frequency. + +Example: + + qcom,devfreq-l3 { + compatible = "qcom,devfreq-fw"; + reg-names = "en-base", "ftbl-base", "perf-base"; + reg = <0x18321000 0x4>, <0x18321110 0x600>, <0x18321920 0x4>; + + qcom,cpu0-l3 { + compatible = "qcom,devfreq-fw-voter"; There's no point in these nodes. They don't have any properties or resources. These nodes decide how many voters this device supports. Each voter would be a devfreq node that will have its own governor set. For example, one of them would use this governor: http://lkml.iu.edu/hypermail/linux/kernel/1805.2/02474.html You can also attach different devfreq-event devices to each one of these voter devices based on what events you want to use for scaling each voter. So, the devices are definitely needed in the larger context. Sorry, I still don't understand. Ok, let me try to explain. Let's take L3 as an example. Different other IPs might have different requirements on the L3 frequency. For example, little CPUs might want L3 to run at 400 MHz, big CPUs might want L3 to run at 1000 MHz, GPU or some other peripheral might want L3 to run at 800 MHz. The L3 freq needs to be set to max of these requests -- in this case 1000 MHz. I'm trying to represent each of these "votes" on L3 as a device. Once I register these with devfreq, each of
Re: [PATCH v2] PM / devfreq: Add support for QCOM devfreq firmware
On 2018-05-23 07:39, Rob Herring wrote: Reviving an old thread. Sorry about the late reply. Got busy. On Tue, May 22, 2018 at 1:30 PM, Saravana Kannan wrote: On 05/22/2018 11:08 AM, Rob Herring wrote: On Fri, May 18, 2018 at 12:52:40AM -0700, Saravana Kannan wrote: The firmware present in some QCOM chipsets offloads the steps necessary for changing the frequency of some devices (Eg: L3). This driver implements the devfreq interface for this firmware so that various governors could be used to scale the frequency of these devices. Each client (say cluster 0 and cluster 1) that wants to vote for a particular device's frequency (say, L3 frequency) is represented as a separate voter device (qcom,devfreq-fw-voter) that's a child of the firmware device (qcom,devfreq-fw). Signed-off-by: Saravana Kannan --- .../bindings/devfreq/devfreq-qcom-fw.txt | 41 +++ drivers/devfreq/Kconfig| 14 + drivers/devfreq/Makefile | 1 + drivers/devfreq/devfreq_qcom_fw.c | 330 + 4 files changed, 386 insertions(+) create mode 100644 Documentation/devicetree/bindings/devfreq/devfreq-qcom-fw.txt create mode 100644 drivers/devfreq/devfreq_qcom_fw.c diff --git a/Documentation/devicetree/bindings/devfreq/devfreq-qcom-fw.txt b/Documentation/devicetree/bindings/devfreq/devfreq-qcom-fw.txt new file mode 100644 index 000..f882a0b --- /dev/null +++ b/Documentation/devicetree/bindings/devfreq/devfreq-qcom-fw.txt @@ -0,0 +1,41 @@ +QCOM Devfreq firmware device + +Some Qualcomm Technologies, Inc. (QTI) chipsets have a firmware that +offloads the steps for frequency switching. It provides a table of +supported frequencies and a register to request one of the supported +freqencies. + +The qcom,devfreq-fw represents this firmware as a device. Sometimes, +multiple entities want to vote on the frequency request that is sent to the +firmware. The qcom,devfreq-fw-voter represents these voters as child +devices of the corresponding qcom,devfreq-fw device. + +Required properties: +- compatible: Must be "qcom,devfreq-fw" or "qcom,devfreq-fw-voter" No versions of firmware? Sure, I can add a v1. Right now the interface has always been identical. I thought if it changed in the future I'll add -v2. Sounds like you are making up version numbers. If you don't have real h/w or firmware version numbers, then use an SoC specific compatible string. +Only for qcom,devfreq-fw: +- reg: Pairs of physical base addresses and region sizes of + memory mapped registers. Registers? Is this firmware or h/w block? It's a HW block that has its own firmware. So you have 2 things that could change: the h/w interface and the firmware version. Make sure the compatible string(s) is specific enough for the OS to know the exact combination. For all practical purposes, the FW is opaque to the OS. It doesn't affect anything the OS can do with the IP block. So, the HW version is what matters. I'll figure out the actual HW version and use that. +- reg-names: Names of the bases for the above registers. + Required register regions are: + - "en-base": address of register to check if the + firmware is enabled. + - "ftbl-base": address region for the frequency + table. + - "perf-base": address of register to request a + frequency. + +Example: + + qcom,devfreq-l3 { + compatible = "qcom,devfreq-fw"; + reg-names = "en-base", "ftbl-base", "perf-base"; + reg = <0x18321000 0x4>, <0x18321110 0x600>, <0x18321920 0x4>; + + qcom,cpu0-l3 { + compatible = "qcom,devfreq-fw-voter"; There's no point in these nodes. They don't have any properties or resources. These nodes decide how many voters this device supports. Each voter would be a devfreq node that will have its own governor set. For example, one of them would use this governor: http://lkml.iu.edu/hypermail/linux/kernel/1805.2/02474.html You can also attach different devfreq-event devices to each one of these voter devices based on what events you want to use for scaling each voter. So, the devices are definitely needed in the larger context. Sorry, I still don't understand. Ok, let me try to explain. Let's take L3 as an example. Different other IPs might have different requirements on the L3 frequency. For example, little CPUs might want L3 to run at 400 MHz, big CPUs might want L3 to run at 1000 MHz, GPU or some other peripheral might want L3 to run at 800 MHz. The L3 freq needs to be set to max of these requests -- in this case 1000 MHz. I'm trying to represent each of these "votes" on L3 as a device. Once I register these with devfreq, each of
Re: [PATCH 34/38] vfs: syscall: Add fsinfo() to query filesystem information [ver #10]
Jann Horn wrote: > > fs/fat/inode.c: sprintf(buf, "cp%d", sbi->options.codepage); > > sprintf(buf, "cp%d", sbi->options.codepage); > sbi->nls_disk = load_nls(buf); > if (!sbi->nls_disk) { > fat_msg(sb, KERN_ERR, "codepage %s not found", buf); > goto out_fail; > } Sorry, yes. I was reading the print as the part of the display of superblock parameters. David
Re: [PATCH 34/38] vfs: syscall: Add fsinfo() to query filesystem information [ver #10]
Jann Horn wrote: > > fs/fat/inode.c: sprintf(buf, "cp%d", sbi->options.codepage); > > sprintf(buf, "cp%d", sbi->options.codepage); > sbi->nls_disk = load_nls(buf); > if (!sbi->nls_disk) { > fat_msg(sb, KERN_ERR, "codepage %s not found", buf); > goto out_fail; > } Sorry, yes. I was reading the print as the part of the display of superblock parameters. David
Re: RCU nocb list not reclaiming causing OOM
Hi Paul, I wasn't talking about the xchg() though. The smp_mb__after_atomic() is not for xchg(), it's for `*tail = rdp->nocb_gp_head;` it's stated so in the comment. And I do think we need ordering between `*tail = rdp->nocb_gp_head;` and wake up, because the waiter is checking on head not tail. swait_event_interruptible(rdp->nocb_wq, READ_ONCE(rdp->nocb_follower_head)); So what I'm saying is that since we need to maintain ordering between `*tail = rdp->nocb_gp_head;` and wake up, we need to change the smp_mb__after_atomic() to smp_mb(). Because smp_mb__after_atomic() wouldn't guarantee So this is what I'm proposing. diff -ru linux-4.9.37.orig/kernel/rcu/tree_plugin.h linux-4.9.37/kernel/rcu/tree_plugin.h --- linux-4.9.37.orig/kernel/rcu/tree_plugin.h 2017-07-12 06:42:41.0 -0700 +++ linux-4.9.37/kernel/rcu/tree_plugin.h 2018-07-27 16:23:41.349044259 -0700 @@ -2076,7 +2076,7 @@ /* Append callbacks to follower's "done" list. */ tail = xchg(>nocb_follower_tail, rdp->nocb_gp_tail); *tail = rdp->nocb_gp_head; - smp_mb__after_atomic(); /* Store *tail before wakeup. */ + smp_mb(); /* Store *tail before wakeup. */ if (rdp != my_rdp && tail == >nocb_follower_head) { /* * List was empty, wake up the follower. Thanks, David From: Paul E. McKenney Sent: Friday, July 27, 2018 4:47 PM To: David Chen Cc: linux-kernel@vger.kernel.org Subject: Re: RCU nocb list not reclaiming causing OOM On Fri, Jul 27, 2018 at 11:16:39PM +, David Chen wrote: > Hi Paul, > > Thanks for the advice. > The bug is kind of hard to hit, so I can't say for certain yet. Well, you can always remove the "tail == >nocb_follower_head" as an extra belt-and-suspenders safety net. I am not putting that in mainline, but in the privacy of your own copy of the kernel, I don't see any really serious problem with it. (As long as you aren't going for absolute maximum performance, but even then there are other more important tuning actions and code changes you could make.) > Though after another look at the code, I found out the > `smp_mb__after_atomic();` > seems to be only a compiler barrier on x86. Yes, and that is because the locked xchg instruction used on x86 to implement xchg() already provides full ordering. ;-) > tail = xchg(>nocb_follower_tail, rdp->nocb_gp_tail); > *tail = rdp->nocb_gp_head; > smp_mb__after_atomic(); /* Store *tail before wakeup. */ > if (rdp != my_rdp && tail == >nocb_follower_head) { > swake_up(>nocb_wq); > > But that wouldn't be enough right? Because from 6b5fc3a13318, it stated that > wakeup operation don't guarantee ordering. And when the follower wakes up, it > checks > for nocb_follower_head, which is assigned by `*tail = rdp->nocb_gp_head;` > which doesn't > have LOCK prefix. So it's possible for follower to wake up and see the list > is empty and go > back to sleep. Again, xchg() is defined to provide full ordering against all operations before and after it. Each architecture is required to do whatever is necessary to implement that full ordering, and x86 need only provide its "lock xchg" instruction. The smp_mb__after_atomic() has effect only after atomic read-modify-write operations that do not return a value, for example, atomic_inc(). If you use it after a value-returning atomic read-modify-write operation like xchg(), all you do is needlessly slow things down on platforms that provide non-empty smp_mb__after_atomic() definitions. So again, smp_mb__after_atomic() after xchg() is pointless. Please take a look at Documentation/core-api/atomic_ops.rst in the Linux-kernel source tree for more information. Or get a v4.17 kernel source tree and check this using the memory model (tools/memory-model in that version). Thanx, Paul > Thanks, > David > > From: Paul E. McKenney > Sent: Friday, July 27, 2018 3:31 PM > To: David Chen > Cc: linux-kernel@vger.kernel.org > Subject: Re: RCU nocb list not reclaiming causing OOM > > > On Fri, Jul 27, 2018 at 07:07:46PM +, David Chen wrote: > > Hi Paul, > > > > I'd like to opinion again on this subject. > > > > So we are going to backport this patch: > > 6b5fc3a13318 ("rcu: Add memory barriers for NOCB leader wakeup") > > Does this one solve the problem, or are you still seeing hangs? > If you are no longer seeing hangs, my advice is "hands off keyboard", > though some would no doubt point out that I should follow that advice > more myself. ;-) > > > But the other one: > > 8be6e1b15c54 ("rcu: Use timer as backstop for NOCB deferred wakeups") > > It doesn't apply cleanly, and I'm not too comfortable porting it myself. > > Yeah, that one is a bit on the non-trivial side,
Re: RCU nocb list not reclaiming causing OOM
Hi Paul, I wasn't talking about the xchg() though. The smp_mb__after_atomic() is not for xchg(), it's for `*tail = rdp->nocb_gp_head;` it's stated so in the comment. And I do think we need ordering between `*tail = rdp->nocb_gp_head;` and wake up, because the waiter is checking on head not tail. swait_event_interruptible(rdp->nocb_wq, READ_ONCE(rdp->nocb_follower_head)); So what I'm saying is that since we need to maintain ordering between `*tail = rdp->nocb_gp_head;` and wake up, we need to change the smp_mb__after_atomic() to smp_mb(). Because smp_mb__after_atomic() wouldn't guarantee So this is what I'm proposing. diff -ru linux-4.9.37.orig/kernel/rcu/tree_plugin.h linux-4.9.37/kernel/rcu/tree_plugin.h --- linux-4.9.37.orig/kernel/rcu/tree_plugin.h 2017-07-12 06:42:41.0 -0700 +++ linux-4.9.37/kernel/rcu/tree_plugin.h 2018-07-27 16:23:41.349044259 -0700 @@ -2076,7 +2076,7 @@ /* Append callbacks to follower's "done" list. */ tail = xchg(>nocb_follower_tail, rdp->nocb_gp_tail); *tail = rdp->nocb_gp_head; - smp_mb__after_atomic(); /* Store *tail before wakeup. */ + smp_mb(); /* Store *tail before wakeup. */ if (rdp != my_rdp && tail == >nocb_follower_head) { /* * List was empty, wake up the follower. Thanks, David From: Paul E. McKenney Sent: Friday, July 27, 2018 4:47 PM To: David Chen Cc: linux-kernel@vger.kernel.org Subject: Re: RCU nocb list not reclaiming causing OOM On Fri, Jul 27, 2018 at 11:16:39PM +, David Chen wrote: > Hi Paul, > > Thanks for the advice. > The bug is kind of hard to hit, so I can't say for certain yet. Well, you can always remove the "tail == >nocb_follower_head" as an extra belt-and-suspenders safety net. I am not putting that in mainline, but in the privacy of your own copy of the kernel, I don't see any really serious problem with it. (As long as you aren't going for absolute maximum performance, but even then there are other more important tuning actions and code changes you could make.) > Though after another look at the code, I found out the > `smp_mb__after_atomic();` > seems to be only a compiler barrier on x86. Yes, and that is because the locked xchg instruction used on x86 to implement xchg() already provides full ordering. ;-) > tail = xchg(>nocb_follower_tail, rdp->nocb_gp_tail); > *tail = rdp->nocb_gp_head; > smp_mb__after_atomic(); /* Store *tail before wakeup. */ > if (rdp != my_rdp && tail == >nocb_follower_head) { > swake_up(>nocb_wq); > > But that wouldn't be enough right? Because from 6b5fc3a13318, it stated that > wakeup operation don't guarantee ordering. And when the follower wakes up, it > checks > for nocb_follower_head, which is assigned by `*tail = rdp->nocb_gp_head;` > which doesn't > have LOCK prefix. So it's possible for follower to wake up and see the list > is empty and go > back to sleep. Again, xchg() is defined to provide full ordering against all operations before and after it. Each architecture is required to do whatever is necessary to implement that full ordering, and x86 need only provide its "lock xchg" instruction. The smp_mb__after_atomic() has effect only after atomic read-modify-write operations that do not return a value, for example, atomic_inc(). If you use it after a value-returning atomic read-modify-write operation like xchg(), all you do is needlessly slow things down on platforms that provide non-empty smp_mb__after_atomic() definitions. So again, smp_mb__after_atomic() after xchg() is pointless. Please take a look at Documentation/core-api/atomic_ops.rst in the Linux-kernel source tree for more information. Or get a v4.17 kernel source tree and check this using the memory model (tools/memory-model in that version). Thanx, Paul > Thanks, > David > > From: Paul E. McKenney > Sent: Friday, July 27, 2018 3:31 PM > To: David Chen > Cc: linux-kernel@vger.kernel.org > Subject: Re: RCU nocb list not reclaiming causing OOM > > > On Fri, Jul 27, 2018 at 07:07:46PM +, David Chen wrote: > > Hi Paul, > > > > I'd like to opinion again on this subject. > > > > So we are going to backport this patch: > > 6b5fc3a13318 ("rcu: Add memory barriers for NOCB leader wakeup") > > Does this one solve the problem, or are you still seeing hangs? > If you are no longer seeing hangs, my advice is "hands off keyboard", > though some would no doubt point out that I should follow that advice > more myself. ;-) > > > But the other one: > > 8be6e1b15c54 ("rcu: Use timer as backstop for NOCB deferred wakeups") > > It doesn't apply cleanly, and I'm not too comfortable porting it myself. > > Yeah, that one is a bit on the non-trivial side,
Re: [PATCH] tracing: do not leak kernel addresses
On Fri, Jul 27, 2018 at 03:05:43PM -0700, Sandeep Patil wrote: > On Fri, Jul 27, 2018 at 04:21:14PM -0400, Theodore Y. Ts'o wrote: > > On Fri, Jul 27, 2018 at 04:11:03PM -0400, Steven Rostedt wrote: > > > That said, I would assume that > > > other Android utilities are using other debugfs files for system > > > status and such. > > As of today, I think a lot of information in 'bugreports' is read > out of debugfs (including things like binder stats). We do have a plan > to change that. Hmm, if it's only for bugreports, maybe it can be only mounted when about root processes getting tricked into reading from debugfs. > Indeed, I think it can. However, the problem is the last time I tried to > remove this a whole bunch of things just broke. So, it wasn't about losing > a functionality here and there. Agree, we need to clean up platform to not use > debugfs first. Then we can expect Apps or other native processes to not rely > on debugfs at all. Is Android controlling access to debugfs files via SELinux? If so, then access to debugfs can be gradually cranked down as use cases are removed. - Ted
Re: [PATCH] tracing: do not leak kernel addresses
On Fri, Jul 27, 2018 at 03:05:43PM -0700, Sandeep Patil wrote: > On Fri, Jul 27, 2018 at 04:21:14PM -0400, Theodore Y. Ts'o wrote: > > On Fri, Jul 27, 2018 at 04:11:03PM -0400, Steven Rostedt wrote: > > > That said, I would assume that > > > other Android utilities are using other debugfs files for system > > > status and such. > > As of today, I think a lot of information in 'bugreports' is read > out of debugfs (including things like binder stats). We do have a plan > to change that. Hmm, if it's only for bugreports, maybe it can be only mounted when about root processes getting tricked into reading from debugfs. > Indeed, I think it can. However, the problem is the last time I tried to > remove this a whole bunch of things just broke. So, it wasn't about losing > a functionality here and there. Agree, we need to clean up platform to not use > debugfs first. Then we can expect Apps or other native processes to not rely > on debugfs at all. Is Android controlling access to debugfs files via SELinux? If so, then access to debugfs can be gradually cranked down as use cases are removed. - Ted
Re: [PATCH 7/9] irqchip: add a RISC-V PLIC driver
On 7/26/18 7:38 AM, Christoph Hellwig wrote: This patch adds a driver for the Platform Level Interrupt Controller (PLIC) specified as part of the RISC-V supervisor level ISA manual, in the memory layout implemented by SiFive and qemu. The PLIC connects global interrupt sources to the local interrupt controller on each hart. This driver is based on the driver in the RISC-V tree from Palmer Dabbelt, but has been almost entirely rewritten since. Signed-off-by: Christoph Hellwig I tried to boot HighFive Unleashed with the patch series after applying all the patches from riscv-all branch except timer & irq patches. It gets stuck pretty early. Here is my github repo with all the changes: https://github.com/atishp04/riscv-linux/commits/master_chris_cleanup_hifive I am still looking into it. Palmer: Did I miss something? FWIW, here is the boot log. - Boot log --- [0.00] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=5 [0.00] NR_IRQS: 0, nr_irqs: 0, preallocated irqs: 0 [0.00] plic: mapped 53 interrupts to 4 (out of 9) handlers. [0.00] clocksource: riscv_clocksource: mask: 0x max_cycles: 0x1d854df40, max_idle_ns: 3526361616960 ns [0.00] Calibrating delay loop (skipped), value calculated using timer frequency.. 2.00 BogoMIPS (lpj=1) [0.01] pid_max: default: 32768 minimum: 301 [0.01] Mount-cache hash table entries: 16384 (order: 5, 131072 bytes) [0.02] Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes) [0.02] Hierarchical SRCU implementation. [0.03] smp: Bringing up secondary CPUs ... --- drivers/irqchip/Kconfig | 13 ++ drivers/irqchip/Makefile | 1 + drivers/irqchip/irq-riscv-plic.c | 219 +++ 3 files changed, 233 insertions(+) create mode 100644 drivers/irqchip/irq-riscv-plic.c diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig index e9233db16e03..5ac6dd922a0d 100644 --- a/drivers/irqchip/Kconfig +++ b/drivers/irqchip/Kconfig @@ -372,3 +372,16 @@ config QCOM_PDC IRQs for Qualcomm Technologies Inc (QTI) mobile chips. endmenu + +config RISCV_PLIC + bool "Platform-Level Interrupt Controller" + depends on RISCV + default y + help + This enables support for the PLIC chip found in standard RISC-V + systems. The PLIC controls devices interrupts and connects them to + each core's local interrupt controller. Aside from timer and + software interrupts, all other interrupt sources (MSI, GPIO, etc) + are subordinate to the PLIC. + + If you don't know what to do here, say Y. diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile index 15f268f646bf..bf9238da8a31 100644 --- a/drivers/irqchip/Makefile +++ b/drivers/irqchip/Makefile @@ -87,3 +87,4 @@ obj-$(CONFIG_MESON_IRQ_GPIO) += irq-meson-gpio.o obj-$(CONFIG_GOLDFISH_PIC)+= irq-goldfish-pic.o obj-$(CONFIG_NDS32) += irq-ativic32.o obj-$(CONFIG_QCOM_PDC)+= qcom-pdc.o +obj-$(CONFIG_RISCV_PLIC) += irq-riscv-plic.o diff --git a/drivers/irqchip/irq-riscv-plic.c b/drivers/irqchip/irq-riscv-plic.c new file mode 100644 index ..274fe2cba544 --- /dev/null +++ b/drivers/irqchip/irq-riscv-plic.c @@ -0,0 +1,219 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2017 SiFive + * Copyright (C) 2018 Christoph Hellwig + */ +#define pr_fmt(fmt) "plic: " fmt +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * From the RISC-V Priviledged Spec v1.10: + * + * Global interrupt sources are assigned small unsigned integer identifiers, + * beginning at the value 1. An interrupt ID of 0 is reserved to mean "no + * interrupt". Interrupt identifiers are also used to break ties when two or + * more interrupt sources have the same assigned priority. Smaller values of + * interrupt ID take precedence over larger values of interrupt ID. + * + * While the RISC-V supervisor spec doesn't define the maximum number of + * devices supported by the PLIC, the largest number supported by devices + * marked as 'riscv,plic0' (which is the only device type this driver supports, + * and is the only extant PLIC as of now) is 1024. As mentioned above, device + * 0 is defined to be non-existent so this device really only supports 1023 + * devices. + */ +#define MAX_DEVICES1024 +#define MAX_CONTEXTS 15872 + Is there any way we can preserve some of the comments in the original patch about memory-mapped control registers or at least a reference where to find the register offset calculations? IMHO, it is helpful for anybody who is not familiar with the details. +/* + * Each interrupt source has a priority register associated
Re: [PATCH 7/9] irqchip: add a RISC-V PLIC driver
On 7/26/18 7:38 AM, Christoph Hellwig wrote: This patch adds a driver for the Platform Level Interrupt Controller (PLIC) specified as part of the RISC-V supervisor level ISA manual, in the memory layout implemented by SiFive and qemu. The PLIC connects global interrupt sources to the local interrupt controller on each hart. This driver is based on the driver in the RISC-V tree from Palmer Dabbelt, but has been almost entirely rewritten since. Signed-off-by: Christoph Hellwig I tried to boot HighFive Unleashed with the patch series after applying all the patches from riscv-all branch except timer & irq patches. It gets stuck pretty early. Here is my github repo with all the changes: https://github.com/atishp04/riscv-linux/commits/master_chris_cleanup_hifive I am still looking into it. Palmer: Did I miss something? FWIW, here is the boot log. - Boot log --- [0.00] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=5 [0.00] NR_IRQS: 0, nr_irqs: 0, preallocated irqs: 0 [0.00] plic: mapped 53 interrupts to 4 (out of 9) handlers. [0.00] clocksource: riscv_clocksource: mask: 0x max_cycles: 0x1d854df40, max_idle_ns: 3526361616960 ns [0.00] Calibrating delay loop (skipped), value calculated using timer frequency.. 2.00 BogoMIPS (lpj=1) [0.01] pid_max: default: 32768 minimum: 301 [0.01] Mount-cache hash table entries: 16384 (order: 5, 131072 bytes) [0.02] Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes) [0.02] Hierarchical SRCU implementation. [0.03] smp: Bringing up secondary CPUs ... --- drivers/irqchip/Kconfig | 13 ++ drivers/irqchip/Makefile | 1 + drivers/irqchip/irq-riscv-plic.c | 219 +++ 3 files changed, 233 insertions(+) create mode 100644 drivers/irqchip/irq-riscv-plic.c diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig index e9233db16e03..5ac6dd922a0d 100644 --- a/drivers/irqchip/Kconfig +++ b/drivers/irqchip/Kconfig @@ -372,3 +372,16 @@ config QCOM_PDC IRQs for Qualcomm Technologies Inc (QTI) mobile chips. endmenu + +config RISCV_PLIC + bool "Platform-Level Interrupt Controller" + depends on RISCV + default y + help + This enables support for the PLIC chip found in standard RISC-V + systems. The PLIC controls devices interrupts and connects them to + each core's local interrupt controller. Aside from timer and + software interrupts, all other interrupt sources (MSI, GPIO, etc) + are subordinate to the PLIC. + + If you don't know what to do here, say Y. diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile index 15f268f646bf..bf9238da8a31 100644 --- a/drivers/irqchip/Makefile +++ b/drivers/irqchip/Makefile @@ -87,3 +87,4 @@ obj-$(CONFIG_MESON_IRQ_GPIO) += irq-meson-gpio.o obj-$(CONFIG_GOLDFISH_PIC)+= irq-goldfish-pic.o obj-$(CONFIG_NDS32) += irq-ativic32.o obj-$(CONFIG_QCOM_PDC)+= qcom-pdc.o +obj-$(CONFIG_RISCV_PLIC) += irq-riscv-plic.o diff --git a/drivers/irqchip/irq-riscv-plic.c b/drivers/irqchip/irq-riscv-plic.c new file mode 100644 index ..274fe2cba544 --- /dev/null +++ b/drivers/irqchip/irq-riscv-plic.c @@ -0,0 +1,219 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2017 SiFive + * Copyright (C) 2018 Christoph Hellwig + */ +#define pr_fmt(fmt) "plic: " fmt +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * From the RISC-V Priviledged Spec v1.10: + * + * Global interrupt sources are assigned small unsigned integer identifiers, + * beginning at the value 1. An interrupt ID of 0 is reserved to mean "no + * interrupt". Interrupt identifiers are also used to break ties when two or + * more interrupt sources have the same assigned priority. Smaller values of + * interrupt ID take precedence over larger values of interrupt ID. + * + * While the RISC-V supervisor spec doesn't define the maximum number of + * devices supported by the PLIC, the largest number supported by devices + * marked as 'riscv,plic0' (which is the only device type this driver supports, + * and is the only extant PLIC as of now) is 1024. As mentioned above, device + * 0 is defined to be non-existent so this device really only supports 1023 + * devices. + */ +#define MAX_DEVICES1024 +#define MAX_CONTEXTS 15872 + Is there any way we can preserve some of the comments in the original patch about memory-mapped control registers or at least a reference where to find the register offset calculations? IMHO, it is helpful for anybody who is not familiar with the details. +/* + * Each interrupt source has a priority register associated
RE: [GIT PULL] ACPI fix for v4.18-rc7
> -Original Message- > From: linux-acpi-ow...@vger.kernel.org [mailto:linux-acpi- > ow...@vger.kernel.org] On Behalf Of Schmauss, Erik > Sent: Friday, July 27, 2018 2:51 PM > To: Linus Torvalds ; Rafael J. Wysocki > > Cc: Linux ACPI ; Linux Kernel Mailing List ker...@vger.kernel.org> > Subject: RE: [GIT PULL] ACPI fix for v4.18-rc7 > > > > > -Original Message- > > From: linux-acpi-ow...@vger.kernel.org [mailto:linux-acpi- > > ow...@vger.kernel.org] On Behalf Of Linus Torvalds > > Sent: Friday, July 27, 2018 11:38 AM > > To: Rafael J. Wysocki > > Cc: Linux ACPI ; Linux Kernel Mailing List > > > > Subject: Re: [GIT PULL] ACPI fix for v4.18-rc7 > > > > My XPS13 laptop is unhappy with something recent. > > > > As of yesterday, I get this at bootup: > > > > ACPI: Added _OSI(Module Device) > > ACPI: Added _OSI(Processor Device) > > ACPI: Added _OSI(3.0 _SCP Extensions) > > ACPI: Added _OSI(Processor Aggregator Device) > > ACPI: Added _OSI(Linux-Dell-Video) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI: 9 ACPI AML tables successfully acquired and loaded > > > > It does *not* happen in > > > > Linux version 4.18.0-rc6-00110-g6e77b267723c > > > > but it *does* happen in > > > > Linux version 4.18.0-rc6-00152-gcd3f77d74ac3 > > > > and while I didn't bisect it, I'm assuming it's due to 73c2a01c52b6 > > ("ACPICA: AML Parser: ignore dispatcher error status during table > > load") > > > > Ideas? > > Sorry about the breakage. I found the cause this failure and I'm working on a > fix > right now. > > Erik Hi, The patch below should be able to fix this. From a828a091828599154d8f6e8bfee1495a3df5cf34 Mon Sep 17 00:00:00 2001 From: Erik Schmauss Date: Fri, 27 Jul 2018 15:37:35 -0700 From: Erik Schmauss Subject: [PATCH 1] ACPICA: AML Parser: ignore control method status in module-level code Previous change in the AML parser code blindly set all non-successful dispatcher statuses to AE_OK. This approach is incorrect because successful control method
RE: [GIT PULL] ACPI fix for v4.18-rc7
> -Original Message- > From: linux-acpi-ow...@vger.kernel.org [mailto:linux-acpi- > ow...@vger.kernel.org] On Behalf Of Schmauss, Erik > Sent: Friday, July 27, 2018 2:51 PM > To: Linus Torvalds ; Rafael J. Wysocki > > Cc: Linux ACPI ; Linux Kernel Mailing List ker...@vger.kernel.org> > Subject: RE: [GIT PULL] ACPI fix for v4.18-rc7 > > > > > -Original Message- > > From: linux-acpi-ow...@vger.kernel.org [mailto:linux-acpi- > > ow...@vger.kernel.org] On Behalf Of Linus Torvalds > > Sent: Friday, July 27, 2018 11:38 AM > > To: Rafael J. Wysocki > > Cc: Linux ACPI ; Linux Kernel Mailing List > > > > Subject: Re: [GIT PULL] ACPI fix for v4.18-rc7 > > > > My XPS13 laptop is unhappy with something recent. > > > > As of yesterday, I get this at bootup: > > > > ACPI: Added _OSI(Module Device) > > ACPI: Added _OSI(Processor Device) > > ACPI: Added _OSI(3.0 _SCP Extensions) > > ACPI: Added _OSI(Processor Aggregator Device) > > ACPI: Added _OSI(Linux-Dell-Video) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI Error: Result stack is empty! State=(ptrval) > > (20180531/dswstate-65) > > ACPI Error: AE_AML_NO_RETURN_VALUE, Missing or null operand > > (20180531/dsutils-612) > > ACPI Error: AE_AML_NO_RETURN_VALUE, While creating Arg 0 > > (20180531/dsutils-727) > > ACPI: 9 ACPI AML tables successfully acquired and loaded > > > > It does *not* happen in > > > > Linux version 4.18.0-rc6-00110-g6e77b267723c > > > > but it *does* happen in > > > > Linux version 4.18.0-rc6-00152-gcd3f77d74ac3 > > > > and while I didn't bisect it, I'm assuming it's due to 73c2a01c52b6 > > ("ACPICA: AML Parser: ignore dispatcher error status during table > > load") > > > > Ideas? > > Sorry about the breakage. I found the cause this failure and I'm working on a > fix > right now. > > Erik Hi, The patch below should be able to fix this. From a828a091828599154d8f6e8bfee1495a3df5cf34 Mon Sep 17 00:00:00 2001 From: Erik Schmauss Date: Fri, 27 Jul 2018 15:37:35 -0700 From: Erik Schmauss Subject: [PATCH 1] ACPICA: AML Parser: ignore control method status in module-level code Previous change in the AML parser code blindly set all non-successful dispatcher statuses to AE_OK. This approach is incorrect because successful control method
Re: [PATCH v2 3/3] selftests/ftrace: Fix kprobe string testcase to not probe notrace function
On Fri, 27 Jul 2018 17:43:07 -0400 Steven Rostedt wrote: > On Thu, 26 Jul 2018 14:54:23 +0900 > Masami Hiramatsu wrote: > > > Fix kprobe string argument testcase to not probe notrace > > function. Instead, it probes tracefs function which must > > be available with ftrace. > > Hi Masami, > > With these patches applied, this test fails: > > ./ftracetest test.d/kprobe/probepoint.tc > > Gives me the error in dmseg: > > trace_kprobe: Could not probe notrace function create_trace_kprobe > > Is this expected? I don't see create_trace_kprobe as a notrace function > either. > > I even applied patch 3 (going through Shuah's tree) to me test and it > still fails. Oops, why I missed this... OK, I'll fix it. Thanks, > > Config attached. > > -- Steve > > > > > Signed-off-by: Masami Hiramatsu > > --- > > .../ftrace/test.d/kprobe/kprobe_args_string.tc | 30 > > > > 1 file changed, 12 insertions(+), 18 deletions(-) > > > > diff --git > > a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc > > b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc > > index a0002563e9ee..1ad70cdaf442 100644 > > --- a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc > > +++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc > > @@ -9,28 +9,22 @@ echo > kprobe_events > > > > case `uname -m` in > > x86_64) > > - ARG2=%si > > - OFFS=8 > > + ARG1=%di > > ;; > > i[3456]86) > > - ARG2=%cx > > - OFFS=4 > > + ARG1=%ax > > ;; > > aarch64) > > - ARG2=%x1 > > - OFFS=8 > > + ARG1=%x0 > > ;; > > arm*) > > - ARG2=%r1 > > - OFFS=4 > > + ARG1=%r0 > > ;; > > ppc64*) > > - ARG2=%r4 > > - OFFS=8 > > + ARG1=%r3 > > ;; > > ppc*) > > - ARG2=%r4 > > - OFFS=4 > > + ARG1=%r3 > > ;; > > *) > >echo "Please implement other architecture here" > > @@ -38,17 +32,17 @@ ppc*) > > esac > > > > : "Test get argument (1)" > > -echo "p:testprobe create_trace_kprobe arg1=+0(+0(${ARG2})):string" > > > kprobe_events > > +echo "p:testprobe tracefs_create_dir arg1=+0(${ARG1}):string" > > > kprobe_events > > echo 1 > events/kprobes/testprobe/enable > > -! echo test >> kprobe_events > > -tail -n 1 trace | grep -qe "testprobe.* arg1=\"test\"" > > +echo "p:test _do_fork" >> kprobe_events > > +grep -qe "testprobe.* arg1=\"test\"" trace > > > > echo 0 > events/kprobes/testprobe/enable > > : "Test get argument (2)" > > -echo "p:testprobe create_trace_kprobe arg1=+0(+0(${ARG2})):string > > arg2=+0(+${OFFS}(${ARG2})):string" > kprobe_events > > +echo "p:testprobe tracefs_create_dir arg1=+0(${ARG1}):string > > arg2=+0(${ARG1}):string" > kprobe_events > > echo 1 > events/kprobes/testprobe/enable > > -! echo test1 test2 >> kprobe_events > > -tail -n 1 trace | grep -qe "testprobe.* arg1=\"test1\" arg2=\"test2\"" > > +echo "p:test _do_fork" >> kprobe_events > > +grep -qe "testprobe.* arg1=\"test\" arg2=\"test\"" trace > > > > echo 0 > events/enable > > echo > kprobe_events > -- Masami Hiramatsu
Re: [PATCH v2 3/3] selftests/ftrace: Fix kprobe string testcase to not probe notrace function
On Fri, 27 Jul 2018 17:43:07 -0400 Steven Rostedt wrote: > On Thu, 26 Jul 2018 14:54:23 +0900 > Masami Hiramatsu wrote: > > > Fix kprobe string argument testcase to not probe notrace > > function. Instead, it probes tracefs function which must > > be available with ftrace. > > Hi Masami, > > With these patches applied, this test fails: > > ./ftracetest test.d/kprobe/probepoint.tc > > Gives me the error in dmseg: > > trace_kprobe: Could not probe notrace function create_trace_kprobe > > Is this expected? I don't see create_trace_kprobe as a notrace function > either. > > I even applied patch 3 (going through Shuah's tree) to me test and it > still fails. Oops, why I missed this... OK, I'll fix it. Thanks, > > Config attached. > > -- Steve > > > > > Signed-off-by: Masami Hiramatsu > > --- > > .../ftrace/test.d/kprobe/kprobe_args_string.tc | 30 > > > > 1 file changed, 12 insertions(+), 18 deletions(-) > > > > diff --git > > a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc > > b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc > > index a0002563e9ee..1ad70cdaf442 100644 > > --- a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc > > +++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc > > @@ -9,28 +9,22 @@ echo > kprobe_events > > > > case `uname -m` in > > x86_64) > > - ARG2=%si > > - OFFS=8 > > + ARG1=%di > > ;; > > i[3456]86) > > - ARG2=%cx > > - OFFS=4 > > + ARG1=%ax > > ;; > > aarch64) > > - ARG2=%x1 > > - OFFS=8 > > + ARG1=%x0 > > ;; > > arm*) > > - ARG2=%r1 > > - OFFS=4 > > + ARG1=%r0 > > ;; > > ppc64*) > > - ARG2=%r4 > > - OFFS=8 > > + ARG1=%r3 > > ;; > > ppc*) > > - ARG2=%r4 > > - OFFS=4 > > + ARG1=%r3 > > ;; > > *) > >echo "Please implement other architecture here" > > @@ -38,17 +32,17 @@ ppc*) > > esac > > > > : "Test get argument (1)" > > -echo "p:testprobe create_trace_kprobe arg1=+0(+0(${ARG2})):string" > > > kprobe_events > > +echo "p:testprobe tracefs_create_dir arg1=+0(${ARG1}):string" > > > kprobe_events > > echo 1 > events/kprobes/testprobe/enable > > -! echo test >> kprobe_events > > -tail -n 1 trace | grep -qe "testprobe.* arg1=\"test\"" > > +echo "p:test _do_fork" >> kprobe_events > > +grep -qe "testprobe.* arg1=\"test\"" trace > > > > echo 0 > events/kprobes/testprobe/enable > > : "Test get argument (2)" > > -echo "p:testprobe create_trace_kprobe arg1=+0(+0(${ARG2})):string > > arg2=+0(+${OFFS}(${ARG2})):string" > kprobe_events > > +echo "p:testprobe tracefs_create_dir arg1=+0(${ARG1}):string > > arg2=+0(${ARG1}):string" > kprobe_events > > echo 1 > events/kprobes/testprobe/enable > > -! echo test1 test2 >> kprobe_events > > -tail -n 1 trace | grep -qe "testprobe.* arg1=\"test1\" arg2=\"test2\"" > > +echo "p:test _do_fork" >> kprobe_events > > +grep -qe "testprobe.* arg1=\"test\" arg2=\"test\"" trace > > > > echo 0 > events/enable > > echo > kprobe_events > -- Masami Hiramatsu
Re: [PATCH 34/38] vfs: syscall: Add fsinfo() to query filesystem information [ver #10]
On Sat, Jul 28, 2018 at 1:51 AM David Howells wrote: > David Howells wrote: > > > One thing I'm confused about is that fat has both a codepage and a charset > > and > > I'm not sure of the difference. > > In fact, it's not clear that the codepage is actually used. > > warthog>git grep '[.>]codepage' > fs/fat/inode.c: opts->codepage = fat_default_codepage; > fs/fat/inode.c: opts->codepage = option; > fs/fat/inode.c: sprintf(buf, "cp%d", sbi->options.codepage); sprintf(buf, "cp%d", sbi->options.codepage); sbi->nls_disk = load_nls(buf); if (!sbi->nls_disk) { fat_msg(sb, KERN_ERR, "codepage %s not found", buf); goto out_fail; }
Re: [PATCH 34/38] vfs: syscall: Add fsinfo() to query filesystem information [ver #10]
On Sat, Jul 28, 2018 at 1:51 AM David Howells wrote: > David Howells wrote: > > > One thing I'm confused about is that fat has both a codepage and a charset > > and > > I'm not sure of the difference. > > In fact, it's not clear that the codepage is actually used. > > warthog>git grep '[.>]codepage' > fs/fat/inode.c: opts->codepage = fat_default_codepage; > fs/fat/inode.c: opts->codepage = option; > fs/fat/inode.c: sprintf(buf, "cp%d", sbi->options.codepage); sprintf(buf, "cp%d", sbi->options.codepage); sbi->nls_disk = load_nls(buf); if (!sbi->nls_disk) { fat_msg(sb, KERN_ERR, "codepage %s not found", buf); goto out_fail; }
Re: [PATCH 34/38] vfs: syscall: Add fsinfo() to query filesystem information [ver #10]
David Howells wrote: > One thing I'm confused about is that fat has both a codepage and a charset and > I'm not sure of the difference. In fact, it's not clear that the codepage is actually used. warthog>git grep '[.>]codepage' fs/fat/inode.c: opts->codepage = fat_default_codepage; fs/fat/inode.c: opts->codepage = option; fs/fat/inode.c: sprintf(buf, "cp%d", sbi->options.codepage); David
Re: [PATCH 34/38] vfs: syscall: Add fsinfo() to query filesystem information [ver #10]
David Howells wrote: > One thing I'm confused about is that fat has both a codepage and a charset and > I'm not sure of the difference. In fact, it's not clear that the codepage is actually used. warthog>git grep '[.>]codepage' fs/fat/inode.c: opts->codepage = fat_default_codepage; fs/fat/inode.c: opts->codepage = option; fs/fat/inode.c: sprintf(buf, "cp%d", sbi->options.codepage); David
Re: [PATCH 34/38] vfs: syscall: Add fsinfo() to query filesystem information [ver #10]
Jann Horn wrote: > > +static int fsinfo_generic_name_encoding(struct dentry *dentry, char *buf) > > +{ > > + static const char encoding[] = "utf8"; > > + > > + if (buf) > > + memcpy(buf, encoding, sizeof(encoding) - 1); > > + return sizeof(encoding) - 1; > > +} > > Is this meant to be "encoding to be used by userspace" or "encoding of > on-disk filenames"? The latter. > Are there any plans to create filesystems that behave differently? isofs, fat, ntfs, cifs for example. > If the latter: This is wrong for e.g. a vfat mount that uses a codepage, > right? Should the default in that case not be "I don't know"? Quite possibly. Note that it could also be what you're interpreting it as because the codepage got overridden by a mount parameter rather than what's on the disk (assuming the medium actually records this). One thing I'm confused about is that fat has both a codepage and a charset and I'm not sure of the difference. David
Re: [PATCH 34/38] vfs: syscall: Add fsinfo() to query filesystem information [ver #10]
Jann Horn wrote: > > +static int fsinfo_generic_name_encoding(struct dentry *dentry, char *buf) > > +{ > > + static const char encoding[] = "utf8"; > > + > > + if (buf) > > + memcpy(buf, encoding, sizeof(encoding) - 1); > > + return sizeof(encoding) - 1; > > +} > > Is this meant to be "encoding to be used by userspace" or "encoding of > on-disk filenames"? The latter. > Are there any plans to create filesystems that behave differently? isofs, fat, ntfs, cifs for example. > If the latter: This is wrong for e.g. a vfat mount that uses a codepage, > right? Should the default in that case not be "I don't know"? Quite possibly. Note that it could also be what you're interpreting it as because the codepage got overridden by a mount parameter rather than what's on the disk (assuming the medium actually records this). One thing I'm confused about is that fat has both a codepage and a charset and I'm not sure of the difference. David
Re: RCU nocb list not reclaiming causing OOM
On Fri, Jul 27, 2018 at 11:16:39PM +, David Chen wrote: > Hi Paul, > > Thanks for the advice. > The bug is kind of hard to hit, so I can't say for certain yet. Well, you can always remove the "tail == >nocb_follower_head" as an extra belt-and-suspenders safety net. I am not putting that in mainline, but in the privacy of your own copy of the kernel, I don't see any really serious problem with it. (As long as you aren't going for absolute maximum performance, but even then there are other more important tuning actions and code changes you could make.) > Though after another look at the code, I found out the > `smp_mb__after_atomic();` > seems to be only a compiler barrier on x86. Yes, and that is because the locked xchg instruction used on x86 to implement xchg() already provides full ordering. ;-) > tail = xchg(>nocb_follower_tail, rdp->nocb_gp_tail); > *tail = rdp->nocb_gp_head; > smp_mb__after_atomic(); /* Store *tail before wakeup. */ > if (rdp != my_rdp && tail == >nocb_follower_head) { > swake_up(>nocb_wq); > > But that wouldn't be enough right? Because from 6b5fc3a13318, it stated that > wakeup operation don't guarantee ordering. And when the follower wakes up, it > checks > for nocb_follower_head, which is assigned by `*tail = rdp->nocb_gp_head;` > which doesn't > have LOCK prefix. So it's possible for follower to wake up and see the list > is empty and go > back to sleep. Again, xchg() is defined to provide full ordering against all operations before and after it. Each architecture is required to do whatever is necessary to implement that full ordering, and x86 need only provide its "lock xchg" instruction. The smp_mb__after_atomic() has effect only after atomic read-modify-write operations that do not return a value, for example, atomic_inc(). If you use it after a value-returning atomic read-modify-write operation like xchg(), all you do is needlessly slow things down on platforms that provide non-empty smp_mb__after_atomic() definitions. So again, smp_mb__after_atomic() after xchg() is pointless. Please take a look at Documentation/core-api/atomic_ops.rst in the Linux-kernel source tree for more information. Or get a v4.17 kernel source tree and check this using the memory model (tools/memory-model in that version). Thanx, Paul > Thanks, > David > > From: Paul E. McKenney > Sent: Friday, July 27, 2018 3:31 PM > To: David Chen > Cc: linux-kernel@vger.kernel.org > Subject: Re: RCU nocb list not reclaiming causing OOM > > > On Fri, Jul 27, 2018 at 07:07:46PM +, David Chen wrote: > > Hi Paul, > > > > I'd like to opinion again on this subject. > > > > So we are going to backport this patch: > > 6b5fc3a13318 ("rcu: Add memory barriers for NOCB leader wakeup") > > Does this one solve the problem, or are you still seeing hangs? > If you are no longer seeing hangs, my advice is "hands off keyboard", > though some would no doubt point out that I should follow that advice > more myself. ;-) > > > But the other one: > > 8be6e1b15c54 ("rcu: Use timer as backstop for NOCB deferred wakeups") > > It doesn't apply cleanly, and I'm not too comfortable porting it myself. > > Yeah, that one is a bit on the non-trivial side, no two ways about it. > > > So I'm wondering if I use the following change to always wake up follower > > thread > > regardless if the list was empty or not, just to be on the safe side. Do > > you think > > this change is reasonable? Do you see any problem it might cause? > > > > Thanks, > > David > > > > diff -ru linux-4.9.37.orig/kernel/rcu/tree_plugin.h > > linux-4.9.37/kernel/rcu/tree_plugin.h > > --- linux-4.9.37.orig/kernel/rcu/tree_plugin.h 2017-07-12 > > 06:42:41.0 -0700 > > +++ linux-4.9.37/kernel/rcu/tree_plugin.h 2018-07-27 11:57:03.582134519 > > -0700 > > @@ -2077,7 +2077,7 @@ > > tail = xchg(>nocb_follower_tail, rdp->nocb_gp_tail); > > *tail = rdp->nocb_gp_head; > > smp_mb__after_atomic(); /* Store *tail before wakeup. */ > > - if (rdp != my_rdp && tail == >nocb_follower_head) { > > + if (rdp != my_rdp) { > > This will burn a bit of extra CPU time, but it should be fine other than > that. > > Thanx, Paul > > > /* > > * List was empty, wake up the follower. > > * Memory barriers supplied by atomic_long_add(). > > > > > > From: David Chen > > Sent: Friday, July 20, 2018 5:12 PM > > To: paul...@linux.vnet.ibm.com > > Cc: linux-kernel@vger.kernel.org > > Subject: Re: RCU nocb list not reclaiming causing OOM > > > > > > Hi Paul, > > > > Ok, I'll try those patches. > > > > Thanks, > > David > > > > From: Paul E. McKenney > > Sent: Friday, July 20, 2018 4:32:12 PM > > To: David
Re: RCU nocb list not reclaiming causing OOM
On Fri, Jul 27, 2018 at 11:16:39PM +, David Chen wrote: > Hi Paul, > > Thanks for the advice. > The bug is kind of hard to hit, so I can't say for certain yet. Well, you can always remove the "tail == >nocb_follower_head" as an extra belt-and-suspenders safety net. I am not putting that in mainline, but in the privacy of your own copy of the kernel, I don't see any really serious problem with it. (As long as you aren't going for absolute maximum performance, but even then there are other more important tuning actions and code changes you could make.) > Though after another look at the code, I found out the > `smp_mb__after_atomic();` > seems to be only a compiler barrier on x86. Yes, and that is because the locked xchg instruction used on x86 to implement xchg() already provides full ordering. ;-) > tail = xchg(>nocb_follower_tail, rdp->nocb_gp_tail); > *tail = rdp->nocb_gp_head; > smp_mb__after_atomic(); /* Store *tail before wakeup. */ > if (rdp != my_rdp && tail == >nocb_follower_head) { > swake_up(>nocb_wq); > > But that wouldn't be enough right? Because from 6b5fc3a13318, it stated that > wakeup operation don't guarantee ordering. And when the follower wakes up, it > checks > for nocb_follower_head, which is assigned by `*tail = rdp->nocb_gp_head;` > which doesn't > have LOCK prefix. So it's possible for follower to wake up and see the list > is empty and go > back to sleep. Again, xchg() is defined to provide full ordering against all operations before and after it. Each architecture is required to do whatever is necessary to implement that full ordering, and x86 need only provide its "lock xchg" instruction. The smp_mb__after_atomic() has effect only after atomic read-modify-write operations that do not return a value, for example, atomic_inc(). If you use it after a value-returning atomic read-modify-write operation like xchg(), all you do is needlessly slow things down on platforms that provide non-empty smp_mb__after_atomic() definitions. So again, smp_mb__after_atomic() after xchg() is pointless. Please take a look at Documentation/core-api/atomic_ops.rst in the Linux-kernel source tree for more information. Or get a v4.17 kernel source tree and check this using the memory model (tools/memory-model in that version). Thanx, Paul > Thanks, > David > > From: Paul E. McKenney > Sent: Friday, July 27, 2018 3:31 PM > To: David Chen > Cc: linux-kernel@vger.kernel.org > Subject: Re: RCU nocb list not reclaiming causing OOM > > > On Fri, Jul 27, 2018 at 07:07:46PM +, David Chen wrote: > > Hi Paul, > > > > I'd like to opinion again on this subject. > > > > So we are going to backport this patch: > > 6b5fc3a13318 ("rcu: Add memory barriers for NOCB leader wakeup") > > Does this one solve the problem, or are you still seeing hangs? > If you are no longer seeing hangs, my advice is "hands off keyboard", > though some would no doubt point out that I should follow that advice > more myself. ;-) > > > But the other one: > > 8be6e1b15c54 ("rcu: Use timer as backstop for NOCB deferred wakeups") > > It doesn't apply cleanly, and I'm not too comfortable porting it myself. > > Yeah, that one is a bit on the non-trivial side, no two ways about it. > > > So I'm wondering if I use the following change to always wake up follower > > thread > > regardless if the list was empty or not, just to be on the safe side. Do > > you think > > this change is reasonable? Do you see any problem it might cause? > > > > Thanks, > > David > > > > diff -ru linux-4.9.37.orig/kernel/rcu/tree_plugin.h > > linux-4.9.37/kernel/rcu/tree_plugin.h > > --- linux-4.9.37.orig/kernel/rcu/tree_plugin.h 2017-07-12 > > 06:42:41.0 -0700 > > +++ linux-4.9.37/kernel/rcu/tree_plugin.h 2018-07-27 11:57:03.582134519 > > -0700 > > @@ -2077,7 +2077,7 @@ > > tail = xchg(>nocb_follower_tail, rdp->nocb_gp_tail); > > *tail = rdp->nocb_gp_head; > > smp_mb__after_atomic(); /* Store *tail before wakeup. */ > > - if (rdp != my_rdp && tail == >nocb_follower_head) { > > + if (rdp != my_rdp) { > > This will burn a bit of extra CPU time, but it should be fine other than > that. > > Thanx, Paul > > > /* > > * List was empty, wake up the follower. > > * Memory barriers supplied by atomic_long_add(). > > > > > > From: David Chen > > Sent: Friday, July 20, 2018 5:12 PM > > To: paul...@linux.vnet.ibm.com > > Cc: linux-kernel@vger.kernel.org > > Subject: Re: RCU nocb list not reclaiming causing OOM > > > > > > Hi Paul, > > > > Ok, I'll try those patches. > > > > Thanks, > > David > > > > From: Paul E. McKenney > > Sent: Friday, July 20, 2018 4:32:12 PM > > To: David
Re: [PATCH 0/10] psi: pressure stall information for CPU, memory, and IO v2
On Thu, Jul 26, 2018 at 1:07 PM, Johannes Weiner wrote: > On Thu, Jul 26, 2018 at 11:07:32AM +1000, Singh, Balbir wrote: >> On 7/25/18 1:15 AM, Johannes Weiner wrote: >> > On Tue, Jul 24, 2018 at 07:14:02AM +1000, Balbir Singh wrote: >> >> Does the mechanism scale? I am a little concerned about how frequently >> >> this infrastructure is monitored/read/acted upon. >> > >> > I expect most users to poll in the frequency ballpark of the running >> > averages (10s, 1m, 5m). Our OOMD defaults to 5s polling of the 10s >> > average; we collect the 1m average once per minute from our machines >> > and cgroups to log the system/workload health trends in our fleet. >> > >> > Suren has been experimenting with adaptive polling down to the >> > millisecond range on Android. >> > >> >> I think this is a bad way of doing things, polling only adds to >> overheads, there needs to be an event driven mechanism and the >> selection of the events need to happen in user space. > > Of course, I'm not saying you should be doing this, and in fact Suren > and I were talking about notification/event infrastructure. I implemented a psi-monitor prototype which allows userspace to specify the max PSI stall it can tolerate (in terms of % of time spent on memory management). When that threshold is breached an event to userspace is generated. I'm still testing it but early results look promising. I'm planning to send it upstream when it's ready and after the main PSI patchset is merged. > > You asked if this scales and I'm telling you it's not impossible to > read at such frequencies. > Yes it's doable. One usecase might be to poll at a higher rate for a short period of time immediately after the initial event is received to clarify the short-term signal dynamics. > Maybe you can clarify your question. > >> >> Why aren't existing mechanisms sufficient >> > >> > Our existing stuff gives a lot of indication when something *may* be >> > an issue, like the rate of page reclaim, the number of refaults, the >> > average number of active processes, one task waiting on a resource. >> > >> > But the real difference between an issue and a non-issue is how much >> > it affects your overall goal of making forward progress or reacting to >> > a request in time. And that's the only thing users really care >> > about. It doesn't matter whether my system is doing 2314 or 6723 page >> > refaults per minute, or scanned 8495 pages recently. I need to know >> > whether I'm losing 1% or 20% of my time on overcommitted memory. >> > >> > Delayacct is time-based, so it's a step in the right direction, but it >> > doesn't aggregate tasks and CPUs into compound productivity states to >> > tell you if only parts of your workload are seeing delays (which is >> > often tolerable for the purpose of ensuring maximum HW utilization) or >> > your system overall is not making forward progress. That aggregation >> > isn't something you can do in userspace with polled delayacct data. >> >> By aggregation you mean cgroup aggregation? > > System-wide and per cgroup.
Re: [PATCH 0/10] psi: pressure stall information for CPU, memory, and IO v2
On Thu, Jul 26, 2018 at 1:07 PM, Johannes Weiner wrote: > On Thu, Jul 26, 2018 at 11:07:32AM +1000, Singh, Balbir wrote: >> On 7/25/18 1:15 AM, Johannes Weiner wrote: >> > On Tue, Jul 24, 2018 at 07:14:02AM +1000, Balbir Singh wrote: >> >> Does the mechanism scale? I am a little concerned about how frequently >> >> this infrastructure is monitored/read/acted upon. >> > >> > I expect most users to poll in the frequency ballpark of the running >> > averages (10s, 1m, 5m). Our OOMD defaults to 5s polling of the 10s >> > average; we collect the 1m average once per minute from our machines >> > and cgroups to log the system/workload health trends in our fleet. >> > >> > Suren has been experimenting with adaptive polling down to the >> > millisecond range on Android. >> > >> >> I think this is a bad way of doing things, polling only adds to >> overheads, there needs to be an event driven mechanism and the >> selection of the events need to happen in user space. > > Of course, I'm not saying you should be doing this, and in fact Suren > and I were talking about notification/event infrastructure. I implemented a psi-monitor prototype which allows userspace to specify the max PSI stall it can tolerate (in terms of % of time spent on memory management). When that threshold is breached an event to userspace is generated. I'm still testing it but early results look promising. I'm planning to send it upstream when it's ready and after the main PSI patchset is merged. > > You asked if this scales and I'm telling you it's not impossible to > read at such frequencies. > Yes it's doable. One usecase might be to poll at a higher rate for a short period of time immediately after the initial event is received to clarify the short-term signal dynamics. > Maybe you can clarify your question. > >> >> Why aren't existing mechanisms sufficient >> > >> > Our existing stuff gives a lot of indication when something *may* be >> > an issue, like the rate of page reclaim, the number of refaults, the >> > average number of active processes, one task waiting on a resource. >> > >> > But the real difference between an issue and a non-issue is how much >> > it affects your overall goal of making forward progress or reacting to >> > a request in time. And that's the only thing users really care >> > about. It doesn't matter whether my system is doing 2314 or 6723 page >> > refaults per minute, or scanned 8495 pages recently. I need to know >> > whether I'm losing 1% or 20% of my time on overcommitted memory. >> > >> > Delayacct is time-based, so it's a step in the right direction, but it >> > doesn't aggregate tasks and CPUs into compound productivity states to >> > tell you if only parts of your workload are seeing delays (which is >> > often tolerable for the purpose of ensuring maximum HW utilization) or >> > your system overall is not making forward progress. That aggregation >> > isn't something you can do in userspace with polled delayacct data. >> >> By aggregation you mean cgroup aggregation? > > System-wide and per cgroup.
Re: RCU nocb list not reclaiming causing OOM
Hi Paul, Thanks for the advice. The bug is kind of hard to hit, so I can't say for certain yet. Though after another look at the code, I found out the `smp_mb__after_atomic();` seems to be only a compiler barrier on x86. tail = xchg(>nocb_follower_tail, rdp->nocb_gp_tail); *tail = rdp->nocb_gp_head; smp_mb__after_atomic(); /* Store *tail before wakeup. */ if (rdp != my_rdp && tail == >nocb_follower_head) { swake_up(>nocb_wq); But that wouldn't be enough right? Because from 6b5fc3a13318, it stated that wakeup operation don't guarantee ordering. And when the follower wakes up, it checks for nocb_follower_head, which is assigned by `*tail = rdp->nocb_gp_head;` which doesn't have LOCK prefix. So it's possible for follower to wake up and see the list is empty and go back to sleep. Thanks, David From: Paul E. McKenney Sent: Friday, July 27, 2018 3:31 PM To: David Chen Cc: linux-kernel@vger.kernel.org Subject: Re: RCU nocb list not reclaiming causing OOM On Fri, Jul 27, 2018 at 07:07:46PM +, David Chen wrote: > Hi Paul, > > I'd like to opinion again on this subject. > > So we are going to backport this patch: > 6b5fc3a13318 ("rcu: Add memory barriers for NOCB leader wakeup") Does this one solve the problem, or are you still seeing hangs? If you are no longer seeing hangs, my advice is "hands off keyboard", though some would no doubt point out that I should follow that advice more myself. ;-) > But the other one: > 8be6e1b15c54 ("rcu: Use timer as backstop for NOCB deferred wakeups") > It doesn't apply cleanly, and I'm not too comfortable porting it myself. Yeah, that one is a bit on the non-trivial side, no two ways about it. > So I'm wondering if I use the following change to always wake up follower > thread > regardless if the list was empty or not, just to be on the safe side. Do you > think > this change is reasonable? Do you see any problem it might cause? > > Thanks, > David > > diff -ru linux-4.9.37.orig/kernel/rcu/tree_plugin.h > linux-4.9.37/kernel/rcu/tree_plugin.h > --- linux-4.9.37.orig/kernel/rcu/tree_plugin.h 2017-07-12 > 06:42:41.0 -0700 > +++ linux-4.9.37/kernel/rcu/tree_plugin.h 2018-07-27 11:57:03.582134519 > -0700 > @@ -2077,7 +2077,7 @@ > tail = xchg(>nocb_follower_tail, rdp->nocb_gp_tail); > *tail = rdp->nocb_gp_head; > smp_mb__after_atomic(); /* Store *tail before wakeup. */ > - if (rdp != my_rdp && tail == >nocb_follower_head) { > + if (rdp != my_rdp) { This will burn a bit of extra CPU time, but it should be fine other than that. Thanx, Paul > /* > * List was empty, wake up the follower. > * Memory barriers supplied by atomic_long_add(). > > > From: David Chen > Sent: Friday, July 20, 2018 5:12 PM > To: paul...@linux.vnet.ibm.com > Cc: linux-kernel@vger.kernel.org > Subject: Re: RCU nocb list not reclaiming causing OOM > > > Hi Paul, > > Ok, I'll try those patches. > > Thanks, > David > > From: Paul E. McKenney > Sent: Friday, July 20, 2018 4:32:12 PM > To: David Chen > Cc: linux-kernel@vger.kernel.org > Subject: Re: RCU nocb list not reclaiming causing OOM > > > On Fri, Jul 20, 2018 at 11:05:52PM +, David Chen wrote: > > Hi Paul, > > > > We hit an RCU issue on 4.9.37 kernel. One of the nocb_follower list grows > > too > > large, and not getting reclaimed, causing the system to OOM. > > > > Printing the culprit rcu_sched_data: > > > > nocb_q_count = { > > counter = 32369635 > > }, > > nocb_follower_head = 0x88ae901c0a00, > > nocb_follower_tail = 0x88af1538b8d8, > > nocb_kthread = 0x88b06d29, > > > > As you can see here, the nocb_follower_head is not empty, so in theory, the > > nocb_kthread shouldn't go to sleep. However, if dump the stack of the > > kthread: > > > > crash> bt 0x88b06d29 > > PID: 21 TASK: 88b06d29 CPU: 3 COMMAND: "rcuos/1" > > #0 [afc9020b7dc0] __schedule at 8d8789dc > > #1 [afc9020b7e38] schedule at 8d878e76 > > #2 [afc9020b7e50] rcu_nocb_kthread at 8d112337 > > #3 [afc9020b7ec8] kthread at 8d0c6ce7 > > #4 [afc9020b7f50] ret_from_fork at 8d87d755 > > > > And if we dis the address at 8d112337: > > > > /usr/src/debug/kernel-4.9.37/linux-4.9.37-29.nutanix.07142017.el7.centos.x86_64/kernel/rcu/tree_plugin.h: > > 2106 > > 0x8d11232d : test %rax,%rax > > 0x8d112330 : jne 0x8d112355 > > > > 0x8d112332 : callq 0x8d878e40 > > > > 0x8d112337 : lea -0x40(%rbp),%rsi > > > > So the kthread is blocked at swait_event_interruptible in the > > nocb_follower_wait. > > This contradict with the fact that nocb_follower_head
Re: RCU nocb list not reclaiming causing OOM
Hi Paul, Thanks for the advice. The bug is kind of hard to hit, so I can't say for certain yet. Though after another look at the code, I found out the `smp_mb__after_atomic();` seems to be only a compiler barrier on x86. tail = xchg(>nocb_follower_tail, rdp->nocb_gp_tail); *tail = rdp->nocb_gp_head; smp_mb__after_atomic(); /* Store *tail before wakeup. */ if (rdp != my_rdp && tail == >nocb_follower_head) { swake_up(>nocb_wq); But that wouldn't be enough right? Because from 6b5fc3a13318, it stated that wakeup operation don't guarantee ordering. And when the follower wakes up, it checks for nocb_follower_head, which is assigned by `*tail = rdp->nocb_gp_head;` which doesn't have LOCK prefix. So it's possible for follower to wake up and see the list is empty and go back to sleep. Thanks, David From: Paul E. McKenney Sent: Friday, July 27, 2018 3:31 PM To: David Chen Cc: linux-kernel@vger.kernel.org Subject: Re: RCU nocb list not reclaiming causing OOM On Fri, Jul 27, 2018 at 07:07:46PM +, David Chen wrote: > Hi Paul, > > I'd like to opinion again on this subject. > > So we are going to backport this patch: > 6b5fc3a13318 ("rcu: Add memory barriers for NOCB leader wakeup") Does this one solve the problem, or are you still seeing hangs? If you are no longer seeing hangs, my advice is "hands off keyboard", though some would no doubt point out that I should follow that advice more myself. ;-) > But the other one: > 8be6e1b15c54 ("rcu: Use timer as backstop for NOCB deferred wakeups") > It doesn't apply cleanly, and I'm not too comfortable porting it myself. Yeah, that one is a bit on the non-trivial side, no two ways about it. > So I'm wondering if I use the following change to always wake up follower > thread > regardless if the list was empty or not, just to be on the safe side. Do you > think > this change is reasonable? Do you see any problem it might cause? > > Thanks, > David > > diff -ru linux-4.9.37.orig/kernel/rcu/tree_plugin.h > linux-4.9.37/kernel/rcu/tree_plugin.h > --- linux-4.9.37.orig/kernel/rcu/tree_plugin.h 2017-07-12 > 06:42:41.0 -0700 > +++ linux-4.9.37/kernel/rcu/tree_plugin.h 2018-07-27 11:57:03.582134519 > -0700 > @@ -2077,7 +2077,7 @@ > tail = xchg(>nocb_follower_tail, rdp->nocb_gp_tail); > *tail = rdp->nocb_gp_head; > smp_mb__after_atomic(); /* Store *tail before wakeup. */ > - if (rdp != my_rdp && tail == >nocb_follower_head) { > + if (rdp != my_rdp) { This will burn a bit of extra CPU time, but it should be fine other than that. Thanx, Paul > /* > * List was empty, wake up the follower. > * Memory barriers supplied by atomic_long_add(). > > > From: David Chen > Sent: Friday, July 20, 2018 5:12 PM > To: paul...@linux.vnet.ibm.com > Cc: linux-kernel@vger.kernel.org > Subject: Re: RCU nocb list not reclaiming causing OOM > > > Hi Paul, > > Ok, I'll try those patches. > > Thanks, > David > > From: Paul E. McKenney > Sent: Friday, July 20, 2018 4:32:12 PM > To: David Chen > Cc: linux-kernel@vger.kernel.org > Subject: Re: RCU nocb list not reclaiming causing OOM > > > On Fri, Jul 20, 2018 at 11:05:52PM +, David Chen wrote: > > Hi Paul, > > > > We hit an RCU issue on 4.9.37 kernel. One of the nocb_follower list grows > > too > > large, and not getting reclaimed, causing the system to OOM. > > > > Printing the culprit rcu_sched_data: > > > > nocb_q_count = { > > counter = 32369635 > > }, > > nocb_follower_head = 0x88ae901c0a00, > > nocb_follower_tail = 0x88af1538b8d8, > > nocb_kthread = 0x88b06d29, > > > > As you can see here, the nocb_follower_head is not empty, so in theory, the > > nocb_kthread shouldn't go to sleep. However, if dump the stack of the > > kthread: > > > > crash> bt 0x88b06d29 > > PID: 21 TASK: 88b06d29 CPU: 3 COMMAND: "rcuos/1" > > #0 [afc9020b7dc0] __schedule at 8d8789dc > > #1 [afc9020b7e38] schedule at 8d878e76 > > #2 [afc9020b7e50] rcu_nocb_kthread at 8d112337 > > #3 [afc9020b7ec8] kthread at 8d0c6ce7 > > #4 [afc9020b7f50] ret_from_fork at 8d87d755 > > > > And if we dis the address at 8d112337: > > > > /usr/src/debug/kernel-4.9.37/linux-4.9.37-29.nutanix.07142017.el7.centos.x86_64/kernel/rcu/tree_plugin.h: > > 2106 > > 0x8d11232d : test %rax,%rax > > 0x8d112330 : jne 0x8d112355 > > > > 0x8d112332 : callq 0x8d878e40 > > > > 0x8d112337 : lea -0x40(%rbp),%rsi > > > > So the kthread is blocked at swait_event_interruptible in the > > nocb_follower_wait. > > This contradict with the fact that nocb_follower_head
[PATCH 07/10] staging:rtl8192u: Refactor dm_dig_connect_e - Style
The enumerated type dm_dig_connect_e is only used to group constant values, as the actual type is never used as the type for the variables which use the defined constants (cur_connect_state and pre_connect_state). These two member variables have had there defined types changed to properly reflect there usage and to permit compiler type checks to be performed. In addition the definition of the enumerated type has been moved above the structure which uses the type. The typedef of the enumerated type has been removed to clear the checkpatch issue with defining new types and the enumerated value DIG_CONNECT_MAX has been removed since it is never used in code. The resulting changes are all coding style in nature and should not impact runtime code execution. Signed-off-by: John Whitmore --- drivers/staging/rtl8192u/r8192U_dm.h | 15 +++ 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/drivers/staging/rtl8192u/r8192U_dm.h b/drivers/staging/rtl8192u/r8192U_dm.h index 450bcbe3095e..825df06e17a8 100644 --- a/drivers/staging/rtl8192u/r8192U_dm.h +++ b/drivers/staging/rtl8192u/r8192U_dm.h @@ -76,6 +76,11 @@ enum dm_dig_sta_e { DM_STA_DIG_MAX }; +enum dm_dig_connect_e { + DIG_DISCONNECT = 0, + DIG_CONNECT = 1, +}; + /* 2007/10/04 MH Define upper and lower threshold of DIG enable or disable. */ struct dig { u8 dig_enable_flag; @@ -91,8 +96,8 @@ struct dig { enum dm_dig_sta_e dig_state; enum dm_dig_sta_e dig_highpwr_state; - u8 cur_connect_state; - u8 pre_connect_state; + enum dm_dig_connect_e cur_connect_state; + enum dm_dig_connect_e pre_connect_state; u8 curpd_thstate; u8 prepd_thstate; @@ -110,12 +115,6 @@ struct dig { longrssi_val; }; -typedef enum tag_dig_connect_definition { - DIG_DISCONNECT = 0, - DIG_CONNECT = 1, - DIG_CONNECT_MAX -} dm_dig_connect_e; - typedef enum tag_dig_packetdetection_threshold_definition { DIG_PD_AT_LOW_POWER = 0, DIG_PD_AT_NORMAL_POWER = 1, -- 2.18.0
[PATCH 07/10] staging:rtl8192u: Refactor dm_dig_connect_e - Style
The enumerated type dm_dig_connect_e is only used to group constant values, as the actual type is never used as the type for the variables which use the defined constants (cur_connect_state and pre_connect_state). These two member variables have had there defined types changed to properly reflect there usage and to permit compiler type checks to be performed. In addition the definition of the enumerated type has been moved above the structure which uses the type. The typedef of the enumerated type has been removed to clear the checkpatch issue with defining new types and the enumerated value DIG_CONNECT_MAX has been removed since it is never used in code. The resulting changes are all coding style in nature and should not impact runtime code execution. Signed-off-by: John Whitmore --- drivers/staging/rtl8192u/r8192U_dm.h | 15 +++ 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/drivers/staging/rtl8192u/r8192U_dm.h b/drivers/staging/rtl8192u/r8192U_dm.h index 450bcbe3095e..825df06e17a8 100644 --- a/drivers/staging/rtl8192u/r8192U_dm.h +++ b/drivers/staging/rtl8192u/r8192U_dm.h @@ -76,6 +76,11 @@ enum dm_dig_sta_e { DM_STA_DIG_MAX }; +enum dm_dig_connect_e { + DIG_DISCONNECT = 0, + DIG_CONNECT = 1, +}; + /* 2007/10/04 MH Define upper and lower threshold of DIG enable or disable. */ struct dig { u8 dig_enable_flag; @@ -91,8 +96,8 @@ struct dig { enum dm_dig_sta_e dig_state; enum dm_dig_sta_e dig_highpwr_state; - u8 cur_connect_state; - u8 pre_connect_state; + enum dm_dig_connect_e cur_connect_state; + enum dm_dig_connect_e pre_connect_state; u8 curpd_thstate; u8 prepd_thstate; @@ -110,12 +115,6 @@ struct dig { longrssi_val; }; -typedef enum tag_dig_connect_definition { - DIG_DISCONNECT = 0, - DIG_CONNECT = 1, - DIG_CONNECT_MAX -} dm_dig_connect_e; - typedef enum tag_dig_packetdetection_threshold_definition { DIG_PD_AT_LOW_POWER = 0, DIG_PD_AT_NORMAL_POWER = 1, -- 2.18.0
[PATCH 04/10] staging:rtl8192u: Remove enum dm_dig_op_e
Remove the enumerated type dm_dig_op_e. The type is only used as a parameter to the function dm_change_dynamic_initgain_thresh(), but that function is never referenced in the code at all. I would consider this to be a coding style change as the function is never referenced and as a result the enumeration is never used. In any case there should be no impact on runtime code execution. Signed-off-by: John Whitmore --- drivers/staging/rtl8192u/r8192U_dm.c | 91 drivers/staging/rtl8192u/r8192U_dm.h | 20 -- 2 files changed, 111 deletions(-) diff --git a/drivers/staging/rtl8192u/r8192U_dm.c b/drivers/staging/rtl8192u/r8192U_dm.c index c4e4e3ba394b..16dbe55d0e15 100644 --- a/drivers/staging/rtl8192u/r8192U_dm.c +++ b/drivers/staging/rtl8192u/r8192U_dm.c @@ -1613,97 +1613,6 @@ static void dm_bb_initialgain_backup(struct net_device *dev) } /* dm_BBInitialGainBakcup */ #endif -/*- - * Function: dm_change_dynamic_initgain_thresh() - * - * Overview: - * - * Input: NONE - * - * Output: NONE - * - * Return: NONE - * - * Revised History: - * WhenWho Remark - * 05/29/2008 amy Create Version 0 porting from windows code. - * - *---*/ - -void dm_change_dynamic_initgain_thresh(struct net_device *dev, u32 dm_type, - u32 dm_value) -{ - switch (dm_type) { - case DIG_TYPE_THRESH_HIGH: - dm_digtable.rssi_high_thresh = dm_value; - break; - - case DIG_TYPE_THRESH_LOW: - dm_digtable.rssi_low_thresh = dm_value; - break; - - case DIG_TYPE_THRESH_HIGHPWR_HIGH: - dm_digtable.rssi_high_power_highthresh = dm_value; - break; - - case DIG_TYPE_THRESH_HIGHPWR_LOW: - dm_digtable.rssi_high_power_lowthresh = dm_value; - break; - - case DIG_TYPE_ENABLE: - dm_digtable.dig_state = DM_STA_DIG_MAX; - dm_digtable.dig_enable_flag = true; - break; - - case DIG_TYPE_DISABLE: - dm_digtable.dig_state = DM_STA_DIG_MAX; - dm_digtable.dig_enable_flag = false; - break; - - case DIG_TYPE_DBG_MODE: - if (dm_value >= DM_DBG_MAX) - dm_value = DM_DBG_OFF; - dm_digtable.dbg_mode= (u8)dm_value; - break; - - case DIG_TYPE_RSSI: - if (dm_value > 100) - dm_value = 30; - dm_digtable.rssi_val= (long)dm_value; - break; - - case DIG_TYPE_ALGORITHM: - if (dm_value >= DIG_ALGO_MAX) - dm_value = DIG_ALGO_BY_FALSE_ALARM; - if (dm_digtable.dig_algorithm != (u8)dm_value) - dm_digtable.dig_algorithm_switch = 1; - dm_digtable.dig_algorithm = (u8)dm_value; - break; - - case DIG_TYPE_BACKOFF: - if (dm_value > 30) - dm_value = 30; - dm_digtable.backoff_val = (u8)dm_value; - break; - - case DIG_TYPE_RX_GAIN_MIN: - if (dm_value == 0) - dm_value = 0x1; - dm_digtable.rx_gain_range_min = (u8)dm_value; - break; - - case DIG_TYPE_RX_GAIN_MAX: - if (dm_value > 0x50) - dm_value = 0x50; - dm_digtable.rx_gain_range_max = (u8)dm_value; - break; - - default: - break; - } - -} /* DM_ChangeDynamicInitGainThresh */ - /*- * Function: dm_dig_init() * diff --git a/drivers/staging/rtl8192u/r8192U_dm.h b/drivers/staging/rtl8192u/r8192U_dm.h index 942e9ba8d3ad..7d7eeb1cb9fc 100644 --- a/drivers/staging/rtl8192u/r8192U_dm.h +++ b/drivers/staging/rtl8192u/r8192U_dm.h @@ -105,24 +105,6 @@ struct dig { longrssi_val; }; -/* 2007/10/11 MH Define DIG operation type. */ -typedef enum tag_dynamic_init_gain_operation_type_definition { - DIG_TYPE_THRESH_HIGH= 0, - DIG_TYPE_THRESH_LOW = 1, - DIG_TYPE_THRESH_HIGHPWR_HIGH= 2, - DIG_TYPE_THRESH_HIGHPWR_LOW = 3, - DIG_TYPE_DBG_MODE = 4, - DIG_TYPE_RSSI = 5, - DIG_TYPE_ALGORITHM = 6, - DIG_TYPE_BACKOFF= 7, - DIG_TYPE_PWDB_FACTOR= 8, - DIG_TYPE_RX_GAIN_MIN= 9, - DIG_TYPE_RX_GAIN_MAX
[PATCH 04/10] staging:rtl8192u: Remove enum dm_dig_op_e
Remove the enumerated type dm_dig_op_e. The type is only used as a parameter to the function dm_change_dynamic_initgain_thresh(), but that function is never referenced in the code at all. I would consider this to be a coding style change as the function is never referenced and as a result the enumeration is never used. In any case there should be no impact on runtime code execution. Signed-off-by: John Whitmore --- drivers/staging/rtl8192u/r8192U_dm.c | 91 drivers/staging/rtl8192u/r8192U_dm.h | 20 -- 2 files changed, 111 deletions(-) diff --git a/drivers/staging/rtl8192u/r8192U_dm.c b/drivers/staging/rtl8192u/r8192U_dm.c index c4e4e3ba394b..16dbe55d0e15 100644 --- a/drivers/staging/rtl8192u/r8192U_dm.c +++ b/drivers/staging/rtl8192u/r8192U_dm.c @@ -1613,97 +1613,6 @@ static void dm_bb_initialgain_backup(struct net_device *dev) } /* dm_BBInitialGainBakcup */ #endif -/*- - * Function: dm_change_dynamic_initgain_thresh() - * - * Overview: - * - * Input: NONE - * - * Output: NONE - * - * Return: NONE - * - * Revised History: - * WhenWho Remark - * 05/29/2008 amy Create Version 0 porting from windows code. - * - *---*/ - -void dm_change_dynamic_initgain_thresh(struct net_device *dev, u32 dm_type, - u32 dm_value) -{ - switch (dm_type) { - case DIG_TYPE_THRESH_HIGH: - dm_digtable.rssi_high_thresh = dm_value; - break; - - case DIG_TYPE_THRESH_LOW: - dm_digtable.rssi_low_thresh = dm_value; - break; - - case DIG_TYPE_THRESH_HIGHPWR_HIGH: - dm_digtable.rssi_high_power_highthresh = dm_value; - break; - - case DIG_TYPE_THRESH_HIGHPWR_LOW: - dm_digtable.rssi_high_power_lowthresh = dm_value; - break; - - case DIG_TYPE_ENABLE: - dm_digtable.dig_state = DM_STA_DIG_MAX; - dm_digtable.dig_enable_flag = true; - break; - - case DIG_TYPE_DISABLE: - dm_digtable.dig_state = DM_STA_DIG_MAX; - dm_digtable.dig_enable_flag = false; - break; - - case DIG_TYPE_DBG_MODE: - if (dm_value >= DM_DBG_MAX) - dm_value = DM_DBG_OFF; - dm_digtable.dbg_mode= (u8)dm_value; - break; - - case DIG_TYPE_RSSI: - if (dm_value > 100) - dm_value = 30; - dm_digtable.rssi_val= (long)dm_value; - break; - - case DIG_TYPE_ALGORITHM: - if (dm_value >= DIG_ALGO_MAX) - dm_value = DIG_ALGO_BY_FALSE_ALARM; - if (dm_digtable.dig_algorithm != (u8)dm_value) - dm_digtable.dig_algorithm_switch = 1; - dm_digtable.dig_algorithm = (u8)dm_value; - break; - - case DIG_TYPE_BACKOFF: - if (dm_value > 30) - dm_value = 30; - dm_digtable.backoff_val = (u8)dm_value; - break; - - case DIG_TYPE_RX_GAIN_MIN: - if (dm_value == 0) - dm_value = 0x1; - dm_digtable.rx_gain_range_min = (u8)dm_value; - break; - - case DIG_TYPE_RX_GAIN_MAX: - if (dm_value > 0x50) - dm_value = 0x50; - dm_digtable.rx_gain_range_max = (u8)dm_value; - break; - - default: - break; - } - -} /* DM_ChangeDynamicInitGainThresh */ - /*- * Function: dm_dig_init() * diff --git a/drivers/staging/rtl8192u/r8192U_dm.h b/drivers/staging/rtl8192u/r8192U_dm.h index 942e9ba8d3ad..7d7eeb1cb9fc 100644 --- a/drivers/staging/rtl8192u/r8192U_dm.h +++ b/drivers/staging/rtl8192u/r8192U_dm.h @@ -105,24 +105,6 @@ struct dig { longrssi_val; }; -/* 2007/10/11 MH Define DIG operation type. */ -typedef enum tag_dynamic_init_gain_operation_type_definition { - DIG_TYPE_THRESH_HIGH= 0, - DIG_TYPE_THRESH_LOW = 1, - DIG_TYPE_THRESH_HIGHPWR_HIGH= 2, - DIG_TYPE_THRESH_HIGHPWR_LOW = 3, - DIG_TYPE_DBG_MODE = 4, - DIG_TYPE_RSSI = 5, - DIG_TYPE_ALGORITHM = 6, - DIG_TYPE_BACKOFF= 7, - DIG_TYPE_PWDB_FACTOR= 8, - DIG_TYPE_RX_GAIN_MIN= 9, - DIG_TYPE_RX_GAIN_MAX
[PATCH 05/10] staging:rtl8192u: Refactor enum dm_dig_alg_e - Style
The enumerated type dm_dig_alg_e is only used by one variable in the code, 'dig_algorithm', a member variable of the structure dig. That member variable was defined to be of type 'u8' thus negating any advantage of the use of an enumerated type, (compiler type-checking). The type of the variable 'dig_algorithm' has been change to reflect its use of the enumeration and the enumerated type moved in the file so that it appears before it is used in the file. Additionally the 'typedef' has been removed to clear the checkpatch issue with defining new types. These changes are all coding style in nature and as such should have no impact on runtime code execution. Signed-off-by: John Whitmore --- drivers/staging/rtl8192u/r8192U_dm.h | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/staging/rtl8192u/r8192U_dm.h b/drivers/staging/rtl8192u/r8192U_dm.h index 7d7eeb1cb9fc..22e25a5cf40a 100644 --- a/drivers/staging/rtl8192u/r8192U_dm.h +++ b/drivers/staging/rtl8192u/r8192U_dm.h @@ -65,6 +65,11 @@ /*--Define structure*/ +enum dm_dig_alg_e { + DIG_ALGO_BY_FALSE_ALARM = 0, + DIG_ALGO_BY_RSSI= 1, +}; + enum dm_dig_sta_e { DM_STA_DIG_OFF = 0, DM_STA_DIG_ON, @@ -74,7 +79,7 @@ enum dm_dig_sta_e { /* 2007/10/04 MH Define upper and lower threshold of DIG enable or disable. */ struct dig { u8 dig_enable_flag; - u8 dig_algorithm; + enum dm_dig_alg_e dig_algorithm; u8 dbg_mode; u8 dig_algorithm_switch; @@ -105,12 +110,6 @@ struct dig { longrssi_val; }; -typedef enum tag_dig_algorithm_definition { - DIG_ALGO_BY_FALSE_ALARM = 0, - DIG_ALGO_BY_RSSI= 1, - DIG_ALGO_MAX -} dm_dig_alg_e; - typedef enum tag_dig_dbgmode_definition { DIG_DBG_OFF = 0, DIG_DBG_ON = 1, -- 2.18.0
[PATCH 09/10] staging:rtl8192u: Refactor enum dm_dig_cs_ratio_e - Style
The enumerated type dm_dig_cs_ratio_e is never actually used as a type, but only as a collection of related constants. This is because the variables, which use the defined constant values, are defined as being of type u8 rather then enum dm_dig_cs_ratio_e. This omission negates the possibility of taking advantage of compiler type checking. To enable the use of compiler type checking of the enumeration the two variables, (curcs_ratio_state & precs_ratio_state), which use the type's constants have their types changed from u8 to enum dm_dig_cs_ratio_e. Additionally the types declaration has been moved above the dig structure, where the type is used. The 'typedef' keyword has been removed from the type to clear the checkpatch issue with defining new types. And the constant DIG_CS_MAX has been removed since this is never used in the code. These changes are purely coding style changes and should not impact on runtime code execution. Signed-off-by: John Whitmore --- drivers/staging/rtl8192u/r8192U_dm.h | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/staging/rtl8192u/r8192U_dm.h b/drivers/staging/rtl8192u/r8192U_dm.h index 41b08211e958..2aef330379cb 100644 --- a/drivers/staging/rtl8192u/r8192U_dm.h +++ b/drivers/staging/rtl8192u/r8192U_dm.h @@ -87,6 +87,11 @@ enum dm_dig_pd_th_e { DIG_PD_AT_HIGH_POWER = 2, }; +enum dm_dig_cs_ratio_e { + DIG_CS_RATIO_LOWER = 0, + DIG_CS_RATIO_HIGHER = 1, +}; + /* 2007/10/04 MH Define upper and lower threshold of DIG enable or disable. */ struct dig { u8 dig_enable_flag; @@ -107,8 +112,8 @@ struct dig { enum dm_dig_pd_th_e curpd_thstate; enum dm_dig_pd_th_e prepd_thstate; - u8 curcs_ratio_state; - u8 precs_ratio_state; + enum dm_dig_cs_ratio_e curcs_ratio_state; + enum dm_dig_cs_ratio_e precs_ratio_state; u32 pre_ig_value; u32 cur_ig_value; @@ -121,11 +126,6 @@ struct dig { longrssi_val; }; -typedef enum tag_dig_cck_cs_ratio_state_definition { - DIG_CS_RATIO_LOWER = 0, - DIG_CS_RATIO_HIGHER = 1, - DIG_CS_MAX -} dm_dig_cs_ratio_e; struct dynamic_rx_path_sel { u8 Enable; u8 DbgMode; -- 2.18.0
[PATCH 02/10] staging:rtl8192u: Refactor use of enum dm_dig_sta_e - Style
Refactor the use of the enumerated type dm_dig_sta_e, which is not actually used for type checking by the compiler. The enumerated type defines values for the enumeration, which are used by both dig_state and dig_highpwr_state, (members of the struct dig). Both of those variables were defined as being of type u8. This negates any usefulness of the use of the enumeration, (compiler type checking). To make use of the compiler's type-checking the two member variables, dig_state and dig_highpwr_state have been changed to being of type enum dm_dig_sta_e. The enumerated type has been moved above the struct dig definition so that the enumeration is already defined when compiler reaches the two types using the enumerated type. In addition the 'typedef' of the enumerated type has been removed to clear the checkpatch issue with declaring new types. These changes, whilst convoluted, are purely coding style in nature and should not impact runtime code execution. Signed-off-by: John Whitmore --- drivers/staging/rtl8192u/r8192U_dm.h | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/staging/rtl8192u/r8192U_dm.h b/drivers/staging/rtl8192u/r8192U_dm.h index e86dda99c223..2444e1c1357b 100644 --- a/drivers/staging/rtl8192u/r8192U_dm.h +++ b/drivers/staging/rtl8192u/r8192U_dm.h @@ -64,6 +64,13 @@ /*--Define structure*/ + +enum dm_dig_sta_e { + DM_STA_DIG_OFF = 0, + DM_STA_DIG_ON, + DM_STA_DIG_MAX +}; + /* 2007/10/04 MH Define upper and lower threshold of DIG enable or disable. */ struct dig { u8 dig_enable_flag; @@ -77,8 +84,8 @@ struct dig { longrssi_high_power_lowthresh; longrssi_high_power_highthresh; - u8 dig_state; - u8 dig_highpwr_state; + enum dm_dig_sta_e dig_state; + enum dm_dig_sta_e dig_highpwr_state; u8 cur_connect_state; u8 pre_connect_state; @@ -98,13 +105,6 @@ struct dig { longrssi_val; }; -typedef enum tag_dynamic_init_gain_state_definition { - DM_STA_DIG_OFF = 0, - DM_STA_DIG_ON, - DM_STA_DIG_MAX -} dm_dig_sta_e; - - /* 2007/10/08 MH Define RATR state. */ typedef enum tag_dynamic_ratr_state_definition { DM_RATR_STA_HIGH = 0, -- 2.18.0
[PATCH 03/10] staging:rtl8192u: Refactor enum dm_ratr_sta_e usage - Style
The enumerated type dm_ratr_sta_e was defined in the file drivers/staging/rtl8192u/r8192U_dm.h but never actually used in that file. The only variable which uses this enumerated type is 'ratr_state', a member variable of the _rate_adaptive structure defined in the file drivers/staging/rtl8192u/r8192U.h. To clarify and place the enumerated type close to where it is used the type was moved to the drivers/staging/rtl8192u/r8192U.h file. In addition the member variable 'ratr_state' which uses the enumerated constants was declared as being of type 'u8'. This negates any advantage of the enumerated type, compiler type-checking, so that member variable's type has been changed to being of the enumerated type. Additionally the typedef from the enumerated type has been removed to clear the checkpatch issue with defining new types. This is a coding style change and should have no impact on runtime code execution. Signed-off-by: John Whitmore --- drivers/staging/rtl8192u/r8192U.h| 10 +- drivers/staging/rtl8192u/r8192U_dm.h | 8 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/staging/rtl8192u/r8192U.h b/drivers/staging/rtl8192u/r8192U.h index e615d9b3f6b1..b00781e1f5ee 100644 --- a/drivers/staging/rtl8192u/r8192U.h +++ b/drivers/staging/rtl8192u/r8192U.h @@ -662,9 +662,17 @@ typedef enum _RT_RF_TYPE_819xU { RF_PSEUDO_11N = 4, } RT_RF_TYPE_819xU, *PRT_RF_TYPE_819xU; +/* 2007/10/08 MH Define RATR state. */ +enum dm_ratr_sta_e { + DM_RATR_STA_HIGH = 0, + DM_RATR_STA_MIDDLE = 1, + DM_RATR_STA_LOW = 2, + DM_RATR_STA_MAX +}; + typedef struct _rate_adaptive { u8 rate_adaptive_disabled; - u8 ratr_state; + enum dm_ratr_sta_e ratr_state; u16 reserve; u32 high_rssi_thresh_for_ra; diff --git a/drivers/staging/rtl8192u/r8192U_dm.h b/drivers/staging/rtl8192u/r8192U_dm.h index 2444e1c1357b..942e9ba8d3ad 100644 --- a/drivers/staging/rtl8192u/r8192U_dm.h +++ b/drivers/staging/rtl8192u/r8192U_dm.h @@ -105,14 +105,6 @@ struct dig { longrssi_val; }; -/* 2007/10/08 MH Define RATR state. */ -typedef enum tag_dynamic_ratr_state_definition { - DM_RATR_STA_HIGH = 0, - DM_RATR_STA_MIDDLE = 1, - DM_RATR_STA_LOW = 2, - DM_RATR_STA_MAX -} dm_ratr_sta_e; - /* 2007/10/11 MH Define DIG operation type. */ typedef enum tag_dynamic_init_gain_operation_type_definition { DIG_TYPE_THRESH_HIGH= 0, -- 2.18.0
[PATCH 05/10] staging:rtl8192u: Refactor enum dm_dig_alg_e - Style
The enumerated type dm_dig_alg_e is only used by one variable in the code, 'dig_algorithm', a member variable of the structure dig. That member variable was defined to be of type 'u8' thus negating any advantage of the use of an enumerated type, (compiler type-checking). The type of the variable 'dig_algorithm' has been change to reflect its use of the enumeration and the enumerated type moved in the file so that it appears before it is used in the file. Additionally the 'typedef' has been removed to clear the checkpatch issue with defining new types. These changes are all coding style in nature and as such should have no impact on runtime code execution. Signed-off-by: John Whitmore --- drivers/staging/rtl8192u/r8192U_dm.h | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/staging/rtl8192u/r8192U_dm.h b/drivers/staging/rtl8192u/r8192U_dm.h index 7d7eeb1cb9fc..22e25a5cf40a 100644 --- a/drivers/staging/rtl8192u/r8192U_dm.h +++ b/drivers/staging/rtl8192u/r8192U_dm.h @@ -65,6 +65,11 @@ /*--Define structure*/ +enum dm_dig_alg_e { + DIG_ALGO_BY_FALSE_ALARM = 0, + DIG_ALGO_BY_RSSI= 1, +}; + enum dm_dig_sta_e { DM_STA_DIG_OFF = 0, DM_STA_DIG_ON, @@ -74,7 +79,7 @@ enum dm_dig_sta_e { /* 2007/10/04 MH Define upper and lower threshold of DIG enable or disable. */ struct dig { u8 dig_enable_flag; - u8 dig_algorithm; + enum dm_dig_alg_e dig_algorithm; u8 dbg_mode; u8 dig_algorithm_switch; @@ -105,12 +110,6 @@ struct dig { longrssi_val; }; -typedef enum tag_dig_algorithm_definition { - DIG_ALGO_BY_FALSE_ALARM = 0, - DIG_ALGO_BY_RSSI= 1, - DIG_ALGO_MAX -} dm_dig_alg_e; - typedef enum tag_dig_dbgmode_definition { DIG_DBG_OFF = 0, DIG_DBG_ON = 1, -- 2.18.0
[PATCH 09/10] staging:rtl8192u: Refactor enum dm_dig_cs_ratio_e - Style
The enumerated type dm_dig_cs_ratio_e is never actually used as a type, but only as a collection of related constants. This is because the variables, which use the defined constant values, are defined as being of type u8 rather then enum dm_dig_cs_ratio_e. This omission negates the possibility of taking advantage of compiler type checking. To enable the use of compiler type checking of the enumeration the two variables, (curcs_ratio_state & precs_ratio_state), which use the type's constants have their types changed from u8 to enum dm_dig_cs_ratio_e. Additionally the types declaration has been moved above the dig structure, where the type is used. The 'typedef' keyword has been removed from the type to clear the checkpatch issue with defining new types. And the constant DIG_CS_MAX has been removed since this is never used in the code. These changes are purely coding style changes and should not impact on runtime code execution. Signed-off-by: John Whitmore --- drivers/staging/rtl8192u/r8192U_dm.h | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/staging/rtl8192u/r8192U_dm.h b/drivers/staging/rtl8192u/r8192U_dm.h index 41b08211e958..2aef330379cb 100644 --- a/drivers/staging/rtl8192u/r8192U_dm.h +++ b/drivers/staging/rtl8192u/r8192U_dm.h @@ -87,6 +87,11 @@ enum dm_dig_pd_th_e { DIG_PD_AT_HIGH_POWER = 2, }; +enum dm_dig_cs_ratio_e { + DIG_CS_RATIO_LOWER = 0, + DIG_CS_RATIO_HIGHER = 1, +}; + /* 2007/10/04 MH Define upper and lower threshold of DIG enable or disable. */ struct dig { u8 dig_enable_flag; @@ -107,8 +112,8 @@ struct dig { enum dm_dig_pd_th_e curpd_thstate; enum dm_dig_pd_th_e prepd_thstate; - u8 curcs_ratio_state; - u8 precs_ratio_state; + enum dm_dig_cs_ratio_e curcs_ratio_state; + enum dm_dig_cs_ratio_e precs_ratio_state; u32 pre_ig_value; u32 cur_ig_value; @@ -121,11 +126,6 @@ struct dig { longrssi_val; }; -typedef enum tag_dig_cck_cs_ratio_state_definition { - DIG_CS_RATIO_LOWER = 0, - DIG_CS_RATIO_HIGHER = 1, - DIG_CS_MAX -} dm_dig_cs_ratio_e; struct dynamic_rx_path_sel { u8 Enable; u8 DbgMode; -- 2.18.0