Re: [PATCH rdma-rc 1/3] RDMA/hns: Fix the Oops during rmmod or insmod ko when reset occurs
On 2019/1/12 5:34, Jason Gunthorpe wrote: > On Thu, Jan 10, 2019 at 09:57:41PM +0800, Wei Hu (Xavier) wrote: >> +/* Check the status of the current software reset process, if in >> + * software reset process, wait until software reset process finished, >> + * in order to ensure that reset process and this function will not call >> + * __hns_roce_hw_v2_uninit_instance at the same time. >> + * If a timeout occurs, it indicates that the network subsystem has >> + * encountered a serious error and cannot be recovered from the reset >> + * processing. >> + */ >> +if (ops->ae_dev_resetting(handle)) { >> +dev_warn(dev, "Device is busy in resetting state. waiting.\n"); >> +end = msecs_to_jiffies(HNS_ROCE_V2_RST_PRC_MAX_TIME) + jiffies; >> +while (ops->ae_dev_resetting(handle) && >> + time_before(jiffies, end)) >> +msleep(20); > Really? Does this have to be so ugly? Why isn't there just a simple > lock someplace that is held during reset? > > I'm skeptical that all this strange looking stuff is properly locked > and concurrency safe. Hi, Jason The hns3 NIC driver notifies the hns RoCE driver to perform reset related processing by calling the .reset_notify() interface registered by the RoCE driver. There is a constraint on the hip08 chip, the NIC driver needs to stop the flow before hardware startup reset, otherwise the chip may hang up. We've also thought about using locks, but found using locks can lead to more serious problems because of that restriction of the chip. If using locks here, reset processing may wait for uninstallation to complete, this may lead that NIC driver fails to stop the flow in time in the reset process, thus causing the chip to hang up. Regards Xavier > Also, this series seems a bit big for -rc > > Jason > > . >
[PATCH v2] x86/boot/e820: Use kmemdup instead of duplicating its function
From: "Huang Zijiang" changes since v1: redo the patch based on suggestions from Boris. v1 link:https://lkml.org/lkml/2019/1/8/30 kmemdup is the same function as kmalloc() + memcpy(). Signed-off-by: Huang Zijiang --- arch/x86/kernel/e820.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c index 50895c2..a687d10 100644 --- a/arch/x86/kernel/e820.c +++ b/arch/x86/kernel/e820.c @@ -671,21 +671,18 @@ __init void e820__reallocate_tables(void) int size; size = offsetof(struct e820_table, entries) + sizeof(struct e820_entry)*e820_table->nr_entries; - n = kmalloc(size, GFP_KERNEL); + n = kmemdup(e820_table, size, GFP_KERNEL); BUG_ON(!n); - memcpy(n, e820_table, size); e820_table = n; size = offsetof(struct e820_table, entries) + sizeof(struct e820_entry)*e820_table_kexec->nr_entries; - n = kmalloc(size, GFP_KERNEL); + n = kmemdup(e820_table_kexec, size, GFP_KERNEL); BUG_ON(!n); - memcpy(n, e820_table_kexec, size); e820_table_kexec = n; size = offsetof(struct e820_table, entries) + sizeof(struct e820_entry)*e820_table_firmware->nr_entries; - n = kmalloc(size, GFP_KERNEL); + n = kmemdup(e820_table_firmware, size, GFP_KERNEL); BUG_ON(!n); - memcpy(n, e820_table_firmware, size); e820_table_firmware = n; } -- 1.8.3.1
[PATCH] fgraph: record function return value
This patch adds a new trace option 'funcgraph-retval' and is disabled by default. When this option is enabled, fgraph tracer will show the return value of each function. This is useful to find/analyze a original error source in a call graph. One limitation is that kernel doesn't know the prototype of functions. So fgraph assumes all functions have a retvalue of type int. You must ignore the value of *void* function. And if the retvalue looks like an error code then both hexadecimal and decimal number are displayed. In this patch, only x86 and ARM platforms are supported. Here is example showing the error is caused by vmx_create_vcpu() and the error code is -5 (-EIO). with echo 1 > /sys/kernel/debug/tracing/options/funcgraph-retval 3) | kvm_vm_ioctl() { 3) |mutex_lock() { 3) | _cond_resched() { 3) 0.234 us|rcu_all_qs(); /* ret=0x8000 */ 3) 0.704 us| } /* ret=0x0 */ 3) 1.226 us|} /* ret=0x0 */ 3) 0.247 us|mutex_unlock(); /* ret=0x8880738ed040 */ 3) |kvm_arch_vcpu_create() { 3) | vmx_create_vcpu() { 3) + 17.969 us |kmem_cache_alloc(); /* ret=0x88813a980040 */ 3) + 15.948 us |kmem_cache_alloc(); /* ret=0x88813aa99200 */ 3) 0.653 us|allocate_vpid.part.88(); /* ret=0x1 */ 3) 6.964 us|kvm_vcpu_init(); /* ret=0xfffb */ 3) 0.323 us|free_vpid.part.89(); /* ret=0x1 */ 3) 9.985 us|kmem_cache_free(); /* ret=0x8000 */ 3) 9.491 us|kmem_cache_free(); /* ret=0x8000 */ 3) + 69.858 us | } /* ret=0xfffb/-5 */ 3) + 70.631 us |} /* ret=0xfffb/-5 */ 3) |mutex_lock() { 3) | _cond_resched() { 3) 0.199 us|rcu_all_qs(); /* ret=0x8000 */ 3) 0.594 us| } /* ret=0x0 */ 3) 1.067 us|} /* ret=0x0 */ 3) 0.337 us|mutex_unlock(); /* ret=0x8880738ed040 */ 3) + 92.730 us | } /* ret=0xfffb/-5 */ Signed-off-by: Changbin Du --- Documentation/trace/ftrace.rst | 5 arch/arm/kernel/entry-ftrace.S | 1 + arch/arm64/kernel/entry-ftrace.S | 1 + arch/x86/kernel/ftrace_32.S | 1 + arch/x86/kernel/ftrace_64.S | 1 + include/linux/ftrace.h | 1 + kernel/trace/Kconfig | 4 +++ kernel/trace/fgraph.c| 17 +++- kernel/trace/trace.h | 1 + kernel/trace/trace_entries.h | 1 + kernel/trace/trace_functions_graph.c | 39 11 files changed, 66 insertions(+), 6 deletions(-) diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst index 0131df7f5968..9a4581698eb5 100644 --- a/Documentation/trace/ftrace.rst +++ b/Documentation/trace/ftrace.rst @@ -1227,6 +1227,11 @@ Options for function_graph tracer: only a closing curly bracket "}" is displayed for the return of a function. + funcgraph-retval - At the end of each function (the return) the +return value the function (though the function may not +really have it) is displayed in hex and negative number +if it looks like a error code. + sleep-time When running function graph tracer, to include the time a task schedules out in its function. diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S index 0be69e551a64..e0d049e5873d 100644 --- a/arch/arm/kernel/entry-ftrace.S +++ b/arch/arm/kernel/entry-ftrace.S @@ -263,6 +263,7 @@ ENDPROC(ftrace_graph_regs_caller) .globl return_to_handler return_to_handler: stmdb sp!, {r0-r3} + mov r1, r0 @ return value mov r0, fp @ frame pointer bl ftrace_return_to_handler mov lr, r0 @ r0 has real ret addr diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S index 81b8eb5c4633..223f4ad269d4 100644 --- a/arch/arm64/kernel/entry-ftrace.S +++ b/arch/arm64/kernel/entry-ftrace.S @@ -202,6 +202,7 @@ ENTRY(return_to_handler) stp x4, x5, [sp, #32] stp x6, x7, [sp, #48] + mov x1, x0 // return value mov x0, x29 // parent's fp bl ftrace_return_to_handler// addr = ftrace_return_to_hander(fp); mov x30, x0 // restore the original return address diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S index 4c8440de3355..02e84d39b0a0 100644 --- a/arch/x86/kernel/ftrace_32.S +++ b/arch/x86/kernel/ftrace_32.S @@ -234,6 +234,7 @@ END(ftrace_graph_caller) return_to_handler: pushl %eax pushl %edx + movl%eax, %edx #ifdef CC_USING_FENTRY movl$0, %eax #else diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S index 75f2b
Re: 5.0-rc1 KVM inspired "BUG: Bad page state in process" spew
On Wed, 2019-01-09 at 11:26 -0800, Sean Christopherson wrote: > > I'll try to bisect. Good luck with that. I gave it a go, but that apparently invalidated the warrantee of my vm image :) -Mike
Re: linux-next: build failure after merge of the pwm tree
Hi all, On Sat, 12 Jan 2019 17:01:17 +1100 Stephen Rothwell wrote: > > [Reported by "kernelci.org bot" ] [Also reported by Mark Brown ] > > After merging the pwm tree, today's linux-next build (arm64-allmodconfig) > failed like this: > > In file included from drivers/pwm/pwm-imx27.c:15: > drivers/pwm/pwm-imx27.c:292:25: error: 'imx_pwm_dt_ids' undeclared here (not > in a function); did you mean 'pwm_imx27_dt_ids'? > MODULE_DEVICE_TABLE(of, imx_pwm_dt_ids); > ^~ > include/linux/module.h:213:15: note: in definition of macro > 'MODULE_DEVICE_TABLE' > extern typeof(name) __mod_##type##__##name##_device_table \ >^~~~ > include/linux/module.h:213:21: error: '__mod_of__imx_pwm_dt_ids_device_table' > aliased to undefined symbol 'imx_pwm_dt_ids' > extern typeof(name) __mod_##type##__##name##_device_table \ > ^~ > drivers/pwm/pwm-imx27.c:292:1: note: in expansion of macro > 'MODULE_DEVICE_TABLE' > MODULE_DEVICE_TABLE(of, imx_pwm_dt_ids); > > Caused by commit > > 5a309d380019 ("pwm: imx: Split into two drivers") -- Cheers, Stephen Rothwell pgp8ijuevIrVu.pgp Description: OpenPGP digital signature
Re: [alsa-devel] [PATCH] ASoC: soc-core: Fix null pointer dereference in soc_find_component
On 1/12/2019 3:19 AM, Pierre-Louis Bossart wrote: Thanks for the overnight fix. This update fixes the issue on my Skylake XPS13 test device (blind testing since I don't understand what the code does). Tested-by: Pierre-Louis Bossart I need to take this back, this set of changes (initial+fix) causes an error with our HDMI support [ 17.437684] sof-audio sof-audio: created machine bxt-pcm512x [ 17.585279] bxt-pcm512x bxt-pcm512x: ASoC: failed to init link iDisp1 [ 17.585639] bxt-pcm512x bxt-pcm512x: snd_soc_register_card failed -517 Removing your changes restores the functionality Looks like we should revert generic implementation for defering probe and move to call from machine driver as done in v1. https://lore.kernel.org/patchwork/patch/1027560/ https://lore.kernel.org/patchwork/patch/1027561/ @Mark, Do you have suggestion to refine current patch? Adding some traces I can see that the the platform name we use doesn't seem compatible with your logic. All the Intel boards used a constant platform name matching the PCI ID, see e.g. [1], which IIRC is used to bind components. Liam, do you recall in more details if this is really required? [1] https://elixir.bootlin.com/linux/latest/source/sound/soc/intel/boards/bxt_da7219_max98357a.c#L475 I think if it does not set platform name as :00:0e.0, it should take snd-soc-dummy as platform name and that might fix the issue. snd-soc-dummy does have component associated with it. https://elixir.bootlin.com/linux/latest/source/sound/soc/soc-utils.c#L281 [ 18.205812] plb: platform name sof-audio [ 18.206059] plb: cpu_name (null) [ 18.206234] plb: platform name :00:0e.0 [ 18.206459] plb: returning -EPROBE_DEFER 1 [ 18.206686] bxt-pcm512x bxt-pcm512x: ASoC: failed to init link iDisp1 [ 18.207054] bxt-pcm512x bxt-pcm512x: snd_soc_register_card failed -517 diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index cbafbdd02483..ae731212f82b 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -1133,11 +1133,15 @@ static int soc_init_dai_link(struct snd_soc_card *card, * Defer card registration if platform dai component is not added to * component list. */ + pr_err("plb: platform name %s\n", link->platform->name); if (link->platform->of_node || link->platform->name) if (!soc_find_component(link->platform->of_node, link->platform->name)) - return -EPROBE_DEFER; + { + pr_err("plb: returning -EPROBE_DEFER 1\n"); + return -EPROBE_DEFER; + } /* * CPU device may be specified by either name or OF node, but * can be left unspecified, and will be matched based on DAI @@ -1154,9 +1158,14 @@ static int soc_init_dai_link(struct snd_soc_card *card, * Defer card registration if cpu dai component is not added to * component list. */ + pr_err("plb: cpu_name %s\n", link->cpu_name); if (link->cpu_of_node || link->cpu_name) if (!soc_find_component(link->cpu_of_node, link->cpu_name)) - return -EPROBE_DEFER; + { + pr_err("plb: returning -EPROBE_DEFER 2\n"); + return -EPROBE_DEFER; + + } /* * At least one of CPU DAI name or CPU device name/node must be Thanks, Rohit -- Qualcomm INDIA, on behalf of Qualcomm Innovation Center, Inc.is a member of the Code Aurora Forum, hosted by the Linux Foundation. --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus
[PATCH] MAINTAINERS: drop PREEMPTIBLE KERNEL section entry
The PREEMPTIBLE KERNEL section entry seems quite outdated: Robert Love is not actively maintaining the file anymore, nor a recorded contributor to the files in the PREEMPTIBLE KERNEL section for the last few years. The mailing list kpreempt-t...@lists.sourceforge.net does not exist anymore; the website just points to some very old patches for v2.4/v2.5. So, let's delete the PREEMPTIBLE KERNEL section entry and clean this up: - Documentation/preempt-locking.txt is not modified much anyway, and the changes in that file are generally maintained by Jonathan Corbet. So, we do not need to explicitly mention Documentation/preempt-locking.txt in MAINTAINERS. - include/linux/preempt.h is maintained by Peter and Ingo, so we simply add that file to the SCHEDULER section entry. I got directed to this issue, as I could not subscribe to the outdated mailing list address and decided to investigate and then cleaned this up. Signed-off-by: Lukas Bulwahn --- MAINTAINERS | 9 + 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 4b7967af08a4..14b991ebf2c1 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -12219,30 +12219,22 @@ F:Documentation/ABI/testing/sysfs-pps F: drivers/pps/ F: include/linux/pps*.h F: include/uapi/linux/pps.h PPTP DRIVER M: Dmitry Kozlov L: net...@vger.kernel.org S: Maintained F: drivers/net/ppp/pptp.c W: http://sourceforge.net/projects/accel-pptp -PREEMPTIBLE KERNEL -M: Robert Love -L: kpreempt-t...@lists.sourceforge.net -W: https://www.kernel.org/pub/linux/kernel/people/rml/preempt-kernel -S: Supported -F: Documentation/preempt-locking.txt -F: include/linux/preempt.h - PRINTK M: Petr Mladek M: Sergey Senozhatsky R: Steven Rostedt S: Maintained F: kernel/printk/ F: include/linux/printk.h PRISM54 WIRELESS DRIVER M: Luis Chamberlain L: linux-wirel...@vger.kernel.org @@ -13449,22 +13441,23 @@ F:drivers/watchdog/sc1200wdt.c SCHEDULER M: Ingo Molnar M: Peter Zijlstra L: linux-kernel@vger.kernel.org T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core S: Maintained F: kernel/sched/ F: include/linux/sched.h F: include/uapi/linux/sched.h F: include/linux/wait.h +F: include/linux/preempt.h SCR24X CHIP CARD INTERFACE DRIVER M: Lubomir Rintel S: Supported F: drivers/char/pcmcia/scr24x_cs.c SCSI CDROM DRIVER M: Jens Axboe L: linux-s...@vger.kernel.org W: http://www.kernel.dk S: Maintained -- 2.17.1
linux-next: build failure after merge of the pwm tree
Hi all, [Reported by "kernelci.org bot" ] After merging the pwm tree, today's linux-next build (arm64-allmodconfig) failed like this: In file included from drivers/pwm/pwm-imx27.c:15: drivers/pwm/pwm-imx27.c:292:25: error: 'imx_pwm_dt_ids' undeclared here (not in a function); did you mean 'pwm_imx27_dt_ids'? MODULE_DEVICE_TABLE(of, imx_pwm_dt_ids); ^~ include/linux/module.h:213:15: note: in definition of macro 'MODULE_DEVICE_TABLE' extern typeof(name) __mod_##type##__##name##_device_table \ ^~~~ include/linux/module.h:213:21: error: '__mod_of__imx_pwm_dt_ids_device_table' aliased to undefined symbol 'imx_pwm_dt_ids' extern typeof(name) __mod_##type##__##name##_device_table \ ^~ drivers/pwm/pwm-imx27.c:292:1: note: in expansion of macro 'MODULE_DEVICE_TABLE' MODULE_DEVICE_TABLE(of, imx_pwm_dt_ids); Caused by commit 5a309d380019 ("pwm: imx: Split into two drivers") -- Cheers, Stephen Rothwell pgpbeqdwPn_NR.pgp Description: OpenPGP digital signature
[PATCH] fs: drop unused fput_atomic definition
commit d7065da03822 ("get rid of the magic around f_count in aio") added fput_atomic to include/linux/fs.h, motivated by its use in __aio_put_req() in fs/aio.c. Later, commit 3ffa3c0e3f6e ("aio: now fput() is OK from interrupt context; get rid of manual delayed __fput()") removed the only use of fput_atomic in __aio_put_req(), but did not remove the since then unused fput_atomic definition in include/linux/fs.h. We curate this now and finally remove the unused definition. This issue was identified during a code review due to a coccinelle warning from the atomic_as_refcounter.cocci rule pointing to the use of atomic_t in fput_atomic. Suggested-by: Krystian Radlak Signed-off-by: Lukas Bulwahn --- compile-tested with defconfig & allyesconfig on v4.20 include/linux/fs.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 811c77743dad..ddf7de4d522b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -952,7 +952,6 @@ static inline struct file *get_file(struct file *f) return f; } #define get_file_rcu(x) atomic_long_inc_not_zero(&(x)->f_count) -#define fput_atomic(x) atomic_long_add_unless(&(x)->f_count, -1, 1) #define file_count(x) atomic_long_read(&(x)->f_count) #defineMAX_NON_LFS ((1UL<<31) - 1) -- 2.17.1
[PATCH] fcoe: make use of fip_mode enum complete
commit 1917d42d14b7 ("fcoe: use enum for fip_mode") introduces a separate enum for the fip_mode that shall be used during initialisation handling until it is passed to fcoe_ctrl_link_up to set the initial fip_state. That change was incomplete and gcc quietly converted in various places between the fip_mode and the fip_state enum values with implicit enum conversions, which fortunately cannot cause any issues in the actual code's execution. clang however warns about these implicit enum conversions in the scsi drivers. This commit consolidates the use of the two enums, guided by clang's enum-conversion warnings. This commit now completes the use of the fip_mode: It expects and uses fip_mode in {bnx2fc,fcoe}_interface_create and fcoe_ctlr_init, and it explicitly converts from fip_mode to fip_state only at the single point in fcoe_ctlr_link_up. To eliminate that adding or removing values from fip_mode or fip_state enum break the right mapping of values, all fip_mode values are assigned to their fip_state counterparts. Signed-off-by: Lukas Bulwahn --- resending of this patch: https://patchwork.kernel.org/patch/10589865/ drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 2 +- drivers/scsi/fcoe/fcoe.c | 2 +- drivers/scsi/fcoe/fcoe_ctlr.c | 4 ++-- drivers/scsi/fcoe/fcoe_transport.c | 2 +- drivers/scsi/qedf/qedf_main.c | 2 +- include/scsi/libfcoe.h | 8 6 files changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c index f00045813378..83aaab4fdab5 100644 --- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c +++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c @@ -1445,7 +1445,7 @@ static struct bnx2fc_hba *bnx2fc_hba_create(struct cnic_dev *cnic) static struct bnx2fc_interface * bnx2fc_interface_create(struct bnx2fc_hba *hba, struct net_device *netdev, - enum fip_state fip_mode) + enum fip_mode fip_mode) { struct fcoe_ctlr_device *ctlr_dev; struct bnx2fc_interface *interface; diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c index f46b312d04bc..6768b2e8148a 100644 --- a/drivers/scsi/fcoe/fcoe.c +++ b/drivers/scsi/fcoe/fcoe.c @@ -390,7 +390,7 @@ static int fcoe_interface_setup(struct fcoe_interface *fcoe, * Returns: pointer to a struct fcoe_interface or NULL on error */ static struct fcoe_interface *fcoe_interface_create(struct net_device *netdev, - enum fip_state fip_mode) + enum fip_mode fip_mode) { struct fcoe_ctlr_device *ctlr_dev; struct fcoe_ctlr *ctlr; diff --git a/drivers/scsi/fcoe/fcoe_ctlr.c b/drivers/scsi/fcoe/fcoe_ctlr.c index 54da3166da8d..a52d3ad94876 100644 --- a/drivers/scsi/fcoe/fcoe_ctlr.c +++ b/drivers/scsi/fcoe/fcoe_ctlr.c @@ -147,7 +147,7 @@ static void fcoe_ctlr_map_dest(struct fcoe_ctlr *fip) * fcoe_ctlr_init() - Initialize the FCoE Controller instance * @fip: The FCoE controller to initialize */ -void fcoe_ctlr_init(struct fcoe_ctlr *fip, enum fip_state mode) +void fcoe_ctlr_init(struct fcoe_ctlr *fip, enum fip_mode mode) { fcoe_ctlr_set_state(fip, FIP_ST_LINK_WAIT); fip->mode = mode; @@ -454,7 +454,7 @@ void fcoe_ctlr_link_up(struct fcoe_ctlr *fip) mutex_unlock(&fip->ctlr_mutex); fc_linkup(fip->lp); } else if (fip->state == FIP_ST_LINK_WAIT) { - fcoe_ctlr_set_state(fip, fip->mode); + fcoe_ctlr_set_state(fip, (enum fip_state) fip->mode); switch (fip->mode) { default: LIBFCOE_FIP_DBG(fip, "invalid mode %d\n", fip->mode); diff --git a/drivers/scsi/fcoe/fcoe_transport.c b/drivers/scsi/fcoe/fcoe_transport.c index f4909cd206d3..b381ab066b9c 100644 --- a/drivers/scsi/fcoe/fcoe_transport.c +++ b/drivers/scsi/fcoe/fcoe_transport.c @@ -873,7 +873,7 @@ static int fcoe_transport_create(const char *buffer, int rc = -ENODEV; struct net_device *netdev = NULL; struct fcoe_transport *ft = NULL; - enum fip_state fip_mode = (enum fip_state)(long)kp->arg; + enum fip_mode fip_mode = (enum fip_mode)(long)kp->arg; mutex_lock(&ft_mutex); diff --git a/drivers/scsi/qedf/qedf_main.c b/drivers/scsi/qedf/qedf_main.c index 0a5dd5595dd3..cd61905ca2f5 100644 --- a/drivers/scsi/qedf/qedf_main.c +++ b/drivers/scsi/qedf/qedf_main.c @@ -1418,7 +1418,7 @@ static struct libfc_function_template qedf_lport_template = { static void qedf_fcoe_ctlr_setup(struct qedf_ctx *qedf) { - fcoe_ctlr_init(&qedf->ctlr, FIP_ST_AUTO); + fcoe_ctlr_init(&qedf->ctlr, FIP_MODE_AUTO); qedf->ctlr.send = qedf_fip_send; qedf->ctlr.get_src_addr = qedf_get_src_mac; diff --git a/include/scsi/libfcoe.h b/include/scsi/libfcoe.h index cb8a273732cf..f8f1f039a611 100644 --- a/include/scsi/libfcoe.h +++ b/include/scsi/libfcoe.h @@ -
[PATCH v2] target: fix a missing check of match_int
When match_int fails, "arg" is left uninitialized and may contain random value, thus should not be used. The fix checks if match_int fails, and if so, returns its error code. Signed-off-by: Kangjie Lu --- drivers/target/target_core_rd.c | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers/target/target_core_rd.c b/drivers/target/target_core_rd.c index a6e8106abd6f..3b7657b2f2f1 100644 --- a/drivers/target/target_core_rd.c +++ b/drivers/target/target_core_rd.c @@ -559,6 +559,7 @@ static ssize_t rd_set_configfs_dev_params(struct se_device *dev, char *orig, *ptr, *opts; substring_t args[MAX_OPT_ARGS]; int arg, token; + int ret; opts = kstrdup(page, GFP_KERNEL); if (!opts) @@ -573,14 +574,24 @@ static ssize_t rd_set_configfs_dev_params(struct se_device *dev, token = match_token(ptr, tokens, args); switch (token) { case Opt_rd_pages: - match_int(args, &arg); + ret = match_int(args, &arg); + if (ret) { + kfree(orig); + return ret; + } + rd_dev->rd_page_count = arg; pr_debug("RAMDISK: Referencing Page" " Count: %u\n", rd_dev->rd_page_count); rd_dev->rd_flags |= RDF_HAS_PAGE_COUNT; break; case Opt_rd_nullio: - match_int(args, &arg); + ret = match_int(args, &arg); + if (ret) { + kfree(orig); + return ret; + } + if (arg != 1) break; -- 2.17.2 (Apple Git-113)
Re: [PATCH 2/6] crypto: kdf - SP800-108 Key Derivation Function
On Fri, Jan 11, 2019 at 08:10:02PM +0100, Stephan Müller wrote: > The SP800-108 compliant Key Derivation Function is implemented as a > random number generator considering that it behaves like a deterministic > RNG. > > All three KDF types specified in SP800-108 are implemented. > > The code comments provide details about how to invoke the different KDF > types. > > Signed-off-by: Stephan Mueller > --- > crypto/Kconfig | 7 + > crypto/Makefile | 1 + > crypto/kdf.c| 492 > 3 files changed, 500 insertions(+) > create mode 100644 crypto/kdf.c > > diff --git a/crypto/Kconfig b/crypto/Kconfig > index 86960aa53e0f..cc80d89e0cf5 100644 > --- a/crypto/Kconfig > +++ b/crypto/Kconfig > @@ -561,6 +561,13 @@ config CRYPTO_HMAC > HMAC: Keyed-Hashing for Message Authentication (RFC2104). > This is required for IPSec. > > +config CRYPTO_KDF > + tristate "Key Derivation Function (SP800-108)" > + select CRYPTO_RNG > + help > + Support for KDF compliant to SP800-108. All three types of > + KDF specified in SP800-108 are implemented. > + > config CRYPTO_XCBC > tristate "XCBC support" > select CRYPTO_HASH > diff --git a/crypto/Makefile b/crypto/Makefile > index 799ed5e94606..69a0bb64b0ac 100644 > --- a/crypto/Makefile > +++ b/crypto/Makefile > @@ -58,6 +58,7 @@ crypto_user-y := crypto_user_base.o > crypto_user-$(CONFIG_CRYPTO_STATS) += crypto_user_stat.o > obj-$(CONFIG_CRYPTO_CMAC) += cmac.o > obj-$(CONFIG_CRYPTO_HMAC) += hmac.o > +obj-$(CONFIG_CRYPTO_KDF) += kdf.o This naming is too generic. CONFIG_CRYPTO_KDF and kdf.c imply that this is related to all KDFs. But actually it is an implementation of a few specific KDFs. Can you give it a clearer name, like KDF_SP800? - Eric
Re: [PATCH 3/6] crypto: kdf - add known answer tests
On Fri, Jan 11, 2019 at 08:10:22PM +0100, Stephan Müller wrote: > Add known answer tests to the testmgr for the KDF (SP800-108) cipher. > > Signed-off-by: Stephan Mueller > --- > crypto/testmgr.c | 226 +++ > crypto/testmgr.h | 110 +++ > 2 files changed, 336 insertions(+) > > diff --git a/crypto/testmgr.c b/crypto/testmgr.c > index 0f684a414acb..ff9051bffa1f 100644 > --- a/crypto/testmgr.c > +++ b/crypto/testmgr.c > @@ -110,6 +110,11 @@ struct drbg_test_suite { > unsigned int count; > }; > > +struct kdf_test_suite { > + struct kdf_testvec *vecs; > + unsigned int count; > +}; > + > struct akcipher_test_suite { > const struct akcipher_testvec *vecs; > unsigned int count; > @@ -133,6 +138,7 @@ struct alg_test_desc { > struct hash_test_suite hash; > struct cprng_test_suite cprng; > struct drbg_test_suite drbg; > + struct kdf_test_suite kdf; > struct akcipher_test_suite akcipher; > struct kpp_test_suite kpp; > } suite; > @@ -2020,6 +2026,64 @@ static int drbg_cavs_test(const struct drbg_testvec > *test, int pr, > return ret; > } > > +static int kdf_cavs_test(struct kdf_testvec *test, > + const char *driver, u32 type, u32 mask) Why not just "kdf_test()"? > +{ > + int ret = -EAGAIN; > + struct crypto_rng *drng; > + unsigned char *buf = kzalloc(test->expectedlen, GFP_KERNEL); s/unsigned char/u8 > + > + if (!buf) > + return -ENOMEM; > + > + drng = crypto_alloc_rng(driver, type | CRYPTO_ALG_INTERNAL, mask); > + if (IS_ERR(drng)) { > + printk(KERN_ERR "alg: kdf: could not allocate cipher handle " > +"for %s\n", driver); pr_err > + kzfree(buf); kfree is fine here. > + return -ENOMEM; > + } > + > + ret = crypto_rng_reset(drng, test->K1, test->K1len); > + if (ret) { > + printk(KERN_ERR "alg: kdf: could not set key derivation key\n"); pr_err > + goto err; > + } > + > + ret = crypto_rng_generate(drng, test->context, test->contextlen, > + buf, test->expectedlen); > + if (ret) { > + printk(KERN_ERR "alg: kdf: could not obtain key data\n"); pr_err > + goto err; > + } > + > + ret = memcmp(test->expected, buf, test->expectedlen); Elsewhere this function returns an -errno value but this is different. > + > +err: > + crypto_free_rng(drng); > + kzfree(buf); kfree would be fine here too. > + return ret; > +} > + > +static int alg_test_kdf(const struct alg_test_desc *desc, const char *driver, > + u32 type, u32 mask) > +{ > + int err = 0; > + unsigned int i = 0; > + struct kdf_testvec *template = desc->suite.kdf.vecs; const - Eric
Re: [PATCH 5/6] crypto: hkdf - add known answer tests
On Fri, Jan 11, 2019 at 08:10:56PM +0100, Stephan Müller wrote: > Add known answer tests to the testmgr for the HKDF (RFC5869) cipher. > > The known answer tests are derived from RFC 5869 appendix A. > > Note, the HKDF is considered to be a FIPS 140-2 allowed (not approved) > cipher as of now. Yet, an allowed cipher is usable under FIPS 140-2 > rules. > > Signed-off-by: Stephan Mueller > --- > crypto/testmgr.c | 32 + > crypto/testmgr.h | 115 +++ > 2 files changed, 147 insertions(+) > > diff --git a/crypto/testmgr.c b/crypto/testmgr.c > index ff9051bffa1f..aba7a3645293 100644 > --- a/crypto/testmgr.c > +++ b/crypto/testmgr.c > @@ -3187,6 +3187,38 @@ static const struct alg_test_desc alg_test_descs[] = { > .suite = { > .hash = __VECS(ghash_tv_template) > } > + }, { > + .alg = "hkdf(hmac(sha1))", > + .test = alg_test_kdf, > + .fips_allowed = 1, > + .suite = { > + .kdf = { > + .vecs = hkdf_hmac_sha1_tv_template, > + .count = ARRAY_SIZE(hkdf_hmac_sha1_tv_template) Use the __VECS() macro. > + } > + } > + }, { > + .alg = "hkdf(hmac(sha224))", > + .test = alg_test_null, > + .fips_allowed = 1, I think it is dumb to add algorithms to the testmgr with no tests just so the 'fips_allowed' flag can be set. And doesn't FIPS sometimes require tests anyway? I don't think the "null test" should count as a test :-) Perhaps just include sha256 and sha512, and have tests for them? > + }, { > + .alg = "hkdf(hmac(sha256))", > + .test = alg_test_kdf, > + .fips_allowed = 1, > + .suite = { > + .kdf = { > + .vecs = hkdf_hmac_sha256_tv_template, > + .count = > ARRAY_SIZE(hkdf_hmac_sha256_tv_template) > + } > + } > + }, { > + .alg = "hkdf(hmac(sha384))", > + .test = alg_test_null, > + .fips_allowed = 1, > + }, { > + .alg = "hkdf(hmac(sha512))", > + .test = alg_test_null, > + .fips_allowed = 1, > }, { > .alg = "hmac(md5)", > .test = alg_test_hash, > diff --git a/crypto/testmgr.h b/crypto/testmgr.h > index a729b66f8757..7c4aa694e0f3 100644 > --- a/crypto/testmgr.h > +++ b/crypto/testmgr.h > @@ -28002,6 +28002,121 @@ static struct kdf_testvec > kdf_dpi_hmac_sha256_tv_template[] = { > } > }; > > +/* Test vectors from RFC 5869 appendix A */ > +static struct kdf_testvec hkdf_hmac_sha256_tv_template[] = { const Likewise for all other kdf_testvecs. > + { > + .K1 = (unsigned char *) No need for this cast if you make the pointers const in struct kdf_testvec. Likewise for all other testvecs. > +#ifdef __LITTLE_ENDIAN > + "\x08\x00" /* rta length */ > + "\x01\x00" /* rta type */ > + "\x0d\x00\x00\x00" /* salt length */ > +#else > + "\x00\x08" /* rta length */ > + "\x00\x01" /* rta type */ > + "\x00\x00\x00\x0d" /* salt length */ > +#endif > + "\x00\x01\x02\x03\x04\x05\x06\x07" > + "\x08\x09\x0a\x0b\x0c" /* salt */ > + "\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b" > + "\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b" > + "\x0b\x0b\x0b\x0b\x0b\x0b", /* IKM */ > + .K1len = 43, > + .context = (unsigned char *) > + "\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7" > + "\xf8\xf9", > + .contextlen = 10, > + .expected = (unsigned char *) > + "\x3c\xb2\x5f\x25\xfa\xac\xd5\x7a" > + "\x90\x43\x4f\x64\xd0\x36\x2f\x2a" > + "\x2d\x2d\x0a\x90\xcf\x1a\x5a\x4c" > + "\x5d\xb0\x2d\x56\xec\xc4\xc5\xbf" > + "\x34\x00\x72\x08\xd5\xb8\x87\x18" > + "\x58\x65", > + .expectedlen = 42 > + }, { > + .K1 = (unsigned char *) > +#ifdef __LITTLE_ENDIAN > + "\x08\x00" /* rta length */ > + "\x01\x00" /* rta type */ > +#else > + "\x00\x08" /* rta length */ > + "\x00\x01" /* rta type */ > +#endif > + "\x00\x00\x00\x00" /* salt length */ > + "\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b" > + "\x0b\x0b\x0b\x0b\x0b\x0b\x
Re: [PATCH 4/6] crypto: hkdf - RFC5869 Key Derivation Function
Hi Stephan, On Fri, Jan 11, 2019 at 08:10:39PM +0100, Stephan Müller wrote: > The RFC5869 compliant Key Derivation Function is implemented as a > random number generator considering that it behaves like a deterministic > RNG. > Thanks for the proof of concept! I guess it ended up okay. But can you explain more the benefits of using the crypto_rng interface, as opposed to just some crypto_hkdf_*() helper functions that are exported for modules to use? > The extract and expand phases use different instances of the underlying > keyed message digest cipher to ensure that while the extraction phase > generates a new key for the expansion phase, the cipher for the > expansion phase can still be used. This approach is intended to aid > multi-threaded uses cases. I think you partially misunderstood what I was asking for. One HMAC tfm is sufficient as long as HKDF-Expand is separated from HKDF-Extract, which you've done. So just use one HMAC tfm, and in crypto_hkdf_seed() key it with the 'salt', and then afterwards with the 'prk'. Also everywhere in this patchset, please avoid using the word "cipher" to refer to algorithms that are not encryption/decryption. I know a lot of the crypto API docs do this, but I think it is a mistake and confusing. Hash algorithms and KDFs are not "ciphers". > > Signed-off-by: Stephan Mueller > --- > crypto/Kconfig | 6 + > crypto/Makefile | 1 + > crypto/hkdf.c | 290 > 3 files changed, 297 insertions(+) > create mode 100644 crypto/hkdf.c > > diff --git a/crypto/Kconfig b/crypto/Kconfig > index cc80d89e0cf5..0eee5e129fa3 100644 > --- a/crypto/Kconfig > +++ b/crypto/Kconfig > @@ -568,6 +568,12 @@ config CRYPTO_KDF > Support for KDF compliant to SP800-108. All three types of > KDF specified in SP800-108 are implemented. > > +config CRYPTO_HKDF > + tristate "HMAC-based Extract-and expand Key Derivation Function" > + select CRYPTO_RNG > + help > + Support for KDF compliant to RFC5869. > + Make the description "HMAC-based Extract-and-Expand Key Derivation Function (HKDF)"? There is a missing dash, and probably the acronym "HKDF" should be in there. > config CRYPTO_XCBC > tristate "XCBC support" > select CRYPTO_HASH > diff --git a/crypto/Makefile b/crypto/Makefile > index 69a0bb64b0ac..6bbb0a4dea13 100644 > --- a/crypto/Makefile > +++ b/crypto/Makefile > @@ -59,6 +59,7 @@ crypto_user-$(CONFIG_CRYPTO_STATS) += crypto_user_stat.o > obj-$(CONFIG_CRYPTO_CMAC) += cmac.o > obj-$(CONFIG_CRYPTO_HMAC) += hmac.o > obj-$(CONFIG_CRYPTO_KDF) += kdf.o > +obj-$(CONFIG_CRYPTO_HKDF) += hkdf.o > obj-$(CONFIG_CRYPTO_VMAC) += vmac.o > obj-$(CONFIG_CRYPTO_XCBC) += xcbc.o > obj-$(CONFIG_CRYPTO_NULL2) += crypto_null.o > diff --git a/crypto/hkdf.c b/crypto/hkdf.c > new file mode 100644 > index ..35a975ed64a8 > --- /dev/null > +++ b/crypto/hkdf.c > @@ -0,0 +1,290 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +/* > + * RFC 5869 Key-derivation function > + * People don't know what RFC xyz is. Please be more explicit than just the RFC number, e.g. "HKDF: HMAC-based Extract-and-Expand Key Derivation Function (RFC 5869)" > + * Copyright (C) 2019, Stephan Mueller > + */ > + > +/* > + * The HKDF extract phase is applied with crypto_rng_reset(). > + * The HKDF expand phase is applied with crypto_rng_generate(). > + * > + * NOTE: In-place cipher operations are not supported. > + */ What does an "in-place cipher operation" mean in this context? That the 'info' buffer must not overlap the 'dst' buffer? Maybe crypto_rng_generate() should check that for all crypto_rngs? Or is it different for different crypto_rngs? > + > +#include > +#include > +#include > +#include > +#include > +#include > + > +struct crypto_hkdf_ctx { > + struct crypto_shash *extract_kmd; > + struct crypto_shash *expand_kmd; > +}; > + > +#define CRYPTO_HKDF_MAX_DIGESTSIZE 64 > + > +/* > + * HKDF expand phase > + */ > +static int crypto_hkdf_random(struct crypto_rng *rng, > + const u8 *info, unsigned int infolen, > + u8 *dst, unsigned int dlen) Why call it crypto_hkdf_random() instead of crypto_hkdf_generate()? The latter would match rng_alg::generate. > +{ > + struct crypto_hkdf_ctx *ctx = crypto_tfm_ctx(crypto_rng_tfm(rng)); const > + struct crypto_shash *expand_kmd = ctx->expand_kmd; > + SHASH_DESC_ON_STACK(desc, expand_kmd); > + unsigned int h = crypto_shash_digestsize(expand_kmd); const > + int err = 0; > + u8 *dst_orig = dst; > + const u8 *prev = NULL; > + uint8_t ctr = 0x01; u8 instead of uint8_t > + > + if (dlen > h * 255) > + return -EINVAL; > + > + desc->tfm = expand_kmd; > + desc->flags = crypto_shash_get_flags(expand_kmd) & > + CRYPTO_TFM_REQ_MAY_SLEEP; This line setting desc->flags doesn't make sense. How is the user meant to control whether
Re: [PATCH lora-next 4/4] dt-bindings: lora: sx130x: add clock bindings
Am 08.01.19 um 09:41 schrieb Ben Whitten: > The sx130x family consumes two clocks, a 32MHz clock provided by a > connected IQ transceiver, and a 133MHz high speed clock. > > In the example we connect the concentrator to output 0 of a fixed clock > providing the 133MHz high speed clock, and we connect to output 0 of a > connected transceiver 32MHz clock. > > The connected radios are both fed from output 0 of a fixed 32MHz clock, > with only one being the clock source back with one output to the > sx130x concentrator. No, no, no... See below. > > Signed-off-by: Ben Whitten > --- > .../{ => net}/lora/semtech,sx130x.yaml| 39 ++- > 1 file changed, 38 insertions(+), 1 deletion(-) > rename Documentation/devicetree/bindings/{ => net}/lora/semtech,sx130x.yaml > (62%) > > diff --git a/Documentation/devicetree/bindings/lora/semtech,sx130x.yaml > b/Documentation/devicetree/bindings/net/lora/semtech,sx130x.yaml > similarity index 62% > rename from Documentation/devicetree/bindings/lora/semtech,sx130x.yaml > rename to Documentation/devicetree/bindings/net/lora/semtech,sx130x.yaml > index ad263bc4e60d..23a096ca2912 100644 > --- a/Documentation/devicetree/bindings/lora/semtech,sx130x.yaml > +++ b/Documentation/devicetree/bindings/net/lora/semtech,sx130x.yaml > @@ -15,7 +15,8 @@ description: | >demodulating LoRa signals on 8 channels simultaneously. > >It is typically paired with two sx125x IQ radios controlled over an > - SPI directly from the concentrator. > + SPI directly from the concentrator. One of the radios will provide > + a 32MHz clock back into the concentrator. The SX125x does not seem to document the frequency of the output clock, so I assume it's pass-through of whatever went in, and 32 MHz seems correct for the SX130x input side. > >The concentrator itself it controlled over SPI. > > @@ -41,6 +42,20 @@ properties: >in Hz. Maximum SPI frequency is 10MHz although 8MHz is typically used >on a number of cards. > > + clocks: > +maxItems: 2 > +items: > + - description: 32MHz clock provider > + - description: 133MHz high speed clock provider > +description: The chip requires two clock inputs; A 32MHz clock at CMOS > + level which is provided from a connected radio. > + And a 133MHz high speed clock at CMOS level provided by an oscillator. > + > + clock-names: > +items: > + - const: clk32m The pin is called CLK32M; in the specs there's XTAL32F/XTAL32T. > + - const: clkhs The pin is called CLKHS; in the specs it's HSC_F. So I assume both names are good, but wanted to double-check. > + >radio-spi: > description: The concentrator has two radios connected which are > contained >within the following node. > @@ -64,11 +79,27 @@ required: > > examples: >- | > +tcxo: dummy32m { > + compatible = "fixed-clock"; > + clock-frequency = <3200>; > + clock-output-names = "tcxo"; > + #clock-cells = <0>; > +}; > + > +clkhs: dummy133m { > + compatible = "fixed-clock"; > + clock-frequency = <13300>; > + clock-output-names = "clkhs"; > + #clock-cells = <0>; > +}; > + > concentrator0: lora@0 { >compatible = "semtech,sx1301"; >reg = <0>; >reset-gpios = <&pioB 27 GPIO_ACTIVE_HIGH>; >spi-max-frequency = <800>; > + clocks = <&radio1 0>, <&clkhs 0>; Once again, this is wrong use of #clock-cells: <&radio1>, <&clkhs> Otherwise you have four clocks, and your "clkhs" will be <0>. > + clock-names = "clk32m", "clkhs"; > >radio-spi { > #address-cells = <1>; > @@ -77,11 +108,17 @@ examples: > radio0: lora@0 { >compatible = "semtech,sx1257"; >reg = <0>; > + clocks = <&tcxo 0>; Ditto, <&tcxo>. > + clock-names = "tcxo"; > }; > > radio1: lora@1 { >compatible = "semtech,sx1257"; >reg = <1>; > + clocks = <&tcxo 0>; <&tcxo> > + clock-names = "tcxo"; > + clock-output-names = "clk32m"; > + #clock-cells = <0>; > }; >}; > }; This example may need to be updated to use a fixed-clock instead of the radio'S output, due to unsolved clk provider implementation problems that will render the example unusable. Same as for the radio, we should also prepare optional regulator inputs. There's at least 1.8 V and 3.3 V supplies here. Regards, Andreas -- SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg)
[PATCH] regulator: pwm: No need to make a copy of regulator_ops per instance
Having instance specific copy of desc is enough to support multiple instance of pwm regulator. The regulator_ops is never changed so no need to copy it per instance, make pwm_regulator_voltage_table_ops and pwm_regulator_voltage_continuous_ops const to ensure they won't be changed. The pwm_regulator_desc is a template to be copied so also make it const. Signed-off-by: Axel Lin --- drivers/regulator/pwm-regulator.c | 17 + 1 file changed, 5 insertions(+), 12 deletions(-) diff --git a/drivers/regulator/pwm-regulator.c b/drivers/regulator/pwm-regulator.c index a2fd140eff81..3f53f9134b32 100644 --- a/drivers/regulator/pwm-regulator.c +++ b/drivers/regulator/pwm-regulator.c @@ -40,9 +40,6 @@ struct pwm_regulator_data { /* regulator descriptor */ struct regulator_desc desc; - /* Regulator ops */ - struct regulator_ops ops; - int state; /* Enable GPIO */ @@ -231,7 +228,7 @@ static int pwm_regulator_set_voltage(struct regulator_dev *rdev, return 0; } -static struct regulator_ops pwm_regulator_voltage_table_ops = { +static const struct regulator_ops pwm_regulator_voltage_table_ops = { .set_voltage_sel = pwm_regulator_set_voltage_sel, .get_voltage_sel = pwm_regulator_get_voltage_sel, .list_voltage= pwm_regulator_list_voltage, @@ -241,7 +238,7 @@ static struct regulator_ops pwm_regulator_voltage_table_ops = { .is_enabled = pwm_regulator_is_enabled, }; -static struct regulator_ops pwm_regulator_voltage_continuous_ops = { +static const struct regulator_ops pwm_regulator_voltage_continuous_ops = { .get_voltage = pwm_regulator_get_voltage, .set_voltage = pwm_regulator_set_voltage, .enable = pwm_regulator_enable, @@ -249,7 +246,7 @@ static struct regulator_ops pwm_regulator_voltage_continuous_ops = { .is_enabled = pwm_regulator_is_enabled, }; -static struct regulator_desc pwm_regulator_desc = { +static const struct regulator_desc pwm_regulator_desc = { .name = "pwm-regulator", .type = REGULATOR_VOLTAGE, .owner = THIS_MODULE, @@ -287,9 +284,7 @@ static int pwm_regulator_init_table(struct platform_device *pdev, drvdata->state = -EINVAL; drvdata->duty_cycle_table = duty_cycle_table; - memcpy(&drvdata->ops, &pwm_regulator_voltage_table_ops, - sizeof(drvdata->ops)); - drvdata->desc.ops = &drvdata->ops; + drvdata->desc.ops = &pwm_regulator_voltage_table_ops; drvdata->desc.n_voltages= length / sizeof(*duty_cycle_table); return 0; @@ -301,9 +296,7 @@ static int pwm_regulator_init_continuous(struct platform_device *pdev, u32 dutycycle_range[2] = { 0, 100 }; u32 dutycycle_unit = 100; - memcpy(&drvdata->ops, &pwm_regulator_voltage_continuous_ops, - sizeof(drvdata->ops)); - drvdata->desc.ops = &drvdata->ops; + drvdata->desc.ops = &pwm_regulator_voltage_continuous_ops; drvdata->desc.continuous_voltage_range = true; of_property_read_u32_array(pdev->dev.of_node, -- 2.17.1
Re: [PATCH 2/2] sh: generate uapi header and syscall table header files
On 1/10/19 5:54 PM, Guenter Roeck wrote: > On Wed, Jan 02, 2019 at 09:07:25PM +0530, Firoz Khan wrote: >> Unified system call table generation script must be run to >> generate unistd_32.h and syscall_table.h files. This patch >> will have changes which will invokes the script. >> >> This patch will generate unistd_32.h and syscall_table.h >> files by the syscall table generation script invoked by >> sh/Makefile and the generated files against the removed >> files must be identical. >> >> The generated uapi header file will be included in uapi/- >> asm/unistd.h and generated system call table header file >> will be included by kernel/syscall_32.S file. >> >> Signed-off-by: Firoz Khan > > Have you tested this patch ? I tested it at one point, but not recently. (It was before 4.20 came out...) Rob
Re: [PATCH lora-next 3/4] dt-bindings: lora: sx125x: add clock bindings
Am 08.01.19 um 09:41 schrieb Ben Whitten: > The sx125x consumes a 32MHz clock and if it is coupled with a sx130x > concentrator may also provide a gated version of this 32MHz for the > concentrator. "SX125x", "SX130x", "32 MHz" > > In the example we connect to output 0 of "txco" clock source. The radio > also provides a single clock output, hence "#clock-cells = <0>", named > "clk32m" for consumption by the sx130x concentrator. No, as pointed out before, output 0 with #clock-cells = <0> is wrong! Either it's 1 and then you can use a single-cell index, or it's 0, in which case that's the next (invalid) clock. > > Signed-off-by: Ben Whitten > --- > .../{ => net}/lora/semtech,sx125x.yaml| 27 +++ > 1 file changed, 27 insertions(+) > rename Documentation/devicetree/bindings/{ => net}/lora/semtech,sx125x.yaml > (67%) > > diff --git a/Documentation/devicetree/bindings/lora/semtech,sx125x.yaml > b/Documentation/devicetree/bindings/net/lora/semtech,sx125x.yaml > similarity index 67% > rename from Documentation/devicetree/bindings/lora/semtech,sx125x.yaml > rename to Documentation/devicetree/bindings/net/lora/semtech,sx125x.yaml In wrong patch? > index 5eadec860b70..c2fb4ac06341 100644 > --- a/Documentation/devicetree/bindings/lora/semtech,sx125x.yaml > +++ b/Documentation/devicetree/bindings/net/lora/semtech,sx125x.yaml > @@ -33,13 +33,40 @@ properties: > description: The frequency of the SPI communication to the radio, >in Hz. Maximum SPI frequency is 10MHz. > > + clocks: > +maxItems: 1 > +description: 32MHz clock provider > + > + clock-names: > +items: > + - const: txco The name TCXO and 32 MHz appear to come from the SX1301 manual's reference example? In the SX1257 manual I see XTA/XTB connected to an XTAL, and CLK_IN. Both CLK_IN and CLK_OUT are documented as 36 MHz (page 8). In table 2-4 (page 11) however FXOSC is 32 to 36 MHz, typically 36 MHz; FCLK_IN also has a range of 32 to 36 MHz there. So I guess we should be having two named clocks here and be more permissive in the description? While getting into the hairy details of clock names, we should also prepare optional properties for supply voltages, in case they come from some PMIC that needs a regulator drifer. By my count there's 7 inputs: VBAT1/2/3, VR_PA, VR_ANA1/2, VR_DIG. Do we need a named -supply property for each? > + > + clock-output-names: > +items: > + - const: clk32m > +description: 32MHz output clock name > + > + '#clock-cells': > +const: 0 > + > required: >- compatible >- reg > > examples: >- | > +tcxo: dummy32m { > + compatible = "fixed-clock"; > + clock-frequency = <3200>; > + clock-output-names = "tcxo"; > + #clock-cells = <0>; > +}; > + > radio0: lora@0 { >compatible = "semtech,sx1257"; >reg = <0>; > + clocks = <&tcxo 0>; just <&tcxo> > + clock-names = "tcxo"; > + clock-output-names = "clk32m"; > + #clock-cells = <0>; > }; > Regards, Andreas -- SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg)
Re: [PATCH 0/5] Improve the latency tracers
Hi Steven, Have you checked this serias yet? :) On Tue, Jan 01, 2019 at 11:46:09PM +0800, Changbin Du wrote: > Happy new year! > > This series make some improments for the kernel latency tracers, especilly > for the wakeup tracers. The latency tracers will show us more useful > information. With this series, the wakeup tracers look like this > when display-graph is enabled: > > # tracer: wakeup > # > # wakeup latency trace v1.1.5 on 4.20.0+ > # > # latency: 593 us, #674/674, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4) > #- > #| task: kworker/0:1H-339 (uid:0 nice:-20 policy:0 rt_prio:0) > #- > # > # _-=> irqs-off > # / _=> need-resched > # | / _---=> hardirq/softirq > # || / _--=> preempt-depth > # ||| / > # REL TIME CPU TASK/PID DURATION > FUNCTION CALLS > # | | || | | | > | | | > 0 us | 0)-0| dNs. | | /* 0:120:R > + [000] 339:100:R kworker/0:1H */ > 3 us | 0)-0| dNs. | 0.000 us| > (null)(); > -0 0dNs. 66us : > => try_to_wake_up > => __queue_work > => queue_work_on > => call_timer_fn > => run_timer_softirq > => __do_softirq > => irq_exit > => smp_apic_timer_interrupt > => apic_timer_interrupt > => native_safe_halt > => default_idle > => default_idle_call > => do_idle > => cpu_startup_entry > => start_kernel > => secondary_startup_64 >67 us | 0)-0| dNs. | 0.721 us| ttwu_stat(); >69 us | 0)-0| dNs. | 0.607 us| > _raw_spin_unlock_irqrestore(); >71 us | 0)-0| .Ns. | 0.598 us| > _raw_spin_lock_irq(); >72 us | 0)-0| .Ns. | 0.584 us| > _raw_spin_lock_irq(); >73 us | 0)-0| dNs. | 1.125 us| > __next_timer_interrupt(); >75 us | 0)-0| dNs. | | call_timer_fn() > { >76 us | 0)-0| dNs. | | > delayed_work_timer_fn() { > [...] > 564 us | 0) kworker-13 | d... | | > set_next_entity() { > 565 us | 0) kworker-13 | d... | 0.524 us| > __update_load_avg_se(); > 566 us | 0) kworker-13 | d... | 0.562 us| > __update_load_avg_cfs_rq(); > 567 us | 0) kworker-13 | d... | 2.765 us| } > 568 us | 0) kworker-13 | d... | + 10.077 us |} > 569 us | 0) kworker-13 | d... | 0.000 us| __schedule(); > kworker/-13 0d... 593us : > => schedule > => worker_thread > => kthread > => ret_from_fork > 593 us | 0) kworker-13 | d... | | /* 13:120:I > ==> [000] 339:100:R kworker/0:1H */ > > Changbin Du (5): > function_graph: Support displaying relative timestamp > sched/tracing: Show more info for funcgraph wakeup tracers > sched/tracing: Put a margin between flags and duration for wakeup > tracers > sched/tracing: Show stacktrace for wakeup tracers > trace/doc: Add latency tracer funcgraph example > > Documentation/trace/ftrace.rst | 51 > kernel/trace/trace.h | 9 ++--- > kernel/trace/trace_functions_graph.c | 30 ++-- > kernel/trace/trace_irqsoff.c | 2 +- > kernel/trace/trace_sched_wakeup.c| 11 -- > 5 files changed, 94 insertions(+), 9 deletions(-) > > -- > 2.17.1 > -- Cheers, Changbin Du
stack-protector: fix CC_HAS_STACKPROTECTOR_NONE depend on -fno-stack-protector
commit(2a61f4747eeaa85ce26ca9fbd81421b15facd018)rename CC_STACKPROTECTOR_NONE config. but unfortunately if the compiler support option -fno-stack-protector, CC_HAS_STACKPROTECTOR_NONE will not be disabled. CC_HAS_STACKPROTECTOR_NONE and CC_STACKPROTECTOR_STRONG will be enabled at once, as the following conditions: 1. gcc support -fno-stack-protector & -fstack-protector-strong 2. enabled CC_STACKPROTECTOR_STRONG & STACKPROTECTOR 3. disabled CC_HAS_STACKPROTECTOR_NONE 0001-stack-protector-fix-CC_HAS_STACKPROTECTOR_NONE-depen.patch Description: Binary data
Re: [PATCH lora-next 2/4] dt-bindings: lora: sx125x: add basic documentation
Am 08.01.19 um 09:41 schrieb Ben Whitten: > The sx125x family are IQ radio transceivers from Semtech configured over > SPI, they are typically connected to an sx130x series concentrator however > may be connected to a host directly. "SX125x" and "SX130x" > > Required properties include the radio number of the host or concentrator > bus. > > Signed-off-by: Ben Whitten > --- > .../bindings/lora/semtech,sx125x.yaml | 45 +++ > 1 file changed, 45 insertions(+) > create mode 100644 Documentation/devicetree/bindings/lora/semtech,sx125x.yaml > > diff --git a/Documentation/devicetree/bindings/lora/semtech,sx125x.yaml > b/Documentation/devicetree/bindings/lora/semtech,sx125x.yaml > new file mode 100644 > index ..5eadec860b70 > --- /dev/null > +++ b/Documentation/devicetree/bindings/lora/semtech,sx125x.yaml > @@ -0,0 +1,45 @@ > +# SPDX-License-Identifier: GPL-2.0 > +%YAML 1.2 > +--- > +$id: http://devicetree.org/schemas/lora/semtech,sx125x.yaml# > +$schema: http://devicetree.org/meta-schemas/core.yaml# > + > +title: Semtech IQ modulator/de-modulator transeiver > + > +maintainers: > + - Andreas Färber > + - Ben Whitten > + > +description: | > + The sx125x family are highly integrated RF front-end to digital I and Q > + modulator/demodulator Multi-PHY mode transceiver capable of supporting > + multiple constant and non-constant envelope modulation schemes. "SX125x" "are ... transceivers" > + > +properties: > + compatible: > +items: > + - enum: > +- semtech,sx1255 > +- semtech,sx1257 > +- semtech,sx1258 Do we need all three? Probably yes for the concentrator to make decisions? > + > + reg: > +maxItems: 1 > +description: The chip select on the SPI bus or radio number in > concentrator. ", with radio A = 0 and radio B = 1." > + > + spi-max-frequency: > +maximum: 1000 > +default: 800 > +description: The frequency of the SPI communication to the radio, > + in Hz. Maximum SPI frequency is 10MHz. If on the concentrator's SPI bus then this is unused. You don't mark it required below, so maybe add a textual note? > + > +required: > + - compatible > + - reg > + > +examples: > + - | > +radio0: lora@0 { > + compatible = "semtech,sx1257"; > + reg = <0>; > +}; If you compare my sx127x implementation, should we prepare a "radio-frequency" (or "radio-center-frequency") property for the antenna, to preconfigure the driver during probe? Contents would then be <86800>, for example. Thanks, Andreas -- SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg)
Re: [PATCH V7 2/2] drm/panel: Add Sitronix ST7701 panel driver
On Sat, Jan 12, 2019 at 2:49 AM Sam Ravnborg wrote: > > Hi Jagan. > > Gave this another more detailed read - triggered some additional comments. > Depite the comments it looks good, and this is all more or > less cosmetic improvements. Thanks for the review. > > Sam > > > +struct st7701_panel_desc { > > + const struct drm_display_mode *mode; > > + unsigned int lanes; > > + unsigned long flags; > > + enum mipi_dsi_pixel_format format; > > + const char *const *supply_names; > > + unsigned int num_supplies; > > + unsigned int panel_sleep_delay; > > +}; > > + > > +struct st7701 { > > + struct drm_panel panel; > > + struct mipi_dsi_device *dsi; > > + const struct st7701_panel_desc *desc; > > + > > + struct backlight_device *backlight; > > + struct regulator_bulk_data *supplies; > > + unsigned int num_supplies; > I cannot see that num_supplies in this struct are used? Yes it is used in the code, please check in struct st7701_panel_desc. > > > > + struct gpio_desc *reset; > > + unsigned int sleep_delay; > > +}; > > + > > +static inline struct st7701 *panel_to_st7701(struct drm_panel *panel) > > +{ > > + return container_of(panel, struct st7701, panel); > > +} > > + > > +static inline int st7701_dsi_write(struct st7701 *st7701, const void *seq, > > +size_t len) > > +{ > > + return mipi_dsi_dcs_write_buffer(st7701->dsi, seq, len); > > +} > > + > > > > +static int st7701_prepare(struct drm_panel *panel) > > +{ > > + struct st7701 *st7701 = panel_to_st7701(panel); > > + int ret; > > + > > + gpiod_set_value(st7701->reset, 0); > > + msleep(20); > > + > > + ret = regulator_bulk_enable(st7701->desc->num_supplies, > > + st7701->supplies); > > + if (ret < 0) > > + return ret; > > + msleep(20); > > + > > + gpiod_set_value(st7701->reset, 1); > > + msleep(20); > > + > > + gpiod_set_value(st7701->reset, 0); > > + msleep(30); > > + > > + gpiod_set_value(st7701->reset, 1); > > + msleep(150); > > + > > + st7701_init_sequence(st7701); > > + > > + return 0; > > +} > > + > > > +static int st7701_unprepare(struct drm_panel *panel) > > +{ > > + struct st7701 *st7701 = panel_to_st7701(panel); > > + > > + ST7701_DSI(st7701, MIPI_DCS_EXIT_SLEEP_MODE, 0x00); > > + > > + msleep(st7701->sleep_delay); > > + > > + gpiod_set_value(st7701->reset, 0); > > + > > + gpiod_set_value(st7701->reset, 1); > > + > > + gpiod_set_value(st7701->reset, 0); > No timing constrains here? In prepare there are sleeps intermixed. Delay while doing unprare is not needed I suppose. > > > + > > + regulator_bulk_disable(st7701->desc->num_supplies, st7701->supplies); > > + > > + return 0; > > +} > > + > > +static int st7701_get_modes(struct drm_panel *panel) > > +{ > > + struct st7701 *st7701 = panel_to_st7701(panel); > > + const struct drm_display_mode *desc_mode = st7701->desc->mode; > > + struct drm_display_mode *mode; > > + > > + mode = drm_mode_duplicate(panel->drm, desc_mode); > > + if (!mode) { > > + dev_err(&st7701->dsi->dev, "failed to add mode %ux%ux@%u\n", > > + desc_mode->hdisplay, desc_mode->vdisplay, > > + desc_mode->vrefresh); > Use something like: > DRM_DEV_ERROR(st7701->dev, "failed to add mode" DRM_MODE_FMT > ", > DRM_MODE_ARG(desc_mode)); > > > + return -ENOMEM; > > + } > > + > > + drm_mode_set_name(mode); > > + drm_mode_probed_add(panel->connector, mode); > > + > > + panel->connector->display_info.width_mm = desc_mode->width_mm; > > + panel->connector->display_info.height_mm = desc_mode->height_mm; > > + > > + return 1; > > +} > > + > > > +static const struct st7701_panel_desc ts8550b_desc = { > > + .mode = &ts8550b_mode, > > + .lanes = 2, > > + .flags = MIPI_DSI_MODE_VIDEO, > > + .format = MIPI_DSI_FMT_RGB888, > > + .supply_names = ts8550b_supply_names, > > + .num_supplies = ARRAY_SIZE(ts8550b_supply_names), > > + .panel_sleep_delay = 80, /* panel need extra 80ms for sleep out cmd */ > In the only place this is used there is added 120 ms too. > Looks inconsistent - do we have the same delay twice? 120ms is the one recommended by st7701 controller delay after a sleep out command, please check it in datasheet [1], page 188. And this 80ms is specific to TS8550B panel as per BSP code this doesn't have proper documentation but the BSP code doing this extra 80ms. > > > > + > > +static int st7701_dsi_probe(struct mipi_dsi_device *dsi) > > +{ > > + const struct st7701_panel_desc *desc; > > + struct device_node *np; > > + struct st7701 *st7701; > > + int ret, i; > > + > > + st7701 = devm_kzalloc(&dsi->dev, sizeof(*st7701), GFP_KERNEL); > > + if (!st7701) > > + return -ENOMEM; > > + > > + de
Re: [PATCH] scsi: ufs: Remove select of phy-qcom-ufs from ufs-qcom
Evan, > CONFIG_SCSI_UFS_QCOM selects CONFIG_PHY_QCOM_UFS, assuming that > this was the only possible PHY driver Qualcomm's UFS controller > would use. But in SDM845, the UFS driver is bundled into phy-qcom-qmp, > and phy-qcom-ufs is unused. > > Remove the select, since for SDM845 it adds useless drivers to the > build. Applied to 5.1/scsi-queue. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH lora-next 1/4] dt-bindings: lora: sx130x: add basic documentation
Hi Ben, Am 08.01.19 um 09:41 schrieb Ben Whitten: > Add basic documentation in YAML format for the sx130x series concentrators > from Semtech. > Required is; the location on the SPI bus, the reset gpio and the node for > downstream IQ radios, typically sx125x. > > Signed-off-by: Ben Whitten > --- > .../bindings/lora/semtech,sx130x.yaml | 87 +++ > 1 file changed, 87 insertions(+) > create mode 100644 Documentation/devicetree/bindings/lora/semtech,sx130x.yaml Patch 3/4 moves this to net/lora/, which I think is more appropriate. > > diff --git a/Documentation/devicetree/bindings/lora/semtech,sx130x.yaml > b/Documentation/devicetree/bindings/lora/semtech,sx130x.yaml > new file mode 100644 > index ..ad263bc4e60d > --- /dev/null > +++ b/Documentation/devicetree/bindings/lora/semtech,sx130x.yaml > @@ -0,0 +1,87 @@ > +# SPDX-License-Identifier: GPL-2.0 > +%YAML 1.2 > +--- > +$id: http://devicetree.org/schemas/lora/semtech,sx130x.yaml# > +$schema: http://devicetree.org/meta-schemas/core.yaml# > + > +title: Semtech LoRa concentrator > + > +maintainers: > + - Andreas Färber > + - Ben Whitten > + > +description: | > + Semtech LoRa concentrator sx130x digital baseband chip is capable of SX130x or SX1301/SX1308, to distinguish from driver name sx130x and to avoid ambiguities of which x is a wildcard. > + demodulating LoRa signals on 8 channels simultaneously. > + > + It is typically paired with two sx125x IQ radios controlled over an Ditto, SX125x > + SPI directly from the concentrator. > + > + The concentrator itself it controlled over SPI. > + > +properties: > + compatible: > +items: > + - enum: > +- semtech,sx1301 > +- semtech,sx1308 We should only list both if we expect differences between the two models. Otherwise SX1308 can reuse the SX1301 compatible. If we want to mark it up just in case then rearranging the above to be a sequence of "semtech,sx1308", "semtech,sx1301" would be an alternative. > + > + reg: > +maxItems: 1 > +description: The chip select on the SPI bus. Is this mandatory now or not with maxItems? > + > + reset-gpios: > +maxItems: 1 > +description: A connection of the reset gpio line. This needs to be optional, which I think the maxItems syntax says unlike the commit message? On an mPCIe card you won't have such a GPIO, for instance. We do a Soft Reset, so it's not functionally mandatory. > + > + spi-max-frequency: > +maximum: 1000 > +default: 800 > +description: The frequency of the SPI communication to the concentrator, > + in Hz. Maximum SPI frequency is 10MHz although 8MHz is typically used > + on a number of cards. Do we really need to describe this here? It should be covered in the common SPI bindings, and only applies to SPI bus, not USB picoGW. > + > + radio-spi: > +description: The concentrator has two radios connected which are > contained > + within the following node. "can have" > + > +properties: > + '#address-cells': > +const: 1 > + > + '#size-cells': > +const: 0 > + > +required: > + - '#address-cells' > + - '#size-cells' I'm pretty sure that Rob would like to have a compatible here, even if unneeded in our Linux driver? BTW if someone has better naming suggestions than "radio-spi"... I just wanted to avoid having it in the main node directly, in case we need other sub-nodes, too. > + > +required: > + - compatible > + - reg > + - reset-gpios Must be optional. > + - radio-spi Should be optional. (Driver needs it today, but that's another problem.) > + > +examples: > + - | > +concentrator0: lora@0 { > + compatible = "semtech,sx1301"; > + reg = <0>; > + reset-gpios = <&pioB 27 GPIO_ACTIVE_HIGH>; > + spi-max-frequency = <800>; > + > + radio-spi { > +#address-cells = <1>; > +#size-cells = <0>; > + > +radio0: lora@0 { > + compatible = "semtech,sx1257"; > + reg = <0>; > +}; > + > +radio1: lora@1 { > + compatible = "semtech,sx1257"; > + reg = <1>; > +}; > + }; > +}; Thanks for looking into this, Andreas -- SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg)
Re: [PATCH] scsi: qla1280: set 64bit coherent mask
Thomas, > Reason is that SGI IP27 always generates 64bit DMA addresses and has > no fallback mode for 32bit DMA addresses implemented. QLA1280 > supports 64bit addressing for all DMA accesses so setting coherent mask > to 64bit fixes the issue. Applied to 5.0/scsi-fixes, thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 1/1] doc: scsi: remove reference to tmscsim.txt file
Otto, > The tmscsim.txt doc file was removed in c121107d0f84. Applied to 5.1/scsi-queue, thanks. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions
On Fri, Jan 11, 2019 at 07:06:08PM -0800, John Hubbard wrote: > On 1/11/19 6:46 PM, Jerome Glisse wrote: > > On Fri, Jan 11, 2019 at 06:38:44PM -0800, John Hubbard wrote: > >> On 1/11/19 6:02 PM, Jerome Glisse wrote: > >>> On Fri, Jan 11, 2019 at 05:04:05PM -0800, John Hubbard wrote: > On 1/11/19 8:51 AM, Jerome Glisse wrote: > > On Thu, Jan 10, 2019 at 06:59:31PM -0800, John Hubbard wrote: > >> On 1/3/19 6:44 AM, Jerome Glisse wrote: > >>> On Thu, Jan 03, 2019 at 10:26:54AM +0100, Jan Kara wrote: > On Wed 02-01-19 20:55:33, Jerome Glisse wrote: > > On Wed, Dec 19, 2018 at 12:08:56PM +0100, Jan Kara wrote: > >> On Tue 18-12-18 21:07:24, Jerome Glisse wrote: > >>> On Tue, Dec 18, 2018 at 03:29:34PM -0800, John Hubbard wrote: > > [...] > > Hi Jerome, > > Looks good, in a conceptual sense. Let me do a brain dump of how I see > it, > in case anyone spots a disastrous conceptual error (such as the lock_page > point), while I'm putting together the revised patchset. > > I've studied this carefully, and I agree that using mapcount in > this way is viable, *as long* as we use a lock (or a construct that > looks just > like one: your "memory barrier, check, retry" is really just a lock) in > order to hold off gup() while page_mkclean() is in progress. In other > words, > nothing that increments mapcount may proceed while page_mkclean() is > running. > >>> > >>> No, increment to page->_mapcount are fine while page_mkclean() is running. > >>> The above solution do work no matter what happens thanks to the memory > >>> barrier. By clearing the pin flag first and reading the page->_mapcount > >>> after (and doing the reverse in GUP) we know that a racing GUP will either > >>> have its pin page clear but the incremented mapcount taken into account by > >>> page_mkclean() or page_mkclean() will miss the incremented mapcount but > >>> it will also no clear the pin flag set concurrently by any GUP. > >>> > >>> Here are all the possible time line: > >>> [T1]: > >>> GUP on CPU0 | page_mkclean() on CPU1 > >>> | > >>> [G2] atomic_inc(&page->mapcount) | > >>> [G3] smp_wmb(); | > >>> [G4] SetPagePin(page); | > >>> ... > >>> | [C1] pined = TestClearPagePin(page); > >> > >> It appears that you're using the "page pin is clear" to indicate that > >> page_mkclean() is running. The problem is, that approach leads to toggling > >> the PagePin flag, and so an observer (other than gup or page_mkclean) will > >> see intervals during which the PagePin flag is clear, when conceptually it > >> should be set. > >> > >> Jan and other FS people, is it definitely the case that we only have to > >> take > >> action (defer, wait, revoke, etc) for gup-pinned pages, in page_mkclean()? > >> Because I recall from earlier experiments that there were several places, > >> not > >> just page_mkclean(). > > > > Yes and it is fine to temporarily have the pin flag unstable. Anything > > that need stable page content will have to lock the page so will have > > to sync against any page_mkclean() and in the end the only thing were > > we want to check the pin flag is when doing write back ie after > > page_mkclean() while the page is still locked. If they are any other > > place that need to check the pin flag then they will need to lock the > > page. But i can not think of any other place right now. > > > > > > OK. Yes, since the clearing and resetting happens under page lock, that will > suffice to synchronize it. That's a good point. > > > [...] > > > The other idea that you and Dan (and maybe others) pointed out was a > debug > option, which we'll certainly need in order to safely convert all the > call > sites. (Mirror the mappings at a different kernel offset, so that > put_page() > and put_user_page() can verify that the right call was made.) That will > be > a separate patchset, as you recommended. > > I'll even go as far as recommending the page lock itself. I realize that > this > adds overhead to gup(), but we *must* hold off page_mkclean(), and I > believe > that this (below) has similar overhead to the notes above--but is *much* > easier > to verify correct. (If the page lock is unacceptable due to being so > widely used, > then I'd recommend using another page bit to do the same thing.) > >>> > >>> Please page lock is pointless and it will not work for GUP fast. The above > >>> scheme do work and is fine. I spend the day again thinking about all > >>> memory > >>> ordering and i do not see any issues. > >>> > >> > >> Why is it that page lock cannot be used for gup fast, btw? > > > > Well it can not happen within the preempt disable section. But after >
Re: [PATCH 0/3] libsas: SATA PHY connection rate matching during disovery
John, > This patchset looks to resolve an issue we see whereby a SATA end device > negotiated linkrate may exceed the min linkrate to the initiator, and > not be able to establish a connection. Looks good to me. Applied to 5.1/scsi-queue. Thanks! -- Martin K. Petersen Oracle Linux Engineering
RE: [PATCH v2] soc: imx: Break dependency on SOC_IMX8MQ for GPCv2
> From: Shawn Guo [mailto:shawn...@kernel.org] > Sent: Saturday, January 12, 2019 11:09 AM > On Sun, Dec 23, 2018 at 01:20:34PM +, Abel Vesa wrote: > > Since this is going to be used on more SoCs than just i.MX8MQ, make > > the dependency here more generic. > > > > Signed-off-by: Abel Vesa > > Reviewed-by: Dong Aisheng > > --- > > Changes since v1: > > * removed the SOC_IMX7D since it's included by ARCH_MXC > > This should really be mentioned in the commit log. > Yes, you're right. We should put it in commit log. Abel, Would you resend with this updated? Regards Dong Aisheng > Shawn > > >as suggested by Dong Aisheng. > > > > drivers/soc/imx/Kconfig | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/soc/imx/Kconfig b/drivers/soc/imx/Kconfig index > > 2112d18..d80f899 100644 > > --- a/drivers/soc/imx/Kconfig > > +++ b/drivers/soc/imx/Kconfig > > @@ -2,7 +2,7 @@ menu "i.MX SoC drivers" > > > > config IMX_GPCV2_PM_DOMAINS > > bool "i.MX GPCv2 PM domains" > > - depends on SOC_IMX7D || SOC_IMX8MQ || (COMPILE_TEST && OF) > > + depends on ARCH_MXC || (COMPILE_TEST && OF) > > depends on PM > > select PM_GENERIC_DOMAINS > > default y if SOC_IMX7D > > -- > > 2.7.4 > >
Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions
On Fri, Jan 11, 2019 at 06:38:44PM -0800, John Hubbard wrote: > On 1/11/19 6:02 PM, Jerome Glisse wrote: > > On Fri, Jan 11, 2019 at 05:04:05PM -0800, John Hubbard wrote: > >> On 1/11/19 8:51 AM, Jerome Glisse wrote: > >>> On Thu, Jan 10, 2019 at 06:59:31PM -0800, John Hubbard wrote: > On 1/3/19 6:44 AM, Jerome Glisse wrote: > > On Thu, Jan 03, 2019 at 10:26:54AM +0100, Jan Kara wrote: > >> On Wed 02-01-19 20:55:33, Jerome Glisse wrote: > >>> On Wed, Dec 19, 2018 at 12:08:56PM +0100, Jan Kara wrote: > On Tue 18-12-18 21:07:24, Jerome Glisse wrote: > > On Tue, Dec 18, 2018 at 03:29:34PM -0800, John Hubbard wrote: > >>> [...] > >> > >> Hi Jerome, > >> > >> Looks good, in a conceptual sense. Let me do a brain dump of how I see it, > >> in case anyone spots a disastrous conceptual error (such as the lock_page > >> point), while I'm putting together the revised patchset. > >> > >> I've studied this carefully, and I agree that using mapcount in > >> this way is viable, *as long* as we use a lock (or a construct that looks > >> just > >> like one: your "memory barrier, check, retry" is really just a lock) in > >> order to hold off gup() while page_mkclean() is in progress. In other > >> words, > >> nothing that increments mapcount may proceed while page_mkclean() is > >> running. > > > > No, increment to page->_mapcount are fine while page_mkclean() is running. > > The above solution do work no matter what happens thanks to the memory > > barrier. By clearing the pin flag first and reading the page->_mapcount > > after (and doing the reverse in GUP) we know that a racing GUP will either > > have its pin page clear but the incremented mapcount taken into account by > > page_mkclean() or page_mkclean() will miss the incremented mapcount but > > it will also no clear the pin flag set concurrently by any GUP. > > > > Here are all the possible time line: > > [T1]: > > GUP on CPU0 | page_mkclean() on CPU1 > > | > > [G2] atomic_inc(&page->mapcount) | > > [G3] smp_wmb(); | > > [G4] SetPagePin(page); | > > ... > > | [C1] pined = TestClearPagePin(page); > > It appears that you're using the "page pin is clear" to indicate that > page_mkclean() is running. The problem is, that approach leads to toggling > the PagePin flag, and so an observer (other than gup or page_mkclean) will > see intervals during which the PagePin flag is clear, when conceptually it > should be set. Also forgot to stress that i am not using the pin flag to report page_mkclean is running, i am clearing it first because clearing that bit is the thing that is racy. If we clear it first and then read map and pin count and then count number of real mapping we get a proper ordering and we will always detect pined page and properly restore the pin flag at the end of page_mkclean. In fact GUP or PUP never need to check if the flag is clear. The check in GUP in my pseudo code is an optimization for the write back ordering (no need to do any ordering if the pin flag was already set before the current GUP). Cheers, Jérôme
Re: [PATCH v2] soc: imx: Break dependency on SOC_IMX8MQ for GPCv2
On Sun, Dec 23, 2018 at 01:20:34PM +, Abel Vesa wrote: > Since this is going to be used on more SoCs than just i.MX8MQ, > make the dependency here more generic. > > Signed-off-by: Abel Vesa > Reviewed-by: Dong Aisheng > --- > Changes since v1: > * removed the SOC_IMX7D since it's included by ARCH_MXC This should really be mentioned in the commit log. Shawn >as suggested by Dong Aisheng. > > drivers/soc/imx/Kconfig | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/soc/imx/Kconfig b/drivers/soc/imx/Kconfig > index 2112d18..d80f899 100644 > --- a/drivers/soc/imx/Kconfig > +++ b/drivers/soc/imx/Kconfig > @@ -2,7 +2,7 @@ menu "i.MX SoC drivers" > > config IMX_GPCV2_PM_DOMAINS > bool "i.MX GPCv2 PM domains" > - depends on SOC_IMX7D || SOC_IMX8MQ || (COMPILE_TEST && OF) > + depends on ARCH_MXC || (COMPILE_TEST && OF) > depends on PM > select PM_GENERIC_DOMAINS > default y if SOC_IMX7D > -- > 2.7.4 >
Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions
On 1/11/19 6:46 PM, Jerome Glisse wrote: > On Fri, Jan 11, 2019 at 06:38:44PM -0800, John Hubbard wrote: >> On 1/11/19 6:02 PM, Jerome Glisse wrote: >>> On Fri, Jan 11, 2019 at 05:04:05PM -0800, John Hubbard wrote: On 1/11/19 8:51 AM, Jerome Glisse wrote: > On Thu, Jan 10, 2019 at 06:59:31PM -0800, John Hubbard wrote: >> On 1/3/19 6:44 AM, Jerome Glisse wrote: >>> On Thu, Jan 03, 2019 at 10:26:54AM +0100, Jan Kara wrote: On Wed 02-01-19 20:55:33, Jerome Glisse wrote: > On Wed, Dec 19, 2018 at 12:08:56PM +0100, Jan Kara wrote: >> On Tue 18-12-18 21:07:24, Jerome Glisse wrote: >>> On Tue, Dec 18, 2018 at 03:29:34PM -0800, John Hubbard wrote: > [...] Hi Jerome, Looks good, in a conceptual sense. Let me do a brain dump of how I see it, in case anyone spots a disastrous conceptual error (such as the lock_page point), while I'm putting together the revised patchset. I've studied this carefully, and I agree that using mapcount in this way is viable, *as long* as we use a lock (or a construct that looks just like one: your "memory barrier, check, retry" is really just a lock) in order to hold off gup() while page_mkclean() is in progress. In other words, nothing that increments mapcount may proceed while page_mkclean() is running. >>> >>> No, increment to page->_mapcount are fine while page_mkclean() is running. >>> The above solution do work no matter what happens thanks to the memory >>> barrier. By clearing the pin flag first and reading the page->_mapcount >>> after (and doing the reverse in GUP) we know that a racing GUP will either >>> have its pin page clear but the incremented mapcount taken into account by >>> page_mkclean() or page_mkclean() will miss the incremented mapcount but >>> it will also no clear the pin flag set concurrently by any GUP. >>> >>> Here are all the possible time line: >>> [T1]: >>> GUP on CPU0 | page_mkclean() on CPU1 >>> | >>> [G2] atomic_inc(&page->mapcount) | >>> [G3] smp_wmb(); | >>> [G4] SetPagePin(page); | >>> ... >>> | [C1] pined = TestClearPagePin(page); >> >> It appears that you're using the "page pin is clear" to indicate that >> page_mkclean() is running. The problem is, that approach leads to toggling >> the PagePin flag, and so an observer (other than gup or page_mkclean) will >> see intervals during which the PagePin flag is clear, when conceptually it >> should be set. >> >> Jan and other FS people, is it definitely the case that we only have to take >> action (defer, wait, revoke, etc) for gup-pinned pages, in page_mkclean()? >> Because I recall from earlier experiments that there were several places, >> not >> just page_mkclean(). > > Yes and it is fine to temporarily have the pin flag unstable. Anything > that need stable page content will have to lock the page so will have > to sync against any page_mkclean() and in the end the only thing were > we want to check the pin flag is when doing write back ie after > page_mkclean() while the page is still locked. If they are any other > place that need to check the pin flag then they will need to lock the > page. But i can not think of any other place right now. > > OK. Yes, since the clearing and resetting happens under page lock, that will suffice to synchronize it. That's a good point. > [...] > The other idea that you and Dan (and maybe others) pointed out was a debug option, which we'll certainly need in order to safely convert all the call sites. (Mirror the mappings at a different kernel offset, so that put_page() and put_user_page() can verify that the right call was made.) That will be a separate patchset, as you recommended. I'll even go as far as recommending the page lock itself. I realize that this adds overhead to gup(), but we *must* hold off page_mkclean(), and I believe that this (below) has similar overhead to the notes above--but is *much* easier to verify correct. (If the page lock is unacceptable due to being so widely used, then I'd recommend using another page bit to do the same thing.) >>> >>> Please page lock is pointless and it will not work for GUP fast. The above >>> scheme do work and is fine. I spend the day again thinking about all memory >>> ordering and i do not see any issues. >>> >> >> Why is it that page lock cannot be used for gup fast, btw? > > Well it can not happen within the preempt disable section. But after > as a post pass before GUP_fast return and after reenabling preempt then > it is fine like it would be for regular GUP. But locking page for GUP > is also likely to slow down some workload (with direct-IO). > Right, and so to crux of the matter: taking an uncontended page lock involves pretty much the s
Re: [PATCH] mISDN: hfcsusb: Use struct_size() in kzalloc()
From: "Gustavo A. R. Silva" Date: Tue, 8 Jan 2019 15:27:05 -0600 > One of the more common cases of allocation size calculations is finding the > size of a structure that has a zero-sized array at the end, along with memory > for some number of elements for that array. For example: > > struct foo { > int stuff; > void *entry[]; > }; > > instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL); > > Instead of leaving these open-coded and prone to type mistakes, we can now > use the new struct_size() helper: > > instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL); > > This code was detected with the help of Coccinelle. > > Signed-off-by: Gustavo A. R. Silva Applied.
Re: [PATCH v2] rbtree: fix the red root
On Fri, Jan 11, 2019 at 3:47 PM David Lechner wrote: > > On 1/11/19 2:58 PM, Qian Cai wrote: > > A GPF was reported, > > > > kasan: CONFIG_KASAN_INLINE enabled > > kasan: GPF could be caused by NULL-ptr deref or user memory access > > general protection fault: [#1] SMP KASAN > > kasan_die_handler.cold.22+0x11/0x31 > > notifier_call_chain+0x17b/0x390 > > atomic_notifier_call_chain+0xa7/0x1b0 > > notify_die+0x1be/0x2e0 > > do_general_protection+0x13e/0x330 > > general_protection+0x1e/0x30 > > rb_insert_color+0x189/0x1480 > > create_object+0x785/0xca0 > > kmemleak_alloc+0x2f/0x50 > > kmem_cache_alloc+0x1b9/0x3c0 > > getname_flags+0xdb/0x5d0 > > getname+0x1e/0x20 > > do_sys_open+0x3a1/0x7d0 > > __x64_sys_open+0x7e/0xc0 > > do_syscall_64+0x1b3/0x820 > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > > > It turned out, > > > > gparent = rb_red_parent(parent); > > tmp = gparent->rb_right; <-- GPF was triggered here. > > > > Apparently, "gparent" is NULL which indicates "parent" is rbtree's root > > which is red. Otherwise, it will be treated properly a few lines above. > > > > /* > > * If there is a black parent, we are done. > > * Otherwise, take some corrective action as, > > * per 4), we don't want a red root or two > > * consecutive red nodes. > > */ > > if(rb_is_black(parent)) > > break; > > > > Hence, it violates the rule #1 (the root can't be red) and need a fix > > up, and also add a regression test for it. This looks like was > > introduced by 6d58452dc06 where it no longer always paint the root as > > black. > > > > Fixes: 6d58452dc06 (rbtree: adjust root color in rb_insert_color() only > > when necessary) > > Reported-by: Esme > > Tested-by: Joey Pabalinas > > Signed-off-by: Qian Cai > > --- > > Tested-by: David Lechner > FWIW, this fixed the following crash for me: > > Unable to handle kernel NULL pointer dereference at virtual address 0004 Just to clarify, do you have a way to reproduce this crash without the fix ? I don't think the fix is correct, because it just silently ignores a corrupted rbtree (red root node). But the code that creates this situation certainly needs to be fixed - having a reproduceable test case would certainly help here. Regarding 6d58452dc06, the reasoning was that this code expects to be called after inserting a new (red) leaf into an rbtree that had all of its data structure invariants satisfied. So in this context, it should not be necessary to always reset the root to black, as this should already be the case...
Re: [PATCH 05/41] scsi: aic7xxx: aic79xx: mark expected switch fall-through
Hannes, >> Friendly ping (second one): >> >> Who can ack/review/take this patch, please? Applied to 5.1/scsi-queue. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 10/41] scsi: bfa: bfa_fcs_lport: Mark expected switch fall-throughs
Gustavo, > In preparation to enabling -Wimplicit-fallthrough, mark switch cases > where we are expecting to fall through. Applied this and two other bfa patches to 5.1/scsi-queue. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 1/3] arm64: dts: add dpaa2-console node for DPAA2 platforms
On Fri, Dec 21, 2018 at 03:31:09PM +, Ioana Ciornei wrote: > Add the dpaa2-console device tree node for the following > DPAA2 based platforms: LS1088A, LS2080A and LS2088A. > > Signed-off-by: Ioana Ciornei DTS patch generally goes to the last in a series. Shawn
Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions
On Fri, Jan 11, 2019 at 06:38:44PM -0800, John Hubbard wrote: > On 1/11/19 6:02 PM, Jerome Glisse wrote: > > On Fri, Jan 11, 2019 at 05:04:05PM -0800, John Hubbard wrote: > >> On 1/11/19 8:51 AM, Jerome Glisse wrote: > >>> On Thu, Jan 10, 2019 at 06:59:31PM -0800, John Hubbard wrote: > On 1/3/19 6:44 AM, Jerome Glisse wrote: > > On Thu, Jan 03, 2019 at 10:26:54AM +0100, Jan Kara wrote: > >> On Wed 02-01-19 20:55:33, Jerome Glisse wrote: > >>> On Wed, Dec 19, 2018 at 12:08:56PM +0100, Jan Kara wrote: > On Tue 18-12-18 21:07:24, Jerome Glisse wrote: > > On Tue, Dec 18, 2018 at 03:29:34PM -0800, John Hubbard wrote: > >>> [...] > >> > >> Hi Jerome, > >> > >> Looks good, in a conceptual sense. Let me do a brain dump of how I see it, > >> in case anyone spots a disastrous conceptual error (such as the lock_page > >> point), while I'm putting together the revised patchset. > >> > >> I've studied this carefully, and I agree that using mapcount in > >> this way is viable, *as long* as we use a lock (or a construct that looks > >> just > >> like one: your "memory barrier, check, retry" is really just a lock) in > >> order to hold off gup() while page_mkclean() is in progress. In other > >> words, > >> nothing that increments mapcount may proceed while page_mkclean() is > >> running. > > > > No, increment to page->_mapcount are fine while page_mkclean() is running. > > The above solution do work no matter what happens thanks to the memory > > barrier. By clearing the pin flag first and reading the page->_mapcount > > after (and doing the reverse in GUP) we know that a racing GUP will either > > have its pin page clear but the incremented mapcount taken into account by > > page_mkclean() or page_mkclean() will miss the incremented mapcount but > > it will also no clear the pin flag set concurrently by any GUP. > > > > Here are all the possible time line: > > [T1]: > > GUP on CPU0 | page_mkclean() on CPU1 > > | > > [G2] atomic_inc(&page->mapcount) | > > [G3] smp_wmb(); | > > [G4] SetPagePin(page); | > > ... > > | [C1] pined = TestClearPagePin(page); > > It appears that you're using the "page pin is clear" to indicate that > page_mkclean() is running. The problem is, that approach leads to toggling > the PagePin flag, and so an observer (other than gup or page_mkclean) will > see intervals during which the PagePin flag is clear, when conceptually it > should be set. > > Jan and other FS people, is it definitely the case that we only have to take > action (defer, wait, revoke, etc) for gup-pinned pages, in page_mkclean()? > Because I recall from earlier experiments that there were several places, not > just page_mkclean(). Yes and it is fine to temporarily have the pin flag unstable. Anything that need stable page content will have to lock the page so will have to sync against any page_mkclean() and in the end the only thing were we want to check the pin flag is when doing write back ie after page_mkclean() while the page is still locked. If they are any other place that need to check the pin flag then they will need to lock the page. But i can not think of any other place right now. [...] > >> The other idea that you and Dan (and maybe others) pointed out was a debug > >> option, which we'll certainly need in order to safely convert all the call > >> sites. (Mirror the mappings at a different kernel offset, so that > >> put_page() > >> and put_user_page() can verify that the right call was made.) That will be > >> a separate patchset, as you recommended. > >> > >> I'll even go as far as recommending the page lock itself. I realize that > >> this > >> adds overhead to gup(), but we *must* hold off page_mkclean(), and I > >> believe > >> that this (below) has similar overhead to the notes above--but is *much* > >> easier > >> to verify correct. (If the page lock is unacceptable due to being so > >> widely used, > >> then I'd recommend using another page bit to do the same thing.) > > > > Please page lock is pointless and it will not work for GUP fast. The above > > scheme do work and is fine. I spend the day again thinking about all memory > > ordering and i do not see any issues. > > > > Why is it that page lock cannot be used for gup fast, btw? Well it can not happen within the preempt disable section. But after as a post pass before GUP_fast return and after reenabling preempt then it is fine like it would be for regular GUP. But locking page for GUP is also likely to slow down some workload (with direct-IO). Cheers, Jérôme
Re: [PATCH 14/41] scsi: esas2r: esas2r_init: mark expected switch fall-throughs
Gustavo, > In preparation to enabling -Wimplicit-fallthrough, mark switch cases > where we are expecting to fall through. Applied to 5.1/scsi-queue. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 1/2] arm64: dts: lx2160a: Add fsl-mc node
On Thu, Dec 20, 2018 at 05:02:03PM +, Ioana Ciocoi Radulescu wrote: > Add the fsl-mc node in the LX2160A device tree. > > Signed-off-by: Ioana Radulescu Applied both, thanks.
Re: [PATCH] scsi: hisi_sas: Set protection parameters prior to adding SCSI host
John, > Currently we set the protection parameters after calling scsi_add_host() > for v3 hw. > > They should be set beforehand, so make this change. Applied to 5.0/scsi-fixes. Thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions
On 1/11/19 6:02 PM, Jerome Glisse wrote: > On Fri, Jan 11, 2019 at 05:04:05PM -0800, John Hubbard wrote: >> On 1/11/19 8:51 AM, Jerome Glisse wrote: >>> On Thu, Jan 10, 2019 at 06:59:31PM -0800, John Hubbard wrote: On 1/3/19 6:44 AM, Jerome Glisse wrote: > On Thu, Jan 03, 2019 at 10:26:54AM +0100, Jan Kara wrote: >> On Wed 02-01-19 20:55:33, Jerome Glisse wrote: >>> On Wed, Dec 19, 2018 at 12:08:56PM +0100, Jan Kara wrote: On Tue 18-12-18 21:07:24, Jerome Glisse wrote: > On Tue, Dec 18, 2018 at 03:29:34PM -0800, John Hubbard wrote: >>> [...] >> >> Hi Jerome, >> >> Looks good, in a conceptual sense. Let me do a brain dump of how I see it, >> in case anyone spots a disastrous conceptual error (such as the lock_page >> point), while I'm putting together the revised patchset. >> >> I've studied this carefully, and I agree that using mapcount in >> this way is viable, *as long* as we use a lock (or a construct that looks >> just >> like one: your "memory barrier, check, retry" is really just a lock) in >> order to hold off gup() while page_mkclean() is in progress. In other words, >> nothing that increments mapcount may proceed while page_mkclean() is running. > > No, increment to page->_mapcount are fine while page_mkclean() is running. > The above solution do work no matter what happens thanks to the memory > barrier. By clearing the pin flag first and reading the page->_mapcount > after (and doing the reverse in GUP) we know that a racing GUP will either > have its pin page clear but the incremented mapcount taken into account by > page_mkclean() or page_mkclean() will miss the incremented mapcount but > it will also no clear the pin flag set concurrently by any GUP. > > Here are all the possible time line: > [T1]: > GUP on CPU0 | page_mkclean() on CPU1 > | > [G2] atomic_inc(&page->mapcount) | > [G3] smp_wmb(); | > [G4] SetPagePin(page); | > ... > | [C1] pined = TestClearPagePin(page); It appears that you're using the "page pin is clear" to indicate that page_mkclean() is running. The problem is, that approach leads to toggling the PagePin flag, and so an observer (other than gup or page_mkclean) will see intervals during which the PagePin flag is clear, when conceptually it should be set. Jan and other FS people, is it definitely the case that we only have to take action (defer, wait, revoke, etc) for gup-pinned pages, in page_mkclean()? Because I recall from earlier experiments that there were several places, not just page_mkclean(). One more quick question below... > | [C2] smp_mb(); > | [C3] map_and_pin_count = > |atomic_read(&page->mapcount) > > It is fine because page_mkclean() will read the correct page->mapcount > which include the GUP that happens before [C1] > > > [T2]: > GUP on CPU0 | page_mkclean() on CPU1 > | > | [C1] pined = TestClearPagePin(page); > | [C2] smp_mb(); > | [C3] map_and_pin_count = > |atomic_read(&page->mapcount) > ... > [G2] atomic_inc(&page->mapcount) | > [G3] smp_wmb(); | > [G4] SetPagePin(page); | > > It is fine because [G4] set the pin flag so it does not matter that [C3] > did miss the mapcount increase from the GUP. > > > [T3]: > GUP on CPU0 | page_mkclean() on CPU1 > [G4] SetPagePin(page); | [C1] pined = TestClearPagePin(page); > > No matter which CPU ordering we get ie either: > - [G4] is overwritten by [C1] in that case [C3] will see the mapcount > that was incremented by [G2] so we will map_count < map_and_pin_count > and we will set the pin flag again at the end of page_mkclean() > - [C1] is overwritten by [G4] in that case the pin flag is set and thus > it does not matter that [C3] also see the mapcount that was incremented > by [G2] > > > This is totaly race free ie at the end of page_mkclean() the pin flag will > be set for all page that are pin and for some page that are no longer pin. > What matter is that they are no false negative. > > >> I especially am intrigued by your idea about a fuzzy count that allows >> false positives but no false negatives. To do that, we need to put a hard >> lock protecting the increment operation, but we can be loose (no lock) on >> decrement. That turns out to be a perfect match for the problem here, because >> as I recall from my earlier efforts, put_user_page() must *not* take locks-- >> and that's where we just decrement. Sweet! See below. > > You do not need lock, lock are easier to think with but they are not alwa
Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()
John, > So how about just drop these APIs and let the user set the shost > protection parameters directly, like other shost parameters, The protection interfaces here obviously predate the block layer allocation changes that made this particular issue pop up. > which should make it a bit clearer when these should be set, > i.e. before scsi_add_host()? In general, I am not so keen on the somewhat messy intersection between the host parameters and the host template. The static host templates made lots of sense in the days of Seagate ST01 and fixed hardware capabilities. But reality is that most modern controllers have to query firmware interfaces to figure out what their actual capabilities are. So in this case I think that accessor functions are actually better because they allow us to print a big fat warning when you twiddle something you shouldn't post-initialization. So that's something I think we could--and should--improve. -- Martin K. Petersen Oracle Linux Engineering
[PATCH v2 9/9] kprobes: Prohibit probing on lockdep functions
Some lockdep functions can be involved in breakpoint handling and probing on those functions can cause a breakpoint recursion. Prohibit probing on those functions by blacklist. Signed-off-by: Masami Hiramatsu --- kernel/locking/lockdep.c |7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index 95932333a48b..bc35a54ae3d4 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -50,6 +50,7 @@ #include #include #include +#include #include @@ -2814,6 +2815,7 @@ void lockdep_hardirqs_on(unsigned long ip) __trace_hardirqs_on_caller(ip); current->lockdep_recursion = 0; } +NOKPROBE_SYMBOL(lockdep_hardirqs_on); /* * Hardirqs were disabled: @@ -2843,6 +2845,7 @@ void lockdep_hardirqs_off(unsigned long ip) } else debug_atomic_inc(redundant_hardirqs_off); } +NOKPROBE_SYMBOL(lockdep_hardirqs_off); /* * Softirqs will be enabled: @@ -3650,7 +3653,8 @@ __lock_release(struct lockdep_map *lock, int nested, unsigned long ip) return 0; } -static int __lock_is_held(const struct lockdep_map *lock, int read) +static nokprobe_inline +int __lock_is_held(const struct lockdep_map *lock, int read) { struct task_struct *curr = current; int i; @@ -3883,6 +3887,7 @@ int lock_is_held_type(const struct lockdep_map *lock, int read) return ret; } EXPORT_SYMBOL_GPL(lock_is_held_type); +NOKPROBE_SYMBOL(lock_is_held_type); struct pin_cookie lock_pin_lock(struct lockdep_map *lock) {
[PATCH v2 8/9] kprobes: Prohibit probing on RCU debug routine
Since kprobe itself depends on RCU, probing on RCU debug routine can cause recursive breakpoint problem. Prohibit probing on RCU debug routines. int3 ->do_int3() ->ist_enter() ->RCU_LOCKDEP_WARN() ->debug_lockdep_rcu_enabled() -> int3 Signed-off-by: Masami Hiramatsu --- kernel/rcu/update.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c index 1971869c4072..f4ca36d92138 100644 --- a/kernel/rcu/update.c +++ b/kernel/rcu/update.c @@ -52,6 +52,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS @@ -249,6 +250,7 @@ int notrace debug_lockdep_rcu_enabled(void) current->lockdep_recursion == 0; } EXPORT_SYMBOL_GPL(debug_lockdep_rcu_enabled); +NOKPROBE_SYMBOL(debug_lockdep_rcu_enabled); /** * rcu_read_lock_held() - might we be in RCU read-side critical section?
[PATCH v2 6/9] kprobes: Prohibit probing on hardirq tracers
Since kprobes breakpoint handling involves hardirq tracer, probing these functions cause breakpoint recursion problem. Prohibit probing on those functions. Signed-off-by: Masami Hiramatsu Acked-by: Steven Rostedt (VMware) --- kernel/trace/trace_irqsoff.c|9 +++-- kernel/trace/trace_preemptirq.c |5 + 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c index 98ea6d28df15..829709bfec3d 100644 --- a/kernel/trace/trace_irqsoff.c +++ b/kernel/trace/trace_irqsoff.c @@ -14,6 +14,7 @@ #include #include #include +#include #include "trace.h" @@ -368,7 +369,7 @@ check_critical_timing(struct trace_array *tr, __trace_function(tr, CALLER_ADDR0, parent_ip, flags, pc); } -static inline void +static nokprobe_inline void start_critical_timing(unsigned long ip, unsigned long parent_ip, int pc) { int cpu; @@ -404,7 +405,7 @@ start_critical_timing(unsigned long ip, unsigned long parent_ip, int pc) atomic_dec(&data->disabled); } -static inline void +static nokprobe_inline void stop_critical_timing(unsigned long ip, unsigned long parent_ip, int pc) { int cpu; @@ -446,6 +447,7 @@ void start_critical_timings(void) start_critical_timing(CALLER_ADDR0, CALLER_ADDR1, pc); } EXPORT_SYMBOL_GPL(start_critical_timings); +NOKPROBE_SYMBOL(start_critical_timings); void stop_critical_timings(void) { @@ -455,6 +457,7 @@ void stop_critical_timings(void) stop_critical_timing(CALLER_ADDR0, CALLER_ADDR1, pc); } EXPORT_SYMBOL_GPL(stop_critical_timings); +NOKPROBE_SYMBOL(stop_critical_timings); #ifdef CONFIG_FUNCTION_TRACER static bool function_enabled; @@ -615,6 +618,7 @@ void tracer_hardirqs_on(unsigned long a0, unsigned long a1) if (!preempt_trace(pc) && irq_trace()) stop_critical_timing(a0, a1, pc); } +NOKPROBE_SYMBOL(tracer_hardirqs_on); void tracer_hardirqs_off(unsigned long a0, unsigned long a1) { @@ -623,6 +627,7 @@ void tracer_hardirqs_off(unsigned long a0, unsigned long a1) if (!preempt_trace(pc) && irq_trace()) start_critical_timing(a0, a1, pc); } +NOKPROBE_SYMBOL(tracer_hardirqs_off); static int irqsoff_tracer_init(struct trace_array *tr) { diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c index 71f553cceb3c..4d8e99fdbbbe 100644 --- a/kernel/trace/trace_preemptirq.c +++ b/kernel/trace/trace_preemptirq.c @@ -9,6 +9,7 @@ #include #include #include +#include #include "trace.h" #define CREATE_TRACE_POINTS @@ -30,6 +31,7 @@ void trace_hardirqs_on(void) lockdep_hardirqs_on(CALLER_ADDR0); } EXPORT_SYMBOL(trace_hardirqs_on); +NOKPROBE_SYMBOL(trace_hardirqs_on); void trace_hardirqs_off(void) { @@ -43,6 +45,7 @@ void trace_hardirqs_off(void) lockdep_hardirqs_off(CALLER_ADDR0); } EXPORT_SYMBOL(trace_hardirqs_off); +NOKPROBE_SYMBOL(trace_hardirqs_off); __visible void trace_hardirqs_on_caller(unsigned long caller_addr) { @@ -56,6 +59,7 @@ __visible void trace_hardirqs_on_caller(unsigned long caller_addr) lockdep_hardirqs_on(CALLER_ADDR0); } EXPORT_SYMBOL(trace_hardirqs_on_caller); +NOKPROBE_SYMBOL(trace_hardirqs_on_caller); __visible void trace_hardirqs_off_caller(unsigned long caller_addr) { @@ -69,6 +73,7 @@ __visible void trace_hardirqs_off_caller(unsigned long caller_addr) lockdep_hardirqs_off(CALLER_ADDR0); } EXPORT_SYMBOL(trace_hardirqs_off_caller); +NOKPROBE_SYMBOL(trace_hardirqs_off_caller); #endif /* CONFIG_TRACE_IRQFLAGS */ #ifdef CONFIG_TRACE_PREEMPT_TOGGLE
[PATCH v2 7/9] kprobes: Prohibit probing on preempt_check debug functions
Since kprobes depends on preempt disable/enable, probing on the preempt debug routine can cause recursive breakpoint problem. Signed-off-by: Masami Hiramatsu --- lib/smp_processor_id.c |7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/lib/smp_processor_id.c b/lib/smp_processor_id.c index 85925aaa4fff..157d9e31f6c2 100644 --- a/lib/smp_processor_id.c +++ b/lib/smp_processor_id.c @@ -5,10 +5,11 @@ * DEBUG_PREEMPT variant of smp_processor_id(). */ #include +#include #include -notrace static unsigned int check_preemption_disabled(const char *what1, - const char *what2) +notrace static nokprobe_inline +unsigned int check_preemption_disabled(const char *what1, const char *what2) { int this_cpu = raw_smp_processor_id(); @@ -56,9 +57,11 @@ notrace unsigned int debug_smp_processor_id(void) return check_preemption_disabled("smp_processor_id", ""); } EXPORT_SYMBOL(debug_smp_processor_id); +NOKPROBE_SYMBOL(debug_smp_processor_id); notrace void __this_cpu_preempt_check(const char *op) { check_preemption_disabled("__this_cpu_", op); } EXPORT_SYMBOL(__this_cpu_preempt_check); +NOKPROBE_SYMBOL(__this_cpu_preempt_check);
[PATCH v2 4/9] x86/kprobes: Prohibit probing on IRQ handlers directly
Prohibit probing on IRQ handlers in irqentry_text because if it interrupts user mode, at that point we haven't changed to kernel space yet and which eventually leads a double fault. E.g. # echo p apic_timer_interrupt > kprobe_events # echo 1 > events/kprobes/enable PANIC: double fault, error_code: 0x0 CPU: 1 PID: 814 Comm: less Not tainted 4.20.0-rc3+ #30 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:error_entry+0x12/0xf0 [snip] Call Trace: ? native_iret+0x7/0x7 ? async_page_fault+0x8/0x30 ? trace_hardirqs_on_thunk+0x1c/0x1c ? error_entry+0x7c/0xf0 ? async_page_fault+0x8/0x30 ? native_iret+0x7/0x7 ? int3+0xa/0x20 ? trace_hardirqs_on_thunk+0x1c/0x1c ? error_entry+0x7c/0xf0 ? int3+0xa/0x20 ? apic_timer_interrupt+0x1/0x20 Kernel panic - not syncing: Machine halted. Kernel Offset: disabled ---[ end Kernel panic - not syncing: Machine halted. ]--- Signed-off-by: Masami Hiramatsu --- arch/x86/kernel/kprobes/core.c |7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c index f4b954ff5b89..fed46ddb1eef 100644 --- a/arch/x86/kernel/kprobes/core.c +++ b/arch/x86/kernel/kprobes/core.c @@ -1072,6 +1072,13 @@ NOKPROBE_SYMBOL(kprobe_fault_handler); int __init arch_populate_kprobe_blacklist(void) { + int ret; + + ret = kprobe_add_area_blacklist((unsigned long)__irqentry_text_start, +(unsigned long)__irqentry_text_end); + if (ret) + return ret; + return kprobe_add_area_blacklist((unsigned long)__entry_text_start, (unsigned long)__entry_text_end); }
[PATCH v2 5/9] kprobes: Search non-suffixed symbol in blacklist
Newer gcc can generate some different instances of a function with suffixed symbols if the function is optimized and only has a part of that. (e.g. .constprop, .part etc.) In this case, it is not enough to check the entry of kprobe blacklist because it only records non-suffixed symbol address. To fix this issue, search non-suffixed symbol in blacklist if given address is within a symbol which has a suffix. Note that this can cause false positive cases if a kprobe-safe function is optimized to suffixed instance and has same name symbol which is blacklisted. But I would like to chose a fail-safe design for this issue. Signed-off-by: Masami Hiramatsu --- kernel/kprobes.c | 21 - 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/kernel/kprobes.c b/kernel/kprobes.c index e8c76164f541..faa519f07aad 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -1396,7 +1396,7 @@ bool __weak arch_within_kprobe_blacklist(unsigned long addr) addr < (unsigned long)__kprobes_text_end; } -bool within_kprobe_blacklist(unsigned long addr) +static bool __within_kprobe_blacklist(unsigned long addr) { struct kprobe_blacklist_entry *ent; @@ -1410,7 +1410,26 @@ bool within_kprobe_blacklist(unsigned long addr) if (addr >= ent->start_addr && addr < ent->end_addr) return true; } + return false; +} +bool within_kprobe_blacklist(unsigned long addr) +{ + char symname[KSYM_NAME_LEN], *p; + + if (__within_kprobe_blacklist(addr)) + return true; + + /* Check if the address is on a suffixed-symbol */ + if (!lookup_symbol_name(addr, symname)) { + p = strchr(symname, '.'); + if (!p) + return false; + *p = '\0'; + addr = (unsigned long)kprobe_lookup_name(symname, 0); + if (addr) + return __within_kprobe_blacklist(addr); + } return false; }
[PATCH v2 3/9] x86/kprobes: Prohibit probing on functions before kprobe_int3_handler()
Prohibit probing on the functions called before kprobe_int3_handler() in do_int3(). More specifically, ftrace_int3_handler(), poke_int3_handler(), and ist_enter(). And since rcu_nmi_enter() is called by ist_enter(), it also should be marked as NOKPROBE_SYMBOL. Since those are handled before kprobe_int3_handler(), probing those functions can cause a breakpoint recursion and crash the kernel. Signed-off-by: Masami Hiramatsu --- arch/x86/kernel/alternative.c |3 ++- arch/x86/kernel/ftrace.c |3 ++- arch/x86/kernel/traps.c |1 + kernel/rcu/tree.c |2 ++ 4 files changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index ebeac487a20c..e8b628b1b279 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -764,8 +765,8 @@ int poke_int3_handler(struct pt_regs *regs) regs->ip = (unsigned long) bp_int3_handler; return 1; - } +NOKPROBE_SYMBOL(poke_int3_handler); /** * text_poke_bp() -- update instructions on live kernel on SMP diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index 7ee8067cbf45..22a548919228 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -269,7 +269,7 @@ int ftrace_update_ftrace_func(ftrace_func_t func) return ret; } -static int is_ftrace_caller(unsigned long ip) +static nokprobe_inline int is_ftrace_caller(unsigned long ip) { if (ip == ftrace_update_func) return 1; @@ -299,6 +299,7 @@ int ftrace_int3_handler(struct pt_regs *regs) return 1; } +NOKPROBE_SYMBOL(ftrace_int3_handler); static int ftrace_write(unsigned long ip, const char *val, int size) { diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 9b7c4ca8f0a7..e289ce1332ab 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -111,6 +111,7 @@ void ist_enter(struct pt_regs *regs) /* This code is a bit fragile. Test it. */ RCU_LOCKDEP_WARN(!rcu_is_watching(), "ist_enter didn't work"); } +NOKPROBE_SYMBOL(ist_enter); void ist_exit(struct pt_regs *regs) { diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 9180158756d2..74db52a0a466 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -62,6 +62,7 @@ #include #include #include +#include #include "tree.h" #include "rcu.h" @@ -872,6 +873,7 @@ void rcu_nmi_enter(void) { rcu_nmi_enter_common(false); } +NOKPROBE_SYMBOL(rcu_nmi_enter); /** * rcu_irq_enter - inform RCU that current CPU is entering irq away from idle
[PATCH v2 2/9] x86/kprobes: Move trampoline code into RODATA
Move optprobe trampoline code into RODATA since it is not executed, but copied and modified to be used on a trampoline buffer. Signed-off-by: Masami Hiramatsu --- arch/x86/kernel/kprobes/opt.c |9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c index 544bd41a514c..f14262952015 100644 --- a/arch/x86/kernel/kprobes/opt.c +++ b/arch/x86/kernel/kprobes/opt.c @@ -97,6 +97,7 @@ static void synthesize_set_arg1(kprobe_opcode_t *addr, unsigned long val) } asm ( + ".pushsection .rodata\n" "optprobe_template_func:\n" ".global optprobe_template_entry\n" "optprobe_template_entry:\n" @@ -136,16 +137,10 @@ asm ( #endif ".global optprobe_template_end\n" "optprobe_template_end:\n" - ".type optprobe_template_func, @function\n" - ".size optprobe_template_func, .-optprobe_template_func\n"); + ".popsection\n"); void optprobe_template_func(void); STACK_FRAME_NON_STANDARD(optprobe_template_func); -NOKPROBE_SYMBOL(optprobe_template_func); -NOKPROBE_SYMBOL(optprobe_template_entry); -NOKPROBE_SYMBOL(optprobe_template_val); -NOKPROBE_SYMBOL(optprobe_template_call); -NOKPROBE_SYMBOL(optprobe_template_end); #define TMPL_MOVE_IDX \ ((long)optprobe_template_val - (long)optprobe_template_entry)
[PATCH v2 1/9] x86/kprobes: Prohibit probing on optprobe template code
Prohibit probing on optprobe template code, since it is not a code but a template instruction sequence. If we modify this template, copied template must be broken. Signed-off-by: Masami Hiramatsu Fixes: 9326638cbee2 ("kprobes, x86: Use NOKPROBE_SYMBOL() instead of __kprobes annotation") Cc: sta...@vger.kernel.org --- arch/x86/kernel/kprobes/opt.c |5 + 1 file changed, 5 insertions(+) diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c index 6adf6e6c2933..544bd41a514c 100644 --- a/arch/x86/kernel/kprobes/opt.c +++ b/arch/x86/kernel/kprobes/opt.c @@ -141,6 +141,11 @@ asm ( void optprobe_template_func(void); STACK_FRAME_NON_STANDARD(optprobe_template_func); +NOKPROBE_SYMBOL(optprobe_template_func); +NOKPROBE_SYMBOL(optprobe_template_entry); +NOKPROBE_SYMBOL(optprobe_template_val); +NOKPROBE_SYMBOL(optprobe_template_call); +NOKPROBE_SYMBOL(optprobe_template_end); #define TMPL_MOVE_IDX \ ((long)optprobe_template_val - (long)optprobe_template_entry)
[PATCH v2 0/9] kprobes: Fix and improve blacklist symbols
Hi, Here is the v2 series of kprobes blacklist bugfix and improvements mainly on x86 (since I started testing on qemu-x86). >From v1, I just removed stable-ml from Cc (but tagged [1/9]) and added Steve's Ack. This has been started from discussion about KPROBE_ENENTS_ON_NOTRACE configuration. I tried to find notrace functions which can cause kernel crash with kprobes using following script. #!/bin/sh i=0; cat notrace_functions | while read f ; do if echo p:event$i $f >> /sys/kernel/debug/tracing/kprobe_events; then echo "Probing on $f" echo 1 > /sys/kernel/debug/tracing/events/kprobes/event$i/enable fi i=$((i+1)) done And I found several functions which must be blacklisted. - optprobe template code, which is just a template code and never be executed. Moreover, since it can be copied and reused, if we probe it, it modifies the template code and can cause a crash. ([1/9][2/9]) - functions which is called before kprobe_int3_handler() handles kprobes. This can cause a breakpoint recursion. ([3/9]) - IRQ entry text, which should not be probed since register/pagetable status has not been stable at that point. ([4/9]) - Suffixed symbols, like .constprop, .part etc. Those suffixed symbols never be blacklisted even if the non-suffixed version has been blacklisted. ([5/9]) - hardirq tracer also works before int3 handling. ([6/9]) - preempt_check debug function also is involved in int3 handling. ([7/9]) - RCU debug routine is also called before kprobe_int3_handler(). ([8/9]) - Some lockdep functions are also involved in int3 handling. ([9/9]) Of course there still may be some functions which can be called by configuration change, I'll continue to test it. Thank you, --- Masami Hiramatsu (9): x86/kprobes: Prohibit probing on optprobe template code x86/kprobes: Move trampoline code into RODATA x86/kprobes: Prohibit probing on functions before kprobe_int3_handler() x86/kprobes: Prohibit probing on IRQ handlers directly kprobes: Search non-suffixed symbol in blacklist kprobes: Prohibit probing on hardirq tracers kprobes: Prohibit probing on preempt_check debug functions kprobes: Prohibit probing on RCU debug routine kprobes: Prohibit probing on lockdep functions arch/x86/kernel/alternative.c |3 ++- arch/x86/kernel/ftrace.c|3 ++- arch/x86/kernel/kprobes/core.c |7 +++ arch/x86/kernel/kprobes/opt.c |4 ++-- arch/x86/kernel/traps.c |1 + kernel/kprobes.c| 21 - kernel/locking/lockdep.c|7 ++- kernel/rcu/tree.c |2 ++ kernel/rcu/update.c |2 ++ kernel/trace/trace_irqsoff.c|9 +++-- kernel/trace/trace_preemptirq.c |5 + lib/smp_processor_id.c |7 +-- 12 files changed, 61 insertions(+), 10 deletions(-) -- Masami Hiramatsu (Linaro)
Re: [PATCH net v4 0/4] net: bpfilter: fix two bugs in bpfilter
From: Taehee Yoo Date: Wed, 9 Jan 2019 02:23:42 +0900 > This patches fix two bugs in the bpfilter_umh which are related in > iptables command. ... Series applied, thank you.
Re: [PATCH v4] RISC-V: defconfig: Enable Generic PCIE by default
On Sat, 12 Jan 2019, Alistair Francis wrote: > Enable generic PCIe by default in the RISC-V defconfig, this allows us > to use QEMU's PCIe support out of the box. > > Signed-off-by: Alistair Francis Reviewed-by: Paul Walmsley - Paul
Re: [PATCH 36/41] scsi: qla4xxx: ql4_os: mark expected switch fall-through
Nilesh, >>In preparation to enabling -Wimplicit-fallthrough, mark switch cases >>where we are expecting to fall through. >> >>Notice that, in this particular case, I replaced "allow fall-through" >>with a "fall through" annotation, which is what GCC is expecting to >>find. Applied to 5.1/scsi-queue. Thanks, Gustavo. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions
On Fri, Jan 11, 2019 at 05:04:05PM -0800, John Hubbard wrote: > On 1/11/19 8:51 AM, Jerome Glisse wrote: > > On Thu, Jan 10, 2019 at 06:59:31PM -0800, John Hubbard wrote: > >> On 1/3/19 6:44 AM, Jerome Glisse wrote: > >>> On Thu, Jan 03, 2019 at 10:26:54AM +0100, Jan Kara wrote: > On Wed 02-01-19 20:55:33, Jerome Glisse wrote: > > On Wed, Dec 19, 2018 at 12:08:56PM +0100, Jan Kara wrote: > >> On Tue 18-12-18 21:07:24, Jerome Glisse wrote: > >>> On Tue, Dec 18, 2018 at 03:29:34PM -0800, John Hubbard wrote: > > > > [...] > > > > Now page_mkclean: > > > > int page_mkclean(struct page *page) > > { > > int cleaned = 0; > > + int real_mapcount = 0; > > struct address_space *mapping; > > struct rmap_walk_control rwc = { > > .arg = (void *)&cleaned, > > .rmap_one = page_mkclean_one, > > .invalid_vma = invalid_mkclean_vma, > > + .mapcount = &real_mapcount, > > }; > > + int mapcount1, mapcount2; > > > > BUG_ON(!PageLocked(page)); > > > > if (!page_mapped(page)) > > return 0; > > > > mapping = page_mapping(page); > > if (!mapping) > > return 0; > > > > + mapcount1 = page_mapcount(page); > > // rmap_walk need to change to count mapping and return value > > // in .mapcount easy one > > rmap_walk(page, &rwc); > > So what prevents GUP_fast() to grab reference here and the test below > would > think the page is not pinned? Or do you assume that every page_mkclean() > call will be protected by PageWriteback (currently it is not) so that > GUP_fast() blocks / bails out? > >> > >> Continuing this thread, still focusing only on the "how to maintain a > >> PageDmaPinned > >> for each page" question (ignoring, for now, what to actually *do* in > >> response to > >> that flag being set): > >> > >> 1. Jan's point above is still a problem: PageWriteback != "page_mkclean is > >> happening". > >> This is probably less troubling than the next point, but it does undermine > >> all the > >> complicated schemes involving PageWriteback, that try to synchronize gup() > >> with > >> page_mkclean(). > >> > >> 2. Also, the mapcount approach here still does not reliably avoid false > >> negatives > >> (that is, a page may have been gup'd, but page_mkclean could miss that): > >> gup() > >> can always jump in and increment the mapcount, while page_mkclean is in > >> the middle > >> of making (wrong) decisions based on that mapcount. There's no lock to > >> prevent that. > >> > >> Again: mapcount can go up *or* down, so I'm not seeing a true solution yet. > > > > Both point is address by the solution at the end of this email. > > > >>> > >>> So GUP_fast() becomes: > >>> > >>> GUP_fast_existing() { ... } > >>> GUP_fast() > >>> { > >>> GUP_fast_existing(); > >>> > >>> for (i = 0; i < npages; ++i) { > >>> if (PageWriteback(pages[i])) { > >>> // need to force slow path for this page > >>> } else { > >>> SetPageDmaPinned(pages[i]); > >>> atomic_inc(pages[i]->mapcount); > >>> } > >>> } > >>> } > >>> > >>> This is a minor slow down for GUP fast and it takes care of a > >>> write back race on behalf of caller. This means that page_mkclean > >>> can not see a mapcount value that increase. This simplify thing > >>> we can relax that. Note that what this is doing is making sure > >>> that GUP_fast never get lucky :) ie never GUP a page that is in > >>> the process of being write back but has not yet had its pte > >>> updated to reflect that. > >>> > >>> > But I think that detecting pinned pages with small false positive rate is > OK. The extra page bouncing will cost some performance but if it is rare, > then we are OK. So I think we can go for the simple version of detecting > pinned pages as you mentioned in some earlier email. We just have to be > sure there are no false negatives. > >>> > >> > >> Agree with that sentiment, but there are still false negatives and I'm not > >> yet seeing any solutions for that. > > > > So here is the solution: > > > > > > Is a page pin ? With no false negative: > > === > > > > get_user_page*() aka GUP: > > if (!PageAnon(page)) { > > bool write_back = PageWriteback(page); > > bool page_is_pin = PagePin(page); > > if (write_back && !page_is_pin) { > > /* Wait for write back a re-try GUP */ > > ... > > goto retry; > > } > > [G1]smp_rmb(); > > [G2]atomic_inc(&page->_mapcount) > > [G3]smp_wmb(); > > [G4]SetPagePin(page); > > [G5]smp_wmb(); > > [G6]if (!write_back && !page_is_pin && PageWriteback(page)) { > > /* Back-off as write back might have miss us */ > > atomic_dec(&page->_mapcount); > > /*
Good News for 2019 and Beyond: National Neuroscience Institute Singapore MRI Brain Scan Stroke Protocol Radiology Report of 29 Dec 2018
Subject: Good News for 2019 and Beyond: National Neuroscience Institute Singapore MRI Brain Scan Stroke Protocol Radiology Report of 29 Dec 2018 National Neuroscience Institute SingHealth, Singapore Radiology Report Patient Name: TEO EN MING Patient MRN: Gender: Male Date of Birth: 1978 Ordering Doctor: YEO ENG HUI DAMIAN (17684B) Procedure: LIMITED STUDIES (MRA) - NNI Location: Exam Date: 29-Dec-2018 03:38 Exam ID: 862664 Main report text: Examd Id: 862664 REPORT STATUS: APPROVED LIMITED STUDIES (MRA) of 29-DEC-2018: REASON FOR EXAM: Left sided numbness (Correction by Teo En Ming: It is not left sided numbness. It is tingling sensation on left hand and left leg. And it is known as paraesthesia.) REPORT: Stroke protocol MRI brain was performed. MRI brain dated 1 Sep 2016 was reviewed. There is no restricted diffusion to suggest an acute infarct. No abnormal susceptibility is detected to indicate intracranial haemorrhage. There is no mass effect, hydrocephalus or midline shift. The basal cisterns are preserved. The major intracranial arteries demonstrate normal signal voids. Fast MRA reveals no severe stenosis or major arterial occlusion in the anterior and posterior circulation. The imaged paranasal sinuses and mastoid cells are clear. IMPRESSION: No acute infarct or intracranial haemorrhage is demonstrated. No major arterial occlusion or flow limiting stenosis on fast MRA. Reported by: DR KWOK YING CHRISTINE on 29-DEC-2018 07:19 AM Approved by: DR JOANNA PEARLY TI on 29-DEC-2018 09:05 AM Dictated by: Dr Kwok Ying Christine at 29-Dec-2018 07:19 Approved by: Dr Joanna Pearly Ti at 29-Dec-2018 09:05 Verified by: KOAY LEPING (61807A) on 29-Dec-2018 10:47 Printed by: KOAY LEPING on 29-Dec-2018 14:39:52 This is an electronic copy. No signature required. 1 of 1 REFERENCE RESOURCES == Two Redundant Links to Scanned Radiology Report: [LINK 1] https://tdtemcerts.blogspot.com/2019/01/good-news-for-2019-and-beyond-national.html [LINK 2] https://tdtemcerts.wordpress.com/2019/01/12/good-news-for-2019-and-beyond-national-neuroscience-institute-singapore-mri-brain-scan-stroke-protocol-radiology-report-of-29-dec-2018/ ===BEGIN SIGNATURE=== Turritopsis Dohrnii Teo En Ming's Academic Qualifications as at 30 Oct 2017 [1] https://tdtemcerts.wordpress.com/ [2] http://tdtemcerts.blogspot.sg/ [3] https://www.scribd.com/user/270125049/Teo-En-Ming ===END SIGNATURE===
Re: [PATCH] isdn: i4l: isdn_tty: Fix some concurrency double-free bugs
From: Jia-Ju Bai Date: Tue, 8 Jan 2019 21:04:48 +0800 > The functions isdn_tty_tiocmset() and isdn_tty_set_termios() may be > concurrently executed. ... > These possible bugs are found by a static tool written by myself and > my manual code review. > > To fix these possible bugs, the mutex lock "modem_info_mutex" used in > isdn_tty_tiocmset() is added in isdn_tty_set_termios(). > > Signed-off-by: Jia-Ju Bai Applied.
Re: [PATCH 1/3] media: dt: bindings: sunxi-ir: Add A64 compatible
On Sat, Jan 12, 2019 at 1:30 AM Jernej Skrabec wrote: > > A64 IR is compatible with A13, so add A64 compatible with A13 as a > fallback. We ask people to add the SoC-specific compatible as a contigency, in case things turn out to be not so "compatible". To be consistent with all the other SoCs and other peripherals, unless you already spotted a "compatible" difference in the hardware, i.e. the hardware isn't completely the same, this patch isn't needed. On the other hand, if you did, please mention the differences in the commit log. ChenYu > Signed-off-by: Jernej Skrabec > --- > Documentation/devicetree/bindings/media/sunxi-ir.txt | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/Documentation/devicetree/bindings/media/sunxi-ir.txt > b/Documentation/devicetree/bindings/media/sunxi-ir.txt > index 278098987edb..ecac6964b69b 100644 > --- a/Documentation/devicetree/bindings/media/sunxi-ir.txt > +++ b/Documentation/devicetree/bindings/media/sunxi-ir.txt > @@ -1,7 +1,10 @@ > Device-Tree bindings for SUNXI IR controller found in sunXi SoC family > > Required properties: > -- compatible : "allwinner,sun4i-a10-ir" or "allwinner,sun5i-a13-ir" > +- compatible : value must be one of: > + * "allwinner,sun4i-a10-ir" > + * "allwinner,sun5i-a13-ir" > + * "allwinner,sun50i-a64-ir", "allwinner,sun5i-a13-ir" > - clocks : list of clock specifiers, corresponding to > entries in clock-names property; > - clock-names : should contain "apb" and "ir" entries; > -- > 2.20.1 >
Re: UBSAN: Undefined behaviour in drivers/pps/pps.c
On Wed, Jan 09, 2019 at 10:20:50AM +0100, Rodolfo Giometti wrote: > On 08/01/2019 21:24, Kyungtae Kim wrote: > > We report a bug in linux-4.20: "UBSAN: Undefined behaviour in > > drivers/pps/pps.c" > > > > kernel config: https://kt0755.github.io/etc/config_v4.20_stable > > repro: https://kt0755.github.io/etc/repro.a6372.c > > > > pps_cdev_pps_fetch() lacks the bounds checking for computing > > fdata->timeout.sec * HZ, that causes such integer overflow when the result > > is larger than the boundary. > > The patch below checks the possibility of overflow right before the > > multiplication. > > > > = > > UBSAN: Undefined behaviour in drivers/pps/pps.c:82:30 > > signed integer overflow: > > -7557201428062104791 * 100 cannot be represented in type 'long long int' > > CPU: 0 PID: 10159 Comm: syz-executor6 Not tainted 4.20.0 #1 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > > Call Trace: > > __dump_stack lib/dump_stack.c:77 [inline] > > dump_stack+0xb1/0x118 lib/dump_stack.c:113 > > ubsan_epilogue+0x12/0x94 lib/ubsan.c:159 > > handle_overflow+0x1cf/0x21a lib/ubsan.c:190 > > __ubsan_handle_mul_overflow+0x2a/0x35 lib/ubsan.c:214 > > pps_cdev_pps_fetch+0x575/0x5b0 drivers/pps/pps.c:82 > > pps_cdev_ioctl+0x567/0x910 drivers/pps/pps.c:191 > > vfs_ioctl fs/ioctl.c:46 [inline] > > do_vfs_ioctl+0x1aa/0x1160 fs/ioctl.c:698 > > ksys_ioctl+0x9e/0xb0 fs/ioctl.c:713 > > __do_sys_ioctl fs/ioctl.c:720 [inline] > > __se_sys_ioctl fs/ioctl.c:718 [inline] > > __x64_sys_ioctl+0x7e/0xc0 fs/ioctl.c:718 > > do_syscall_64+0xbe/0x4f0 arch/x86/entry/common.c:290 > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > RIP: 0033:0x4497b9 > > Code: e8 8c 9f 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 > > 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d > > 01 f0 ff ff 0f 83 9b 6b fc ff c3 66 2e 0f 1f 84 00 00 00 00 > > RSP: 002b:7f8cf875bc68 EFLAGS: 0246 ORIG_RAX: 0010 > > RAX: ffda RBX: 7f8cf875c6cc RCX: 004497b9 > > RDX: 2240 RSI: c00870a4 RDI: 0014 > > RBP: 0071bea0 R08: R09: > > R10: R11: 0246 R12: > > R13: 5c10 R14: 006eecb0 R15: 7f8cf875c700 > > = > > > > --- > > drivers/pps/pps.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/pps/pps.c b/drivers/pps/pps.c > > index 8febacb..66002e1 100644 > > --- a/drivers/pps/pps.c > > +++ b/drivers/pps/pps.c > > @@ -79,6 +79,8 @@ static int pps_cdev_pps_fetch(struct pps_device > > *pps, struct pps_fdata *fdata) > > dev_dbg(pps->dev, "timeout %lld.%09d\n", > > (long long) fdata->timeout.sec, > > fdata->timeout.nsec); > > + if (fdata->timeout.sec > S64_MAX / HZ) > > + return -EINVAL; > > ticks = fdata->timeout.sec * HZ; > > ticks += fdata->timeout.nsec / (NSEC_PER_SEC / HZ); > > It looks good to me. Do you think is better adding a check for timeout.nsec > also? Another option is to use check_mul_overflow(). > > Now you have to produce a patch according to > linux/Documentation/process/submitting-patches.rst and then submitting it! > :-) > Thanks. -- Dmitry
Re: [PATCH] net: nvidia: forcedeth: Fix two possible concurrency use-after-free bugs
From: Jia-Ju Bai Date: Tue, 8 Jan 2019 20:45:18 +0800 > In drivers/net/ethernet/nvidia/forcedeth.c, the functions > nv_start_xmit() and nv_start_xmit_optimized() can be concurrently > executed with nv_poll_controller(). > > nv_start_xmit > line 2321: prev_tx_ctx->skb = skb; > > nv_start_xmit_optimized > line 2479: prev_tx_ctx->skb = skb; > > nv_poll_controller > nv_do_nic_poll > line 4134: spin_lock(&np->lock); > nv_drain_rxtx > nv_drain_tx > nv_release_txskb > line 2004: dev_kfree_skb_any(tx_skb->skb); > > Thus, two possible concurrency use-after-free bugs may occur. I do not think so, the netif_tx_lock_bh() done will prevent the parallel execution.
[PATCH v7 8/8] gpio: pcie-idio-24: Utilize for_each_set_clump8 macro
Replace verbose implementation in get_multiple/set_multiple callbacks with for_each_set_clump8 macro to simplify code and improve clarity. Signed-off-by: William Breathitt Gray --- drivers/gpio/gpio-pcie-idio-24.c | 109 --- 1 file changed, 40 insertions(+), 69 deletions(-) diff --git a/drivers/gpio/gpio-pcie-idio-24.c b/drivers/gpio/gpio-pcie-idio-24.c index 52f1647a46fd..1cfe39fb8e9e 100644 --- a/drivers/gpio/gpio-pcie-idio-24.c +++ b/drivers/gpio/gpio-pcie-idio-24.c @@ -198,52 +198,34 @@ static int idio_24_gpio_get_multiple(struct gpio_chip *chip, unsigned long *mask, unsigned long *bits) { struct idio_24_gpio *const idio24gpio = gpiochip_get_data(chip); - size_t i; - const unsigned int gpio_reg_size = 8; - unsigned int bits_offset; - size_t word_index; - unsigned int word_offset; - unsigned long word_mask; - const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0); - unsigned long port_state; + unsigned int offset; + unsigned long gpio_mask; void __iomem *ports[] = { &idio24gpio->reg->out0_7, &idio24gpio->reg->out8_15, &idio24gpio->reg->out16_23, &idio24gpio->reg->in0_7, &idio24gpio->reg->in8_15, &idio24gpio->reg->in16_23, }; + size_t index; + unsigned long port_state; const unsigned long out_mode_mask = BIT(1); /* clear bits array to a clean slate */ bitmap_zero(bits, chip->ngpio); - /* get bits are evaluated a gpio port register at a time */ - for (i = 0; i < ARRAY_SIZE(ports) + 1; i++) { - /* gpio offset in bits array */ - bits_offset = i * gpio_reg_size; - - /* word index for bits array */ - word_index = BIT_WORD(bits_offset); - - /* gpio offset within current word of bits array */ - word_offset = bits_offset % BITS_PER_LONG; - - /* mask of get bits for current gpio within current word */ - word_mask = mask[word_index] & (port_mask << word_offset); - if (!word_mask) { - /* no get bits in this port so skip to next one */ - continue; - } + for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) { + index = offset / 8; /* read bits from current gpio port (port 6 is TTL GPIO) */ - if (i < 6) - port_state = ioread8(ports[i]); + if (index < 6) + port_state = ioread8(ports[index]); else if (ioread8(&idio24gpio->reg->ctl) & out_mode_mask) port_state = ioread8(&idio24gpio->reg->ttl_out0_7); else port_state = ioread8(&idio24gpio->reg->ttl_in0_7); - /* store acquired bits at respective bits array offset */ - bits[word_index] |= (port_state << word_offset) & word_mask; + port_state &= gpio_mask; + + bitmap_set_value8(bits, port_state, offset); } return 0; @@ -294,59 +276,48 @@ static void idio_24_gpio_set_multiple(struct gpio_chip *chip, unsigned long *mask, unsigned long *bits) { struct idio_24_gpio *const idio24gpio = gpiochip_get_data(chip); - size_t i; - unsigned long bits_offset; + unsigned int offset; unsigned long gpio_mask; - const unsigned int gpio_reg_size = 8; - const unsigned long port_mask = GENMASK(gpio_reg_size, 0); - unsigned long flags; - unsigned int out_state; void __iomem *ports[] = { &idio24gpio->reg->out0_7, &idio24gpio->reg->out8_15, &idio24gpio->reg->out16_23 }; + size_t index; + unsigned int bitmask; + unsigned long flags; + unsigned int out_state; const unsigned long out_mode_mask = BIT(1); - const unsigned int ttl_offset = 48; - const size_t ttl_i = BIT_WORD(ttl_offset); - const unsigned int word_offset = ttl_offset % BITS_PER_LONG; - const unsigned long ttl_mask = (mask[ttl_i] >> word_offset) & port_mask; - const unsigned long ttl_bits = (bits[ttl_i] >> word_offset) & ttl_mask; - - /* set bits are processed a gpio port register at a time */ - for (i = 0; i < ARRAY_SIZE(ports); i++) { - /* gpio offset in bits array */ - bits_offset = i * gpio_reg_size; - - /* check if any set bits for current port */ - gpio_mask = (*mask >> bits_offset) & port_mask; - if (!gpio_mask) { - /* no set bits for this port so move on to next port */ - continue; - } - raw_spin_lock_irqsave(&idio24gpio->lock, flags); + for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) { +
[PATCH v7 7/8] gpio: pci-idio-16: Utilize for_each_set_clump8 macro
Replace verbose implementation in get_multiple/set_multiple callbacks with for_each_set_clump8 macro to simplify code and improve clarity. Signed-off-by: William Breathitt Gray --- drivers/gpio/gpio-pci-idio-16.c | 73 - 1 file changed, 26 insertions(+), 47 deletions(-) diff --git a/drivers/gpio/gpio-pci-idio-16.c b/drivers/gpio/gpio-pci-idio-16.c index 6b7349783223..f115b923ae98 100644 --- a/drivers/gpio/gpio-pci-idio-16.c +++ b/drivers/gpio/gpio-pci-idio-16.c @@ -108,45 +108,23 @@ static int idio_16_gpio_get_multiple(struct gpio_chip *chip, unsigned long *mask, unsigned long *bits) { struct idio_16_gpio *const idio16gpio = gpiochip_get_data(chip); - size_t i; - const unsigned int gpio_reg_size = 8; - unsigned int bits_offset; - size_t word_index; - unsigned int word_offset; - unsigned long word_mask; - const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0); - unsigned long port_state; + unsigned int offset; + unsigned long gpio_mask; void __iomem *ports[] = { &idio16gpio->reg->out0_7, &idio16gpio->reg->out8_15, &idio16gpio->reg->in0_7, &idio16gpio->reg->in8_15, }; + void __iomem *port_addr; + unsigned long port_state; /* clear bits array to a clean slate */ bitmap_zero(bits, chip->ngpio); - /* get bits are evaluated a gpio port register at a time */ - for (i = 0; i < ARRAY_SIZE(ports); i++) { - /* gpio offset in bits array */ - bits_offset = i * gpio_reg_size; - - /* word index for bits array */ - word_index = BIT_WORD(bits_offset); - - /* gpio offset within current word of bits array */ - word_offset = bits_offset % BITS_PER_LONG; + for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) { + port_addr = ports[offset / 8]; + port_state = ioread8(port_addr) & gpio_mask; - /* mask of get bits for current gpio within current word */ - word_mask = mask[word_index] & (port_mask << word_offset); - if (!word_mask) { - /* no get bits in this port so skip to next one */ - continue; - } - - /* read bits from current gpio port */ - port_state = ioread8(ports[i]); - - /* store acquired bits at respective bits array offset */ - bits[word_index] |= (port_state << word_offset) & word_mask; + bitmap_set_value8(bits, port_state, offset); } return 0; @@ -186,30 +164,31 @@ static void idio_16_gpio_set_multiple(struct gpio_chip *chip, unsigned long *mask, unsigned long *bits) { struct idio_16_gpio *const idio16gpio = gpiochip_get_data(chip); + unsigned int offset; + unsigned long gpio_mask; + void __iomem *ports[] = { + &idio16gpio->reg->out0_7, &idio16gpio->reg->out8_15, + }; + size_t index; + void __iomem *port_addr; + unsigned int bitmask; unsigned long flags; unsigned int out_state; - raw_spin_lock_irqsave(&idio16gpio->lock, flags); + for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) { + index = offset / 8; + port_addr = ports[index]; - /* process output lines 0-7 */ - if (*mask & 0xFF) { - out_state = ioread8(&idio16gpio->reg->out0_7) & ~*mask; - out_state |= *mask & *bits; - iowrite8(out_state, &idio16gpio->reg->out0_7); - } + bitmask = bitmap_get_value8(bits, offset) & gpio_mask; + + raw_spin_lock_irqsave(&idio16gpio->lock, flags); - /* shift to next output line word */ - *mask >>= 8; + out_state = ioread8(port_addr) & ~gpio_mask; + out_state |= bitmask; + iowrite8(out_state, port_addr); - /* process output lines 8-15 */ - if (*mask & 0xFF) { - *bits >>= 8; - out_state = ioread8(&idio16gpio->reg->out8_15) & ~*mask; - out_state |= *mask & *bits; - iowrite8(out_state, &idio16gpio->reg->out8_15); + raw_spin_unlock_irqrestore(&idio16gpio->lock, flags); } - - raw_spin_unlock_irqrestore(&idio16gpio->lock, flags); } static void idio_16_irq_ack(struct irq_data *data) -- 2.20.1
Re: [PATCH] vhost/vsock: fix vhost vsock cid hashing inconsistent
From: Zha Bin Date: Tue, 8 Jan 2019 16:07:03 +0800 > The vsock core only supports 32bit CID, but the Virtio-vsock spec define > CID (dst_cid and src_cid) as u64 and the upper 32bits is reserved as > zero. This inconsistency causes one bug in vhost vsock driver. The > scenarios is: > > 0. A hash table (vhost_vsock_hash) is used to map an CID to a vsock > object. And hash_min() is used to compute the hash key. hash_min() is > defined as: > (sizeof(val) <= 4 ? hash_32(val, bits) : hash_long(val, bits)). > That means the hash algorithm has dependency on the size of macro > argument 'val'. > 0. In function vhost_vsock_set_cid(), a 64bit CID is passed to > hash_min() to compute the hash key when inserting a vsock object into > the hash table. > 0. In function vhost_vsock_get(), a 32bit CID is passed to hash_min() > to compute the hash key when looking up a vsock for an CID. > > Because the different size of the CID, hash_min() returns different hash > key, thus fails to look up the vsock object for an CID. > > To fix this bug, we keep CID as u64 in the IOCTLs and virtio message > headers, but explicitly convert u64 to u32 when deal with the hash table > and vsock core. > > Fixes: 834e772c8db0 ("vhost/vsock: fix use-after-free in network stack > callers") > Link: https://github.com/stefanha/virtio/blob/vsock/trunk/content.tex > Signed-off-by: Zha Bin > Reviewed-by: Liu Jiang Applied, thank you.
[PATCH v7 5/8] gpio: gpio-mm: Utilize for_each_set_clump8 macro
Replace verbose implementation in get_multiple/set_multiple callbacks with for_each_set_clump8 macro to simplify code and improve clarity. Signed-off-by: William Breathitt Gray --- drivers/gpio/gpio-gpio-mm.c | 71 +++-- 1 file changed, 20 insertions(+), 51 deletions(-) diff --git a/drivers/gpio/gpio-gpio-mm.c b/drivers/gpio/gpio-gpio-mm.c index 8c150fd68d9d..ad80f874e004 100644 --- a/drivers/gpio/gpio-gpio-mm.c +++ b/drivers/gpio/gpio-gpio-mm.c @@ -172,46 +172,25 @@ static int gpiomm_gpio_get(struct gpio_chip *chip, unsigned int offset) return !!(port_state & mask); } +static const size_t ports[] = { 0, 1, 2, 4, 5, 6 }; + static int gpiomm_gpio_get_multiple(struct gpio_chip *chip, unsigned long *mask, unsigned long *bits) { struct gpiomm_gpio *const gpiommgpio = gpiochip_get_data(chip); - size_t i; - static const size_t ports[] = { 0, 1, 2, 4, 5, 6 }; - const unsigned int gpio_reg_size = 8; - unsigned int bits_offset; - size_t word_index; - unsigned int word_offset; - unsigned long word_mask; - const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0); + unsigned int offset; + unsigned long gpio_mask; + unsigned int port_addr; unsigned long port_state; /* clear bits array to a clean slate */ bitmap_zero(bits, chip->ngpio); - /* get bits are evaluated a gpio port register at a time */ - for (i = 0; i < ARRAY_SIZE(ports); i++) { - /* gpio offset in bits array */ - bits_offset = i * gpio_reg_size; - - /* word index for bits array */ - word_index = BIT_WORD(bits_offset); - - /* gpio offset within current word of bits array */ - word_offset = bits_offset % BITS_PER_LONG; - - /* mask of get bits for current gpio within current word */ - word_mask = mask[word_index] & (port_mask << word_offset); - if (!word_mask) { - /* no get bits in this port so skip to next one */ - continue; - } - - /* read bits from current gpio port */ - port_state = inb(gpiommgpio->base + ports[i]); + for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) { + port_addr = gpiommgpio->base + ports[offset / 8]; + port_state = inb(port_addr) & gpio_mask; - /* store acquired bits at respective bits array offset */ - bits[word_index] |= (port_state << word_offset) & word_mask; + bitmap_set_value8(bits, port_state, offset); } return 0; @@ -242,37 +221,27 @@ static void gpiomm_gpio_set_multiple(struct gpio_chip *chip, unsigned long *mask, unsigned long *bits) { struct gpiomm_gpio *const gpiommgpio = gpiochip_get_data(chip); - unsigned int i; - const unsigned int gpio_reg_size = 8; - unsigned int port; - unsigned int out_port; + unsigned int offset; + unsigned long gpio_mask; + size_t index; + unsigned int port_addr; unsigned int bitmask; unsigned long flags; - /* set bits are evaluated a gpio register size at a time */ - for (i = 0; i < chip->ngpio; i += gpio_reg_size) { - /* no more set bits in this mask word; skip to the next word */ - if (!mask[BIT_WORD(i)]) { - i = (BIT_WORD(i) + 1) * BITS_PER_LONG - gpio_reg_size; - continue; - } + for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) { + index = offset / 8; + port_addr = gpiommgpio->base + ports[index]; - port = i / gpio_reg_size; - out_port = (port > 2) ? port + 1 : port; - bitmask = mask[BIT_WORD(i)] & bits[BIT_WORD(i)]; + bitmask = bitmap_get_value8(bits, offset) & gpio_mask; spin_lock_irqsave(&gpiommgpio->lock, flags); /* update output state data and set device gpio register */ - gpiommgpio->out_state[port] &= ~mask[BIT_WORD(i)]; - gpiommgpio->out_state[port] |= bitmask; - outb(gpiommgpio->out_state[port], gpiommgpio->base + out_port); + gpiommgpio->out_state[index] &= ~gpio_mask; + gpiommgpio->out_state[index] |= bitmask; + outb(gpiommgpio->out_state[index], port_addr); spin_unlock_irqrestore(&gpiommgpio->lock, flags); - - /* prepare for next gpio register set */ - mask[BIT_WORD(i)] >>= gpio_reg_size; - bits[BIT_WORD(i)] >>= gpio_reg_size; } } -- 2.20.1
[PATCH v7 4/8] gpio: 104-idi-48: Utilize for_each_set_clump8 macro
Replace verbose implementation in get_multiple/set_multiple callbacks with for_each_set_clump8 macro to simplify code and improve clarity. Signed-off-by: William Breathitt Gray --- drivers/gpio/gpio-104-idi-48.c | 36 +++--- 1 file changed, 7 insertions(+), 29 deletions(-) diff --git a/drivers/gpio/gpio-104-idi-48.c b/drivers/gpio/gpio-104-idi-48.c index 88dc6f2449f6..d58382958bd5 100644 --- a/drivers/gpio/gpio-104-idi-48.c +++ b/drivers/gpio/gpio-104-idi-48.c @@ -93,42 +93,20 @@ static int idi_48_gpio_get_multiple(struct gpio_chip *chip, unsigned long *mask, unsigned long *bits) { struct idi_48_gpio *const idi48gpio = gpiochip_get_data(chip); - size_t i; + unsigned int offset; + unsigned long gpio_mask; static const size_t ports[] = { 0, 1, 2, 4, 5, 6 }; - const unsigned int gpio_reg_size = 8; - unsigned int bits_offset; - size_t word_index; - unsigned int word_offset; - unsigned long word_mask; - const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0); + unsigned int port_addr; unsigned long port_state; /* clear bits array to a clean slate */ bitmap_zero(bits, chip->ngpio); - /* get bits are evaluated a gpio port register at a time */ - for (i = 0; i < ARRAY_SIZE(ports); i++) { - /* gpio offset in bits array */ - bits_offset = i * gpio_reg_size; + for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) { + port_addr = idi48gpio->base + ports[offset / 8]; + port_state = inb(port_addr) & gpio_mask; - /* word index for bits array */ - word_index = BIT_WORD(bits_offset); - - /* gpio offset within current word of bits array */ - word_offset = bits_offset % BITS_PER_LONG; - - /* mask of get bits for current gpio within current word */ - word_mask = mask[word_index] & (port_mask << word_offset); - if (!word_mask) { - /* no get bits in this port so skip to next one */ - continue; - } - - /* read bits from current gpio port */ - port_state = inb(idi48gpio->base + ports[i]); - - /* store acquired bits at respective bits array offset */ - bits[word_index] |= (port_state << word_offset) & word_mask; + bitmap_set_value8(bits, port_state, offset); } return 0; -- 2.20.1
[PATCH v7 6/8] gpio: ws16c48: Utilize for_each_set_clump8 macro
Replace verbose implementation in get_multiple/set_multiple callbacks with for_each_set_clump8 macro to simplify code and improve clarity. Signed-off-by: William Breathitt Gray --- drivers/gpio/gpio-ws16c48.c | 71 ++--- 1 file changed, 19 insertions(+), 52 deletions(-) diff --git a/drivers/gpio/gpio-ws16c48.c b/drivers/gpio/gpio-ws16c48.c index 5cf3697bfb15..48cb9aa740b5 100644 --- a/drivers/gpio/gpio-ws16c48.c +++ b/drivers/gpio/gpio-ws16c48.c @@ -134,42 +134,19 @@ static int ws16c48_gpio_get_multiple(struct gpio_chip *chip, unsigned long *mask, unsigned long *bits) { struct ws16c48_gpio *const ws16c48gpio = gpiochip_get_data(chip); - const unsigned int gpio_reg_size = 8; - size_t i; - const size_t num_ports = chip->ngpio / gpio_reg_size; - unsigned int bits_offset; - size_t word_index; - unsigned int word_offset; - unsigned long word_mask; - const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0); + unsigned int offset; + unsigned long gpio_mask; + unsigned int port_addr; unsigned long port_state; /* clear bits array to a clean slate */ bitmap_zero(bits, chip->ngpio); - /* get bits are evaluated a gpio port register at a time */ - for (i = 0; i < num_ports; i++) { - /* gpio offset in bits array */ - bits_offset = i * gpio_reg_size; + for_each_set_clump8(offset, gpio_mask, mask, chip->ngpio) { + port_addr = ws16c48gpio->base + offset / 8; + port_state = inb(port_addr) & gpio_mask; - /* word index for bits array */ - word_index = BIT_WORD(bits_offset); - - /* gpio offset within current word of bits array */ - word_offset = bits_offset % BITS_PER_LONG; - - /* mask of get bits for current gpio within current word */ - word_mask = mask[word_index] & (port_mask << word_offset); - if (!word_mask) { - /* no get bits in this port so skip to next one */ - continue; - } - - /* read bits from current gpio port */ - port_state = inb(ws16c48gpio->base + i); - - /* store acquired bits at respective bits array offset */ - bits[word_index] |= (port_state << word_offset) & word_mask; + bitmap_set_value8(bits, port_state, offset); } return 0; @@ -203,39 +180,29 @@ static void ws16c48_gpio_set_multiple(struct gpio_chip *chip, unsigned long *mask, unsigned long *bits) { struct ws16c48_gpio *const ws16c48gpio = gpiochip_get_data(chip); - unsigned int i; - const unsigned int gpio_reg_size = 8; - unsigned int port; - unsigned int iomask; + unsigned int offset; + unsigned long gpio_mask; + size_t index; + unsigned int port_addr; unsigned int bitmask; unsigned long flags; - /* set bits are evaluated a gpio register size at a time */ - for (i = 0; i < chip->ngpio; i += gpio_reg_size) { - /* no more set bits in this mask word; skip to the next word */ - if (!mask[BIT_WORD(i)]) { - i = (BIT_WORD(i) + 1) * BITS_PER_LONG - gpio_reg_size; - continue; - } - - port = i / gpio_reg_size; + for_each_set_clump8(offset, gpio_mask, mask, chip->ngpio) { + index = offset / 8; + port_addr = ws16c48gpio->base + index; /* mask out GPIO configured for input */ - iomask = mask[BIT_WORD(i)] & ~ws16c48gpio->io_state[port]; - bitmask = iomask & bits[BIT_WORD(i)]; + gpio_mask &= ~ws16c48gpio->io_state[index]; + bitmask = bitmap_get_value8(bits, offset) & gpio_mask; raw_spin_lock_irqsave(&ws16c48gpio->lock, flags); /* update output state data and set device gpio register */ - ws16c48gpio->out_state[port] &= ~iomask; - ws16c48gpio->out_state[port] |= bitmask; - outb(ws16c48gpio->out_state[port], ws16c48gpio->base + port); + ws16c48gpio->out_state[index] &= ~gpio_mask; + ws16c48gpio->out_state[index] |= bitmask; + outb(ws16c48gpio->out_state[index], port_addr); raw_spin_unlock_irqrestore(&ws16c48gpio->lock, flags); - - /* prepare for next gpio register set */ - mask[BIT_WORD(i)] >>= gpio_reg_size; - bits[BIT_WORD(i)] >>= gpio_reg_size; } } -- 2.20.1
[PATCH v7 3/8] gpio: 104-dio-48e: Utilize for_each_set_clump8 macro
Replace verbose implementation in get_multiple/set_multiple callbacks with for_each_set_clump8 macro to simplify code and improve clarity. Signed-off-by: William Breathitt Gray --- drivers/gpio/gpio-104-dio-48e.c | 71 ++--- 1 file changed, 20 insertions(+), 51 deletions(-) diff --git a/drivers/gpio/gpio-104-dio-48e.c b/drivers/gpio/gpio-104-dio-48e.c index 92c8f944bf64..1093a1403d77 100644 --- a/drivers/gpio/gpio-104-dio-48e.c +++ b/drivers/gpio/gpio-104-dio-48e.c @@ -183,46 +183,25 @@ static int dio48e_gpio_get(struct gpio_chip *chip, unsigned offset) return !!(port_state & mask); } +static const size_t ports[] = { 0, 1, 2, 4, 5, 6 }; + static int dio48e_gpio_get_multiple(struct gpio_chip *chip, unsigned long *mask, unsigned long *bits) { struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip); - size_t i; - static const size_t ports[] = { 0, 1, 2, 4, 5, 6 }; - const unsigned int gpio_reg_size = 8; - unsigned int bits_offset; - size_t word_index; - unsigned int word_offset; - unsigned long word_mask; - const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0); + unsigned int offset; + unsigned long gpio_mask; + unsigned int port_addr; unsigned long port_state; /* clear bits array to a clean slate */ bitmap_zero(bits, chip->ngpio); - /* get bits are evaluated a gpio port register at a time */ - for (i = 0; i < ARRAY_SIZE(ports); i++) { - /* gpio offset in bits array */ - bits_offset = i * gpio_reg_size; - - /* word index for bits array */ - word_index = BIT_WORD(bits_offset); - - /* gpio offset within current word of bits array */ - word_offset = bits_offset % BITS_PER_LONG; - - /* mask of get bits for current gpio within current word */ - word_mask = mask[word_index] & (port_mask << word_offset); - if (!word_mask) { - /* no get bits in this port so skip to next one */ - continue; - } - - /* read bits from current gpio port */ - port_state = inb(dio48egpio->base + ports[i]); + for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) { + port_addr = dio48egpio->base + ports[offset / 8]; + port_state = inb(port_addr) & gpio_mask; - /* store acquired bits at respective bits array offset */ - bits[word_index] |= (port_state << word_offset) & word_mask; + bitmap_set_value8(bits, port_state, offset); } return 0; @@ -252,37 +231,27 @@ static void dio48e_gpio_set_multiple(struct gpio_chip *chip, unsigned long *mask, unsigned long *bits) { struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip); - unsigned int i; - const unsigned int gpio_reg_size = 8; - unsigned int port; - unsigned int out_port; + unsigned int offset; + unsigned long gpio_mask; + size_t index; + unsigned int port_addr; unsigned int bitmask; unsigned long flags; - /* set bits are evaluated a gpio register size at a time */ - for (i = 0; i < chip->ngpio; i += gpio_reg_size) { - /* no more set bits in this mask word; skip to the next word */ - if (!mask[BIT_WORD(i)]) { - i = (BIT_WORD(i) + 1) * BITS_PER_LONG - gpio_reg_size; - continue; - } + for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) { + index = offset / 8; + port_addr = dio48egpio->base + ports[index]; - port = i / gpio_reg_size; - out_port = (port > 2) ? port + 1 : port; - bitmask = mask[BIT_WORD(i)] & bits[BIT_WORD(i)]; + bitmask = bitmap_get_value8(bits, offset) & gpio_mask; raw_spin_lock_irqsave(&dio48egpio->lock, flags); /* update output state data and set device gpio register */ - dio48egpio->out_state[port] &= ~mask[BIT_WORD(i)]; - dio48egpio->out_state[port] |= bitmask; - outb(dio48egpio->out_state[port], dio48egpio->base + out_port); + dio48egpio->out_state[index] &= ~gpio_mask; + dio48egpio->out_state[index] |= bitmask; + outb(dio48egpio->out_state[index], port_addr); raw_spin_unlock_irqrestore(&dio48egpio->lock, flags); - - /* prepare for next gpio register set */ - mask[BIT_WORD(i)] >>= gpio_reg_size; - bits[BIT_WORD(i)] >>= gpio_reg_size; } } -- 2.20.1
Re: [PATCH] Input: tca6416-keypad: use struct_size() in kzalloc()
On Tue, Jan 08, 2019 at 09:28:16AM -0600, Gustavo A. R. Silva wrote: > One of the more common cases of allocation size calculations is finding the > size of a structure that has a zero-sized array at the end, along with memory > for some number of elements for that array. For example: > > struct foo { > int stuff; > void *entry[]; > }; > > instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL); > > Instead of leaving these open-coded and prone to type mistakes, we can now > use the new struct_size() helper: > > instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL); > > This code was detected with the help of Coccinelle. > > Signed-off-by: Gustavo A. R. Silva Applied, thank you. > --- > drivers/input/keyboard/tca6416-keypad.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/drivers/input/keyboard/tca6416-keypad.c > b/drivers/input/keyboard/tca6416-keypad.c > index dc983ab6c0ad..cdeef180aead 100644 > --- a/drivers/input/keyboard/tca6416-keypad.c > +++ b/drivers/input/keyboard/tca6416-keypad.c > @@ -219,9 +219,7 @@ static int tca6416_keypad_probe(struct i2c_client *client, > return -EINVAL; > } > > - chip = kzalloc(sizeof(struct tca6416_keypad_chip) + > -pdata->nbuttons * sizeof(struct tca6416_button), > -GFP_KERNEL); > + chip = kzalloc(struct_size(chip, buttons, pdata->nbuttons), GFP_KERNEL); > input = input_allocate_device(); > if (!chip || !input) { > error = -ENOMEM; > -- > 2.20.1 > -- Dmitry
[PATCH v7 2/8] lib/test_bitmap.c: Add for_each_set_clump8 test cases
The introduction of the for_each_set_clump8 macro warrants test cases to verify the implementation. This patch adds test case checks for whether an out-of-bounds clump index is returned, a zero clump is returned, or the returned clump value differs from the expected clump value. Cc: Andy Shevchenko Cc: Andrew Morton Cc: Rasmus Villemoes Signed-off-by: William Breathitt Gray --- lib/test_bitmap.c | 65 +++ 1 file changed, 65 insertions(+) diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c index 6cd7d0740005..66ddb3fb98cb 100644 --- a/lib/test_bitmap.c +++ b/lib/test_bitmap.c @@ -88,6 +88,36 @@ __check_eq_u32_array(const char *srcfile, unsigned int line, return true; } +static bool __init __check_eq_clump8(const char *srcfile, unsigned int line, + const unsigned int offset, + const unsigned int size, + const unsigned char *const clump_exp, + const unsigned long *const clump) +{ + unsigned long exp; + + if (offset >= size) { + pr_warn("[%s:%u] bit offset for clump out-of-bounds: expected less than %u, got %u\n", + srcfile, line, size, offset); + return false; + } + + exp = clump_exp[offset / 8]; + if (!exp) { + pr_warn("[%s:%u] bit offset for zero clump: expected nonzero clump, got bit offset %u with clump value 0", + srcfile, line, offset); + return false; + } + + if (*clump != exp) { + pr_warn("[%s:%u] expected clump value of 0x%lX, got clump value of 0x%lX", + srcfile, line, exp, *clump); + return false; + } + + return true; +} + #define __expect_eq(suffix, ...) \ ({ \ int result = 0; \ @@ -104,6 +134,7 @@ __check_eq_u32_array(const char *srcfile, unsigned int line, #define expect_eq_bitmap(...) __expect_eq(bitmap, ##__VA_ARGS__) #define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__) #define expect_eq_u32_array(...) __expect_eq(u32_array, ##__VA_ARGS__) +#define expect_eq_clump8(...) __expect_eq(clump8, ##__VA_ARGS__) static void __init test_zero_clear(void) { @@ -361,6 +392,39 @@ static void noinline __init test_mem_optimisations(void) } } +static const unsigned char clump_exp[] __initconst = { + 0x01, /* 1 bit set */ + 0x02, /* non-edge 1 bit set */ + 0x00, /* zero bits set */ + 0x28, /* 3 bits set across 4-bit boundary */ + 0x28, /* Repeated clump */ + 0x0F, /* 4 bits set */ + 0xFF, /* all bits set */ + 0x05, /* non-adjacent 2 bits set */ +}; + +static void __init test_for_each_set_clump8(void) +{ +#define CLUMP_EXP_NUMBITS 64 + DECLARE_BITMAP(bits, CLUMP_EXP_NUMBITS); + unsigned int start; + unsigned long clump; + + /* set bitmap to test case */ + bitmap_zero(bits, CLUMP_EXP_NUMBITS); + bitmap_set(bits, 0, 1); /* 0x01 */ + bitmap_set(bits, 8, 1); /* 0x02 */ + bitmap_set(bits, 27, 3);/* 0x28 */ + bitmap_set(bits, 35, 3);/* 0x28 */ + bitmap_set(bits, 40, 4);/* 0x0F */ + bitmap_set(bits, 48, 8);/* 0xFF */ + bitmap_set(bits, 56, 1);/* 0x05 - part 1 */ + bitmap_set(bits, 58, 1);/* 0x05 - part 2 */ + + for_each_set_clump8(start, clump, bits, CLUMP_EXP_NUMBITS) + expect_eq_clump8(start, CLUMP_EXP_NUMBITS, clump_exp, &clump); +} + static int __init test_bitmap_init(void) { test_zero_clear(); @@ -369,6 +433,7 @@ static int __init test_bitmap_init(void) test_bitmap_arr32(); test_bitmap_parselist(); test_mem_optimisations(); + test_for_each_set_clump8(); if (failed_tests == 0) pr_info("all %u tests passed\n", total_tests); -- 2.20.1
[PATCH v7 1/8] bitops: Introduce the for_each_set_clump8 macro
This macro iterates for each 8-bit group of bits (clump) with set bits, within a bitmap memory region. For each iteration, "start" is set to the bit offset of the found clump, while the respective clump value is stored to the location pointed by "clump". Additionally, the bitmap_get_value8 and bitmap_set_value8 functions are introduced to respectively get and set an 8-bit value in a bitmap memory region. Suggested-by: Andy Shevchenko Suggested-by: Rasmus Villemoes Cc: Arnd Bergmann Cc: Andrew Morton Signed-off-by: William Breathitt Gray --- include/asm-generic/bitops/find.h | 14 +++ include/linux/bitops.h| 5 +++ lib/find_bit.c| 63 +++ 3 files changed, 82 insertions(+) diff --git a/include/asm-generic/bitops/find.h b/include/asm-generic/bitops/find.h index 8a1ee10014de..61c496ae94cf 100644 --- a/include/asm-generic/bitops/find.h +++ b/include/asm-generic/bitops/find.h @@ -80,4 +80,18 @@ extern unsigned long find_first_zero_bit(const unsigned long *addr, #endif /* CONFIG_GENERIC_FIND_FIRST_BIT */ +unsigned int bitmap_get_value8(const unsigned long *const bitmap, + const unsigned int start); + +void bitmap_set_value8(unsigned long *const bitmap, + const unsigned long value, + const unsigned int start); + +unsigned int find_next_clump8(unsigned long *const clump, + const unsigned long *const addr, + unsigned int offset, const unsigned int size); + +#define find_first_clump8(clump, bits, size) \ + find_next_clump8((clump), (bits), 0, (size)) + #endif /*_ASM_GENERIC_BITOPS_FIND_H_ */ diff --git a/include/linux/bitops.h b/include/linux/bitops.h index 705f7c442691..61c10f20079e 100644 --- a/include/linux/bitops.h +++ b/include/linux/bitops.h @@ -40,6 +40,11 @@ extern unsigned long __sw_hweight64(__u64 w); (bit) < (size);\ (bit) = find_next_zero_bit((addr), (size), (bit) + 1)) +#define for_each_set_clump8(start, clump, bits, size) \ + for ((start) = find_first_clump8(&(clump), (bits), (size)); \ +(start) < (size); \ +(start) = find_next_clump8(&(clump), (bits), (start) + 8, (size))) + static inline int get_bitmask_order(unsigned int count) { int order; diff --git a/lib/find_bit.c b/lib/find_bit.c index ee3df93ba69a..97d96e65a3d9 100644 --- a/lib/find_bit.c +++ b/lib/find_bit.c @@ -218,3 +218,66 @@ EXPORT_SYMBOL(find_next_bit_le); #endif #endif /* __BIG_ENDIAN */ + +/** + * bitmap_get_value8 - get an 8-bit value within a memory region + * @bitmap: address to the bitmap memory region + * @start: bit offset of the 8-bit value + * + * Returns the 8-bit value located at the @start bit offset within the @bitmap + * memory region. + */ +unsigned int bitmap_get_value8(const unsigned long *const bitmap, + const unsigned int start) +{ + const size_t index = BIT_WORD(start); + const unsigned int offset = start % BITS_PER_LONG; + + return (bitmap[index] >> offset) & 0xFF; +} +EXPORT_SYMBOL(bitmap_get_value8); + +/** + * bitmap_set_value8 - set an 8-bit value within a memory region + * @bitmap: address to the bitmap memory region + * @value: the 8-bit value + * @start: bit offset of the 8-bit value + */ +void bitmap_set_value8(unsigned long *const bitmap, + const unsigned long value, + const unsigned int start) +{ + const size_t index = BIT_WORD(start); + const unsigned int offset = start % BITS_PER_LONG; + const unsigned long mask = GENMASK(7, offset); + + bitmap[index] &= ~mask; + bitmap[index] |= value << offset; +} +EXPORT_SYMBOL(bitmap_set_value8); + +/** + * find_next_clump8 - find next 8-bit clump with set bits in a memory region + * @clump: location to store copy of found clump + * @addr: address to base the search on + * @offset: bit offset at which to start searching + * @size: bitmap size in number of bits + * + * Returns the bit offset for the next set clump; the found clump value is + * copied to the location pointed by @clump. If no bits are set, returns @size. + */ +unsigned int find_next_clump8(unsigned long *const clump, + const unsigned long *const addr, + unsigned int offset, const unsigned int size) +{ + for (; offset < size; offset += 8) { + *clump = bitmap_get_value8(addr, offset); + if (!*clump) + continue; + + return offset; + } + + return size; +} +EXPORT_SYMBOL(find_next_clump8); -- 2.20.1
[PATCH v7 0/8] Introduce the for_each_set_clump8 macro
Changes in v7: - Pass bitmap_set_value8 'value' parameter by value - Remove bitmap_set_value8 final mask operation; users beware: values greater than 8 bits may clobber bitmap While adding GPIO get_multiple/set_multiple callback support for various drivers, I noticed a pattern of looping manifesting that would be useful standardized as a macro. This patchset introduces the for_each_set_clump8 macro and utilizes it in several GPIO drivers. The for_each_set_clump macro8 facilitates a for-loop syntax that iterates over a memory region entire groups of set bits at a time. For example, suppose you would like to iterate over a 32-bit integer 8 bits at a time, skipping over 8-bit groups with no set bit, where represents the current 8-bit group: Example:1010 00110011 First loop: 1010 Second loop:1010 00110011 Third loop: 00110011 Each iteration of the loop returns the next 8-bit group that has at least one set bit. The for_each_set_clump8 macro has four parameters: * start: set to the bit offset of the current clump * clump: set to the current clump value * bits: bitmap to search within * size: bitmap size in number of bits In this version of the patchset, the for_each_set_clump macro has been reimplemented and simplified based on the suggestions provided by Rasmus Villemoes and Andy Shevchenko in the version 4 submission. In particular, the function of the for_each_set_clump macro has been restricted to handle only 8-bit clumps; the drivers that use the for_each_set_clump macro only handle 8-bit ports so a generic for_each_set_clump implementation is not necessary. Thus, a solution for odd-sized clumps (e.g. 3-bit, 7-bit, etc.) mismatching word boundaries can be postponed until a driver appears that actually requires a generic for_each_set_clump implementation. In addition, the bitmap_get_value8 and bitmap_set_value8 functions are introduced to get and set 8-bit values respectively. Their use is based on the behavior suggested in the patchset version 4 review. Similarly, the implementation of the find_next_clump function has been simplified in order for the function to match the syntax and use of the find_next_bit function. William Breathitt Gray (8): bitops: Introduce the for_each_set_clump8 macro lib/test_bitmap.c: Add for_each_set_clump8 test cases gpio: 104-dio-48e: Utilize for_each_set_clump8 macro gpio: 104-idi-48: Utilize for_each_set_clump8 macro gpio: gpio-mm: Utilize for_each_set_clump8 macro gpio: ws16c48: Utilize for_each_set_clump8 macro gpio: pci-idio-16: Utilize for_each_set_clump8 macro gpio: pcie-idio-24: Utilize for_each_set_clump8 macro drivers/gpio/gpio-104-dio-48e.c | 71 ++- drivers/gpio/gpio-104-idi-48.c| 36 ++ drivers/gpio/gpio-gpio-mm.c | 71 ++- drivers/gpio/gpio-pci-idio-16.c | 73 +++- drivers/gpio/gpio-pcie-idio-24.c | 109 +++--- drivers/gpio/gpio-ws16c48.c | 71 ++- include/asm-generic/bitops/find.h | 14 include/linux/bitops.h| 5 ++ lib/find_bit.c| 63 + lib/test_bitmap.c | 65 ++ 10 files changed, 279 insertions(+), 299 deletions(-) -- 2.20.1
Re: WARNING in apparmor_cred_free
On 1/11/2019 3:20 PM, Casey Schaufler wrote: > On 1/11/2019 2:43 PM, Casey Schaufler wrote: >> On 1/11/2019 2:30 PM, John Johansen wrote: >>> On 1/11/19 2:11 PM, Casey Schaufler wrote: >>>> On 1/11/2019 1:43 AM, syzbot wrote: >>>>> Hello, >>>>> >>>>> syzbot found the following crash on: >>>>> >>>>> HEAD commit: b808822a75a3 Add linux-next specific files for 20190111 >>>>> git tree: linux-next >>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=179c22f740 >>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=c052ead0aed5001b >>>>> dashboard link: >>>>> https://syzkaller.appspot.com/bug?extid=69ca07954461f189e808 >>>>> compiler: gcc (GCC) 9.0.0 20181231 (experimental) >>>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=162d947f40 >>>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=139f6c3740 >>>>> >>>>> IMPORTANT: if you fix the bug, please add the following tag to the commit: >>>>> Reported-by: syzbot+69ca07954461f189e...@syzkaller.appspotmail.com >>>>> >>>>> [ cut here ] >>>>> AppArmor WARN cred_label: ((!blob)): >>>>> WARNING: CPU: 0 PID: 0 at security/apparmor/include/cred.h:30 cred_label >>>>> security/apparmor/include/cred.h:30 [inline] >>>>> WARNING: CPU: 0 PID: 0 at security/apparmor/include/cred.h:30 >>>>> apparmor_cred_free+0x12f/0x1a0 security/apparmor/lsm.c:62 >>>>> Kernel panic - not syncing: panic_on_warn set ... >>>>> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.0.0-rc1-next-20190111 #10 >>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >>>>> Google 01/01/2011 >>>>> Call Trace: >>>>> >>>>> __dump_stack lib/dump_stack.c:77 [inline] >>>>> dump_stack+0x1db/0x2d0 lib/dump_stack.c:113 >>>>> panic+0x2cb/0x65c kernel/panic.c:214 >>>>> __warn.cold+0x20/0x48 kernel/panic.c:571 >>>>> report_bug+0x263/0x2b0 lib/bug.c:186 >>>>> fixup_bug arch/x86/kernel/traps.c:178 [inline] >>>>> fixup_bug arch/x86/kernel/traps.c:173 [inline] >>>>> do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271 >>>>> do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:290 >>>>> invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973 >>>>> RIP: 0010:cred_label security/apparmor/include/cred.h:30 [inline] >>>>> RIP: 0010:apparmor_cred_free+0x12f/0x1a0 security/apparmor/lsm.c:62 >>>>> Code: 7c 88 48 c7 c7 00 d0 7c 88 e8 fd 70 f2 fd 0f 0b eb a9 e8 54 3f 29 >>>>> fe 48 c7 c6 c0 df 7c 88 48 c7 c7 00 d0 7c 88 e8 e1 70 f2 fd <0f> 0b 48 b8 >>>>> 00 00 00 00 00 fc ff df 80 38 00 75 4a 4c 8b 2c 25 00 >>>>> RSP: 0018:8880ae6079f8 EFLAGS: 00010286 >>>>> RAX: RBX: RCX: >>>>> RDX: 0100 RSI: 81687fa6 RDI: 0006 >>>>> RBP: 8880ae607a18 R08: 8987dec0 R09: >>>>> R10: R11: R12: 8880a86b3100 >>>>> R13: 8880a86b3100 R14: 8880a86b3188 R15: dc00 >>>>> security_cred_free+0x4b/0xf0 security/security.c:1490 >>>> The obvious thing to do is put a check in security_cred_free >>>> for a NULL cred->security, in which case the LSM hooks >>>> wouldn't get called. >>> Right, but the question is should we? To my thinking we shouldn't >>> ever have a cred without cred->security, unless the cred was >>> allocated but a later step in its construction, say allocating >>> ->security failed. >> If allocating ->security fails in security_cred_alloc_blank() >> or security_prepare_creds() you don't have to do anything but >> fail because the LSM hooks are not called before the allocation. >> >>> In which case I'd rather see the cred directly freed and not >>> call into security_cred_free() as I like being able to detect >>> corrupt creds. >> I think we need to look for some bit of code that's setting >> cred->security to NULL inappropriately. > If security_cred_alloc_blank() fails for lack of memory > in cred_alloc_blank() abort_creds() will be called. This > in turn calls put_cred() and put_cred_rcu()
Re: [PATCH 1/2] arm64: dts: layerscape: Add incr-burst-type-adjustment property to USB3 node
On Wed, Dec 19, 2018 at 05:41:08PM +0800, Ran Wang wrote: > Add this property to all layerscape platforms to improve USB read write > performance. > > Signed-off-by: Ran Wang Applied both, thanks.
Re: [PATCH] Input: synaptics - add PNP IDs for Dell XPS models to forcepad
Hi Kim, On Fri, Jan 11, 2019 at 02:54:30PM -0600, Kim Phillips wrote: > This patch is the result of seeing this message: > > psmouse serio1: synaptics: Your touchpad (PNP: DLL087c PNP0f13) says it can > support a different bus. If i2c-hid and hid-rmi are not used, you might want > to try setting psmouse.synaptics_intertouch to 1 and report this to > linux-in...@vger.kernel.org. > > If I set psmouse.synaptics_intertouch=1, or add the PNP ID to > smbus_pnp_ids, the touchpad continues to work, and the above message > goes away, but we now get: > > psmouse serio1: synaptics: Trying to set up SMBus access > psmouse serio1: synaptics: SMbus companion is not ready yet > > With this patch applied, i.e., the PNP IDs are added to the forcepad > array, the touchpad continues to work and all of the above messages > disappear. Are you sure the touchpad in XPSes is a forcepad (i.e. it does not have physical button underneath it)? As far as I know there were only couple of HP laptops with forcepads and when switching to RMI mode forcepads need F21 handler that we do not currently have in the kernel. > > Tested on a Dell XPS 15 9570. The other IDs - for Dell XPS 13 > 9350/9360/9370 and XPS 15 9550/9560 - were obtained by searching > for the model numbers and "says it can support a different bus". > E.g, this is one such instance: > > https://lkml.org/lkml/2018/2/15/52 > > Cc: Paul Menzel > Cc: Dmitry Torokhov > Cc: Benjamin Tissoires > Cc: linux-in...@vger.kernel.org > Signed-off-by: Kim Phillips > --- > With or without this patch, I'm seeing a problem where when the XPS 15 > comes out of a resume, and without even touching the touchpad, I notice > about 600 interrupts per second firing on the "IR-IO-APIC 51-fasteoi > SYNA2393:00" line in /proc/interrupts. If I start using the touchpad, > then leave it alone, I check if there are still interrupts firing, and > they have indeed stopped. This adversely affects my battery life when > using an external mouse. Any ideas on how to debug the situation? > Could it be related to the 'vdd not found' messages?: > > $ dmesg | grep -C 1 -i syna > probe of 1-12 returned 1 after 2343 usecs > psmouse serio1: synaptics: queried max coordinates: x [..5664], y [..4646] > psmouse serio1: synaptics: queried min coordinates: x [1278..], y [1206..] > psmouse serio1: synaptics: Touchpad model: 1, fw: 8.2, id: 0x1e2b1, caps: > 0xf00123/0x840300/0x12e800/0x0, board id: 3125, fw id: 2378871 > input: SynPS/2 Synaptics TouchPad as > /devices/platform/i8042/serio1/input/input6 > probe of serio1 returned 1 after 939332 usecs > -- > probe of idma64.1 returned 1 after 72 usecs > i2c_hid i2c-SYNA2393:00: i2c-SYNA2393:00 supply vdd not found, using dummy > regulator > i2c_hid i2c-SYNA2393:00: Linked as a consumer to regulator.0 > i2c_hid i2c-SYNA2393:00: i2c-SYNA2393:00 supply vddl not found, using dummy > regulator > ath10k_pci :3b:00.0: qca6174 hw3.2 target 0x0503 chip_id 0x00340aff > sub 1a56:1535 > -- > ath10k_pci :3b:00.0: firmware ver WLAN.RM.4.4.1-00079-QCARMSWPZ-1 api 6 > features wowlan,ignore-otp crc32 fd869beb > probe of i2c-SYNA2393:00 returned 1 after 23978 usecs > ath10k_pci :3b:00.0: board_file api 2 bmi_id N/A crc32 20d869c3 > -- > probe of 0018:056A:488F.0001 returned 1 after 1366 usecs > input: SYNA2393:00 06CB:7A13 Mouse as > /devices/pci:00/:00:15.1/i2c_designware.1/i2c-10/i2c-SYNA2393:00/0018:06CB:7A13.0002/input/input23 > input: SYNA2393:00 06CB:7A13 Touchpad as > /devices/pci:00/:00:15.1/i2c_designware.1/i2c-10/i2c-SYNA2393:00/0018:06CB:7A13.0002/input/input24 > hid-generic 0018:06CB:7A13.0002: input,hidraw1: I2C HID v1.00 Mouse > [SYNA2393:00 06CB:7A13] on i2c-SYNA2393:00 > probe of 0018:06CB:7A13.0002 returned 1 after 320 usecs > -- > probe of 0018:06CB:7A13.0002 returned 0 after 5 usecs > input: SYNA2393:00 06CB:7A13 Touchpad as > /devices/pci:00/:00:15.1/i2c_designware.1/i2c-10/i2c-SYNA2393:00/0018:06CB:7A13.0002/input/input27 > hid-multitouch 0018:06CB:7A13.0002: input,hidraw1: I2C HID v1.00 Mouse > [SYNA2393:00 06CB:7A13] on i2c-SYNA2393:00 > probe of 0018:06CB:7A13.0002 returned 1 after 25104 usecs > > drivers/input/mouse/synaptics.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/input/mouse/synaptics.c b/drivers/input/mouse/synaptics.c > index b6da0c1267e3..e3fb4b9346c0 100644 > --- a/drivers/input/mouse/synaptics.c > +++ b/drivers/input/mouse/synaptics.c > @@ -185,6 +185,12 @@ static const char * const smbus_pnp_ids[] = { > }; > > static const char * const forcepad_pnp_ids[] = { > + "DLL06e4", /* Dell XPS 15 9550 */ > + "DLL0704", /* Dell XPS 13 9350 */ > + "DLL075b", /* Dell XPS 13 9360 */ > + "DLL07be", /* Dell XPS 15 9560 */ > + "DLL07e6", /* Dell XPS 13 9370 */ > + "DLL087c", /* Dell XPS 15 9570 */ > "SYN300D", > "SYN3014", > NULL > -- > 2.20.1 > Thanks. -- Dmitry
Re: [PATCH v3 2/6] spmi: pmic-arb: convert to v2 irq interfaces to support hierarchical IRQ chips
On Fri, Jan 11, 2019 at 03:39:27PM -0800, Stephen Boyd wrote: > Ok sounds good. Can you also fix all the spmi-gpio specifying dts files? > There are more than just two PMIC dts files that need changes (per my > grep results sent to the list earlier). Yes, I'll take care of all of them in the next version. Brian
Re: linux-next: build failure after merge of the imx-mxs tree
On Fri, Jan 11, 2019 at 12:15:59AM +0100, Lukasz Majewski wrote: > Hi Stephen, > > > Hi all, > > > > On Fri, 11 Jan 2019 09:30:03 +1100 Stephen Rothwell > > wrote: > > > > > > After merging the imx-mxs tree, today's linux-next build (arm > > > multi_v7_defconfig) failed like this: > > > > Actually this is only treated as a warning. Thanks for reporting, Stephen. > > Thanks for the notification. I will look on that (soon). I just sent a fix for it. https://patchwork.kernel.org/patch/10760827/ > > > > > > arch/arm/boot/dts/vfxxx.dtsi:550.24-563.6: Warning > > > (spi_bus_bridge): /soc/aips-bus@4008/spi@400ad000: incorrect > > > #address-cells for SPI bus also defined at > > > arch/arm/boot/dts/vf610-bk4.dts:130.8-142.3 > > > arch/arm/boot/dts/vf610-bk4.dtb: Warning (spi_bus_reg): Failed > > > prerequisite 'spi_bus_bridge' > > > > > > > > > Caused by commit > > > > > > cf91ce9696a0 ("ARM: dts: vf610-bk4: Provide support for reading > > > ID code from MVB device") I found that the warning exists even on v5.0-rc1, so it's not caused by Lukasz's change. Shawn
Re: ppc64le reliable stack unwinder and scheduled tasks
On Thu, Jan 10, 2019 at 04:14:00PM -0500, Joe Lawrence wrote: > Hi all, > > tl;dr: On ppc64le, what is top-most stack frame for scheduled tasks >about? > > I am looking at a bug in which ~10% of livepatch tests on RHEL-7 and > RHEL-8 distro kernels, the ppc64le reliable stack unwinder consistently > reports an unreliable stack for a given task. This condition can > eventually resolve itself, but sometimes this state remains wedged for > hours and I can live-debug the system with crash-utility. > In summary, you think the reliable stack tracer is being conservative? xmon (built in debugger) also allows for live-debugging and might be more easier to get some of this done. > I have tried reproducing with upstream 4.20 kernel, but haven't seen > this exact condition yet. More on upstream in a bit. > > That said, I have a question about the ppc64 stack frame layout with > regard to scheduled tasks. In each sticky "unreliable" stack trace > instance that I've looked at, the task's stack has an interesting > frame at the top: > Could you please define interesting frame on top a bit more? Usually the topmost return address is in LR > > Example 1 (RHEL-7) > == > > crash> struct task_struct.thread c0022fd015c0 | grep ksp > ksp = 0xc000288af9c0 > > crash> rd 0xc000288af9c0 -e 0xc000288b > > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > sp[0]: > > c000288af9c0: c000288afb90 00dd ...( > c000288af9d0: c0002a94 c1c60a00 .*.. > > crash> sym c0002a94 > c0002a94 (T) hardware_interrupt_common+0x114 What does bt look like in this case? what does r1 point to? I would look at xmon and see what the "t" (stack trace) looks like, I think it's a more familiar tool. > > c000288af9e0: c1c60a80 > c000288af9f0: c000288afbc0 00dd ...( > c000288afa00: c14322e0 c1c60a00 ."C. > c000288afa10: c002303ae380 c002303ae380 ..:0..:0 > c000288afa20: 7265677368657265 2200 erehsger.".. > > Uh-oh... > > /* Mark stacktraces with exception frames as unreliable. */ > stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER > > c000288afa30: c01240ac c000288afcb0 .@.( > c000288afa40: c13e4d00 .M>. > c000288afa50: 0001 0001 > c000288afa60: c14322e0 0804 ."C. > c000288afa70: c000288ac080 c0022fd015c0 ...(.../ > c000288afa80: c000288afae0 ...( > c000288afa90: c000288afae0 c7b80900 ...( > c000288afaa0: c0e90a00 c0e90a00 > c000288afab0: c01240ac c0e90a00 .@.. > c000288afac0: c0e90a00 c4790580 ..y. > c000288afad0: 0001 c0e90a00 > c000288afae0: 0004 > c000288afaf0: c0022fd01ad0 c0e73be0 .../.;.. > c000288afb00: c0023900f450 c1488190 P..9..H. > c000288afb10: 00ad c0023900ef40 @..9 > c000288afb20: c000288ac000 c0e73be0 ...(.;.. > c000288afb30: c001b514 c0022fd01628 (../ > c000288afb40: c000288afbb0 c0008800 ...( > c000288afb50: c0162880 .(.. > c000288afb60: 240f3022 0004 "0.$ > c000288afb70: c0e90a00 c0022fd01a20 ../ > c000288afb80: c000288afbf0 c14322e0 ...(."C. > > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > sp[1]: > > c000288afb90: c000288afbf0 c1960a00 ...( > c000288afba0: c001b514 0004 > > crash> sym c001b514 > c001b514 (T) __switch_to+0x264 > > c000288afbb0: c0e90a00 > c000288afbc0: c000288ac000 c14322e0 ...(."C. > c000288afbd0: c0e90a00 c1960a00 > c000288afbe0: c0022fd015c0 c0023900ef40 .../@..9 > > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > sp[2]: > > c000288afbf0: c000288afcd0 c0003300 ...(.3.. > c000288afc00: c0a83918 c13e4d00 .9...M>. > > crash> sym c0a83918 > c0a83918 (t) __schedule+0x428 > > - - - -
Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions
On 1/11/19 8:51 AM, Jerome Glisse wrote: > On Thu, Jan 10, 2019 at 06:59:31PM -0800, John Hubbard wrote: >> On 1/3/19 6:44 AM, Jerome Glisse wrote: >>> On Thu, Jan 03, 2019 at 10:26:54AM +0100, Jan Kara wrote: On Wed 02-01-19 20:55:33, Jerome Glisse wrote: > On Wed, Dec 19, 2018 at 12:08:56PM +0100, Jan Kara wrote: >> On Tue 18-12-18 21:07:24, Jerome Glisse wrote: >>> On Tue, Dec 18, 2018 at 03:29:34PM -0800, John Hubbard wrote: > > [...] > > Now page_mkclean: > > int page_mkclean(struct page *page) > { > int cleaned = 0; > + int real_mapcount = 0; > struct address_space *mapping; > struct rmap_walk_control rwc = { > .arg = (void *)&cleaned, > .rmap_one = page_mkclean_one, > .invalid_vma = invalid_mkclean_vma, > + .mapcount = &real_mapcount, > }; > + int mapcount1, mapcount2; > > BUG_ON(!PageLocked(page)); > > if (!page_mapped(page)) > return 0; > > mapping = page_mapping(page); > if (!mapping) > return 0; > > + mapcount1 = page_mapcount(page); > // rmap_walk need to change to count mapping and return value > // in .mapcount easy one > rmap_walk(page, &rwc); So what prevents GUP_fast() to grab reference here and the test below would think the page is not pinned? Or do you assume that every page_mkclean() call will be protected by PageWriteback (currently it is not) so that GUP_fast() blocks / bails out? >> >> Continuing this thread, still focusing only on the "how to maintain a >> PageDmaPinned >> for each page" question (ignoring, for now, what to actually *do* in >> response to >> that flag being set): >> >> 1. Jan's point above is still a problem: PageWriteback != "page_mkclean is >> happening". >> This is probably less troubling than the next point, but it does undermine >> all the >> complicated schemes involving PageWriteback, that try to synchronize gup() >> with >> page_mkclean(). >> >> 2. Also, the mapcount approach here still does not reliably avoid false >> negatives >> (that is, a page may have been gup'd, but page_mkclean could miss that): >> gup() >> can always jump in and increment the mapcount, while page_mkclean is in the >> middle >> of making (wrong) decisions based on that mapcount. There's no lock to >> prevent that. >> >> Again: mapcount can go up *or* down, so I'm not seeing a true solution yet. > > Both point is address by the solution at the end of this email. > >>> >>> So GUP_fast() becomes: >>> >>> GUP_fast_existing() { ... } >>> GUP_fast() >>> { >>> GUP_fast_existing(); >>> >>> for (i = 0; i < npages; ++i) { >>> if (PageWriteback(pages[i])) { >>> // need to force slow path for this page >>> } else { >>> SetPageDmaPinned(pages[i]); >>> atomic_inc(pages[i]->mapcount); >>> } >>> } >>> } >>> >>> This is a minor slow down for GUP fast and it takes care of a >>> write back race on behalf of caller. This means that page_mkclean >>> can not see a mapcount value that increase. This simplify thing >>> we can relax that. Note that what this is doing is making sure >>> that GUP_fast never get lucky :) ie never GUP a page that is in >>> the process of being write back but has not yet had its pte >>> updated to reflect that. >>> >>> But I think that detecting pinned pages with small false positive rate is OK. The extra page bouncing will cost some performance but if it is rare, then we are OK. So I think we can go for the simple version of detecting pinned pages as you mentioned in some earlier email. We just have to be sure there are no false negatives. >>> >> >> Agree with that sentiment, but there are still false negatives and I'm not >> yet seeing any solutions for that. > > So here is the solution: > > > Is a page pin ? With no false negative: > === > > get_user_page*() aka GUP: > if (!PageAnon(page)) { > bool write_back = PageWriteback(page); > bool page_is_pin = PagePin(page); > if (write_back && !page_is_pin) { > /* Wait for write back a re-try GUP */ > ... > goto retry; > } > [G1]smp_rmb(); > [G2]atomic_inc(&page->_mapcount) > [G3]smp_wmb(); > [G4]SetPagePin(page); > [G5]smp_wmb(); > [G6]if (!write_back && !page_is_pin && PageWriteback(page)) { > /* Back-off as write back might have miss us */ > atomic_dec(&page->_mapcount); > /* Wait for write back a re-try GUP */ > ... > goto retry; > } > } > > put_user_page() aka PUP: > [P1] if (!PageAnon(page)) atomic_dec(&page->_mapcount); > [P2] put_page(page); > > page_mkclean(): > [C1] pined = TestClearPagePin(page); > [C2] smp_mb(); > [C3] map_and_pin_count = atomic_read(&page->
ARM: config issue with ftrace function graph tracer
I'm having a problem with the ftrace function graph tracer on a 32 bit arm board (orangepi pc). A bisect points to the following commit: f9b58e8c7d03 ("ARM: 8800/1: use choice for kernel unwinders") Before this commit, if I use sunxi_defconfig and then menuconfig to enable FTRACE and FUNCTION_TRACER then the function graph tracer works. With this commit, and as of v5.0-rc1, doing the same as above results in a broken function graph tracer and often an oops as well. The commit introduces a choice group and it looks like it should default to UNWINDER_FRAME_POINTER if FUNCTION_GRAPH_TRACER is enabled. FUNCTION_GRAPH_TRACER is enabled by default when I enable FUNCTION_TRACER but this has no effect on the choice. The choice always defaults to the other option which is UNWINDER_ARM. If I manually choose UNWINDER_FRAME_POINTER then the function graph tracer works fine. Any idea why the default behaviour has changed? Thanks, Jeremy
Re: [PATCH v1] clk: qcom: clk-rpmh: Add IPA clock support
On 1/9/2019 11:28 AM, Stephen Boyd wrote: Quoting David Dai (2018-12-13 18:35:04) The current clk-rpmh driver only supports on and off RPMh clock resources, let's extend the current driver by add support for clocks that are managed by a different type of RPMh resource known as Bus Clock Manager(BCM). The BCM is a configurable shared resource aggregator that scales performance based on a preset of frequency points. The Qualcomm IP Accelerator (IPA) clock is an example of a resource that is managed by the BCM and this a requirement from the IPA driver in order to scale its core clock. Please use this commit text instead: The clk-rpmh driver only supports on and off RPMh clock resources. Let's extend the driver by add support for clocks that are managed by a different type of RPMh resource known as Bus Clock Manager(BCM). The BCM is a configurable shared resource aggregator that scales performance based on a set of frequency points. The Qualcomm IP Accelerator (IPA) clock is an example of a resource that is managed by the BCM and this a requirement from the IPA driver in order to scale its core clock. Ok. Signed-off-by: David Dai --- drivers/clk/qcom/clk-rpmh.c | 141 ++ include/dt-bindings/clock/qcom,rpmh.h | 1 + 2 files changed, 142 insertions(+) diff --git a/drivers/clk/qcom/clk-rpmh.c b/drivers/clk/qcom/clk-rpmh.c index 9f4fc77..ee78c4b 100644 --- a/drivers/clk/qcom/clk-rpmh.c +++ b/drivers/clk/qcom/clk-rpmh.c @@ -18,6 +18,31 @@ #define CLK_RPMH_ARC_EN_OFFSET 0 #define CLK_RPMH_VRM_EN_OFFSET 4 +#define BCM_TCS_CMD_COMMIT_MASK0x4000 +#define BCM_TCS_CMD_VALID_SHIFT29 +#define BCM_TCS_CMD_VOTE_MASK 0x3fff +#define BCM_TCS_CMD_VOTE_SHIFT 0 + +#define BCM_TCS_CMD(valid, vote) \ + (BCM_TCS_CMD_COMMIT_MASK | \ + ((valid) << BCM_TCS_CMD_VALID_SHIFT) | \ + ((cpu_to_le32(vote) & \ Why? Sorry, could you clarify this question? If you're referring to the cpu_to_le32, shouldn't we be explicit about converting endianness even if it might be redundant for this particular qcom platform? + BCM_TCS_CMD_VOTE_MASK) << BCM_TCS_CMD_VOTE_SHIFT)) + +/** + * struct bcm_db - Auxiliary data pertaining to each Bus Clock Manager(BCM) + * @unit: divisor used to convert Hz value to an RPMh msg + * @width: multiplier used to convert Hz value to an RPMh msg + * @vcd: virtual clock domain that this bcm belongs to + * @reserved: reserved to pad the struct + */ +struct bcm_db { + __le32 unit; + __le16 width; + u8 vcd; + u8 reserved; +}; + /** * struct clk_rpmh - individual rpmh clock data structure * @hw:handle between common and hardware-specific interfaces @@ -29,6 +54,7 @@ * @aggr_state:rpmh clock aggregated state * @last_sent_aggr_state: rpmh clock last aggr state sent to RPMh * @valid_state_mask: mask to determine the state of the rpmh clock + * @aux_data: data specific to the bcm rpmh resource But there isn't aux_data. Is this supposed to be unit? And what is unit going to be? 1000? Supposed to be unit, the value is on the order of Khz. * @dev: device to which it is attached * @peer: pointer to the clock rpmh sibling */ @@ -42,6 +68,7 @@ struct clk_rpmh { u32 aggr_state; u32 last_sent_aggr_state; u32 valid_state_mask; + u32 unit; struct device *dev; struct clk_rpmh *peer; }; @@ -210,6 +248,91 @@ static unsigned long clk_rpmh_recalc_rate(struct clk_hw *hw, .recalc_rate= clk_rpmh_recalc_rate, }; +static int clk_rpmh_bcm_send_cmd(struct clk_rpmh *c, bool enable) +{ + struct tcs_cmd cmd = { 0 }; + u32 cmd_state; + int ret; + + mutex_lock(&rpmh_clk_lock); + + cmd_state = enable ? (c->aggr_state ? c->aggr_state : 1) : 0; Make this into an if statement instead of double ternary? Ok. cmd_state = 0; if (enable) { cmd_state = 1; if (c->aggr_state) cmd_state = c->aggr_state; } + + if (c->last_sent_aggr_state == cmd_state) + return 0; We just returned with a mutex held. Oops! Oops, will fix. + + cmd.addr = c->res_addr; + cmd.data = BCM_TCS_CMD(enable, cmd_state); + + ret = rpmh_write_async(c->dev, RPMH_ACTIVE_ONLY_STATE, &cmd, 1); + if (ret) { + dev_err(c->dev, "set active state of %s failed: (%d)\n", + c->res_name, ret); + return ret; Again! + } + + c->last_sent_aggr_state = cmd_state; + + mutex_unlock(&rpmh_clk_lock); + + return 0; +} + +static int clk_rpmh_bcm_prepare(struct clk_hw *hw) +{ + struct clk_rpmh *c = to_clk_rpmh(
Bug (since v4.20): integer underflow in known_siginfo_layout() when sig=0
Hi Eric, The following commit, which went into v4.20, introduced undefined behavior when sys_rt_sigqueueinfo() is called with sig=0: commit 4ce5f9c9e7546915c559ffae594e6d73f918db00 Author: Eric W. Biederman Date: Tue Sep 25 12:59:31 2018 +0200 signal: Use a smaller struct siginfo in the kernel In sig_specific_sicodes(), used from known_siginfo_layout(), the expression '1ULL << ((sig)-1)' is undefined as it evaluates to 1ULL << 4294967295. Reproducer: #include #include #include int main(void) { siginfo_t si = { .si_code = 1 }; syscall(__NR_rt_sigqueueinfo, 0, 0, &si); } UBSAN report for v5.0-rc1: UBSAN: Undefined behaviour in kernel/signal.c:2946:7 shift exponent 4294967295 is too large for 64-bit type 'long unsigned int' CPU: 2 PID: 346 Comm: syz_signal Not tainted 5.0.0-rc1 #25 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x70/0xa5 lib/dump_stack.c:113 ubsan_epilogue+0xd/0x40 lib/ubsan.c:159 __ubsan_handle_shift_out_of_bounds+0x12c/0x170 lib/ubsan.c:425 known_siginfo_layout+0xae/0xe0 kernel/signal.c:2946 post_copy_siginfo_from_user kernel/signal.c:3009 [inline] __copy_siginfo_from_user+0x35/0x60 kernel/signal.c:3035 __do_sys_rt_sigqueueinfo kernel/signal.c:3553 [inline] __se_sys_rt_sigqueueinfo kernel/signal.c:3549 [inline] __x64_sys_rt_sigqueueinfo+0x31/0x70 kernel/signal.c:3549 do_syscall_64+0x4c/0x1b0 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x433639 Code: c4 18 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b 27 00 00 c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:7fffcb289fc8 EFLAGS: 0246 ORIG_RAX: 0081 RAX: ffda RBX: 004002e0 RCX: 00433639 RDX: 7fffcb289fd0 RSI: RDI: RBP: 006b2018 R08: 004d R09: R10: R11: 0246 R12: 00401560 R13: 004015f0 R14: R15:
[PATCH v4] RISC-V: defconfig: Enable Generic PCIE by default
Enable generic PCIe by default in the RISC-V defconfig, this allows us to use QEMU's PCIe support out of the box. Signed-off-by: Alistair Francis --- arch/riscv/configs/defconfig | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig index f399659d3b8d..7c86bd71ae68 100644 --- a/arch/riscv/configs/defconfig +++ b/arch/riscv/configs/defconfig @@ -14,6 +14,8 @@ CONFIG_EXPERT=y CONFIG_BPF_SYSCALL=y CONFIG_SMP=y CONFIG_PCI=y +CONFIG_PCIEPORTBUS=y +CONFIG_PCI_HOST_GENERIC=y CONFIG_PCIE_XILINX=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y @@ -63,7 +65,6 @@ CONFIG_USB_STORAGE=y CONFIG_USB_UAS=y CONFIG_VIRTIO_MMIO=y CONFIG_SIFIVE_PLIC=y -CONFIG_RAS=y CONFIG_EXT4_FS=y CONFIG_EXT4_FS_POSIX_ACL=y CONFIG_AUTOFS4_FS=y @@ -77,5 +78,6 @@ CONFIG_NFS_V4_1=y CONFIG_NFS_V4_2=y CONFIG_ROOT_NFS=y CONFIG_CRYPTO_USER_API_HASH=y +CONFIG_CRYPTO_DEV_VIRTIO=y CONFIG_PRINTK_TIME=y # CONFIG_RCU_TRACE is not set -- 2.19.1
Re: [PATCH v2] rbtree: fix the red root
I've been out today but return home tomorrow and can test any suggested fixes, or with different kernel settings. Just let me know. Esme Sent with ProtonMail Secure Email. ‐‐‐ Original Message ‐‐‐ On Friday, January 11, 2019 7:18 PM, Qian Cai wrote: > On 1/11/19 6:16 PM, Matthew Wilcox wrote: > > > On Fri, Jan 11, 2019 at 03:58:43PM -0500, Qian Cai wrote: > > > > > diff --git a/lib/rbtree_test.c b/lib/rbtree_test.c > > > index b7055b2a07d3..afad0213a117 100644 > > > --- a/lib/rbtree_test.c > > > +++ b/lib/rbtree_test.c > > > @@ -345,6 +345,17 @@ static int __init rbtree_test_init(void) > > > check(0); > > > } > > > > > > - /* > > > - - a little regression test to catch a bug may be introduced by > > > - - 6d58452dc06 (rbtree: adjust root color in rb_insert_color() only > > > when > > > - - necessary) > > > - */ > > > - insert(nodes, &root); > > > - nodes->rb.__rb_parent_color = RB_RED; > > > - insert(nodes + 1, &root); > > > - erase(nodes + 1, &root); > > > - erase(nodes, &root); > > > > That's not a fair test! You're poking around in the data structure to > > create the situation. This test would have failed before 6d58452dc06 too. > > How do we create a tree that has a red parent at root, only using insert() > > and erase()? > > If only I knew how to reproduce this myself, I might be able to figure out how > it ends up with the red root in the first place.
Re: [Potential Spoof] Re: [Potential Spoof] Re: [PATCH v2 4/4] ARM: dts: aspeed: Add lpc ctrl for Facebook
Joel, Please merge these patches as it is required by facebook platform. Regards -Vijay On 1/7/19, 11:25 AM, "Linux-aspeed on behalf of Vijay Khemka" wrote: Please merge these patches in upstream kernel. Regards -Vijay On 12/20/18, 10:06 AM, "Linux-aspeed on behalf of Vijay Khemka" wrote: Joel, Can you please take care of these patches merge. On 12/17/18, 12:04 PM, "Vijay Khemka" wrote: Added lpc ctrl device to enable LPC clock in Facebook Tiogapass device tree. Signed-off-by: Vijay Khemka --- .../boot/dts/aspeed-bmc-facebook-tiogapass.dts | 17 + 1 file changed, 17 insertions(+) diff --git a/arch/arm/boot/dts/aspeed-bmc-facebook-tiogapass.dts b/arch/arm/boot/dts/aspeed-bmc-facebook-tiogapass.dts index 73e58a821613..ef7875b54562 100644 --- a/arch/arm/boot/dts/aspeed-bmc-facebook-tiogapass.dts +++ b/arch/arm/boot/dts/aspeed-bmc-facebook-tiogapass.dts @@ -22,6 +22,17 @@ reg = <0x8000 0x2000>; }; + reserved-memory { + #address-cells = <1>; + #size-cells = <1>; + ranges; + + flash_memory: region@9800 { + no-map; + reg = <0x9800 0x1000>; /* 4K */ + }; + }; + iio-hwmon { compatible = "iio-hwmon"; io-channels = <&adc 0>, <&adc 1>, <&adc 2>, <&adc 3>, @@ -54,6 +65,12 @@ }; }; +&lpc_ctrl { + status = "okay"; + memory-region = <&flash_memory>; + flash = <&spi1>; +}; + &uart1 { // Host Console status = "okay"; -- 2.17.1
Re: [PATCH v2] rbtree: fix the red root
On 1/11/19 6:16 PM, Matthew Wilcox wrote: > On Fri, Jan 11, 2019 at 03:58:43PM -0500, Qian Cai wrote: >> diff --git a/lib/rbtree_test.c b/lib/rbtree_test.c >> index b7055b2a07d3..afad0213a117 100644 >> --- a/lib/rbtree_test.c >> +++ b/lib/rbtree_test.c >> @@ -345,6 +345,17 @@ static int __init rbtree_test_init(void) >> check(0); >> } >> >> +/* >> + * a little regression test to catch a bug may be introduced by >> + * 6d58452dc06 (rbtree: adjust root color in rb_insert_color() only when >> + * necessary) >> + */ >> +insert(nodes, &root); >> +nodes->rb.__rb_parent_color = RB_RED; >> +insert(nodes + 1, &root); >> +erase(nodes + 1, &root); >> +erase(nodes, &root); > > That's not a fair test! You're poking around in the data structure to > create the situation. This test would have failed before 6d58452dc06 too. > How do we create a tree that has a red parent at root, only using insert() > and erase()? > If only I knew how to reproduce this myself, I might be able to figure out how it ends up with the red root in the first place.
Re: [PATCH] staging: vchiq: Fix local event signalling
> Phil Elwell hat am 11. Januar 2019 um 12:34 > geschrieben: > > > Prior to the recent event reworking (see Fixes), thread synchronisation > was implemented using completions, the worker thread being woken with > a call to complete(). The replacement uses waitqueues, which are more > like condition variables in that the waiting thread is only woken if > the condition is true. > > When the VPU signals the ARM, it first sets the event's fired flag to > indicate which event is being signalled, but the places in the > ARM-side code where the worker thread is being woken - > remote_event_signal_local via request_poll - did not do so as it > wasn't previously necessary, and since the armed flag was being > cleared this lead to a deadlock. > > Fixes: 852b2876a8a8 ("staging: vchiq: rework remove_event handling") > Signed-off-by: Phil Elwell Tested-by: Stefan Wahren Thanks
Re: [PATCH v3] RISC-V: defconfig: Enable Generic PCIE by default
On Sat, 2019-01-12 at 00:03 +, Alistair Francis wrote: > Enable generic PCIe by default in the RISC-V defconfig, this allows > us > to use QEMU's PCIe support out of the box. > > Signed-off-by: Alistair Francis > --- > arch/riscv/configs/defconfig | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/configs/defconfig > b/arch/riscv/configs/defconfig > index f399659d3b8d..18b50bf41b67 100644 > --- a/arch/riscv/configs/defconfig > +++ b/arch/riscv/configs/defconfig > @@ -14,6 +14,8 @@ CONFIG_EXPERT=y > CONFIG_BPF_SYSCALL=y > CONFIG_SMP=y > CONFIG_PCI=y > +CONFIG_PCIEPORTBUS=y > +CONFIG_PCI_HOST_GENERIC=y > CONFIG_PCIE_XILINX=y > CONFIG_MODULES=y > CONFIG_MODULE_UNLOAD=y > @@ -46,7 +48,6 @@ CONFIG_INPUT_MOUSEDEV=y > CONFIG_SERIAL_8250=y > CONFIG_SERIAL_8250_CONSOLE=y > CONFIG_SERIAL_OF_PLATFORM=y > -CONFIG_SERIAL_EARLYCON_RISCV_SBI=y Ah, this shouldn't be removed. Sorry for the spam, I'll send a v4. Alistair > CONFIG_HVC_RISCV_SBI=y > # CONFIG_PTP_1588_CLOCK is not set > CONFIG_DRM=y > @@ -63,7 +64,6 @@ CONFIG_USB_STORAGE=y > CONFIG_USB_UAS=y > CONFIG_VIRTIO_MMIO=y > CONFIG_SIFIVE_PLIC=y > -CONFIG_RAS=y > CONFIG_EXT4_FS=y > CONFIG_EXT4_FS_POSIX_ACL=y > CONFIG_AUTOFS4_FS=y > @@ -77,5 +77,6 @@ CONFIG_NFS_V4_1=y > CONFIG_NFS_V4_2=y > CONFIG_ROOT_NFS=y > CONFIG_CRYPTO_USER_API_HASH=y > +CONFIG_CRYPTO_DEV_VIRTIO=y > CONFIG_PRINTK_TIME=y > # CONFIG_RCU_TRACE is not set
[PATCH] acpi/nfit: Fix command-supported detection
The _DSM function number validation only happens to succeed when the generic Linux command number translation corresponds with a DSM-family-specific function number. This breaks NVDIMM-N implementations that correctly implement _LSR, _LSW, and _LSI, but do not happen to publish support for DSM function numbers 4, 5, and 6. Recall that the support for _LS{I,R,W} family of methods results in the DIMM being marked as supporting those command numbers at acpi_nfit_register_dimms() time. The DSM function mask is only used for ND_CMD_CALL support of non-NVDIMM_FAMILY_INTEL devices. Fixes: 31eca76ba2fc ("nfit, libnvdimm: limited/whitelisted dimm command...") Cc: Link: https://github.com/pmem/ndctl/issues/78 Reported-by: Sujith Pandel Signed-off-by: Dan Williams --- Sujith, this is a larger change than what you originally tested, but it should behave the same. I wanted to consolidate all the code that handles Linux command number to DIMM _DSM function number translation. If you have a chance to re-test with this it would be much appreciated. Thanks for the report! drivers/acpi/nfit/core.c | 43 +-- 1 file changed, 29 insertions(+), 14 deletions(-) diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index 790691d9a982..d5d64e90ae71 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -409,6 +409,29 @@ static bool payload_dumpable(struct nvdimm *nvdimm, unsigned int func) return true; } +static int cmd_to_func(struct nvdimm *nvdimm, unsigned int cmd, + struct nd_cmd_pkg *call_pkg) +{ + struct nfit_mem *nfit_mem = nvdimm_provider_data(nvdimm); + + if (cmd == ND_CMD_CALL) { + int i; + + if (call_pkg && nfit_mem->family != call_pkg->nd_family) + return -ENOTTY; + + for (i = 0; i < ARRAY_SIZE(call_pkg->nd_reserved2); i++) + if (call_pkg->nd_reserved2[i]) + return -EINVAL; + return call_pkg->nd_command; + } + + /* Linux ND commands == NVDIMM_FAMILY_INTEL function numbers */ + if (nfit_mem->family == NVDIMM_FAMILY_INTEL) + return cmd; + return 0; +} + int acpi_nfit_ctl(struct nvdimm_bus_descriptor *nd_desc, struct nvdimm *nvdimm, unsigned int cmd, void *buf, unsigned int buf_len, int *cmd_rc) { @@ -422,30 +445,21 @@ int acpi_nfit_ctl(struct nvdimm_bus_descriptor *nd_desc, struct nvdimm *nvdimm, unsigned long cmd_mask, dsm_mask; u32 offset, fw_status = 0; acpi_handle handle; - unsigned int func; const guid_t *guid; - int rc, i; + int func, rc, i; if (cmd_rc) *cmd_rc = -EINVAL; - func = cmd; - if (cmd == ND_CMD_CALL) { - call_pkg = buf; - func = call_pkg->nd_command; - - for (i = 0; i < ARRAY_SIZE(call_pkg->nd_reserved2); i++) - if (call_pkg->nd_reserved2[i]) - return -EINVAL; - } if (nvdimm) { struct acpi_device *adev = nfit_mem->adev; if (!adev) return -ENOTTY; - if (call_pkg && nfit_mem->family != call_pkg->nd_family) - return -ENOTTY; + func = cmd_to_func(nvdimm, cmd, buf); + if (func < 0) + return func; dimm_name = nvdimm_name(nvdimm); cmd_name = nvdimm_cmd_name(cmd); cmd_mask = nvdimm_cmd_mask(nvdimm); @@ -456,6 +470,7 @@ int acpi_nfit_ctl(struct nvdimm_bus_descriptor *nd_desc, struct nvdimm *nvdimm, } else { struct acpi_device *adev = to_acpi_dev(acpi_desc); + func = cmd; cmd_name = nvdimm_bus_cmd_name(cmd); cmd_mask = nd_desc->cmd_mask; dsm_mask = cmd_mask; @@ -470,7 +485,7 @@ int acpi_nfit_ctl(struct nvdimm_bus_descriptor *nd_desc, struct nvdimm *nvdimm, if (!desc || (cmd && (desc->out_num + desc->in_num == 0))) return -ENOTTY; - if (!test_bit(cmd, &cmd_mask) || !test_bit(func, &dsm_mask)) + if (!test_bit(cmd, &cmd_mask) && !test_bit(func, &dsm_mask)) return -ENOTTY; in_obj.type = ACPI_TYPE_PACKAGE;
[PATCH 3/3] autofs: add ignore mount option
Add an autofs file system mount option that can be used to provide a generic indicator to applications that the mount entry should be ignored when displaying mount information. Signed-off-by: Ian Kent --- fs/autofs/autofs_i.h |1 + fs/autofs/inode.c|9 - include/uapi/linux/auto_fs.h |2 +- 3 files changed, 10 insertions(+), 2 deletions(-) diff --git a/fs/autofs/autofs_i.h b/fs/autofs/autofs_i.h index 3e59f0ed777b..b735f2b1e462 100644 --- a/fs/autofs/autofs_i.h +++ b/fs/autofs/autofs_i.h @@ -105,6 +105,7 @@ struct autofs_wait_queue { #define AUTOFS_SBI_CATATONIC 0x0001 #define AUTOFS_SBI_STRICTEXPIRE 0x0002 +#define AUTOFS_SBI_IGNORE 0x0004 struct autofs_sb_info { u32 magic; diff --git a/fs/autofs/inode.c b/fs/autofs/inode.c index 078992eee299..8647ecaa89fc 100644 --- a/fs/autofs/inode.c +++ b/fs/autofs/inode.c @@ -89,6 +89,8 @@ static int autofs_show_options(struct seq_file *m, struct dentry *root) seq_printf(m, ",indirect"); if (sbi->flags & AUTOFS_SBI_STRICTEXPIRE) seq_printf(m, ",strictexpire"); + if (sbi->flags & AUTOFS_SBI_IGNORE) + seq_printf(m, ",ignore"); #ifdef CONFIG_CHECKPOINT_RESTORE if (sbi->pipe) seq_printf(m, ",pipe_ino=%ld", file_inode(sbi->pipe)->i_ino); @@ -111,7 +113,8 @@ static const struct super_operations autofs_sops = { }; enum {Opt_err, Opt_fd, Opt_uid, Opt_gid, Opt_pgrp, Opt_minproto, Opt_maxproto, - Opt_indirect, Opt_direct, Opt_offset, Opt_strictexpire}; + Opt_indirect, Opt_direct, Opt_offset, Opt_strictexpire, + Opt_ignore}; static const match_table_t tokens = { {Opt_fd, "fd=%u"}, @@ -124,6 +127,7 @@ static const match_table_t tokens = { {Opt_direct, "direct"}, {Opt_offset, "offset"}, {Opt_strictexpire, "strictexpire"}, + {Opt_ignore, "ignore"}, {Opt_err, NULL} }; @@ -206,6 +210,9 @@ static int parse_options(char *options, case Opt_strictexpire: sbi->flags |= AUTOFS_SBI_STRICTEXPIRE; break; + case Opt_ignore: + sbi->flags |= AUTOFS_SBI_IGNORE; + break; default: return 1; } diff --git a/include/uapi/linux/auto_fs.h b/include/uapi/linux/auto_fs.h index 082119630b49..1f7925afad2d 100644 --- a/include/uapi/linux/auto_fs.h +++ b/include/uapi/linux/auto_fs.h @@ -23,7 +23,7 @@ #define AUTOFS_MIN_PROTO_VERSION 3 #define AUTOFS_MAX_PROTO_VERSION 5 -#define AUTOFS_PROTO_SUBVERSION4 +#define AUTOFS_PROTO_SUBVERSION5 /* * The wait_queue_token (autofs_wqt_t) is part of a structure which is passed
[PATCH 2/3] autofs - fix error return in autofs_fill_super()
In autofs_fill_super() on error of get inode/make root dentry the return should be ENOMEM as this is the only failure case of the called functions. Signed-off-by: Ian Kent --- fs/autofs/inode.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/autofs/inode.c b/fs/autofs/inode.c index 0e8ea2d9a2bb..078992eee299 100644 --- a/fs/autofs/inode.c +++ b/fs/autofs/inode.c @@ -266,8 +266,10 @@ int autofs_fill_super(struct super_block *s, void *data, int silent) } root_inode = autofs_get_inode(s, S_IFDIR | 0755); root = d_make_root(root_inode); - if (!root) + if (!root) { + ret = -ENOMEM; goto fail_ino; + } pipe = NULL; root->d_fsdata = ino;
4.19.{12,[13],14}: RIP: 0010:nf_conncount_cache_free+0x26/0x2f [nf_conncount]
Hello. this is my first post to a Linux list, and i am not subscribed. [Used it from 01-11-1999 to about 2001, then happily went to FreeBSD. ^_^ But Linux again since 2015, on bare metal since last October/November. Many thanks -- working Unix/POSIX on a Laptop. Fantastic improvements on the documentation side, and upsetting technologies wherever i look and as far as i can penetrate that, sched, namespaces, filesystems .. Grazy! At the moment i am running AlpineLinux [edge] on both rented Linux VM and Lpatop(s). (Note well: i love CRUX Linux!)] They have updated from flawless 4.14.xx to 4.19.xx series in December. I went with 4.19.12, but after some hours there were problems, the first time the server VM got completely stuck and i had to force a hard shutdown -- the first in 606 days. In the following twelve hours i saw two automatic reboots, i went back to 4.14.89. 4.19.13: i saw announcement with nothing mentioned, went to the net stuff git repo and saw some commits mid december which could address the problem of 4.19.12 Dec 28 12:20:48 kernel: [34107.761146] RIP: 0010:__list_del_entry_valid+0x7f/0x86 (conn_free+0x36/0x86 [nf_conncount]), so i decided to wait for 4.19.14. Today it came in, but after exactly 10 hours this: crit: Jan 12 00:15:00 kernel: [36690.017115] BUG: unable to handle kernel NULL pointer dereference at warn: Jan 12 00:15:00 kernel: [36690.023028] Oops: [#1] SMP PTI Jan 12 00:15:00 kernel: [36690.024368] CPU: 0 PID: 3708 Comm: iptables Not tainted 4.19.14-0-vanilla #1-Alpine Jan 12 00:15:00 kernel: [36690.025679] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Jan 12 00:15:00 kernel: [36690.027056] RIP: 0010:nf_conncount_cache_free+0x26/0p messages: Jan 12 00:15:00 crond[2046]: USER root pid 3677 cmd run-parts /etc/periodic/12hourly Jan 12 00:15:00 kernel: [36690.021645] PGD 0 P4D 0 That periodic script outputs sort(1)ed entries from xt_recent, and shows the state of the firewall. Here is the full warn entry: Jan 12 00:15:00 kernel: [36690.023028] Oops: [#1] SMP PTI Jan 12 00:15:00 kernel: [36690.024368] CPU: 0 PID: 3708 Comm: iptables Not tainted 4.19.14-0-vanilla #1-Alpine Jan 12 00:15:00 kernel: [36690.025679] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Jan 12 00:15:00 kernel: [36690.027056] RIP: 0010:nf_conncount_cache_free+0x26/0x2f [nf_conncount] Jan 12 00:15:00 kernel: [36690.028459] Code: 19 9f c5 ed 66 66 66 66 90 55 53 48 8b 77 08 48 8d 5f 08 48 8b 2e 48 39 de 74 15 48 8b 3d 05 20 00 00 e8 f6 9e c5 ed 48 89 ee <48> 8b 6d 00 eb e6 5b 5d c3 66 66 66 66 90 41 55 41 54 55 53 83 7f Jan 12 00:15:00 kernel: [36690.030482] RSP: 0018:b5f5c055fd28 EFLAGS: 00010202 Jan 12 00:15:00 kernel: [36690.031411] RAX: 9c3bf96b4c01 RBX: 9c3bf7c7e680 RCX: 802e002c Jan 12 00:15:00 kernel: [36690.032376] RDX: 802e002d RSI: RDI: 9c3bfbc17080 Jan 12 00:15:00 kernel: [36690.033374] RBP: R08: 0001 R09: 9c3bfb5028a8 Jan 12 00:15:00 kernel: [36690.034372] R10: 0401 R11: b5f5c0374001 R12: 9c3bf7c7e660 Jan 12 00:15:00 kernel: [36690.035374] R13: 9c3bf5eb1808 R14: aeea3e40 R15: 9c3bf96fa118 Jan 12 00:15:00 kernel: [36690.036406] FS: 7f9ca8e25b68() GS:9c3bfc20() knlGS: Jan 12 00:15:00 kernel: [36690.037472] CS: 0010 DS: ES: CR0: 80050033 Jan 12 00:15:00 kernel: [36690.038532] CR2: CR3: 79ae CR4: 06b0 Jan 12 00:15:00 kernel: [36690.039628] DR0: DR1: DR2: Jan 12 00:15:00 kernel: [36690.040724] DR3: DR6: fffe0ff0 DR7: 0400 Jan 12 00:15:00 kernel: [36690.041791] Call Trace: Jan 12 00:15:00 kernel: [36690.042919] nf_conncount_destroy+0x5a/0x82 [nf_conncount] Jan 12 00:15:00 kernel: [36690.044035] cleanup_match+0x45/0x6d [ip_tables] Jan 12 00:15:00 kernel: [36690.045175] cleanup_entry+0x3e/0xa8 [ip_tables] Jan 12 00:15:00 kernel: [36690.046305] __do_replace+0x171/0x203 [ip_tables] Jan 12 00:15:00 kernel: [36690.047421] do_ipt_set_ctl+0x133/0x195 [ip_tables] Jan 12 00:15:00 kernel: [36690.048643] nf_setsockopt+0x4b/0x64 Jan 12 00:15:00 kernel: [36690.049809] __sys_setsockopt+0x8b/0xc1 Jan 12 00:15:00 kernel: [36690.050957] __x64_sys_setsockopt+0x20/0x23 Jan 12 00:15:00 kernel: [36690.052110] do_syscall_64+0x55/0xe4 Jan 12 00:15:00 kernel: [36690.053329] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Jan 12 00:15:00 kernel: [36690.054516] RIP: 0033:0x7f9ca8dd08d3 Jan 12 00:15:00 kernel: [36690.055666] Code: 83 c4 18 48 89 c7 e9 0c 3b fe ff c3 c3 31 c0 c3 49 89 ca 50 48 63 d2 48 63 f6 48 63 ff 45 89 c0 45 31 c9 b8 36 00 00 00 0f 05 <48> 89 c7 e8 e5 3a fe ff 5a c3 48 63 f6 50 4
[PATCH 1/3] autofs: drop dentry reference only when it is never used
From: Pan Bian The function autofs_expire_run calls dput(dentry) to drop the reference count of dentry. However, dentry is read via autofs_dentry_ino(dentry) after that. This may result in a use-free-bug. The patch drops the reference count of dentry only when it is never used. Signed-off-by: Pan Bian Acked-by: Ian Kent --- fs/autofs/expire.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/autofs/expire.c b/fs/autofs/expire.c index d441244b79df..28d9c2b1b3bb 100644 --- a/fs/autofs/expire.c +++ b/fs/autofs/expire.c @@ -596,7 +596,6 @@ int autofs_expire_run(struct super_block *sb, pkt.len = dentry->d_name.len; memcpy(pkt.name, dentry->d_name.name, pkt.len); pkt.name[pkt.len] = '\0'; - dput(dentry); if (copy_to_user(pkt_p, &pkt, sizeof(struct autofs_packet_expire))) ret = -EFAULT; @@ -609,6 +608,8 @@ int autofs_expire_run(struct super_block *sb, complete_all(&ino->expire_complete); spin_unlock(&sbi->fs_lock); + dput(dentry); + return ret; }
[PATCH] binderfs: handle !CONFIG_IPC_NS builds
kbuild reported a build faile in [1]. This is triggered when CONFIG_IPC_NS is not set. So let's make the use of init_ipc_ns conditional on CONFIG_IPC_NS being set. [1]: https://lists.01.org/pipermail/kbuild-all/2019-January/056903.html Signed-off-by: Christian Brauner --- drivers/android/binderfs.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/android/binderfs.c b/drivers/android/binderfs.c index ad3ad2f7f9f4..9518e2e7da05 100644 --- a/drivers/android/binderfs.c +++ b/drivers/android/binderfs.c @@ -129,7 +129,11 @@ static int binderfs_binder_device_create(struct inode *ref_inode, struct inode *inode = NULL; struct super_block *sb = ref_inode->i_sb; struct binderfs_info *info = sb->s_fs_info; +#if defined(CONFIG_IPC_NS) bool use_reserve = (info->ipc_ns == &init_ipc_ns); +#else + bool use_reserve = true; +#endif /* Reserve new minor number for the new device. */ mutex_lock(&binderfs_minors_mutex); -- 2.19.1
[PATCH v3] RISC-V: defconfig: Enable Generic PCIE by default
Enable generic PCIe by default in the RISC-V defconfig, this allows us to use QEMU's PCIe support out of the box. Signed-off-by: Alistair Francis --- arch/riscv/configs/defconfig | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig index f399659d3b8d..18b50bf41b67 100644 --- a/arch/riscv/configs/defconfig +++ b/arch/riscv/configs/defconfig @@ -14,6 +14,8 @@ CONFIG_EXPERT=y CONFIG_BPF_SYSCALL=y CONFIG_SMP=y CONFIG_PCI=y +CONFIG_PCIEPORTBUS=y +CONFIG_PCI_HOST_GENERIC=y CONFIG_PCIE_XILINX=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y @@ -46,7 +48,6 @@ CONFIG_INPUT_MOUSEDEV=y CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y CONFIG_SERIAL_OF_PLATFORM=y -CONFIG_SERIAL_EARLYCON_RISCV_SBI=y CONFIG_HVC_RISCV_SBI=y # CONFIG_PTP_1588_CLOCK is not set CONFIG_DRM=y @@ -63,7 +64,6 @@ CONFIG_USB_STORAGE=y CONFIG_USB_UAS=y CONFIG_VIRTIO_MMIO=y CONFIG_SIFIVE_PLIC=y -CONFIG_RAS=y CONFIG_EXT4_FS=y CONFIG_EXT4_FS_POSIX_ACL=y CONFIG_AUTOFS4_FS=y @@ -77,5 +77,6 @@ CONFIG_NFS_V4_1=y CONFIG_NFS_V4_2=y CONFIG_ROOT_NFS=y CONFIG_CRYPTO_USER_API_HASH=y +CONFIG_CRYPTO_DEV_VIRTIO=y CONFIG_PRINTK_TIME=y # CONFIG_RCU_TRACE is not set -- 2.19.1
Re: [PATCH] drm/msm: Fix A6XX support for opp-level
On Fri, Jan 11, 2019 at 02:27:21PM -0800, Douglas Anderson wrote: > The bindings for Qualcomm opp levels changed after being Acked but > before landing. Thus the code in the GPU that was relying on the old > bindings is now broken. > > While we could just change the string 'qcom,level' to the string > 'opp-level', it actually seems better to use the newly-introduced > dev_pm_opp_get_level(). > > This patch thus has a hard dependency on the outstanding patch ("OPP: > Add support for parsing the 'opp-level' property") and will need to > land in a tree that contains that patch. > > This patch needs to land before the patch ("arm64: dts: sdm845: Add > gpu and gmu device nodes") since if a tree contains the device tree > patch but not this one you'll get a crash at bootup. > > Signed-off-by: Douglas Anderson Reviewed-by: Jordan Crouse > --- > > drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 17 ++--- > 1 file changed, 6 insertions(+), 11 deletions(-) > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c > b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c > index 5beb83d1cf87..900f18dc1577 100644 > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c > @@ -928,25 +928,20 @@ static int a6xx_gmu_memory_probe(struct a6xx_gmu *gmu) > } > > /* Return the 'arc-level' for the given frequency */ > -static u32 a6xx_gmu_get_arc_level(struct device *dev, unsigned long freq) > +static unsigned int a6xx_gmu_get_arc_level(struct device *dev, > +unsigned long freq) > { > struct dev_pm_opp *opp; > - struct device_node *np; > - u32 val = 0; > + unsigned int val; > > if (!freq) > return 0; > > - opp = dev_pm_opp_find_freq_exact(dev, freq, true); > + opp = dev_pm_opp_find_freq_exact(dev, freq, true); > if (IS_ERR(opp)) > return 0; > > - np = dev_pm_opp_get_of_node(opp); > - > - if (np) { > - of_property_read_u32(np, "qcom,level", &val); > - of_node_put(np); > - } > + val = dev_pm_opp_get_level(opp); > > dev_pm_opp_put(opp); > > @@ -982,7 +977,7 @@ static int a6xx_gmu_rpmh_arc_votes_init(struct device > *dev, u32 *votes, > /* Construct a vote for each frequency */ > for (i = 0; i < freqs_count; i++) { > u8 pindex = 0, sindex = 0; > - u32 level = a6xx_gmu_get_arc_level(dev, freqs[i]); > + unsigned int level = a6xx_gmu_get_arc_level(dev, freqs[i]); > > /* Get the primary index that matches the arc level */ > for (j = 0; j < pri_count; j++) { > -- > 2.20.1.97.g81188d93c3-goog > -- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Re: [PATCH INTERNAL V3 0/3] Add support for PWM Configure and stablize for PWM kona
On 1/11/2019 3:58 PM, Ray Jui wrote: > Please remove 'INTERNAL' in all of your patches and cover letter. > Never mind, saw your next patchset with this issue fixed > On 1/10/2019 9:08 PM, Sheetal Tigadoli wrote: >> Hi, >> This patchset contain support to make PWM changes configure >> and stablize >> Following are brief changes done >> a. Add support for version2 compatible string >> b. Change PWM config and stablize delay in PWM Kona >> >> Praveen Kumar B (3): >> dt-bindings: pwm: kona: Add new compatible for new version pwm-kona >> drivers: pwm: pwm-bcm-kona: Add pwm-kona-v2 support >> ARM: dts: cygnus: Change pwm compatible to new version >> >> .../devicetree/bindings/pwm/brcm,kona-pwm.txt | 2 +- >> arch/arm/boot/dts/bcm-cygnus.dtsi | 2 +- >> drivers/pwm/pwm-bcm-kona.c | 128 >> - >> 3 files changed, 100 insertions(+), 32 deletions(-) >>
Re: [PATCH INTERNAL V3 0/3] Add support for PWM Configure and stablize for PWM kona
Please remove 'INTERNAL' in all of your patches and cover letter. On 1/10/2019 9:08 PM, Sheetal Tigadoli wrote: > Hi, > This patchset contain support to make PWM changes configure > and stablize > Following are brief changes done > a. Add support for version2 compatible string > b. Change PWM config and stablize delay in PWM Kona > > Praveen Kumar B (3): > dt-bindings: pwm: kona: Add new compatible for new version pwm-kona > drivers: pwm: pwm-bcm-kona: Add pwm-kona-v2 support > ARM: dts: cygnus: Change pwm compatible to new version > > .../devicetree/bindings/pwm/brcm,kona-pwm.txt | 2 +- > arch/arm/boot/dts/bcm-cygnus.dtsi | 2 +- > drivers/pwm/pwm-bcm-kona.c | 128 > - > 3 files changed, 100 insertions(+), 32 deletions(-) >