Re: [PATCHv9 1/3] rdmacg: Added rdma cgroup controller
On 03/03/2016 04:49, Parav Pandit wrote: > Hi Tejun, Haggai, > > On Thu, Mar 3, 2016 at 1:28 AM, Parav Panditwrote: + rpool->refcnt--; + if (rpool->refcnt == 0 && rpool->num_max_cnt == pool_info->table_len) { >>> >>> If the caller charges 2 and then uncharges 1 two times, the refcnt >>> underflows? Why not just track how many usages are zero? >>> >> This is certainly must fix bug. Changed refcnt to usage_sum and changed to do >> usage_sum -= num during uncharging >> and >> usage_sum += num during charing. > > This is not sufficient as css_get() and put are done only once per > call, which leads to similar problem as of refcnt. Are css_get_many() and css_put_many() relevant here? > As I think more, I realised that this particular test is missing that > resulted in this related bug, I realize that we don't have use case to > have "num" field from the IB stack side. > For bulk free IB stack will have to keep track of different or same > rdmacg returned values to call uncharge() with right number of > resources, all of that complexity just doesn't make sense and not > required. > So as first step to further simplify this, I am removing "num" input > field from charge and uncharge API. IIRC there are no instances in the RDMA subsystem today where userspace allocates more than one resource at a time. Yishai, in your proposed RSS patchset did you have a verb to allocate multiple work queues at once? Haggai -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv9 1/3] rdmacg: Added rdma cgroup controller
On Thu, Mar 3, 2016 at 1:28 AM, Parav Panditwrote: > On Wed, Mar 2, 2016 at 11:09 PM, Tejun Heo wrote: >> Nothing seems to prevent @cg from going away if this races with >> @current being migrated to a different cgroup. Have you run this with >> lockdep and rcu debugging enabled? This should have triggered a >> warning. I am able to reproduce this race. Looking into how to address it. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 01/11] drm/hisilicon: Add device tree binding for hi6220 display subsystem
Hi, On 3 March 2016 at 02:29, Rob Herringwrote: > On Fri, Feb 26, 2016 at 04:40:18PM +0800, Xinliang Liu wrote: >> Add ADE display controller binding doc. >> Add DesignWare DSI Host Controller v1.20a binding doc. >> >> v6: >> - Cleanup values part of reg and clocks properties. >> - Change "pclk_dsi" clock name to "pclk". >> v5: >> - Remove endpoint unit address of dsi output port. >> - Add "hisilicon,noc-syscon" property for ADE NOC QoS syscon. >> - Add "resets" property for ADE reset. >> v4: >> - Describe more specific of clocks and ports. >> - Fix indentation. >> v3: >> - Make ade as the drm master node. >> - Use assigned-clocks to set clock rate. >> - Use ports to connect display relavant nodes. >> v2: >> - Move dt binding docs to bindings/display/hisilicon directory. >> >> Signed-off-by: Xinliang Liu >> --- >> .../bindings/display/hisilicon/dw-dsi.txt | 72 >> ++ >> .../bindings/display/hisilicon/hisi-ade.txt| 64 +++ >> 2 files changed, 136 insertions(+) >> create mode 100644 >> Documentation/devicetree/bindings/display/hisilicon/dw-dsi.txt >> create mode 100644 >> Documentation/devicetree/bindings/display/hisilicon/hisi-ade.txt > > Acked-by: Rob Herring Thanks :-) -xinliang -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] sparc64: Add support for Application Data Integrity (ADI)
Hi Khalid, On Thu, Mar 3, 2016 at 7:39 AM, Khalid Azizwrote: > > Enable Application Data Integrity (ADI) support in the sparc > kernel for applications to use ADI in userspace. ADI is a new > feature supported on sparc M7 and newer processors. ADI is supported > for data fetches only and not instruction fetches. This patch adds > prctl commands to enable and disable ADI (TSTATE.mcde), return ADI > parameters to userspace, enable/disable MCD (Memory Corruption > Detection) on selected memory ranges and enable TTE.mcd in PTEs. It > also adds handlers for all traps related to MCD. ADI is not enabled > by default for any task and a task must explicitly enable ADI > (TSTATE.mcde), turn MCD on on a memory range and set version tag > for ADI to be effective for the task. This patch adds support for > ADI for hugepages only. Addresses passed into system calls must be > non-ADI tagged addresses. I can't comment on the actual functionality here, but I do see a few minor style issues in your patch. My big concern is that you're defining a lot of new code that is ADI specific but isn't inside a CONFIG_SPARC_ADI ifdef. (That said, handling ADI specific traps if ADI isn't enabled looks like a good idea to me, however most of the other stuff is just dead code if CONFIG_SPARC_ADI isn't enabled.) > Signed-off-by: Khalid Aziz > --- > NOTES: ADI is a new feature added to M7 processor to allow hardware > to catch rogue accesses to memory. An app can enable ADI on > its data pages, set version tags on them and use versioned > addresses (bits 63-60 of the address contain a version tag) > to access the data pages. If a rogue app attempts to access > ADI enabled data pages, its access is blocked and processor > generates an exception. Enabling this functionality for all > data pages of an app requires adding infrastructure to save > version tags for any data pages that get swapped out and > restoring those tags when pages are swapped back in. In this > first implementation I am enabling ADI for hugepages only > since these pages are locked in memory and hence avoid the > issue of saving and restoring tags. Once this core functionality > is stable, ADI for other memory pages can be enabled more > easily. > > v2: > - Fixed a build error > > Documentation/prctl/sparc_adi.txt | 62 ++ > Documentation/sparc/adi.txt | 206 +++ > arch/sparc/Kconfig| 12 ++ > arch/sparc/include/asm/hugetlb.h | 14 +++ > arch/sparc/include/asm/hypervisor.h | 2 + > arch/sparc/include/asm/mmu_64.h | 1 + > arch/sparc/include/asm/pgtable_64.h | 15 +++ > arch/sparc/include/asm/processor_64.h | 19 +++ > arch/sparc/include/asm/ttable.h | 10 ++ > arch/sparc/include/uapi/asm/asi.h | 3 + > arch/sparc/include/uapi/asm/pstate.h | 10 ++ > arch/sparc/kernel/entry.h | 3 + > arch/sparc/kernel/head_64.S | 1 + > arch/sparc/kernel/mdesc.c | 81 + > arch/sparc/kernel/process_64.c| 222 > ++ > arch/sparc/kernel/sun4v_mcd.S | 16 +++ > arch/sparc/kernel/traps_64.c | 96 ++- > arch/sparc/kernel/ttable_64.S | 6 +- > include/linux/mm.h| 2 + > include/uapi/asm-generic/siginfo.h| 5 +- > include/uapi/linux/prctl.h| 16 +++ > kernel/sys.c | 30 + > 22 files changed, 826 insertions(+), 6 deletions(-) > create mode 100644 Documentation/prctl/sparc_adi.txt > create mode 100644 Documentation/sparc/adi.txt > create mode 100644 arch/sparc/kernel/sun4v_mcd.S I must admit that I'm slightly impressed that the documentation is over a quarter of the lines added. =) > diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig > index 56442d2..0aac0ae 100644 > --- a/arch/sparc/Kconfig > +++ b/arch/sparc/Kconfig > @@ -80,6 +80,7 @@ config SPARC64 > select NO_BOOTMEM > select HAVE_ARCH_AUDITSYSCALL > select ARCH_SUPPORTS_ATOMIC_RMW > + select SPARC_ADI This doesn't look right. > config ARCH_DEFCONFIG > string > @@ -314,6 +315,17 @@ if SPARC64 > source "kernel/power/Kconfig" > endif > > +config SPARC_ADI > + bool "Application Data Integrity support" > + def_bool y if SPARC64 def_bool is for config options without names (i.e. "this is a boolean value and it's default is...") So if you want people to be able to disable this option, then you should remove the select above and just have: bool "Application Data Integrity support" default y if SPARC64 If you don't want people disabling it, then there's no point in having a separate Kconfig symbol. > + help > + Support for Application Data Integrity (ADI). ADI feature allows > + a
Re: [PATCH v10 07/12] task_isolation: add debug boot flag
On 3/2/2016 3:37 PM, Peter Zijlstra wrote: On Wed, Mar 02, 2016 at 03:09:31PM -0500, Chris Metcalf wrote: +void task_isolation_debug(int cpu) +{ + struct task_struct *p; + + if (!task_isolation_possible(cpu)) + return; + + rcu_read_lock(); + p = cpu_curr(cpu); + get_task_struct(p); As I think Oleg keeps reminding me, this is not actually a safe thing to do. So what's the right solution? The fast path in task_isolation_debug_task basically just uses the new "task_isolation_flags", and "pid" and "comm". I would think those would all have to be safe because of the get_task_struct(). The piece that might be problematic is the eventual call to send_sig_info() using the task_struct pointer (called via task_isolation_debug_task -> task_isolation_interrupt). Clearly this is safe at some level, since that's more or less what sys_kill() does and the process could similarly evaporate half way through sending the signal. Suggestions? Thanks! -- Chris Metcalf, Mellanox Technologies http://www.mellanox.com -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v10 07/12] task_isolation: add debug boot flag
The new "task_isolation_debug" flag simplifies debugging of TASK_ISOLATION kernels when processes are running in PR_TASK_ISOLATION_ENABLE mode. Such processes should get no interrupts from the kernel, and if they do, when this boot flag is specified a kernel stack dump on the console is generated. It's possible to use ftrace to simply detect whether a task_isolation core has unexpectedly entered the kernel. But what this boot flag does is allow the kernel to provide better diagnostics, e.g. by reporting in the IPI-generating code what remote core and context is preparing to deliver an interrupt to a task_isolation core. It may be worth considering other ways to generate useful debugging output rather than console spew, but for now that is simple and direct. Signed-off-by: Chris Metcalf--- Documentation/kernel-parameters.txt| 8 include/linux/context_tracking_state.h | 6 +++ include/linux/isolation.h | 5 +++ kernel/irq_work.c | 5 ++- kernel/isolation.c | 77 ++ kernel/sched/core.c| 18 kernel/signal.c| 5 +++ kernel/smp.c | 6 ++- kernel/softirq.c | 33 +++ 9 files changed, 161 insertions(+), 2 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index c8d0b42d984a..ea0434fa906e 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -3755,6 +3755,14 @@ bytes respectively. Such letter suffixes can also be entirely omitted. also sets up nohz_full and isolcpus mode for the listed set of cpus. + task_isolation_debug[KNL] + In kernels built with CONFIG_TASK_ISOLATION + and booted in task_isolation= mode, this + setting will generate console backtraces when + the kernel is about to interrupt a task that + has requested PR_TASK_ISOLATION_ENABLE and is + running on a task_isolation core. + tcpmhash_entries= [KNL,NET] Set the number of tcp_metrics_hash slots. Default value is 8192 or 16384 depending on total diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h index 1d34fe68f48a..4e2c4b900b82 100644 --- a/include/linux/context_tracking_state.h +++ b/include/linux/context_tracking_state.h @@ -39,8 +39,14 @@ static inline bool context_tracking_in_user(void) { return __this_cpu_read(context_tracking.state) == CONTEXT_USER; } + +static inline bool context_tracking_cpu_in_user(int cpu) +{ + return per_cpu(context_tracking.state, cpu) == CONTEXT_USER; +} #else static inline bool context_tracking_in_user(void) { return false; } +static inline bool context_tracking_cpu_in_user(int cpu) { return false; } static inline bool context_tracking_active(void) { return false; } static inline bool context_tracking_is_enabled(void) { return false; } static inline bool context_tracking_cpu_is_enabled(void) { return false; } diff --git a/include/linux/isolation.h b/include/linux/isolation.h index ba6c4d510db8..f1ae7b663746 100644 --- a/include/linux/isolation.h +++ b/include/linux/isolation.h @@ -45,6 +45,9 @@ static inline void task_isolation_enter(void) extern bool task_isolation_syscall(int nr); extern void task_isolation_exception(const char *fmt, ...); extern void task_isolation_interrupt(struct task_struct *, const char *buf); +extern void task_isolation_debug(int cpu); +extern void task_isolation_debug_cpumask(const struct cpumask *); +extern void task_isolation_debug_task(int cpu, struct task_struct *p); static inline bool task_isolation_strict(void) { @@ -73,6 +76,8 @@ static inline bool task_isolation_ready(void) { return true; } static inline void task_isolation_enter(void) { } static inline bool task_isolation_check_syscall(int nr) { return false; } static inline void task_isolation_check_exception(const char *fmt, ...) { } +static inline void task_isolation_debug(int cpu) { } +#define task_isolation_debug_cpumask(mask) do {} while (0) #endif #endif diff --git a/kernel/irq_work.c b/kernel/irq_work.c index bcf107ce0854..a9b95ce00667 100644 --- a/kernel/irq_work.c +++ b/kernel/irq_work.c @@ -17,6 +17,7 @@ #include #include #include +#include #include @@ -75,8 +76,10 @@ bool irq_work_queue_on(struct irq_work *work, int cpu) if (!irq_work_claim(work)) return false; - if (llist_add(>llnode, _cpu(raised_list, cpu))) + if (llist_add(>llnode, _cpu(raised_list, cpu))) { + task_isolation_debug(cpu); arch_send_call_function_single_ipi(cpu); + } return true; } diff --git a/kernel/isolation.c
Re: [PATCH] sparc64: Add support for Application Data Integrity (ADI)
On 03/02/2016 01:26 PM, kbuild test robot wrote: Hi Khalid, [auto build test ERROR on sparc/master] [also build test ERROR on v4.5-rc6] [cannot apply to next-20160302] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url: https://github.com/0day-ci/linux/commits/Khalid-Aziz/sparc64-Add-support-for-Application-Data-Integrity-ADI/20160303-025709 base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc.git master config: sparc64-allnoconfig (attached as .config) reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=sparc64 All errors (new ones prefixed by >>): arch/sparc/kernel/process_64.c: In function 'disable_sparc_adi': arch/sparc/kernel/process_64.c:961:6: error: implicit declaration of function 'vma_policy' [-Werror=implicit-function-declaration] vma_policy(vma), vma->vm_userfaultfd_ctx); ^ arch/sparc/kernel/process_64.c:959:10: error: passing argument 9 of 'vma_merge' makes pointer from integer without a cast [-Werror] prev = vma_merge(mm, prev, addr, end, vma->vm_flags, ^ In file included from arch/sparc/kernel/process_64.c:18:0: include/linux/mm.h:1922:31: note: expected 'struct mempolicy *' but argument is of type 'int' extern struct vm_area_struct *vma_merge(struct mm_struct *, ^ Not sure why it built without errors on my system. I will #include and send updated patch. -- Khalid -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v10 07/12] task_isolation: add debug boot flag
On Wed, Mar 02, 2016 at 03:09:31PM -0500, Chris Metcalf wrote: > +void task_isolation_debug(int cpu) > +{ > + struct task_struct *p; > + > + if (!task_isolation_possible(cpu)) > + return; > + > + rcu_read_lock(); > + p = cpu_curr(cpu); > + get_task_struct(p); As I think Oleg keeps reminding me, this is not actually a safe thing to do. > + rcu_read_unlock(); > + task_isolation_debug_task(cpu, p); > + put_task_struct(p); > +} -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v10 05/12] task_isolation: support CONFIG_TASK_ISOLATION_ALL
This option, similar to NO_HZ_FULL_ALL, simplifies configuring a system to boot by default with all cores except the boot core running in task isolation mode. --- init/Kconfig | 10 ++ kernel/isolation.c | 6 ++ 2 files changed, 16 insertions(+) diff --git a/init/Kconfig b/init/Kconfig index 6cab348fe454..314b09347fba 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -802,6 +802,16 @@ config TASK_ISOLATION You should say "N" unless you are intending to run a high-performance userspace driver or similar task. +config TASK_ISOLATION_ALL + bool "Provide task isolation on all CPUs by default (except CPU 0)" + depends on TASK_ISOLATION + help +If the user doesn't pass the task_isolation boot option to +define the range of task isolation CPUs, consider that all +CPUs in the system are task isolation by default. +Note the boot CPU will still be kept outside the range to +handle timekeeping duty, etc. + config BUILD_BIN2C bool default n diff --git a/kernel/isolation.c b/kernel/isolation.c index e954afd8cce8..42ad7a746a1e 100644 --- a/kernel/isolation.c +++ b/kernel/isolation.c @@ -40,8 +40,14 @@ int __init task_isolation_init(void) { /* For offstack cpumask, ensure we allocate an empty cpumask early. */ if (!saw_boot_arg) { +#ifdef CONFIG_TASK_ISOLATION_ALL + alloc_cpumask_var(_isolation_map, GFP_KERNEL); + cpumask_copy(task_isolation_map, cpu_possible_mask); + cpumask_clear_cpu(smp_processor_id(), task_isolation_map); +#else zalloc_cpumask_var(_isolation_map, GFP_KERNEL); return 0; +#endif } /* -- 2.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] sparc64: Add support for Application Data Integrity (ADI)
Hi Khalid, [auto build test ERROR on sparc/master] [also build test ERROR on v4.5-rc6] [cannot apply to next-20160302] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url: https://github.com/0day-ci/linux/commits/Khalid-Aziz/sparc64-Add-support-for-Application-Data-Integrity-ADI/20160303-025709 base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc.git master config: sparc64-allnoconfig (attached as .config) reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=sparc64 All errors (new ones prefixed by >>): arch/sparc/kernel/process_64.c: In function 'disable_sparc_adi': >> arch/sparc/kernel/process_64.c:961:6: error: implicit declaration of >> function 'vma_policy' [-Werror=implicit-function-declaration] vma_policy(vma), vma->vm_userfaultfd_ctx); ^ arch/sparc/kernel/process_64.c:959:10: error: passing argument 9 of 'vma_merge' makes pointer from integer without a cast [-Werror] prev = vma_merge(mm, prev, addr, end, vma->vm_flags, ^ In file included from arch/sparc/kernel/process_64.c:18:0: include/linux/mm.h:1922:31: note: expected 'struct mempolicy *' but argument is of type 'int' extern struct vm_area_struct *vma_merge(struct mm_struct *, ^ cc1: all warnings being treated as errors vim +/vma_policy +961 arch/sparc/kernel/process_64.c 955 /* Update the ADI info in vma and check if this vma can 956 * be merged with adjacent ones 957 */ 958 pgoff = vma->vm_pgoff + ((addr - vma->vm_start) >> PAGE_SHIFT); 959 prev = vma_merge(mm, prev, addr, end, vma->vm_flags, 960 vma->anon_vma, vma->vm_file, pgoff, > 961 vma_policy(vma), > vma->vm_userfaultfd_ctx); 962 if (prev) 963 vma = prev; 964 --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: Binary data
[PATCH v10 06/12] task_isolation: support PR_TASK_ISOLATION_STRICT mode
With task_isolation mode, the task is in principle guaranteed not to be interrupted by the kernel, but only if it behaves. In particular, if it enters the kernel via system call, page fault, or any of a number of other synchronous traps, it may be unexpectedly exposed to long latencies. Add a simple flag that puts the process into a state where any such kernel entry is fatal; this is defined as happening immediately before the SECCOMP test. By default, the task is signalled with SIGKILL, but we add prctl() bits to support requesting a specific signal instead. To allow the state to be entered and exited, we ignore the prctl() syscall so that we can clear the bit again later, and we ignore exit/exit_group to allow exiting the task without a pointless signal killing you as you try to do so. Signed-off-by: Chris Metcalf--- include/linux/isolation.h | 25 +++ include/uapi/linux/prctl.h | 3 +++ kernel/isolation.c | 60 ++ 3 files changed, 88 insertions(+) diff --git a/include/linux/isolation.h b/include/linux/isolation.h index c564cf1886bb..ba6c4d510db8 100644 --- a/include/linux/isolation.h +++ b/include/linux/isolation.h @@ -42,12 +42,37 @@ static inline void task_isolation_enter(void) _task_isolation_enter(); } +extern bool task_isolation_syscall(int nr); +extern void task_isolation_exception(const char *fmt, ...); +extern void task_isolation_interrupt(struct task_struct *, const char *buf); + +static inline bool task_isolation_strict(void) +{ + return ((current->task_isolation_flags & +(PR_TASK_ISOLATION_ENABLE | PR_TASK_ISOLATION_STRICT)) == + (PR_TASK_ISOLATION_ENABLE | PR_TASK_ISOLATION_STRICT)) && + task_isolation_possible(raw_smp_processor_id()); +} + +static inline bool task_isolation_check_syscall(int nr) +{ + return task_isolation_strict() && task_isolation_syscall(nr); +} + +#define task_isolation_check_exception(fmt, ...) \ + do {\ + if (task_isolation_strict())\ + task_isolation_exception(fmt, ## __VA_ARGS__); \ + } while (0) + #else static inline void task_isolation_init(void) { } static inline bool task_isolation_possible(int cpu) { return false; } static inline bool task_isolation_enabled(void) { return false; } static inline bool task_isolation_ready(void) { return true; } static inline void task_isolation_enter(void) { } +static inline bool task_isolation_check_syscall(int nr) { return false; } +static inline void task_isolation_check_exception(const char *fmt, ...) { } #endif #endif diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 67224df4b559..a5582ace987f 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -201,5 +201,8 @@ struct prctl_mm_map { #define PR_SET_TASK_ISOLATION 48 #define PR_GET_TASK_ISOLATION 49 # define PR_TASK_ISOLATION_ENABLE (1 << 0) +# define PR_TASK_ISOLATION_STRICT (1 << 1) +# define PR_TASK_ISOLATION_SET_SIG(sig)(((sig) & 0x7f) << 8) +# define PR_TASK_ISOLATION_GET_SIG(bits) (((bits) >> 8) & 0x7f) #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/isolation.c b/kernel/isolation.c index 42ad7a746a1e..5621fdf15b17 100644 --- a/kernel/isolation.c +++ b/kernel/isolation.c @@ -11,6 +11,7 @@ #include #include #include +#include #include "time/tick-sched.h" cpumask_var_t task_isolation_map; @@ -122,3 +123,62 @@ void _task_isolation_enter(void) if (!tick_nohz_tick_stopped()) set_tsk_need_resched(current); } + +void task_isolation_interrupt(struct task_struct *task, const char *buf) +{ + siginfo_t info = {}; + int sig; + + pr_warn("%s/%d: task_isolation strict mode violated by %s\n", + task->comm, task->pid, buf); + + /* +* Turn off task isolation mode entirely to avoid spamming +* the process with signals. It can re-enable task isolation +* mode in the signal handler if it wants to. +*/ + task->task_isolation_flags = 0; + + sig = PR_TASK_ISOLATION_GET_SIG(task->task_isolation_flags); + if (sig == 0) + sig = SIGKILL; + info.si_signo = sig; + send_sig_info(sig, , task); +} + +/* + * This routine is called from any userspace exception if the _STRICT + * flag is set. + */ +void task_isolation_exception(const char *fmt, ...) +{ + va_list args; + char buf[100]; + + /* RCU should have been enabled prior to this point. */ + RCU_LOCKDEP_WARN(!rcu_is_watching(), "kernel entry without RCU"); + + va_start(args, fmt); + vsnprintf(buf, sizeof(buf), fmt, args); + va_end(args); + + task_isolation_interrupt(current, buf); +} + +/* + * This routine is called from syscall
[PATCH v10 00/12] support "task_isolation" mode
Here is the latest version of the task-isolation patch set, adopting various suggestions made about the v9 patch series, including some feedback from testing on the new EZchip NPS ARC platform (although the new arch/arc support is not included in this patch series). All of the suggestions were relatively non-controversial. Perhaps we are getting close to being able to merge this. :-) Changes since v9: - task_isolation is now set up by adding its required cpus to both the nohz_full and isolcpus cpumasks. This allows users to separately specify all three flags, if so desired, and still get reasonably sane semantics. This is done with a new tick_nohz_full_add_cpus() method for nohz, and just directly updating the isolcpus cpumask. - We add a /sys/devices/system/cpu/task_isolation file to parallel the equivalent nohz_full file. (This should have been in v8 since once task_isolation isn't equivalent to nohz_full, it needs its own way to let userspace know where to run.) - We add a new Kconfig option, TASK_ISOLATION_ALL, which sets all but the boot processor to run in task isolation mode. This parallels the existing NO_HZ_FULL_ALL and works around the fact that you can't easily specify a boot argument with the desired semantics. - For task_isolation_debug, we add a check of the context_tracking state of the remote cpu before issuing a warning; if the remote cpu is actually in the kernel, we don't need to warn. - A cloned child of a task_isolation task is not enabled for task_isolation, since otherwise they would both fight over who could safely return to userspace without requiring scheduling interrupts. - The quiet_vmstat() function's semantics was changed since the v9 patch series, so I introduce a quiet_vmstat_sync() for isolation. - The lru_add_drain_needed() function is updated to know about the new lru_deactivate_pvecs variable. - The arm64 patch factoring assembly into C has been modified based on an earlier patch by Mark Rutland. - I simplified the enabling patch for arm64 by realizing we could just test TIF_NOHZ as the only bit for TIF_WORK_MASK for task isolation, so I didn't have to renumber all the TIF_xxx bits. - Small fixes to avoid preemption warnings. - Rebased on v4.5-rc5 For changes in earlier versions of the patch series, please see: http://lkml.kernel.org/r/1451936091-29247-1-git-send-email-cmetc...@ezchip.com A couple of the tile patches that refactored the context tracking code were taken into 4.5 so are no longer present in this series. This version of the patch series has been tested on arm64 and tile, and build-tested on x86. It remains true that the 1 Hz tick needs to be disabled for this patch series to be able to achieve its primary goal of enabling truly tick-free operation, but that is ongoing orthogonal work. The series is available at: git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile.git dataplane Chris Metcalf (12): vmstat: add quiet_vmstat_sync function vmstat: add vmstat_idle function lru_add_drain_all: factor out lru_add_drain_needed task_isolation: add initial support task_isolation: support CONFIG_TASK_ISOLATION_ALL task_isolation: support PR_TASK_ISOLATION_STRICT mode task_isolation: add debug boot flag arm, tile: turn off timer tick for oneshot_stopped state arch/x86: enable task isolation functionality arch/tile: enable task isolation functionality arm64: factor work_pending state machine to C arch/arm64: enable task isolation functionality Documentation/kernel-parameters.txt| 16 ++ arch/arm64/include/asm/thread_info.h | 8 +- arch/arm64/kernel/entry.S | 12 +- arch/arm64/kernel/ptrace.c | 12 +- arch/arm64/kernel/signal.c | 34 - arch/arm64/kernel/smp.c| 2 + arch/arm64/mm/fault.c | 4 + arch/tile/kernel/process.c | 6 +- arch/tile/kernel/ptrace.c | 6 + arch/tile/kernel/single_step.c | 5 + arch/tile/kernel/smp.c | 28 ++-- arch/tile/kernel/time.c| 1 + arch/tile/kernel/unaligned.c | 3 + arch/tile/mm/fault.c | 3 + arch/tile/mm/homecache.c | 2 + arch/x86/entry/common.c| 18 ++- arch/x86/kernel/traps.c| 2 + arch/x86/mm/fault.c| 2 + drivers/base/cpu.c | 18 +++ drivers/clocksource/arm_arch_timer.c | 2 + include/linux/context_tracking_state.h | 6 + include/linux/isolation.h | 83 +++ include/linux/sched.h | 3 + include/linux/swap.h | 1 + include/linux/tick.h | 1 + include/linux/vmstat.h | 4 + include/uapi/linux/prctl.h | 8 + init/Kconfig | 30 kernel/Makefile| 1 + kernel/fork.c
Re: [PATCHv9 1/3] rdmacg: Added rdma cgroup controller
On Wed, Mar 2, 2016 at 11:09 PM, Tejun Heowrote: > Hello, > >> +struct rdma_cgroup { >> + struct cgroup_subsys_state css; >> + >> + spinlock_t rpool_list_lock; /* protects resource pool list */ >> + struct list_head rpool_head;/* head to keep track of all resource >> + * pools that belongs to this cgroup. >> + */ > > I think we usually don't tail wing these comments. ok. I will put comments in separate line. > >> +}; >> + >> +struct rdmacg_pool_info { >> + const char **resource_name_table; >> + int table_len; > > Align fields consistently? I've said this multiple times now but > please make the available resources constant and document them in > Documentation/cgroup-v2.txt. o.k. I will align them. o.k. I will document the resource constants defined by IB stack in cgroup-v2.txt. >> + >> +/* APIs for RDMA/IB stack to publish when a device wants to >> + * participate in resource accounting >> + */ > > Please follow CodingStyle. Wasn't this pointed out a couple times > already? Yes. You did. I fixed at few places. I missed this out. Sorry. I am doing now. > >> +enum rdmacg_file_type { >> + RDMACG_RESOURCE_MAX, >> + RDMACG_RESOURCE_STAT, >> +}; > > Heh, usually not a good sign. If you're using this to call into a > common function and then switch out on the file type, just switch out > at the method level and factor out common part into helpers. > Methods for both the constants are same. Code changes between two file type is hardly 4 lines of code. So there is no need of additional helper functions. So in currently defined functions rdmacg_resource_read() and rdmacg_resource_set_max() works on file type. >> +/* resource tracker per resource for rdma cgroup */ >> +struct rdmacg_resource { >> + int max; >> + int usage; >> +}; > > Align fields? Above one seems to be aligned. Not sure what am I missing. I am aligning all the structures anyways. > >> +/** > > The above indicates kerneldoc comments, which this isn't. Fixed for this comment. > >> + * resource pool object which represents, per cgroup, per device >> + * resources. There are multiple instance >> + * of this object per cgroup, therefore it cannot be embedded within >> + * rdma_cgroup structure. It is maintained as list. > > Consistent paragraph fill? Fixed. > >> + */ >> +struct rdmacg_resource_pool { >> + struct list_head cg_list; >> + struct list_head dev_list; >> + >> + struct rdmacg_device *device; >> + struct rdmacg_resource *resources; >> + struct rdma_cgroup *cg; /* owner cg used during device cleanup */ >> + >> + int refcnt; /* count active user tasks of this pool */ >> + int num_max_cnt;/* total number counts which are set to max */ >> +}; > > Formatting. Aligning all the fields now in next patch. > >> +static inline void set_resource_limit(struct rdmacg_resource_pool *rpool, >> + int index, int new_max) > > Is inline necessary? Compiler can figure out these. Yes. Removed. > >> +static struct rdmacg_resource_pool* > ^ > space > >> +find_cg_rpool_locked(struct rdma_cgroup *cg, >> + struct rdmacg_device *device) > ... >> +static void uncharge_cg_resource(struct rdma_cgroup *cg, >> + struct rdmacg_device *device, >> + int index, int num) >> +{ > ... >> + rpool->refcnt--; >> + if (rpool->refcnt == 0 && rpool->num_max_cnt == pool_info->table_len) { > > If the caller charges 2 and then uncharges 1 two times, the refcnt > underflows? Why not just track how many usages are zero? > This is certainly must fix bug. Changed refcnt to usage_sum and changed to do usage_sum -= num during uncharging and usage_sum += num during charing. > ... >> +void rdmacg_uncharge(struct rdma_cgroup *cg, >> + struct rdmacg_device *device, >> + int index, int num) >> +{ >> + struct rdma_cgroup *p; >> + >> + for (p = cg; p; p = parent_rdmacg(p)) >> + uncharge_cg_resource(p, device, index, num); >> + >> + css_put(>css); >> +} >> +EXPORT_SYMBOL(rdmacg_uncharge); > > So, I suppose the code is trying to save lock contention with > fine-grained locking; Yes. > however, as the code is currently structured, > it's just increasing the number of lock ops and it'd be highly likely > to cheaper to simply use a single lock. Single lock per subsystem? I understood that you were ok to have per cgroup fine grain lock. > If you're working up the tree > grabbing lock at each level, per-node locking doesn't save you > anything. Also, it introduces conditions where charges are spuriously > denied when there are racing requestors. If scalability becomes an > issue, the right way to address is adding percpu front cache. > >> +void
Re: [PATCH v8 06/10] watchdog: dw_wdt: Convert to use watchdog infrastructure
On Tue, Mar 01, 2016 at 02:59:06PM -0800, Doug Anderson wrote: > Guenter, > > On Sun, Feb 28, 2016 at 1:12 PM, Guenter Roeckwrote: > > Convert driver to use watchdog infrastructure. This includes > > infrastructure support to handle watchdog keepalive if the watchdog > > is running while the watchdog device is closed. > > > > Signed-off-by: Guenter Roeck > > > > --- > > v8: max_hw_timeout_ms -> max_hw_heartbeat_ms > > Rebased to v4.5-rc5 > > v7: Set max_hw_timeout_ms > > Rebased to v4.5-rc1 > > v6: Added patch > > --- > > drivers/watchdog/Kconfig | 1 + > > drivers/watchdog/dw_wdt.c | 323 > > +- > > 2 files changed, 117 insertions(+), 207 deletions(-) > > I re-ran the same set of tests I ran against v7. They all still pass. > > Tested-by: Douglas Anderson Thanks a lot! Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv9 0/3] rdmacg: IB/core: rdma controller support
Hi Tejun, On Wed, Mar 2, 2016 at 11:10 PM, Tejun Heowrote: > Hello, Parav. > > It doesn't look like my reviews are getting through. For now, > I have addressed all the review comments that you provided in v5 patch. I admit that few comments have not followed CodingStyle and I owe to fix it, which is bad on my part. I have few 2 questions on your comments, I will ask there. For cgroup lock - out of 3 proposals, you acknowledged that you are ok with cgroup specific spin_lock, though you don't see that as big saving. Even though it can be made course gained by having single lock for the whole rdma cgroup subsystem it raises contention among processes of different cgroups, which is preferable to avoid. So I have continued to have cgroup specific fine grain lock, which I believe is ok as this is little efficient for first cut. Including above one we agreed on almost all the points at design and implementation level in previous patches. If you see any issue, I am open to resolve them. I will address comments given in patch v9. > Nacked-by: Tejun Heo > > Thanks. > > -- > tejun -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] isdn: i4l: move active-isdn drivers to staging
The icn, act2000 and pcbit drivers are all for very old hardware, and it is highly unlikely that anyone is actually still using them on modern kernels, if at all. All three drivers apparently are for hardware that predates PCI being the common connector, as they are ISA-only and active PCI ISDN cards were widely available in the 1990s. Looking through the git logs, it I cannot find any indication of a patch to any of these drivers that has been tested on real hardware, only cleanups or global API changes. Signed-off-by: Arnd Bergmann--- Documentation/isdn/00-INDEX | 8 drivers/isdn/Makefile | 3 --- drivers/isdn/i4l/Kconfig| 10 -- drivers/staging/Kconfig | 2 ++ drivers/staging/Makefile| 1 + .../staging/i4l/Documentation}/README.act2000 | 0 .../isdn => drivers/staging/i4l/Documentation}/README.icn | 0 .../isdn => drivers/staging/i4l/Documentation}/README.pcbit | 0 .../isdn => drivers/staging/i4l/Documentation}/README.sc| 0 drivers/staging/i4l/Kconfig | 13 + drivers/staging/i4l/Makefile| 5 + drivers/staging/i4l/TODO| 3 +++ drivers/{isdn => staging/i4l}/act2000/Kconfig | 0 drivers/{isdn => staging/i4l}/act2000/Makefile | 0 drivers/{isdn => staging/i4l}/act2000/act2000.h | 0 drivers/{isdn => staging/i4l}/act2000/act2000_isa.c | 0 drivers/{isdn => staging/i4l}/act2000/act2000_isa.h | 0 drivers/{isdn => staging/i4l}/act2000/capi.c| 0 drivers/{isdn => staging/i4l}/act2000/capi.h| 0 drivers/{isdn => staging/i4l}/act2000/module.c | 0 drivers/{isdn => staging/i4l}/icn/Kconfig | 0 drivers/{isdn => staging/i4l}/icn/Makefile | 0 drivers/{isdn => staging/i4l}/icn/icn.c | 0 drivers/{isdn => staging/i4l}/icn/icn.h | 0 drivers/{isdn => staging/i4l}/pcbit/Kconfig | 0 drivers/{isdn => staging/i4l}/pcbit/Makefile| 0 drivers/{isdn => staging/i4l}/pcbit/callbacks.c | 0 drivers/{isdn => staging/i4l}/pcbit/callbacks.h | 0 drivers/{isdn => staging/i4l}/pcbit/capi.c | 0 drivers/{isdn => staging/i4l}/pcbit/capi.h | 0 drivers/{isdn => staging/i4l}/pcbit/drv.c | 0 drivers/{isdn => staging/i4l}/pcbit/edss1.c | 0 drivers/{isdn => staging/i4l}/pcbit/edss1.h | 0 drivers/{isdn => staging/i4l}/pcbit/layer2.c| 0 drivers/{isdn => staging/i4l}/pcbit/layer2.h| 0 drivers/{isdn => staging/i4l}/pcbit/module.c| 0 drivers/{isdn => staging/i4l}/pcbit/pcbit.h | 0 37 files changed, 24 insertions(+), 21 deletions(-) rename {Documentation/isdn => drivers/staging/i4l/Documentation}/README.act2000 (100%) rename {Documentation/isdn => drivers/staging/i4l/Documentation}/README.icn (100%) rename {Documentation/isdn => drivers/staging/i4l/Documentation}/README.pcbit (100%) rename {Documentation/isdn => drivers/staging/i4l/Documentation}/README.sc (100%) create mode 100644 drivers/staging/i4l/Kconfig create mode 100644 drivers/staging/i4l/Makefile create mode 100644 drivers/staging/i4l/TODO rename drivers/{isdn => staging/i4l}/act2000/Kconfig (100%) rename drivers/{isdn => staging/i4l}/act2000/Makefile (100%) rename drivers/{isdn => staging/i4l}/act2000/act2000.h (100%) rename drivers/{isdn => staging/i4l}/act2000/act2000_isa.c (100%) rename drivers/{isdn => staging/i4l}/act2000/act2000_isa.h (100%) rename drivers/{isdn => staging/i4l}/act2000/capi.c (100%) rename drivers/{isdn => staging/i4l}/act2000/capi.h (100%) rename drivers/{isdn => staging/i4l}/act2000/module.c (100%) rename drivers/{isdn => staging/i4l}/icn/Kconfig (100%) rename drivers/{isdn => staging/i4l}/icn/Makefile (100%) rename drivers/{isdn => staging/i4l}/icn/icn.c (100%) rename drivers/{isdn => staging/i4l}/icn/icn.h (100%) rename drivers/{isdn => staging/i4l}/pcbit/Kconfig (100%) rename drivers/{isdn => staging/i4l}/pcbit/Makefile (100%) rename drivers/{isdn => staging/i4l}/pcbit/callbacks.c (100%) rename drivers/{isdn => staging/i4l}/pcbit/callbacks.h (100%) rename drivers/{isdn => staging/i4l}/pcbit/capi.c (100%) rename drivers/{isdn => staging/i4l}/pcbit/capi.h (100%) rename drivers/{isdn => staging/i4l}/pcbit/drv.c (100%) rename drivers/{isdn => staging/i4l}/pcbit/edss1.c (100%) rename drivers/{isdn => staging/i4l}/pcbit/edss1.h (100%) rename drivers/{isdn => staging/i4l}/pcbit/layer2.c (100%) rename drivers/{isdn => staging/i4l}/pcbit/layer2.h (100%) rename
[PATCH] sparc64: Add support for Application Data Integrity (ADI)
Enable Application Data Integrity (ADI) support in the sparc kernel for applications to use ADI in userspace. ADI is a new feature supported on sparc M7 and newer processors. ADI is supported for data fetches only and not instruction fetches. This patch adds prctl commands to enable and disable ADI (TSTATE.mcde), return ADI parameters to userspace, enable/disable MCD (Memory Corruption Detection) on selected memory ranges and enable TTE.mcd in PTEs. It also adds handlers for all traps related to MCD. ADI is not enabled by default for any task and a task must explicitly enable ADI (TSTATE.mcde), turn MCD on on a memory range and set version tag for ADI to be effective for the task. This patch adds support for ADI for hugepages only. Addresses passed into system calls must be non-ADI tagged addresses. Signed-off-by: Khalid Aziz--- NOTES: ADI is a new feature added to M7 processor to allow hardware to catch rogue accesses to memory. An app can enable ADI on its data pages, set version tags on them and use versioned addresses (bits 63-60 of the address contain a version tag) to access the data pages. If a rogue app attempts to access ADI enabled data pages, its access is blocked and processor generates an exception. Enabling this functionality for all data pages of an app requires adding infrastructure to save version tags for any data pages that get swapped out and restoring those tags when pages are swapped back in. In this first implementation I am enabling ADI for hugepages only since these pages are locked in memory and hence avoid the issue of saving and restoring tags. Once this core functionality is stable, ADI for other memory pages can be enabled more easily. Documentation/prctl/sparc_adi.txt | 62 ++ Documentation/sparc/adi.txt | 206 +++ arch/sparc/Kconfig| 12 ++ arch/sparc/include/asm/hugetlb.h | 14 +++ arch/sparc/include/asm/hypervisor.h | 2 + arch/sparc/include/asm/mmu_64.h | 1 + arch/sparc/include/asm/pgtable_64.h | 15 +++ arch/sparc/include/asm/processor_64.h | 19 +++ arch/sparc/include/asm/ttable.h | 10 ++ arch/sparc/include/uapi/asm/asi.h | 3 + arch/sparc/include/uapi/asm/pstate.h | 10 ++ arch/sparc/kernel/entry.h | 3 + arch/sparc/kernel/head_64.S | 1 + arch/sparc/kernel/mdesc.c | 81 + arch/sparc/kernel/process_64.c| 221 ++ arch/sparc/kernel/sun4v_mcd.S | 16 +++ arch/sparc/kernel/traps_64.c | 96 ++- arch/sparc/kernel/ttable_64.S | 6 +- include/linux/mm.h| 2 + include/uapi/asm-generic/siginfo.h| 5 +- include/uapi/linux/prctl.h| 16 +++ kernel/sys.c | 30 + 22 files changed, 825 insertions(+), 6 deletions(-) create mode 100644 Documentation/prctl/sparc_adi.txt create mode 100644 Documentation/sparc/adi.txt create mode 100644 arch/sparc/kernel/sun4v_mcd.S diff --git a/Documentation/prctl/sparc_adi.txt b/Documentation/prctl/sparc_adi.txt new file mode 100644 index 000..9cbdcae --- /dev/null +++ b/Documentation/prctl/sparc_adi.txt @@ -0,0 +1,62 @@ + +Overview + + +SPARC M7 processor includes the feature Application Data Integrity (ADI). +ADI allows a tag to be associated with a virtual memory address range +and a process must access that memory range with the correct tag. ADI +tag is embedded in bits 63-60 of virtual address. Once ADI is enabled +on a range of memory addresses, the process can set a tag for blocks +in this memory range n the cache using ASI_MCD_PRIMARY or +ASI_MCD_ST_BLKINIT_PRIMARY. This tag is set for ADI block sized blocks +which is provided to the kernel by machine description table. + +Linux kernel supports an application enabling and setting the ADI tag +for a subset of its data pages. Those data pages have to be locked in +memory since saving ADI tags to swap is not supported. + + +New prctl options for ADI +- + +Following new options to prctl() have been added to support ADI. + + PR_GET_SPARC_ADICAPS - Get ADI capabilities for the processor. + These capabilities are used to set up ADI correctly + from userspace. Machine description table provides all + of the ADI capabilities information. arg2 to prctl() is + a pointer to struct adi_caps which is defined in + linux/prctl.h. + + + PR_SET_SPARC_ADI - Set the state of ADI in a user thread by + setting PSTATE.mcde bit in the user mode PSTATE register + of the calling thread based on the value passed in arg2: + 1 == enable, 0 == disable, other == no change + Return the
Re: [PATCHv9 0/3] rdmacg: IB/core: rdma controller support
Hello, Parav. It doesn't look like my reviews are getting through. For now, Nacked-by: Tejun HeoThanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv9 1/3] rdmacg: Added rdma cgroup controller
Hello, On Wed, Mar 02, 2016 at 12:35:35AM +0530, Parav Pandit wrote: > diff --git a/include/linux/cgroup_rdma.h b/include/linux/cgroup_rdma.h > new file mode 100644 > index 000..2da3d6c > --- /dev/null > +++ b/include/linux/cgroup_rdma.h > @@ -0,0 +1,50 @@ > +#ifndef _CGROUP_RDMA_H > +#define _CGROUP_RDMA_H > + > +#include > + > +#ifdef CONFIG_CGROUP_RDMA > + > +struct rdma_cgroup { > + struct cgroup_subsys_state css; > + > + spinlock_t rpool_list_lock; /* protects resource pool list */ > + struct list_head rpool_head;/* head to keep track of all resource > + * pools that belongs to this cgroup. > + */ I think we usually don't tail wing these comments. > +}; > + > +struct rdmacg_pool_info { > + const char **resource_name_table; > + int table_len; Align fields consistently? I've said this multiple times now but please make the available resources constant and document them in Documentation/cgroup-v2.txt. > +}; > + > +struct rdmacg_device { > + struct rdmacg_pool_info pool_info; > + struct list_headrdmacg_list; > + struct list_headrpool_head; > + /* protects resource pool list */ > + spinlock_t rpool_lock; > + char*name; > +}; > + > +/* APIs for RDMA/IB stack to publish when a device wants to > + * participate in resource accounting > + */ Please follow CodingStyle. Wasn't this pointed out a couple times already? > +enum rdmacg_file_type { > + RDMACG_RESOURCE_MAX, > + RDMACG_RESOURCE_STAT, > +}; Heh, usually not a good sign. If you're using this to call into a common function and then switch out on the file type, just switch out at the method level and factor out common part into helpers. > +/* resource tracker per resource for rdma cgroup */ > +struct rdmacg_resource { > + int max; > + int usage; > +}; Align fields? > +/** The above indicates kerneldoc comments, which this isn't. > + * resource pool object which represents, per cgroup, per device > + * resources. There are multiple instance > + * of this object per cgroup, therefore it cannot be embedded within > + * rdma_cgroup structure. It is maintained as list. Consistent paragraph fill? > + */ > +struct rdmacg_resource_pool { > + struct list_head cg_list; > + struct list_head dev_list; > + > + struct rdmacg_device *device; > + struct rdmacg_resource *resources; > + struct rdma_cgroup *cg; /* owner cg used during device cleanup */ > + > + int refcnt; /* count active user tasks of this pool */ > + int num_max_cnt;/* total number counts which are set to max */ > +}; Formatting. > +static inline void set_resource_limit(struct rdmacg_resource_pool *rpool, > + int index, int new_max) Is inline necessary? Compiler can figure out these. > +static struct rdmacg_resource_pool* ^ space > +find_cg_rpool_locked(struct rdma_cgroup *cg, > + struct rdmacg_device *device) ... > +static void uncharge_cg_resource(struct rdma_cgroup *cg, > + struct rdmacg_device *device, > + int index, int num) > +{ ... > + rpool->refcnt--; > + if (rpool->refcnt == 0 && rpool->num_max_cnt == pool_info->table_len) { If the caller charges 2 and then uncharges 1 two times, the refcnt underflows? Why not just track how many usages are zero? ... > +void rdmacg_uncharge(struct rdma_cgroup *cg, > + struct rdmacg_device *device, > + int index, int num) > +{ > + struct rdma_cgroup *p; > + > + for (p = cg; p; p = parent_rdmacg(p)) > + uncharge_cg_resource(p, device, index, num); > + > + css_put(>css); > +} > +EXPORT_SYMBOL(rdmacg_uncharge); So, I suppose the code is trying to save lock contention with fine-grained locking; however, as the code is currently structured, it's just increasing the number of lock ops and it'd be highly likely to cheaper to simply use a single lock. If you're working up the tree grabbing lock at each level, per-node locking doesn't save you anything. Also, it introduces conditions where charges are spuriously denied when there are racing requestors. If scalability becomes an issue, the right way to address is adding percpu front cache. > +void rdmacg_query_limit(struct rdmacg_device *device, > + int *limits) > +{ > + struct rdma_cgroup *cg, *p; > + struct rdmacg_resource_pool *rpool; > + struct rdmacg_pool_info *pool_info; > + int i; > + > + cg = task_rdmacg(current); > + pool_info = >pool_info; Nothing seems to prevent @cg from going away if this races with @current being migrated to a different cgroup. Have you run this with lockdep and rcu debugging enabled? This should have triggered a warning. ... >
Re: [PATCH] can: rcar_canfd: Add Renesas R-Car CAN FD driver
On 03/02/2016 11:08 AM, Ramesh Shanmugasundaram wrote: I see no locking for the tx-path. >>> >>> I am not sure why locking is needed in tx-path? >> >> If the tx-path is shared between both channels you need locking as the >> networking subsystem may send one frame to each controller at the same >> time. > > Yes. Tx completion interrupt is shared by both channels but the Tx > FIFO, echo skbs, head, tail are maintained on a per channel basis. > Yes, net subsystem can send one frame to each channel at the same > time and the tx completion interrupt handler will traverse each > channel & update per channel context accordingly. Ok. Which instruction starts the transmit? Please add a comment to the code. You stop the tx_queue after this, so your code is racy, as the interrupt might happen in between. >>> However, looking at it again, I should move the incrementing of head >>> after the "sts" handing to be apt. What do you think? >> >> With one producer (one SW instance) and one consumer (the HW) you can >> write lockless code (if the HW allows it), but with two producers it's not >> possible. > > For a channel represented as netdev, there is still one producer (one > SW instance) isn't it? net acquires lock before calling xmit. The > consumer is tx completion interrupt handler which updates per channel > attributes only. Ok - I haven't looked at the code in depth yet. Now it's clear, that just the tx interrupt is shared. > Sorry if I am annoying. I have tested this with two channels > transmitting & receiving at the same time and it works fine. If I am > missing lock, I would like to understand it thoroughly before making > the change. Marc -- Pengutronix e.K. | Marc Kleine-Budde | Industrial Linux Solutions| Phone: +49-231-2826-924 | Vertretung West/Dortmund | Fax: +49-5121-206917- | Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de | signature.asc Description: OpenPGP digital signature
RE: [PATCH] can: rcar_canfd: Add Renesas R-Car CAN FD driver
Hi Marc, > On 03/02/2016 09:41 AM, Ramesh Shanmugasundaram wrote: > >> Is the channel loopback mode configurable per channel? If so, please > >> remove the module parameter and make use of CAN_CTRLMODE_LOOPBACK to > >> configure it. > > > > The loopback setting is not truly a per channel attribute. It requires > > touching Rx rules which can be done only when the controller's global > > state is reset (during probe). > > CAN_CTRLMODE_LOOPBACK config option is possible only through .ndo_open > > of that channel but the global controller state needs to be > > operational by this time. As it is a global attribute & for the above > > reason, I choose the module option. > > I see, ok then. > > >>> CAN FD mode supports both Classical CAN & CAN FD frame formats. The > >>> controller supports ISO 11898-1:2015 CAN FD format only. > >>> > >>> This controller supports two channels and the driver can enable > >>> either or both of the channels. > >>> > >>> Driver uses Rx FIFOs (one per channel) for reception & Common FIFOs > >>> (one per channel) for transmission. Rx filter rules are configured > >>> to the minimum (one per channel) and it accepts Standard, Extended, > >>> Data & Remote Frame combinations. > >> > >> I see no locking for the tx-path. > > > > I am not sure why locking is needed in tx-path? > > If the tx-path is shared between both channels you need locking as the > networking subsystem may send one frame to each controller at the same > time. Yes. Tx completion interrupt is shared by both channels but the Tx FIFO, echo skbs, head, tail are maintained on a per channel basis. Yes, net subsystem can send one frame to each channel at the same time and the tx completion interrupt handler will traverse each channel & update per channel context accordingly. > > However, looking at it again, I should move the incrementing of head > > after the "sts" handing to be apt. What do you think? > > With one producer (one SW instance) and one consumer (the HW) you can > write lockless code (if the HW allows it), but with two producers it's not > possible. For a channel represented as netdev, there is still one producer (one SW instance) isn't it? net acquires lock before calling xmit. The consumer is tx completion interrupt handler which updates per channel attributes only. Sorry if I am annoying. I have tested this with two channels transmitting & receiving at the same time and it works fine. If I am missing lock, I would like to understand it thoroughly before making the change. Thanks for the help, Ramesh. N�r��yb�X��ǧv�^�){.n�+{�v�"��^n�r���z���h�&���G���h�(�階�ݢj"���m��z�ޖ���f���h���~�m�
RE: [PATCH] can: rcar_canfd: Add Renesas R-Car CAN FD driver
Hi Marc, Thanks for the review. > On 03/01/2016 10:34 AM, Ramesh Shanmugasundaram wrote: > > This patch adds support for the CAN FD controller found in Renesas > > R-Car SoCs. The controller operates in CAN FD mode by default. Two > > test modes are available and can be enabled by the > > "rcar_canfd.testmode" module parameter. Refer to Documentation/kernel- > parameters.txt. > > Is the channel loopback mode configurable per channel? If so, please > remove the module parameter and make use of CAN_CTRLMODE_LOOPBACK to > configure it. The loopback setting is not truly a per channel attribute. It requires touching Rx rules which can be done only when the controller's global state is reset (during probe). CAN_CTRLMODE_LOOPBACK config option is possible only through .ndo_open of that channel but the global controller state needs to be operational by this time. As it is a global attribute & for the above reason, I choose the module option. > > > CAN FD mode supports both Classical CAN & CAN FD frame formats. The > > controller supports ISO 11898-1:2015 CAN FD format only. > > > > This controller supports two channels and the driver can enable either > > or both of the channels. > > > > Driver uses Rx FIFOs (one per channel) for reception & Common FIFOs > > (one per channel) for transmission. Rx filter rules are configured to > > the minimum (one per channel) and it accepts Standard, Extended, Data > > & Remote Frame combinations. > > I see no locking for the tx-path. I am not sure why locking is needed in tx-path? However, looking at it again, I should move the incrementing of head after the "sts" handing to be apt. What do you think? Thanks, Ramesh N�r��yb�X��ǧv�^�){.n�+{�v�"��^n�r���z���h�&���G���h�(�階�ݢj"���m��z�ޖ���f���h���~�m�
Re: [PATCH v3] tty: serial: meson: Implement earlycon support
On Wed, Mar 2, 2016 at 3:49 AM, Andreas Färberwrote: > Split off the bulk of the existing meson_serial_console_write() > implementation into meson_serial_port_write() for implementing > meson_serial_early_console_write(). > > Use "meson" as the earlycon driver name, courtesy of Nicolas. > > Signed-off-by: Nicolas Saenz Julienne > Signed-off-by: Andreas Färber Acked-by: Carlo Caione Thanks, -- Carlo Caione -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html