Re: [PATCHv9 1/3] rdmacg: Added rdma cgroup controller

2016-03-02 Thread Haggai Eran
On 03/03/2016 04:49, Parav Pandit wrote:
> Hi Tejun, Haggai,
> 
> On Thu, Mar 3, 2016 at 1:28 AM, Parav Pandit  wrote:
 + rpool->refcnt--;
 + if (rpool->refcnt == 0 && rpool->num_max_cnt == 
 pool_info->table_len) {
>>>
>>> If the caller charges 2 and then uncharges 1 two times, the refcnt
>>> underflows?  Why not just track how many usages are zero?
>>>
>> This is certainly must fix bug. Changed refcnt to usage_sum and changed to do
>> usage_sum -= num during uncharging
>> and
>> usage_sum += num during charing.
> 
> This is not sufficient as css_get() and put are done only once per
> call, which leads to similar problem as of refcnt.
Are css_get_many() and css_put_many() relevant here?

> As I think more, I realised that this particular test is missing that
> resulted in this related bug, I realize that we don't have use case to
> have "num" field from the IB stack side.
> For bulk free IB stack will have to keep track of different or same
> rdmacg returned values to call uncharge() with right number of
> resources, all of that complexity just doesn't make sense and not
> required.
> So as first step to further simplify this, I am removing "num" input
> field from charge and uncharge API.

IIRC there are no instances in the RDMA subsystem today where userspace
allocates more than one resource at a time.

Yishai, in your proposed RSS patchset did you have a verb to allocate
multiple work queues at once?

Haggai

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv9 1/3] rdmacg: Added rdma cgroup controller

2016-03-02 Thread Parav Pandit
On Thu, Mar 3, 2016 at 1:28 AM, Parav Pandit  wrote:
> On Wed, Mar 2, 2016 at 11:09 PM, Tejun Heo  wrote:
>> Nothing seems to prevent @cg from going away if this races with
>> @current being migrated to a different cgroup.  Have you run this with
>> lockdep and rcu debugging enabled?  This should have triggered a
>> warning.
I am able to reproduce this race. Looking into how to address it.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 01/11] drm/hisilicon: Add device tree binding for hi6220 display subsystem

2016-03-02 Thread Xinliang Liu
Hi,

On 3 March 2016 at 02:29, Rob Herring  wrote:
> On Fri, Feb 26, 2016 at 04:40:18PM +0800, Xinliang Liu wrote:
>> Add ADE display controller binding doc.
>> Add DesignWare DSI Host Controller v1.20a binding doc.
>>
>> v6:
>> - Cleanup values part of reg and clocks properties.
>> - Change "pclk_dsi" clock name to "pclk".
>> v5:
>> - Remove endpoint unit address of dsi output port.
>> - Add "hisilicon,noc-syscon" property for ADE NOC QoS syscon.
>> - Add "resets" property for ADE reset.
>> v4:
>> - Describe more specific of clocks and ports.
>> - Fix indentation.
>> v3:
>> - Make ade as the drm master node.
>> - Use assigned-clocks to set clock rate.
>> - Use ports to connect display relavant nodes.
>> v2:
>> - Move dt binding docs to bindings/display/hisilicon directory.
>>
>> Signed-off-by: Xinliang Liu 
>> ---
>>  .../bindings/display/hisilicon/dw-dsi.txt  | 72 
>> ++
>>  .../bindings/display/hisilicon/hisi-ade.txt| 64 +++
>>  2 files changed, 136 insertions(+)
>>  create mode 100644 
>> Documentation/devicetree/bindings/display/hisilicon/dw-dsi.txt
>>  create mode 100644 
>> Documentation/devicetree/bindings/display/hisilicon/hisi-ade.txt
>
> Acked-by: Rob Herring 

Thanks :-)
-xinliang
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] sparc64: Add support for Application Data Integrity (ADI)

2016-03-02 Thread Julian Calaby
Hi Khalid,

On Thu, Mar 3, 2016 at 7:39 AM, Khalid Aziz  wrote:
>
> Enable Application Data Integrity (ADI) support in the sparc
> kernel for applications to use ADI in userspace. ADI is a new
> feature supported on sparc M7 and newer processors. ADI is supported
> for data fetches only and not instruction fetches. This patch adds
> prctl commands to enable and disable ADI (TSTATE.mcde), return ADI
> parameters to userspace, enable/disable MCD (Memory Corruption
> Detection) on selected memory ranges and enable TTE.mcd in PTEs. It
> also adds handlers for all traps related to MCD. ADI is not enabled
> by default for any task and a task must explicitly enable ADI
> (TSTATE.mcde), turn MCD on on a memory range and set version tag
> for ADI to be effective for the task. This patch adds support for
> ADI for hugepages only. Addresses passed into system calls must be
> non-ADI tagged addresses.

I can't comment on the actual functionality here, but I do see a few
minor style issues in your patch.

My big concern is that you're defining a lot of new code that is ADI
specific but isn't inside a CONFIG_SPARC_ADI ifdef. (That said,
handling ADI specific traps if ADI isn't enabled looks like a good
idea to me, however most of the other stuff is just dead code if
CONFIG_SPARC_ADI isn't enabled.)

> Signed-off-by: Khalid Aziz 
> ---
> NOTES: ADI is a new feature added to M7 processor to allow hardware
> to catch rogue accesses to memory. An app can enable ADI on
> its data pages, set version tags on them and use versioned
> addresses (bits 63-60 of the address contain a version tag)
> to access the data pages. If a rogue app attempts to access
> ADI enabled data pages, its access is blocked and processor
> generates an exception. Enabling this functionality for all
> data pages of an app requires adding infrastructure to save
> version tags for any data pages that get swapped out and
> restoring those tags when pages are swapped back in. In this
> first implementation I am enabling ADI for hugepages only
> since these pages are locked in memory and hence avoid the
> issue of saving and restoring tags. Once this core functionality
> is stable, ADI for other memory pages can be enabled more
> easily.
>
> v2:
> - Fixed a build error
>
>  Documentation/prctl/sparc_adi.txt |  62 ++
>  Documentation/sparc/adi.txt   | 206 +++
>  arch/sparc/Kconfig|  12 ++
>  arch/sparc/include/asm/hugetlb.h  |  14 +++
>  arch/sparc/include/asm/hypervisor.h   |   2 +
>  arch/sparc/include/asm/mmu_64.h   |   1 +
>  arch/sparc/include/asm/pgtable_64.h   |  15 +++
>  arch/sparc/include/asm/processor_64.h |  19 +++
>  arch/sparc/include/asm/ttable.h   |  10 ++
>  arch/sparc/include/uapi/asm/asi.h |   3 +
>  arch/sparc/include/uapi/asm/pstate.h  |  10 ++
>  arch/sparc/kernel/entry.h |   3 +
>  arch/sparc/kernel/head_64.S   |   1 +
>  arch/sparc/kernel/mdesc.c |  81 +
>  arch/sparc/kernel/process_64.c| 222 
> ++
>  arch/sparc/kernel/sun4v_mcd.S |  16 +++
>  arch/sparc/kernel/traps_64.c  |  96 ++-
>  arch/sparc/kernel/ttable_64.S |   6 +-
>  include/linux/mm.h|   2 +
>  include/uapi/asm-generic/siginfo.h|   5 +-
>  include/uapi/linux/prctl.h|  16 +++
>  kernel/sys.c  |  30 +
>  22 files changed, 826 insertions(+), 6 deletions(-)
>  create mode 100644 Documentation/prctl/sparc_adi.txt
>  create mode 100644 Documentation/sparc/adi.txt
>  create mode 100644 arch/sparc/kernel/sun4v_mcd.S

I must admit that I'm slightly impressed that the documentation is
over a quarter of the lines added. =)

> diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
> index 56442d2..0aac0ae 100644
> --- a/arch/sparc/Kconfig
> +++ b/arch/sparc/Kconfig
> @@ -80,6 +80,7 @@ config SPARC64
> select NO_BOOTMEM
> select HAVE_ARCH_AUDITSYSCALL
> select ARCH_SUPPORTS_ATOMIC_RMW
> +   select SPARC_ADI

This doesn't look right.

>  config ARCH_DEFCONFIG
> string
> @@ -314,6 +315,17 @@ if SPARC64
>  source "kernel/power/Kconfig"
>  endif
>
> +config SPARC_ADI
> +   bool "Application Data Integrity support"
> +   def_bool y if SPARC64

def_bool is for config options without names (i.e. "this is a boolean
value and it's default is...")

So if you want people to be able to disable this option, then you
should remove the select above and just have:

bool "Application Data Integrity support"
default y if SPARC64

If you don't want people disabling it, then there's no point in having
a separate Kconfig symbol.

> +   help
> + Support for Application Data Integrity (ADI). ADI feature allows
> + a 

Re: [PATCH v10 07/12] task_isolation: add debug boot flag

2016-03-02 Thread Chris Metcalf

On 3/2/2016 3:37 PM, Peter Zijlstra wrote:

On Wed, Mar 02, 2016 at 03:09:31PM -0500, Chris Metcalf wrote:

+void task_isolation_debug(int cpu)
+{
+   struct task_struct *p;
+
+   if (!task_isolation_possible(cpu))
+   return;
+
+   rcu_read_lock();
+   p = cpu_curr(cpu);
+   get_task_struct(p);

As I think Oleg keeps reminding me, this is not actually a safe thing to
do.


So what's the right solution?  The fast path in task_isolation_debug_task 
basically
just uses the new "task_isolation_flags", and "pid" and "comm".  I would think 
those
would all have to be safe because of the get_task_struct().

The piece that might be problematic is the eventual call to send_sig_info() 
using the
task_struct pointer (called via task_isolation_debug_task -> 
task_isolation_interrupt).
Clearly this is safe at some level, since that's more or less what sys_kill() 
does and the
process could similarly evaporate half way through sending the signal.

Suggestions?  Thanks!

--
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v10 07/12] task_isolation: add debug boot flag

2016-03-02 Thread Chris Metcalf
The new "task_isolation_debug" flag simplifies debugging
of TASK_ISOLATION kernels when processes are running in
PR_TASK_ISOLATION_ENABLE mode.  Such processes should get no
interrupts from the kernel, and if they do, when this boot flag is
specified a kernel stack dump on the console is generated.

It's possible to use ftrace to simply detect whether a task_isolation
core has unexpectedly entered the kernel.  But what this boot flag
does is allow the kernel to provide better diagnostics, e.g. by
reporting in the IPI-generating code what remote core and context
is preparing to deliver an interrupt to a task_isolation core.

It may be worth considering other ways to generate useful debugging
output rather than console spew, but for now that is simple and direct.

Signed-off-by: Chris Metcalf 
---
 Documentation/kernel-parameters.txt|  8 
 include/linux/context_tracking_state.h |  6 +++
 include/linux/isolation.h  |  5 +++
 kernel/irq_work.c  |  5 ++-
 kernel/isolation.c | 77 ++
 kernel/sched/core.c| 18 
 kernel/signal.c|  5 +++
 kernel/smp.c   |  6 ++-
 kernel/softirq.c   | 33 +++
 9 files changed, 161 insertions(+), 2 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index c8d0b42d984a..ea0434fa906e 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -3755,6 +3755,14 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
also sets up nohz_full and isolcpus mode for the
listed set of cpus.
 
+   task_isolation_debug[KNL]
+   In kernels built with CONFIG_TASK_ISOLATION
+   and booted in task_isolation= mode, this
+   setting will generate console backtraces when
+   the kernel is about to interrupt a task that
+   has requested PR_TASK_ISOLATION_ENABLE and is
+   running on a task_isolation core.
+
tcpmhash_entries= [KNL,NET]
Set the number of tcp_metrics_hash slots.
Default value is 8192 or 16384 depending on total
diff --git a/include/linux/context_tracking_state.h 
b/include/linux/context_tracking_state.h
index 1d34fe68f48a..4e2c4b900b82 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -39,8 +39,14 @@ static inline bool context_tracking_in_user(void)
 {
return __this_cpu_read(context_tracking.state) == CONTEXT_USER;
 }
+
+static inline bool context_tracking_cpu_in_user(int cpu)
+{
+   return per_cpu(context_tracking.state, cpu) == CONTEXT_USER;
+}
 #else
 static inline bool context_tracking_in_user(void) { return false; }
+static inline bool context_tracking_cpu_in_user(int cpu) { return false; }
 static inline bool context_tracking_active(void) { return false; }
 static inline bool context_tracking_is_enabled(void) { return false; }
 static inline bool context_tracking_cpu_is_enabled(void) { return false; }
diff --git a/include/linux/isolation.h b/include/linux/isolation.h
index ba6c4d510db8..f1ae7b663746 100644
--- a/include/linux/isolation.h
+++ b/include/linux/isolation.h
@@ -45,6 +45,9 @@ static inline void task_isolation_enter(void)
 extern bool task_isolation_syscall(int nr);
 extern void task_isolation_exception(const char *fmt, ...);
 extern void task_isolation_interrupt(struct task_struct *, const char *buf);
+extern void task_isolation_debug(int cpu);
+extern void task_isolation_debug_cpumask(const struct cpumask *);
+extern void task_isolation_debug_task(int cpu, struct task_struct *p);
 
 static inline bool task_isolation_strict(void)
 {
@@ -73,6 +76,8 @@ static inline bool task_isolation_ready(void) { return true; }
 static inline void task_isolation_enter(void) { }
 static inline bool task_isolation_check_syscall(int nr) { return false; }
 static inline void task_isolation_check_exception(const char *fmt, ...) { }
+static inline void task_isolation_debug(int cpu) { }
+#define task_isolation_debug_cpumask(mask) do {} while (0)
 #endif
 
 #endif
diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index bcf107ce0854..a9b95ce00667 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 
@@ -75,8 +76,10 @@ bool irq_work_queue_on(struct irq_work *work, int cpu)
if (!irq_work_claim(work))
return false;
 
-   if (llist_add(>llnode, _cpu(raised_list, cpu)))
+   if (llist_add(>llnode, _cpu(raised_list, cpu))) {
+   task_isolation_debug(cpu);
arch_send_call_function_single_ipi(cpu);
+   }
 
return true;
 }
diff --git a/kernel/isolation.c 

Re: [PATCH] sparc64: Add support for Application Data Integrity (ADI)

2016-03-02 Thread Khalid Aziz

On 03/02/2016 01:26 PM, kbuild test robot wrote:

Hi Khalid,

[auto build test ERROR on sparc/master]
[also build test ERROR on v4.5-rc6]
[cannot apply to next-20160302]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improving the system]

url:
https://github.com/0day-ci/linux/commits/Khalid-Aziz/sparc64-Add-support-for-Application-Data-Integrity-ADI/20160303-025709
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc.git master
config: sparc64-allnoconfig (attached as .config)
reproduce:
 wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
 chmod +x ~/bin/make.cross
 # save the attached .config to linux build tree
 make.cross ARCH=sparc64

All errors (new ones prefixed by >>):

arch/sparc/kernel/process_64.c: In function 'disable_sparc_adi':

arch/sparc/kernel/process_64.c:961:6: error: implicit declaration of function 
'vma_policy' [-Werror=implicit-function-declaration]

  vma_policy(vma), vma->vm_userfaultfd_ctx);
  ^
arch/sparc/kernel/process_64.c:959:10: error: passing argument 9 of 
'vma_merge' makes pointer from integer without a cast [-Werror]
   prev = vma_merge(mm, prev, addr, end, vma->vm_flags,
  ^
In file included from arch/sparc/kernel/process_64.c:18:0:
include/linux/mm.h:1922:31: note: expected 'struct mempolicy *' but 
argument is of type 'int'
 extern struct vm_area_struct *vma_merge(struct mm_struct *,
   ^


Not sure why it built without errors on my system. I will #include 
 and send updated patch.


--
Khalid
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 07/12] task_isolation: add debug boot flag

2016-03-02 Thread Peter Zijlstra
On Wed, Mar 02, 2016 at 03:09:31PM -0500, Chris Metcalf wrote:
> +void task_isolation_debug(int cpu)
> +{
> + struct task_struct *p;
> +
> + if (!task_isolation_possible(cpu))
> + return;
> +
> + rcu_read_lock();
> + p = cpu_curr(cpu);
> + get_task_struct(p);

As I think Oleg keeps reminding me, this is not actually a safe thing to
do.

> + rcu_read_unlock();
> + task_isolation_debug_task(cpu, p);
> + put_task_struct(p);
> +}
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v10 05/12] task_isolation: support CONFIG_TASK_ISOLATION_ALL

2016-03-02 Thread Chris Metcalf
This option, similar to NO_HZ_FULL_ALL, simplifies configuring
a system to boot by default with all cores except the boot core
running in task isolation mode.
---
 init/Kconfig   | 10 ++
 kernel/isolation.c |  6 ++
 2 files changed, 16 insertions(+)

diff --git a/init/Kconfig b/init/Kconfig
index 6cab348fe454..314b09347fba 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -802,6 +802,16 @@ config TASK_ISOLATION
 You should say "N" unless you are intending to run a
 high-performance userspace driver or similar task.
 
+config TASK_ISOLATION_ALL
+   bool "Provide task isolation on all CPUs by default (except CPU 0)"
+   depends on TASK_ISOLATION
+   help
+If the user doesn't pass the task_isolation boot option to
+define the range of task isolation CPUs, consider that all
+CPUs in the system are task isolation by default.
+Note the boot CPU will still be kept outside the range to
+handle timekeeping duty, etc.
+
 config BUILD_BIN2C
bool
default n
diff --git a/kernel/isolation.c b/kernel/isolation.c
index e954afd8cce8..42ad7a746a1e 100644
--- a/kernel/isolation.c
+++ b/kernel/isolation.c
@@ -40,8 +40,14 @@ int __init task_isolation_init(void)
 {
/* For offstack cpumask, ensure we allocate an empty cpumask early. */
if (!saw_boot_arg) {
+#ifdef CONFIG_TASK_ISOLATION_ALL
+   alloc_cpumask_var(_isolation_map, GFP_KERNEL);
+   cpumask_copy(task_isolation_map, cpu_possible_mask);
+   cpumask_clear_cpu(smp_processor_id(), task_isolation_map);
+#else
zalloc_cpumask_var(_isolation_map, GFP_KERNEL);
return 0;
+#endif
}
 
/*
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sparc64: Add support for Application Data Integrity (ADI)

2016-03-02 Thread kbuild test robot
Hi Khalid,

[auto build test ERROR on sparc/master]
[also build test ERROR on v4.5-rc6]
[cannot apply to next-20160302]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improving the system]

url:
https://github.com/0day-ci/linux/commits/Khalid-Aziz/sparc64-Add-support-for-Application-Data-Integrity-ADI/20160303-025709
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc.git master
config: sparc64-allnoconfig (attached as .config)
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=sparc64 

All errors (new ones prefixed by >>):

   arch/sparc/kernel/process_64.c: In function 'disable_sparc_adi':
>> arch/sparc/kernel/process_64.c:961:6: error: implicit declaration of 
>> function 'vma_policy' [-Werror=implicit-function-declaration]
 vma_policy(vma), vma->vm_userfaultfd_ctx);
 ^
   arch/sparc/kernel/process_64.c:959:10: error: passing argument 9 of 
'vma_merge' makes pointer from integer without a cast [-Werror]
  prev = vma_merge(mm, prev, addr, end, vma->vm_flags,
 ^
   In file included from arch/sparc/kernel/process_64.c:18:0:
   include/linux/mm.h:1922:31: note: expected 'struct mempolicy *' but argument 
is of type 'int'
extern struct vm_area_struct *vma_merge(struct mm_struct *,
  ^
   cc1: all warnings being treated as errors

vim +/vma_policy +961 arch/sparc/kernel/process_64.c

   955  /* Update the ADI info in vma and check if this vma can
   956   * be merged with adjacent ones
   957   */
   958  pgoff = vma->vm_pgoff + ((addr - vma->vm_start) >> 
PAGE_SHIFT);
   959  prev = vma_merge(mm, prev, addr, end, vma->vm_flags,
   960   vma->anon_vma, vma->vm_file, pgoff,
 > 961   vma_policy(vma), 
 > vma->vm_userfaultfd_ctx);
   962  if (prev)
   963  vma = prev;
   964  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


[PATCH v10 06/12] task_isolation: support PR_TASK_ISOLATION_STRICT mode

2016-03-02 Thread Chris Metcalf
With task_isolation mode, the task is in principle guaranteed not to
be interrupted by the kernel, but only if it behaves.  In particular,
if it enters the kernel via system call, page fault, or any of a
number of other synchronous traps, it may be unexpectedly exposed
to long latencies.  Add a simple flag that puts the process into
a state where any such kernel entry is fatal; this is defined as
happening immediately before the SECCOMP test.

By default, the task is signalled with SIGKILL, but we add prctl()
bits to support requesting a specific signal instead.

To allow the state to be entered and exited, we ignore the prctl()
syscall so that we can clear the bit again later, and we ignore
exit/exit_group to allow exiting the task without a pointless signal
killing you as you try to do so.

Signed-off-by: Chris Metcalf 
---
 include/linux/isolation.h  | 25 +++
 include/uapi/linux/prctl.h |  3 +++
 kernel/isolation.c | 60 ++
 3 files changed, 88 insertions(+)

diff --git a/include/linux/isolation.h b/include/linux/isolation.h
index c564cf1886bb..ba6c4d510db8 100644
--- a/include/linux/isolation.h
+++ b/include/linux/isolation.h
@@ -42,12 +42,37 @@ static inline void task_isolation_enter(void)
_task_isolation_enter();
 }
 
+extern bool task_isolation_syscall(int nr);
+extern void task_isolation_exception(const char *fmt, ...);
+extern void task_isolation_interrupt(struct task_struct *, const char *buf);
+
+static inline bool task_isolation_strict(void)
+{
+   return ((current->task_isolation_flags &
+(PR_TASK_ISOLATION_ENABLE | PR_TASK_ISOLATION_STRICT)) ==
+   (PR_TASK_ISOLATION_ENABLE | PR_TASK_ISOLATION_STRICT)) &&
+   task_isolation_possible(raw_smp_processor_id());
+}
+
+static inline bool task_isolation_check_syscall(int nr)
+{
+   return task_isolation_strict() && task_isolation_syscall(nr);
+}
+
+#define task_isolation_check_exception(fmt, ...)   \
+   do {\
+   if (task_isolation_strict())\
+   task_isolation_exception(fmt, ## __VA_ARGS__);  \
+   } while (0)
+
 #else
 static inline void task_isolation_init(void) { }
 static inline bool task_isolation_possible(int cpu) { return false; }
 static inline bool task_isolation_enabled(void) { return false; }
 static inline bool task_isolation_ready(void) { return true; }
 static inline void task_isolation_enter(void) { }
+static inline bool task_isolation_check_syscall(int nr) { return false; }
+static inline void task_isolation_check_exception(const char *fmt, ...) { }
 #endif
 
 #endif
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 67224df4b559..a5582ace987f 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -201,5 +201,8 @@ struct prctl_mm_map {
 #define PR_SET_TASK_ISOLATION  48
 #define PR_GET_TASK_ISOLATION  49
 # define PR_TASK_ISOLATION_ENABLE  (1 << 0)
+# define PR_TASK_ISOLATION_STRICT  (1 << 1)
+# define PR_TASK_ISOLATION_SET_SIG(sig)(((sig) & 0x7f) << 8)
+# define PR_TASK_ISOLATION_GET_SIG(bits) (((bits) >> 8) & 0x7f)
 
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/isolation.c b/kernel/isolation.c
index 42ad7a746a1e..5621fdf15b17 100644
--- a/kernel/isolation.c
+++ b/kernel/isolation.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "time/tick-sched.h"
 
 cpumask_var_t task_isolation_map;
@@ -122,3 +123,62 @@ void _task_isolation_enter(void)
if (!tick_nohz_tick_stopped())
set_tsk_need_resched(current);
 }
+
+void task_isolation_interrupt(struct task_struct *task, const char *buf)
+{
+   siginfo_t info = {};
+   int sig;
+
+   pr_warn("%s/%d: task_isolation strict mode violated by %s\n",
+   task->comm, task->pid, buf);
+
+   /*
+* Turn off task isolation mode entirely to avoid spamming
+* the process with signals.  It can re-enable task isolation
+* mode in the signal handler if it wants to.
+*/
+   task->task_isolation_flags = 0;
+
+   sig = PR_TASK_ISOLATION_GET_SIG(task->task_isolation_flags);
+   if (sig == 0)
+   sig = SIGKILL;
+   info.si_signo = sig;
+   send_sig_info(sig, , task);
+}
+
+/*
+ * This routine is called from any userspace exception if the _STRICT
+ * flag is set.
+ */
+void task_isolation_exception(const char *fmt, ...)
+{
+   va_list args;
+   char buf[100];
+
+   /* RCU should have been enabled prior to this point. */
+   RCU_LOCKDEP_WARN(!rcu_is_watching(), "kernel entry without RCU");
+
+   va_start(args, fmt);
+   vsnprintf(buf, sizeof(buf), fmt, args);
+   va_end(args);
+
+   task_isolation_interrupt(current, buf);
+}
+
+/*
+ * This routine is called from syscall 

[PATCH v10 00/12] support "task_isolation" mode

2016-03-02 Thread Chris Metcalf
Here is the latest version of the task-isolation patch set, adopting
various suggestions made about the v9 patch series, including some
feedback from testing on the new EZchip NPS ARC platform (although the
new arch/arc support is not included in this patch series).  All of the
suggestions were relatively non-controversial.

Perhaps we are getting close to being able to merge this. :-)

Changes since v9:

- task_isolation is now set up by adding its required cpus to both the
  nohz_full and isolcpus cpumasks.  This allows users to separately
  specify all three flags, if so desired, and still get reasonably
  sane semantics.  This is done with a new tick_nohz_full_add_cpus()
  method for nohz, and just directly updating the isolcpus cpumask.

- We add a /sys/devices/system/cpu/task_isolation file to parallel
  the equivalent nohz_full file.  (This should have been in v8 since
  once task_isolation isn't equivalent to nohz_full, it needs its
  own way to let userspace know where to run.)

- We add a new Kconfig option, TASK_ISOLATION_ALL, which sets all but
  the boot processor to run in task isolation mode.  This parallels
  the existing NO_HZ_FULL_ALL and works around the fact that you can't
  easily specify a boot argument with the desired semantics.

- For task_isolation_debug, we add a check of the context_tracking
  state of the remote cpu before issuing a warning; if the remote cpu
  is actually in the kernel, we don't need to warn.

- A cloned child of a task_isolation task is not enabled for
  task_isolation, since otherwise they would both fight over who could
  safely return to userspace without requiring scheduling interrupts.

- The quiet_vmstat() function's semantics was changed since the v9
  patch series, so I introduce a quiet_vmstat_sync() for isolation.

- The lru_add_drain_needed() function is updated to know about the new
  lru_deactivate_pvecs variable.

- The arm64 patch factoring assembly into C has been modified based
  on an earlier patch by Mark Rutland.

- I simplified the enabling patch for arm64 by realizing we could just
  test TIF_NOHZ as the only bit for TIF_WORK_MASK for task isolation,
  so I didn't have to renumber all the TIF_xxx bits.

- Small fixes to avoid preemption warnings.

- Rebased on v4.5-rc5

For changes in earlier versions of the patch series, please see:

http://lkml.kernel.org/r/1451936091-29247-1-git-send-email-cmetc...@ezchip.com

A couple of the tile patches that refactored the context tracking
code were taken into 4.5 so are no longer present in this series.

This version of the patch series has been tested on arm64 and tile,
and build-tested on x86.

It remains true that the 1 Hz tick needs to be disabled for this
patch series to be able to achieve its primary goal of enabling
truly tick-free operation, but that is ongoing orthogonal work.

The series is available at:

  git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile.git 
dataplane

Chris Metcalf (12):
  vmstat: add quiet_vmstat_sync function
  vmstat: add vmstat_idle function
  lru_add_drain_all: factor out lru_add_drain_needed
  task_isolation: add initial support
  task_isolation: support CONFIG_TASK_ISOLATION_ALL
  task_isolation: support PR_TASK_ISOLATION_STRICT mode
  task_isolation: add debug boot flag
  arm, tile: turn off timer tick for oneshot_stopped state
  arch/x86: enable task isolation functionality
  arch/tile: enable task isolation functionality
  arm64: factor work_pending state machine to C
  arch/arm64: enable task isolation functionality

 Documentation/kernel-parameters.txt|  16 ++
 arch/arm64/include/asm/thread_info.h   |   8 +-
 arch/arm64/kernel/entry.S  |  12 +-
 arch/arm64/kernel/ptrace.c |  12 +-
 arch/arm64/kernel/signal.c |  34 -
 arch/arm64/kernel/smp.c|   2 +
 arch/arm64/mm/fault.c  |   4 +
 arch/tile/kernel/process.c |   6 +-
 arch/tile/kernel/ptrace.c  |   6 +
 arch/tile/kernel/single_step.c |   5 +
 arch/tile/kernel/smp.c |  28 ++--
 arch/tile/kernel/time.c|   1 +
 arch/tile/kernel/unaligned.c   |   3 +
 arch/tile/mm/fault.c   |   3 +
 arch/tile/mm/homecache.c   |   2 +
 arch/x86/entry/common.c|  18 ++-
 arch/x86/kernel/traps.c|   2 +
 arch/x86/mm/fault.c|   2 +
 drivers/base/cpu.c |  18 +++
 drivers/clocksource/arm_arch_timer.c   |   2 +
 include/linux/context_tracking_state.h |   6 +
 include/linux/isolation.h  |  83 +++
 include/linux/sched.h  |   3 +
 include/linux/swap.h   |   1 +
 include/linux/tick.h   |   1 +
 include/linux/vmstat.h |   4 +
 include/uapi/linux/prctl.h |   8 +
 init/Kconfig   |  30 
 kernel/Makefile|   1 +
 kernel/fork.c 

Re: [PATCHv9 1/3] rdmacg: Added rdma cgroup controller

2016-03-02 Thread Parav Pandit
On Wed, Mar 2, 2016 at 11:09 PM, Tejun Heo  wrote:
> Hello,
>
>> +struct rdma_cgroup {
>> + struct cgroup_subsys_state  css;
>> +
>> + spinlock_t rpool_list_lock; /* protects resource pool list */
>> + struct list_head rpool_head;/* head to keep track of all resource
>> +  * pools that belongs to this cgroup.
>> +  */
>
> I think we usually don't tail wing these comments.

ok. I will put comments in separate line.

>
>> +};
>> +
>> +struct rdmacg_pool_info {
>> + const char **resource_name_table;
>> + int table_len;
>
> Align fields consistently?  I've said this multiple times now but
> please make the available resources constant and document them in
> Documentation/cgroup-v2.txt.

o.k. I will align them.
o.k. I will document the resource constants defined by IB stack in
cgroup-v2.txt.

>> +
>> +/* APIs for RDMA/IB stack to publish when a device wants to
>> + * participate in resource accounting
>> + */
>
> Please follow CodingStyle.  Wasn't this pointed out a couple times
> already?
Yes. You did. I fixed at few places. I missed this out. Sorry. I am doing now.

>
>> +enum rdmacg_file_type {
>> + RDMACG_RESOURCE_MAX,
>> + RDMACG_RESOURCE_STAT,
>> +};
>
> Heh, usually not a good sign.  If you're using this to call into a
> common function and then switch out on the file type, just switch out
> at the method level and factor out common part into helpers.
>
Methods for both the constants are same. Code changes between two file
type is hardly 4 lines of code.
So there is no need of additional helper functions.
So in currently defined functions rdmacg_resource_read() and
rdmacg_resource_set_max() works on file type.

>> +/* resource tracker per resource for rdma cgroup */
>> +struct rdmacg_resource {
>> + int max;
>> + int usage;
>> +};
>
> Align fields?

Above one seems to be aligned. Not sure what am I missing.
I am aligning all the structures anyways.

>
>> +/**
>
> The above indicates kerneldoc comments, which this isn't.
Fixed for this comment.

>
>> + * resource pool object which represents, per cgroup, per device
>> + * resources. There are multiple instance
>> + * of this object per cgroup, therefore it cannot be embedded within
>> + * rdma_cgroup structure. It is maintained as list.
>
> Consistent paragraph fill?
Fixed.

>

>> + */
>> +struct rdmacg_resource_pool {
>> + struct list_head cg_list;
>> + struct list_head dev_list;
>> +
>> + struct rdmacg_device *device;
>> + struct rdmacg_resource *resources;
>> + struct rdma_cgroup *cg; /* owner cg used during device cleanup */
>> +
>> + int refcnt; /* count active user tasks of this pool */
>> + int num_max_cnt;/* total number counts which are set to max */
>> +};
>
> Formatting.

Aligning all the fields now in next patch.

>
>> +static inline void set_resource_limit(struct rdmacg_resource_pool *rpool,
>> +   int index, int new_max)
>
> Is inline necessary?  Compiler can figure out these.
Yes. Removed.

>
>> +static struct rdmacg_resource_pool*
>  ^
>  space
>
>> +find_cg_rpool_locked(struct rdma_cgroup *cg,
>> +  struct rdmacg_device *device)
> ...
>> +static void uncharge_cg_resource(struct rdma_cgroup *cg,
>> +  struct rdmacg_device *device,
>> +  int index, int num)
>> +{
> ...
>> + rpool->refcnt--;
>> + if (rpool->refcnt == 0 && rpool->num_max_cnt == pool_info->table_len) {
>
> If the caller charges 2 and then uncharges 1 two times, the refcnt
> underflows?  Why not just track how many usages are zero?
>
This is certainly must fix bug. Changed refcnt to usage_sum and changed to do
usage_sum -= num during uncharging
and
usage_sum += num during charing.

> ...
>> +void rdmacg_uncharge(struct rdma_cgroup *cg,
>> +  struct rdmacg_device *device,
>> +  int index, int num)
>> +{
>> + struct rdma_cgroup *p;
>> +
>> + for (p = cg; p; p = parent_rdmacg(p))
>> + uncharge_cg_resource(p, device, index, num);
>> +
>> + css_put(>css);
>> +}
>> +EXPORT_SYMBOL(rdmacg_uncharge);
>
> So, I suppose the code is trying to save lock contention with
> fine-grained locking;
Yes.
> however, as the code is currently structured,
> it's just increasing the number of lock ops and it'd be highly likely
> to cheaper to simply use a single lock.

Single lock per subsystem? I understood that you were ok to have per
cgroup fine grain lock.

> If you're working up the tree
> grabbing lock at each level, per-node locking doesn't save you
> anything.  Also, it introduces conditions where charges are spuriously
> denied when there are racing requestors.  If scalability becomes an
> issue, the right way to address is adding percpu front cache.
>
>> +void 

Re: [PATCH v8 06/10] watchdog: dw_wdt: Convert to use watchdog infrastructure

2016-03-02 Thread Guenter Roeck
On Tue, Mar 01, 2016 at 02:59:06PM -0800, Doug Anderson wrote:
> Guenter,
> 
> On Sun, Feb 28, 2016 at 1:12 PM, Guenter Roeck  wrote:
> > Convert driver to use watchdog infrastructure. This includes
> > infrastructure support to handle watchdog keepalive if the watchdog
> > is running while the watchdog device is closed.
> >
> > Signed-off-by: Guenter Roeck 
> >
> > ---
> > v8: max_hw_timeout_ms -> max_hw_heartbeat_ms
> > Rebased to v4.5-rc5
> > v7: Set max_hw_timeout_ms
> > Rebased to v4.5-rc1
> > v6: Added patch
> > ---
> >  drivers/watchdog/Kconfig  |   1 +
> >  drivers/watchdog/dw_wdt.c | 323 
> > +-
> >  2 files changed, 117 insertions(+), 207 deletions(-)
> 
> I re-ran the same set of tests I ran against v7.  They all still pass.
> 
> Tested-by: Douglas Anderson 

Thanks a lot!

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv9 0/3] rdmacg: IB/core: rdma controller support

2016-03-02 Thread Parav Pandit
Hi Tejun,

On Wed, Mar 2, 2016 at 11:10 PM, Tejun Heo  wrote:
> Hello, Parav.
>
> It doesn't look like my reviews are getting through.  For now,
>
I have addressed all the review comments that you provided in v5 patch.
I admit that few comments have not followed CodingStyle and I owe to
fix it, which is bad on my part.
I have few 2 questions on your comments, I will ask there.

For cgroup lock - out of 3 proposals, you acknowledged that you are ok
with cgroup specific spin_lock, though you don't see that as big
saving.
Even though it can be made course gained by having single lock for the
whole rdma cgroup subsystem it raises contention among processes of
different cgroups, which is preferable to avoid.
So I have continued to have cgroup specific fine grain lock, which I
believe is ok as this is little efficient for first cut.

Including above one we agreed on almost all the points at design and
implementation level in previous patches.
If you see any issue, I am open to resolve them.

I will address comments given in patch v9.

> Nacked-by: Tejun Heo 
>
> Thanks.
>
> --
> tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] isdn: i4l: move active-isdn drivers to staging

2016-03-02 Thread Arnd Bergmann
The icn, act2000 and pcbit drivers are all for very old hardware,
and it is highly unlikely that anyone is actually still using them
on modern kernels, if at all.

All three drivers apparently are for hardware that predates PCI
being the common connector, as they are ISA-only and active
PCI ISDN cards were widely available in the 1990s.

Looking through the git logs, it I cannot find any indication of a
patch to any of these drivers that has been tested on real hardware,
only cleanups or global API changes.

Signed-off-by: Arnd Bergmann 
---
 Documentation/isdn/00-INDEX |  8 
 drivers/isdn/Makefile   |  3 ---
 drivers/isdn/i4l/Kconfig| 10 --
 drivers/staging/Kconfig |  2 ++
 drivers/staging/Makefile|  1 +
 .../staging/i4l/Documentation}/README.act2000   |  0
 .../isdn => drivers/staging/i4l/Documentation}/README.icn   |  0
 .../isdn => drivers/staging/i4l/Documentation}/README.pcbit |  0
 .../isdn => drivers/staging/i4l/Documentation}/README.sc|  0
 drivers/staging/i4l/Kconfig | 13 +
 drivers/staging/i4l/Makefile|  5 +
 drivers/staging/i4l/TODO|  3 +++
 drivers/{isdn => staging/i4l}/act2000/Kconfig   |  0
 drivers/{isdn => staging/i4l}/act2000/Makefile  |  0
 drivers/{isdn => staging/i4l}/act2000/act2000.h |  0
 drivers/{isdn => staging/i4l}/act2000/act2000_isa.c |  0
 drivers/{isdn => staging/i4l}/act2000/act2000_isa.h |  0
 drivers/{isdn => staging/i4l}/act2000/capi.c|  0
 drivers/{isdn => staging/i4l}/act2000/capi.h|  0
 drivers/{isdn => staging/i4l}/act2000/module.c  |  0
 drivers/{isdn => staging/i4l}/icn/Kconfig   |  0
 drivers/{isdn => staging/i4l}/icn/Makefile  |  0
 drivers/{isdn => staging/i4l}/icn/icn.c |  0
 drivers/{isdn => staging/i4l}/icn/icn.h |  0
 drivers/{isdn => staging/i4l}/pcbit/Kconfig |  0
 drivers/{isdn => staging/i4l}/pcbit/Makefile|  0
 drivers/{isdn => staging/i4l}/pcbit/callbacks.c |  0
 drivers/{isdn => staging/i4l}/pcbit/callbacks.h |  0
 drivers/{isdn => staging/i4l}/pcbit/capi.c  |  0
 drivers/{isdn => staging/i4l}/pcbit/capi.h  |  0
 drivers/{isdn => staging/i4l}/pcbit/drv.c   |  0
 drivers/{isdn => staging/i4l}/pcbit/edss1.c |  0
 drivers/{isdn => staging/i4l}/pcbit/edss1.h |  0
 drivers/{isdn => staging/i4l}/pcbit/layer2.c|  0
 drivers/{isdn => staging/i4l}/pcbit/layer2.h|  0
 drivers/{isdn => staging/i4l}/pcbit/module.c|  0
 drivers/{isdn => staging/i4l}/pcbit/pcbit.h |  0
 37 files changed, 24 insertions(+), 21 deletions(-)
 rename {Documentation/isdn => 
drivers/staging/i4l/Documentation}/README.act2000 (100%)
 rename {Documentation/isdn => drivers/staging/i4l/Documentation}/README.icn 
(100%)
 rename {Documentation/isdn => drivers/staging/i4l/Documentation}/README.pcbit 
(100%)
 rename {Documentation/isdn => drivers/staging/i4l/Documentation}/README.sc 
(100%)
 create mode 100644 drivers/staging/i4l/Kconfig
 create mode 100644 drivers/staging/i4l/Makefile
 create mode 100644 drivers/staging/i4l/TODO
 rename drivers/{isdn => staging/i4l}/act2000/Kconfig (100%)
 rename drivers/{isdn => staging/i4l}/act2000/Makefile (100%)
 rename drivers/{isdn => staging/i4l}/act2000/act2000.h (100%)
 rename drivers/{isdn => staging/i4l}/act2000/act2000_isa.c (100%)
 rename drivers/{isdn => staging/i4l}/act2000/act2000_isa.h (100%)
 rename drivers/{isdn => staging/i4l}/act2000/capi.c (100%)
 rename drivers/{isdn => staging/i4l}/act2000/capi.h (100%)
 rename drivers/{isdn => staging/i4l}/act2000/module.c (100%)
 rename drivers/{isdn => staging/i4l}/icn/Kconfig (100%)
 rename drivers/{isdn => staging/i4l}/icn/Makefile (100%)
 rename drivers/{isdn => staging/i4l}/icn/icn.c (100%)
 rename drivers/{isdn => staging/i4l}/icn/icn.h (100%)
 rename drivers/{isdn => staging/i4l}/pcbit/Kconfig (100%)
 rename drivers/{isdn => staging/i4l}/pcbit/Makefile (100%)
 rename drivers/{isdn => staging/i4l}/pcbit/callbacks.c (100%)
 rename drivers/{isdn => staging/i4l}/pcbit/callbacks.h (100%)
 rename drivers/{isdn => staging/i4l}/pcbit/capi.c (100%)
 rename drivers/{isdn => staging/i4l}/pcbit/capi.h (100%)
 rename drivers/{isdn => staging/i4l}/pcbit/drv.c (100%)
 rename drivers/{isdn => staging/i4l}/pcbit/edss1.c (100%)
 rename drivers/{isdn => staging/i4l}/pcbit/edss1.h (100%)
 rename drivers/{isdn => staging/i4l}/pcbit/layer2.c (100%)
 rename drivers/{isdn => staging/i4l}/pcbit/layer2.h (100%)
 rename 

[PATCH] sparc64: Add support for Application Data Integrity (ADI)

2016-03-02 Thread Khalid Aziz

Enable Application Data Integrity (ADI) support in the sparc
kernel for applications to use ADI in userspace. ADI is a new
feature supported on sparc M7 and newer processors. ADI is supported
for data fetches only and not instruction fetches. This patch adds
prctl commands to enable and disable ADI (TSTATE.mcde), return ADI
parameters to userspace, enable/disable MCD (Memory Corruption
Detection) on selected memory ranges and enable TTE.mcd in PTEs. It
also adds handlers for all traps related to MCD. ADI is not enabled
by default for any task and a task must explicitly enable ADI
(TSTATE.mcde), turn MCD on on a memory range and set version tag
for ADI to be effective for the task. This patch adds support for
ADI for hugepages only. Addresses passed into system calls must be
non-ADI tagged addresses.

Signed-off-by: Khalid Aziz 
---
NOTES: ADI is a new feature added to M7 processor to allow hardware
to catch rogue accesses to memory. An app can enable ADI on
its data pages, set version tags on them and use versioned
addresses (bits 63-60 of the address contain a version tag)
to access the data pages. If a rogue app attempts to access
ADI enabled data pages, its access is blocked and processor
generates an exception. Enabling this functionality for all
data pages of an app requires adding infrastructure to save
version tags for any data pages that get swapped out and
restoring those tags when pages are swapped back in. In this
first implementation I am enabling ADI for hugepages only
since these pages are locked in memory and hence avoid the
issue of saving and restoring tags. Once this core functionality
is stable, ADI for other memory pages can be enabled more
easily.

 Documentation/prctl/sparc_adi.txt |  62 ++
 Documentation/sparc/adi.txt   | 206 +++
 arch/sparc/Kconfig|  12 ++
 arch/sparc/include/asm/hugetlb.h  |  14 +++
 arch/sparc/include/asm/hypervisor.h   |   2 +
 arch/sparc/include/asm/mmu_64.h   |   1 +
 arch/sparc/include/asm/pgtable_64.h   |  15 +++
 arch/sparc/include/asm/processor_64.h |  19 +++
 arch/sparc/include/asm/ttable.h   |  10 ++
 arch/sparc/include/uapi/asm/asi.h |   3 +
 arch/sparc/include/uapi/asm/pstate.h  |  10 ++
 arch/sparc/kernel/entry.h |   3 +
 arch/sparc/kernel/head_64.S   |   1 +
 arch/sparc/kernel/mdesc.c |  81 +
 arch/sparc/kernel/process_64.c| 221 ++
 arch/sparc/kernel/sun4v_mcd.S |  16 +++
 arch/sparc/kernel/traps_64.c  |  96 ++-
 arch/sparc/kernel/ttable_64.S |   6 +-
 include/linux/mm.h|   2 +
 include/uapi/asm-generic/siginfo.h|   5 +-
 include/uapi/linux/prctl.h|  16 +++
 kernel/sys.c  |  30 +
 22 files changed, 825 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/prctl/sparc_adi.txt
 create mode 100644 Documentation/sparc/adi.txt
 create mode 100644 arch/sparc/kernel/sun4v_mcd.S

diff --git a/Documentation/prctl/sparc_adi.txt 
b/Documentation/prctl/sparc_adi.txt
new file mode 100644
index 000..9cbdcae
--- /dev/null
+++ b/Documentation/prctl/sparc_adi.txt
@@ -0,0 +1,62 @@
+
+Overview
+
+
+SPARC M7 processor includes the feature Application Data Integrity (ADI).
+ADI allows a tag to be associated with a virtual memory address range
+and a process must access that memory range with the correct tag. ADI
+tag is embedded in bits 63-60 of virtual address. Once ADI is enabled
+on a range of memory addresses, the process can set a tag for blocks
+in this memory range n the cache using ASI_MCD_PRIMARY or
+ASI_MCD_ST_BLKINIT_PRIMARY. This tag is set for ADI block sized blocks
+which is provided to the kernel by machine description table.
+
+Linux kernel supports an application enabling and setting the ADI tag
+for a subset of its data pages. Those data pages have to be locked in
+memory since saving ADI tags to swap is not supported.
+
+
+New prctl options for ADI
+-
+
+Following new options to prctl() have been added to support ADI.
+
+   PR_GET_SPARC_ADICAPS - Get ADI capabilities for the processor.
+   These capabilities are used to set up ADI correctly
+   from userspace. Machine description table provides all
+   of the ADI capabilities information. arg2 to prctl() is
+   a pointer to struct adi_caps which is defined in
+   linux/prctl.h.
+
+
+   PR_SET_SPARC_ADI - Set the state of ADI in a user thread by
+   setting PSTATE.mcde bit in the user mode PSTATE register
+   of the calling thread based on the value passed in arg2:
+   1 == enable, 0 == disable, other == no change
+   Return the 

Re: [PATCHv9 0/3] rdmacg: IB/core: rdma controller support

2016-03-02 Thread Tejun Heo
Hello, Parav.

It doesn't look like my reviews are getting through.  For now,

Nacked-by: Tejun Heo 

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv9 1/3] rdmacg: Added rdma cgroup controller

2016-03-02 Thread Tejun Heo
Hello,

On Wed, Mar 02, 2016 at 12:35:35AM +0530, Parav Pandit wrote:
> diff --git a/include/linux/cgroup_rdma.h b/include/linux/cgroup_rdma.h
> new file mode 100644
> index 000..2da3d6c
> --- /dev/null
> +++ b/include/linux/cgroup_rdma.h
> @@ -0,0 +1,50 @@
> +#ifndef _CGROUP_RDMA_H
> +#define _CGROUP_RDMA_H
> +
> +#include 
> +
> +#ifdef CONFIG_CGROUP_RDMA
> +
> +struct rdma_cgroup {
> + struct cgroup_subsys_state  css;
> +
> + spinlock_t rpool_list_lock; /* protects resource pool list */
> + struct list_head rpool_head;/* head to keep track of all resource
> +  * pools that belongs to this cgroup.
> +  */

I think we usually don't tail wing these comments.

> +};
> +
> +struct rdmacg_pool_info {
> + const char **resource_name_table;
> + int table_len;

Align fields consistently?  I've said this multiple times now but
please make the available resources constant and document them in
Documentation/cgroup-v2.txt.

> +};
> +
> +struct rdmacg_device {
> + struct rdmacg_pool_info pool_info;
> + struct list_headrdmacg_list;
> + struct list_headrpool_head;
> + /* protects resource pool list */
> + spinlock_t  rpool_lock;
> + char*name;
> +};
> +
> +/* APIs for RDMA/IB stack to publish when a device wants to
> + * participate in resource accounting
> + */

Please follow CodingStyle.  Wasn't this pointed out a couple times
already?

> +enum rdmacg_file_type {
> + RDMACG_RESOURCE_MAX,
> + RDMACG_RESOURCE_STAT,
> +};

Heh, usually not a good sign.  If you're using this to call into a
common function and then switch out on the file type, just switch out
at the method level and factor out common part into helpers.

> +/* resource tracker per resource for rdma cgroup */
> +struct rdmacg_resource {
> + int max;
> + int usage;
> +};

Align fields?

> +/**

The above indicates kerneldoc comments, which this isn't.

> + * resource pool object which represents, per cgroup, per device
> + * resources. There are multiple instance
> + * of this object per cgroup, therefore it cannot be embedded within
> + * rdma_cgroup structure. It is maintained as list.

Consistent paragraph fill?

> + */
> +struct rdmacg_resource_pool {
> + struct list_head cg_list;
> + struct list_head dev_list;
> +
> + struct rdmacg_device *device;
> + struct rdmacg_resource *resources;
> + struct rdma_cgroup *cg; /* owner cg used during device cleanup */
> +
> + int refcnt; /* count active user tasks of this pool */
> + int num_max_cnt;/* total number counts which are set to max */
> +};

Formatting.

> +static inline void set_resource_limit(struct rdmacg_resource_pool *rpool,
> +   int index, int new_max)

Is inline necessary?  Compiler can figure out these.

> +static struct rdmacg_resource_pool*
 ^
 space

> +find_cg_rpool_locked(struct rdma_cgroup *cg,
> +  struct rdmacg_device *device)
...
> +static void uncharge_cg_resource(struct rdma_cgroup *cg,
> +  struct rdmacg_device *device,
> +  int index, int num)
> +{
...
> + rpool->refcnt--;
> + if (rpool->refcnt == 0 && rpool->num_max_cnt == pool_info->table_len) {

If the caller charges 2 and then uncharges 1 two times, the refcnt
underflows?  Why not just track how many usages are zero?

...
> +void rdmacg_uncharge(struct rdma_cgroup *cg,
> +  struct rdmacg_device *device,
> +  int index, int num)
> +{
> + struct rdma_cgroup *p;
> +
> + for (p = cg; p; p = parent_rdmacg(p))
> + uncharge_cg_resource(p, device, index, num);
> +
> + css_put(>css);
> +}
> +EXPORT_SYMBOL(rdmacg_uncharge);

So, I suppose the code is trying to save lock contention with
fine-grained locking; however, as the code is currently structured,
it's just increasing the number of lock ops and it'd be highly likely
to cheaper to simply use a single lock.  If you're working up the tree
grabbing lock at each level, per-node locking doesn't save you
anything.  Also, it introduces conditions where charges are spuriously
denied when there are racing requestors.  If scalability becomes an
issue, the right way to address is adding percpu front cache.

> +void rdmacg_query_limit(struct rdmacg_device *device,
> + int *limits)
> +{
> + struct rdma_cgroup *cg, *p;
> + struct rdmacg_resource_pool *rpool;
> + struct rdmacg_pool_info *pool_info;
> + int i;
> +
> + cg = task_rdmacg(current);
> + pool_info = >pool_info;

Nothing seems to prevent @cg from going away if this races with
@current being migrated to a different cgroup.  Have you run this with
lockdep and rcu debugging enabled?  This should have triggered a
warning.

...
> 

Re: [PATCH] can: rcar_canfd: Add Renesas R-Car CAN FD driver

2016-03-02 Thread Marc Kleine-Budde
On 03/02/2016 11:08 AM, Ramesh Shanmugasundaram wrote:
 I see no locking for the tx-path.
>>>
>>> I am not sure why locking is needed in tx-path?
>>
>> If the tx-path is shared between both channels you need locking as the
>> networking subsystem may send one frame to each controller at the same
>> time.
> 
> Yes. Tx completion interrupt is shared by both channels but the Tx
> FIFO, echo skbs, head, tail are maintained on a per channel basis. 
> Yes, net subsystem can send one frame to each channel at the same
> time and the tx completion interrupt handler will traverse each
> channel & update per channel context accordingly.

Ok. Which instruction starts the transmit? Please add a comment to the
code. You stop the tx_queue after this, so your code is racy, as the
interrupt might happen in between.

>>> However, looking at it again, I should move the incrementing of head
>>> after the "sts" handing to be apt. What do you think?
>>
>> With one producer (one SW instance) and one consumer (the HW) you can
>> write lockless code (if the HW allows it), but with two producers it's not
>> possible.
> 
> For a channel represented as netdev, there is still one producer (one
> SW instance) isn't it? net acquires lock before calling xmit. The
> consumer is tx completion interrupt handler which updates per channel
> attributes only.

Ok - I haven't looked at the code in depth yet. Now it's clear, that
just the tx interrupt is shared.

> Sorry if I am annoying. I have tested this with two channels
> transmitting & receiving at the same time and it works fine. If I am
> missing lock, I would like to understand it thoroughly before making
> the change.

Marc

-- 
Pengutronix e.K.  | Marc Kleine-Budde   |
Industrial Linux Solutions| Phone: +49-231-2826-924 |
Vertretung West/Dortmund  | Fax:   +49-5121-206917- |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |



signature.asc
Description: OpenPGP digital signature


RE: [PATCH] can: rcar_canfd: Add Renesas R-Car CAN FD driver

2016-03-02 Thread Ramesh Shanmugasundaram
Hi Marc,

> On 03/02/2016 09:41 AM, Ramesh Shanmugasundaram wrote:
> >> Is the channel loopback mode configurable per channel? If so, please
> >> remove the module parameter and make use of CAN_CTRLMODE_LOOPBACK to
> >> configure it.
> >
> > The loopback setting is not truly a per channel attribute. It requires
> > touching Rx rules which can be done only when the controller's global
> > state is reset (during probe).
> > CAN_CTRLMODE_LOOPBACK config option is possible only through .ndo_open
> > of that channel but the global controller state needs to be
> > operational by this time. As it is a global attribute & for the above
> > reason, I choose the module option.
> 
> I see, ok then.
> 
> >>> CAN FD mode supports both Classical CAN & CAN FD frame formats. The
> >>> controller supports ISO 11898-1:2015 CAN FD format only.
> >>>
> >>> This controller supports two channels and the driver can enable
> >>> either or both of the channels.
> >>>
> >>> Driver uses Rx FIFOs (one per channel) for reception & Common FIFOs
> >>> (one per channel) for transmission. Rx filter rules are configured
> >>> to the minimum (one per channel) and it accepts Standard, Extended,
> >>> Data & Remote Frame combinations.
> >>
> >> I see no locking for the tx-path.
> >
> > I am not sure why locking is needed in tx-path?
> 
> If the tx-path is shared between both channels you need locking as the
> networking subsystem may send one frame to each controller at the same
> time.

Yes. Tx completion interrupt is shared by both channels but the Tx FIFO, echo 
skbs, head, tail are maintained on a per channel basis. 
Yes, net subsystem can send one frame to each channel at the same time and the 
tx completion interrupt handler will traverse each channel & update per channel 
context accordingly.

> > However, looking at it again, I should move the incrementing of head
> > after the "sts" handing to be apt. What do you think?
> 
> With one producer (one SW instance) and one consumer (the HW) you can
> write lockless code (if the HW allows it), but with two producers it's not
> possible.

For a channel represented as netdev, there is still one producer (one SW 
instance) isn't it? net acquires lock before calling xmit. The consumer is tx 
completion interrupt handler which updates per channel attributes only.

Sorry if I am annoying. I have tested this with two channels transmitting & 
receiving at the same time and it works fine. If I am missing lock, I would 
like to understand it thoroughly before making the change.

Thanks for the help,
Ramesh.
N�r��yb�X��ǧv�^�)޺{.n�+{�v�"��^n�r���z���h�&���G���h�(�階�ݢj"���m��z�ޖ���f���h���~�m�

RE: [PATCH] can: rcar_canfd: Add Renesas R-Car CAN FD driver

2016-03-02 Thread Ramesh Shanmugasundaram
Hi Marc,

Thanks for the review.

> On 03/01/2016 10:34 AM, Ramesh Shanmugasundaram wrote:
> > This patch adds support for the CAN FD controller found in Renesas
> > R-Car SoCs. The controller operates in CAN FD mode by default. Two
> > test modes are available and can be enabled by the
> > "rcar_canfd.testmode" module parameter. Refer to Documentation/kernel-
> parameters.txt.
> 
> Is the channel loopback mode configurable per channel? If so, please
> remove the module parameter and make use of CAN_CTRLMODE_LOOPBACK to
> configure it.

The loopback setting is not truly a per channel attribute. It requires touching 
Rx rules which can be done only when the controller's global state is reset 
(during probe). CAN_CTRLMODE_LOOPBACK config option is possible only through 
.ndo_open of that channel but the global controller state needs to be 
operational by this time. As it is a global attribute & for the above reason, I 
choose the module option.

> 
> > CAN FD mode supports both Classical CAN & CAN FD frame formats. The
> > controller supports ISO 11898-1:2015 CAN FD format only.
> >
> > This controller supports two channels and the driver can enable either
> > or both of the channels.
> >
> > Driver uses Rx FIFOs (one per channel) for reception & Common FIFOs
> > (one per channel) for transmission. Rx filter rules are configured to
> > the minimum (one per channel) and it accepts Standard, Extended, Data
> > & Remote Frame combinations.
> 
> I see no locking for the tx-path.
 
I am not sure why locking is needed in tx-path? 
However, looking at it again, I should move the incrementing of head after the 
"sts" handing to be apt. What do you think?

Thanks,
Ramesh
N�r��yb�X��ǧv�^�)޺{.n�+{�v�"��^n�r���z���h�&���G���h�(�階�ݢj"���m��z�ޖ���f���h���~�m�

Re: [PATCH v3] tty: serial: meson: Implement earlycon support

2016-03-02 Thread Carlo Caione
On Wed, Mar 2, 2016 at 3:49 AM, Andreas Färber  wrote:
> Split off the bulk of the existing meson_serial_console_write()
> implementation into meson_serial_port_write() for implementing
> meson_serial_early_console_write().
>
> Use "meson" as the earlycon driver name, courtesy of Nicolas.
>
> Signed-off-by: Nicolas Saenz Julienne 
> Signed-off-by: Andreas Färber 

Acked-by: Carlo Caione 

Thanks,

-- 
Carlo Caione
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html