Re: [PATCH trivial] UAPI: Kbuild: add/modify comments for "uapi/Kbuild" and "uapi/linux/Kbuild"

2013-09-02 Thread Chen Gang
Hello Maintainers:

Maybe... I still miss some important mail addresses? or this patch is
not suitable for applying?

Hmm... but I still want to try the last time: "please help check this
patch, when you have time".

And next, I should not send additional tracing mail again, that will be
really spam.


Thanks.

On 08/23/2013 06:30 PM, Chen Gang wrote:
> Hello Maintainers:
> 
> Is this patch suitable for applying ?  Does it belong to 'trivial' (or
> 'Documentation', or others) ?
> 
> 
> And sorry for my original missing some important mail addresses when I
> sent the original patch (I got them by "./scripts/get_maintainers", and
> not give more considerations for them).
> 
> So I append my original patch below, if necessary, please help check
> when you have time, thanks.
> 
> 
> --patch begin---
> 
> "include/uapi/" is the whole Linux kernel API, it is important enough
> to get more global explanations by comments.
> 
> In "include/uapi/Kbuild", "Makefile..." and "non-arch..." comments are
> meaningless for current 'Kbuild', so delete them.
> 
> And add more explanations for "include/uapi/" in "include/uapi/Kbuild",
> also add more explanations for "include/uapi/linux/" in "include/uapi
> /linux/Kbuild".
> 
> 
> Signed-off-by: Chen Gang 
> ---
>  include/uapi/Kbuild   |5 ++---
>  include/uapi/linux/Kbuild |2 ++
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/include/uapi/Kbuild b/include/uapi/Kbuild
> index 81d2106..c682891 100644
> --- a/include/uapi/Kbuild
> +++ b/include/uapi/Kbuild
> @@ -1,7 +1,6 @@
>  # UAPI Header export list
> -# Top-level Makefile calls into asm-$(ARCH)
> -# List only non-arch directories below
> -
> +# Except "linux/", UAPI means Universal API.
> +# For "linux/", UAPI means User API which can be used by user mode.
> 
>  header-y += asm-generic/
>  header-y += linux/
> diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
> index 997f9f2..0025e07 100644
> --- a/include/uapi/linux/Kbuild
> +++ b/include/uapi/linux/Kbuild
> @@ -1,4 +1,6 @@
>  # UAPI Header export list
> +# UAPI is User API which can be used by user mode.
> +
>  header-y += byteorder/
>  header-y += can/
>  header-y += caif/
> 


-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v3 06/35] mm: Add helpers to retrieve node region and zone region for a given page

2013-09-02 Thread Yasuaki Ishimatsu

(2013/08/30 22:15), Srivatsa S. Bhat wrote:

Given a page, we would like to have an efficient mechanism to find out
the node memory region and the zone memory region to which it belongs.

Since the node is assumed to be divided into equal-sized node memory
regions, the node memory region can be obtained by simply right-shifting
the page's pfn by 'MEM_REGION_SHIFT'.

But finding the corresponding zone memory region's index in the zone is
not that straight-forward. To have a O(1) algorithm to find it out, define a
zone_region_idx[] array to store the zone memory region indices for every
node memory region.

To illustrate, consider the following example:

|<--Node-->|
 __
|  Node mem reg 0|  Node mem reg 1 |  (Absolute region
||_|   boundaries)

 __
|ZONE_DMA   |   ZONE_NORMAL|
|   |  |
|<--- ZMR 0 --->|<-ZMR0->|< ZMR 1 >|
|___||_|


In the above figure,

Node mem region 0:
--
This region corresponds to the first zone mem region in ZONE_DMA and also
the first zone mem region in ZONE_NORMAL. Hence its index array would look
like this:
 node_regions[0].zone_region_idx[ZONE_DMA] == 0
 node_regions[0].zone_region_idx[ZONE_NORMAL]  == 0


Node mem region 1:
--
This region corresponds to the second zone mem region in ZONE_NORMAL. Hence
its index array would look like this:
 node_regions[1].zone_region_idx[ZONE_NORMAL]  == 1


Using this index array, we can quickly obtain the zone memory region to
which a given page belongs.

Signed-off-by: Srivatsa S. Bhat 
---

  include/linux/mm.h |   24 
  include/linux/mmzone.h |7 +++
  mm/page_alloc.c|1 +
  3 files changed, 32 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 18fdec4..52329d1 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -723,6 +723,30 @@ static inline struct zone *page_zone(const struct page 
*page)
return _DATA(page_to_nid(page))->node_zones[page_zonenum(page)];
  }

+static inline int page_node_region_id(const struct page *page,
+ const pg_data_t *pgdat)
+{
+   return (page_to_pfn(page) - pgdat->node_start_pfn) >> MEM_REGION_SHIFT;
+}
+
+/**
+ * Return the index of the zone memory region to which the page belongs.
+ *
+ * Given a page, find the absolute (node) memory region as well as the zone to
+ * which it belongs. Then find the region within the zone that corresponds to
+ * that node memory region, and return its index.
+ */
+static inline int page_zone_region_id(const struct page *page)
+{
+   pg_data_t *pgdat = NODE_DATA(page_to_nid(page));
+   enum zone_type z_num = page_zonenum(page);
+   unsigned long node_region_idx;
+
+   node_region_idx = page_node_region_id(page, pgdat);
+
+   return pgdat->node_regions[node_region_idx].zone_region_idx[z_num];
+}
+
  #ifdef SECTION_IN_PAGE_FLAGS
  static inline void set_page_section(struct page *page, unsigned long section)
  {
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 010ab5b..76d9ed2 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -726,6 +726,13 @@ struct node_mem_region {
unsigned long end_pfn;
unsigned long present_pages;
unsigned long spanned_pages;



+
+   /*
+* A physical (node) region could be split across multiple zones.
+* Store the indices of the corresponding regions of each such
+* zone for this physical (node) region.
+*/
+   int zone_region_idx[MAX_NR_ZONES];


You should initialize the zone_region_id[] as negative value.
If the zone_region_id is initialized as 0, region 0 belongs to all zones.

Thanks,
Yasuaki Ishimatsu



struct pglist_data *pgdat;
  };

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 05cedbb..8ffd47b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4877,6 +4877,7 @@ static void __meminit init_zone_memory_regions(struct 
pglist_data *pgdat)
zone_region->present_pages =
zone_region->spanned_pages - absent;

+   node_region->zone_region_idx[zone_idx(z)] = idx;
idx++;
}





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] rbtree: Add some necessary condition checks

2013-09-02 Thread Michel Lespinasse
On Mon, Sep 2, 2013 at 9:45 PM, Zhi Yong Wu  wrote:
> On Mon, Sep 2, 2013 at 4:57 PM, Michel Lespinasse  wrote:
>> Thanks for the link - I now better understand where you are coming
>> from with these fixes.
>>
>> Going back to the original message:
>>
>>> diff --git a/include/linux/rbtree_augmented.h 
>>> b/include/linux/rbtree_augmented.h
>>> index fea49b5..7d19770 100644
>>> --- a/include/linux/rbtree_augmented.h
>>> +++ b/include/linux/rbtree_augmented.h
>>> @@ -199,7 +199,8 @@ __rb_erase_augmented(struct rb_node *node, struct 
>>> rb_root *root,
>>> }
>>>
>>> successor->rb_left = tmp = node->rb_left;
>>> -   rb_set_parent(tmp, successor);
>>> +   if (tmp)
>>> +   rb_set_parent(tmp, successor);
>>>
>>> pc = node->__rb_parent_color;
>>> tmp = __rb_parent(pc);
>>
>> Note that node->rb_left was already fetched at the top of
>> __rb_erase_augmented(), and was checked to be non-NULL at the time -
>> otherwise we would have executed 'Case 1' in that function. So, you
> If 'Case 1' is executed, this line of code is also done, how about the result?
> 'Case 1' seems *not* to change node->rb_left at all.

Wait, I believe this line of code is executed only in Case 2 and Case 3 ?

>>> diff --git a/lib/rbtree.c b/lib/rbtree.c
>>> index c0e31fe..2cb01ba 100644
>>> --- a/lib/rbtree.c
>>> +++ b/lib/rbtree.c
>>> @@ -214,7 +214,7 @@ rb_erase_color(struct rb_node *parent, struct 
>>> rb_root *root,
>>>  */
>>> sibling = parent->rb_right;
>>> if (node != sibling) {  /* node == parent->rb_left */
>>> -   if (rb_is_red(sibling)) {
>>> +   if (sibling && rb_is_red(sibling)) {
>>> /*
>>>  * Case 1 - left rotate at parent
>>>  *
>>
>> Note the loop invariants quoted just above:
>>
>> /*
>>  * Loop invariants:
>>  * - node is black (or NULL on first iteration)
>>  * - node is not the root (parent is not NULL)
>>  * - All leaf paths going through parent and node have a
>>  *   black node count that is 1 lower than other leaf paths.
>>  */
>>
>> Because of these, each path from sibling to a leaf must include at
>> least one black node, which implies that sibling can't be NULL - or to
>> put it another way, if sibling is null then the expected invariants
>> were violated before we even got there.
> In theory, i can understand what you mean, But don't know why and
> where it got violated.

Same here. My point is, I don't think we can fix the issue without
answering that question.

>> Now I had a quick look at your code and I couldn't tell at which point
>> the invariants are violated. However I did notice a couple suspicious
>> things in the very first patch
>> (f5c8f2b256d87ac0bf789a787e6b795ac0c736e8):
>>
>> 1- In both hot_range_tree_free() and and hot_tree_exit(), you try to
>> destroy rb trees by iterating on each node with rb_next() and then
> yes, but this item may not been freed immediately, You can know each item
> has its ref count.

Are items guaranteed to have another refcount than the one we're dropping ?

>> freeing them. Note that rb_next() can reference prior nodes, which
>> have already been freed in your scheme, so that seems quite unsafe.
> I checked rb_next() function, and found that if its prior nodes are
> freed, is this node's parent  not NULL?

No, if the parent was freed with just a put() operation, the child
will still have a pointer to it. This is why I suggested using
rb_erase() on each node before freeing them, so that we don't keep
pointers to freed nodes.

>> The simplest fix would be to do a full rb_erase() on each node before
> full rb_erase()? sorry, i don't get what you mean. Do you mean we
> should erase all nodes from rbtree, then begin to free them? If yes,
> how to iterate them? If no, can you elaborate it?

No, I meant to call rb_erase() on each individual node right before
the corresponding put() operation.

>> 2- I did not look long enough to understand the locking, but it wasn't
>> clear to me if you lock the rbtrees when doing rb_erase() on them
>> (while I could more clearly see that you do it for insertions).
> Yes, it get locking when doing rb_erase() or rb_insert(). You can see
> there are multiple functions maybe rbtree at the same time. To sync
> them, we need to lock the rbtree.

Yes, agree we need to lock rbtree in all such operations. I just
wasn't able to determine if it's done around rb_erase() calls, but it
definitely needs to be.

>> I'm really not sure if either of these will fix the issues you're
>> seeing, though. What I would try next would be to add explicit rbtree
>> invariant checks before and after rbtree manipulations, like what the
>> check() function does in 

Re: [GIT PULL 00/10] perf/core improvements and fixes

2013-09-02 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> From: Arnaldo Carvalho de Melo 
> 
> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> The following changes since commit 7bfb7e6bdd906f11ee9e751b3fec4f4fc728e818:
> 
>   perf: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node() (2013-09-02 
> 08:42:49 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux 
> tags/perf-core-for-mingo
> 
> for you to fetch changes up to 31cd3855c98119cae287b761d8d2e75018714c5d:
> 
>   perf trace: Tell arg formatters the arg index (2013-09-02 16:40:40 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> . 'perf trace' arg formatting improvements to allow masking arguments
>   in syscalls such as futex and open, where the some arguments are
>   ignored and thus should not be printed depending on other args.
> 
> . Beautify futex open, openat, open_by_handle_at, lseek and futex syscalls.
> 
> . Add dummy software event to use when wanting just to keep receiving
>   PERF_RECORD_{MMAP,COMM,etc}, add test for it, from Adrian Hunter.
> 
> . Fix symbol offset computation for some dsos in 'perf script', from David 
> Ahern.
> 
> . Skip unsupported hardware events in 'perf list', from Namhyung Kim.
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Adrian Hunter (3):
>   perf: Add a dummy software event to keep tracking
>   perf tools: Add support for PERF_COUNT_SW_DUMMY
>   perf tests: Add 'keep tracking' test
> 
> Arnaldo Carvalho de Melo (5):
>   perf trace: Allow syscall arg formatters to mask args
>   perf trace: Add beautifier for futex 'operation' parm
>   perf trace: Add beautifier for lseek's whence arg
>   perf trace: Add beautifier for open's flags arg
>   perf trace: Tell arg formatters the arg index
> 
> David Ahern (1):
>   perf tools: Fix symbol offset computation for some dsos
> 
> Namhyung Kim (1):
>   perf list: Skip unsupported events
> 
>  include/uapi/linux/perf_event.h  |   1 +
>  tools/perf/Makefile  |   1 +
>  tools/perf/builtin-trace.c   | 180 
> ---
>  tools/perf/tests/builtin-test.c  |   4 +
>  tools/perf/tests/keep-tracking.c | 154 +
>  tools/perf/tests/tests.h |   1 +
>  tools/perf/util/evlist.c |  42 -
>  tools/perf/util/evlist.h |   5 ++
>  tools/perf/util/evsel.c  |   1 +
>  tools/perf/util/parse-events.c   |  45 +-
>  tools/perf/util/parse-events.l   |   1 +
>  tools/perf/util/python.c |   1 +
>  tools/perf/util/session.c|   1 +
>  tools/perf/util/symbol.c |   5 +-
>  14 files changed, 424 insertions(+), 18 deletions(-)
>  create mode 100644 tools/perf/tests/keep-tracking.c

Pulled, thanks Arnaldo!

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 03/13] tracing/kprobes: Move fetch functions to trace_kprobe.c

2013-09-02 Thread Namhyung Kim
From: Hyeoncheol Lee 

Move kprobes-specific fetch functions to the trace_kprobe.c file.
Also define kprobes_fetch_type_table in the .c file.  This table is
shared with uprobes for now, but the uprobes will get its own table
in the later patch.

This is a preparation for supporting more fetch functions to uprobes
and no functional changes are intended.

Acked-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Hyeoncheol Lee 
[namhy...@kernel.org: Split original patch into pieces as requested]
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_kprobe.c | 169 +
 kernel/trace/trace_probe.c  | 299 +---
 kernel/trace/trace_probe.h  | 132 +++
 3 files changed, 335 insertions(+), 265 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 243f6834d026..1eff166990c2 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -754,6 +754,175 @@ static const struct file_operations kprobe_profile_ops = {
.release= seq_release,
 };
 
+/*
+ * kprobes-specific fetch functions
+ */
+#define DEFINE_FETCH_stack(type)   \
+__kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,  \
+ void *offset, void *dest) \
+{  \
+   *(type *)dest = (type)regs_get_kernel_stack_nth(regs,   \
+   (unsigned int)((unsigned long)offset)); \
+}
+DEFINE_BASIC_FETCH_FUNCS(stack)
+/* No string on the stack entry */
+#define fetch_stack_string NULL
+#define fetch_stack_string_sizeNULL
+
+#define DEFINE_FETCH_memory(type)  \
+__kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs, \
+ void *addr, void *dest)   \
+{  \
+   type retval;\
+   if (probe_kernel_address(addr, retval)) \
+   *(type *)dest = 0;  \
+   else\
+   *(type *)dest = retval; \
+}
+DEFINE_BASIC_FETCH_FUNCS(memory)
+/*
+ * Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max
+ * length and relative data location.
+ */
+__kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
+  void *addr, void *dest)
+{
+   long ret;
+   int maxlen = get_rloc_len(*(u32 *)dest);
+   u8 *dst = get_rloc_data(dest);
+   u8 *src = addr;
+   mm_segment_t old_fs = get_fs();
+
+   if (!maxlen)
+   return;
+
+   /*
+* Try to get string again, since the string can be changed while
+* probing.
+*/
+   set_fs(KERNEL_DS);
+   pagefault_disable();
+
+   do
+   ret = __copy_from_user_inatomic(dst++, src++, 1);
+   while (dst[-1] && ret == 0 && src - (u8 *)addr < maxlen);
+
+   dst[-1] = '\0';
+   pagefault_enable();
+   set_fs(old_fs);
+
+   if (ret < 0) {  /* Failed to fetch string */
+   ((u8 *)get_rloc_data(dest))[0] = '\0';
+   *(u32 *)dest = make_data_rloc(0, get_rloc_offs(*(u32 *)dest));
+   } else {
+   *(u32 *)dest = make_data_rloc(src - (u8 *)addr,
+ get_rloc_offs(*(u32 *)dest));
+   }
+}
+
+/* Return the length of string -- including null terminal byte */
+__kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
+   void *addr, void *dest)
+{
+   mm_segment_t old_fs;
+   int ret, len = 0;
+   u8 c;
+
+   old_fs = get_fs();
+   set_fs(KERNEL_DS);
+   pagefault_disable();
+
+   do {
+   ret = __copy_from_user_inatomic(, (u8 *)addr + len, 1);
+   len++;
+   } while (c && ret == 0 && len < MAX_STRING_SIZE);
+
+   pagefault_enable();
+   set_fs(old_fs);
+
+   if (ret < 0)/* Failed to check the length */
+   *(u32 *)dest = 0;
+   else
+   *(u32 *)dest = len;
+}
+
+/* Memory fetching by symbol */
+struct symbol_cache {
+   char*symbol;
+   longoffset;
+   unsigned long   addr;
+};
+
+unsigned long update_symbol_cache(struct symbol_cache *sc)
+{
+   sc->addr = (unsigned long)kallsyms_lookup_name(sc->symbol);
+
+   if (sc->addr)
+   sc->addr += sc->offset;
+
+   return sc->addr;
+}
+
+void free_symbol_cache(struct symbol_cache *sc)
+{
+   kfree(sc->symbol);
+   kfree(sc);
+}
+
+struct 

[PATCH 04/13] tracing/kprobes: Add fetch{,_size} member into deref fetch method

2013-09-02 Thread Namhyung Kim
From: Hyeoncheol Lee 

The deref fetch methods access a memory region but it assumes that
it's a kernel memory since uprobes does not support them.

Add ->fetch and ->fetch_size member in order to provide a proper
access methods for supporting uprobes.

Acked-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Hyeoncheol Lee 
[namhy...@kernel.org: Split original patch into pieces as requested]
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_probe.c | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 41f654d24cd9..b7b8bda02d6e 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -97,6 +97,8 @@ DEFINE_BASIC_FETCH_FUNCS(retval)
 struct deref_fetch_param {
struct fetch_param  orig;
longoffset;
+   fetch_func_tfetch;
+   fetch_func_tfetch_size;
 };
 
 #define DEFINE_FETCH_deref(type)   \
@@ -108,13 +110,26 @@ __kprobes void FETCH_FUNC_NAME(deref, type)(struct 
pt_regs *regs, \
call_fetch(>orig, regs, );   \
if (addr) { \
addr += dprm->offset;   \
-   fetch_memory_##type(regs, (void *)addr, dest);  \
+   dprm->fetch(regs, (void *)addr, dest);  \
} else  \
*(type *)dest = 0;  \
 }
 DEFINE_BASIC_FETCH_FUNCS(deref)
 DEFINE_FETCH_deref(string)
-DEFINE_FETCH_deref(string_size)
+
+__kprobes void FETCH_FUNC_NAME(deref, string_size)(struct pt_regs *regs,
+  void *data, void *dest)
+{
+   struct deref_fetch_param *dprm = data;
+   unsigned long addr;
+
+   call_fetch(>orig, regs, );
+   if (addr && dprm->fetch_size) {
+   addr += dprm->offset;
+   dprm->fetch_size(regs, (void *)addr, dest);
+   } else
+   *(string_size *)dest = 0;
+}
 
 static __kprobes void update_deref_fetch_param(struct deref_fetch_param *data)
 {
@@ -394,6 +409,9 @@ static int parse_probe_arg(char *arg, const struct 
fetch_type *t,
return -ENOMEM;
 
dprm->offset = offset;
+   dprm->fetch = t->fetch[FETCH_MTD_memory];
+   dprm->fetch_size = get_fetch_size_function(t,
+   dprm->fetch, ttbl);
ret = parse_probe_arg(arg, t2, >orig, is_return,
is_kprobe);
if (ret)
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/13] tracing/probes: Fix basic print type functions

2013-09-02 Thread Namhyung Kim
From: Namhyung Kim 

The print format of s32 type was "ld" and it's casted to "long".  So
it turned out to print 4294967295 for "-1" on 64-bit systems.  Not
sure whether it worked well on 32-bit systems.

Anyway, it'd be better if we have exact format and type cast for each
types on both of 32- and 64-bit systems.  In fact, the only difference
is on s64/u64 types.

Acked-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_probe.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 412e959709b4..b571e4de0769 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -49,14 +49,19 @@ static __kprobes int PRINT_TYPE_FUNC_NAME(type)(struct 
trace_seq *s,\
 }  \
 static const char PRINT_TYPE_FMT_NAME(type)[] = fmt;
 
-DEFINE_BASIC_PRINT_TYPE_FUNC(u8, "%x", unsigned int)
-DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "%x", unsigned int)
-DEFINE_BASIC_PRINT_TYPE_FUNC(u32, "%lx", unsigned long)
+DEFINE_BASIC_PRINT_TYPE_FUNC(u8 , "%x", unsigned char)
+DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "%x", unsigned short)
+DEFINE_BASIC_PRINT_TYPE_FUNC(u32, "%x", unsigned int)
+DEFINE_BASIC_PRINT_TYPE_FUNC(s8,  "%d", signed char)
+DEFINE_BASIC_PRINT_TYPE_FUNC(s16, "%d", short)
+DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%d", int)
+#if BITS_PER_LONG == 32
 DEFINE_BASIC_PRINT_TYPE_FUNC(u64, "%llx", unsigned long long)
-DEFINE_BASIC_PRINT_TYPE_FUNC(s8, "%d", int)
-DEFINE_BASIC_PRINT_TYPE_FUNC(s16, "%d", int)
-DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%ld", long)
 DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%lld", long long)
+#else /* BITS_PER_LONG == 64 */
+DEFINE_BASIC_PRINT_TYPE_FUNC(u64, "%lx", unsigned long)
+DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%ld", long)
+#endif
 
 static inline void *get_rloc_data(u32 *dl)
 {
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 01/13] tracing/uprobes: Fix documentation of uprobe registration syntax

2013-09-02 Thread Namhyung Kim
From: Namhyung Kim 

The uprobe syntax requires an offset after a file path not a symbol.

Reviewed-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 Documentation/trace/uprobetracer.txt | 10 +-
 kernel/trace/trace_uprobe.c  |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/Documentation/trace/uprobetracer.txt 
b/Documentation/trace/uprobetracer.txt
index d9c3e682312c..8f1a8b8956fc 100644
--- a/Documentation/trace/uprobetracer.txt
+++ b/Documentation/trace/uprobetracer.txt
@@ -19,15 +19,15 @@ user to calculate the offset of the probepoint in the 
object.
 
 Synopsis of uprobe_tracer
 -
-  p[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a uprobe
-  r[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a return uprobe 
(uretprobe)
-  -:[GRP/]EVENT  : Clear uprobe or uretprobe 
event
+  p[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS] : Set a uprobe
+  r[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS] : Set a return uprobe (uretprobe)
+  -:[GRP/]EVENT   : Clear uprobe or uretprobe event
 
   GRP   : Group name. If omitted, "uprobes" is the default value.
   EVENT : Event name. If omitted, the event name is generated based
-  on SYMBOL+offs.
+  on PATH+OFFSET.
   PATH  : Path to an executable or a library.
-  SYMBOL[+offs] : Symbol+offset where the probe is inserted.
+  OFFSET: Offset where the probe is inserted.
 
   FETCHARGS : Arguments. Each probe can have up to 128 args.
%REG : Fetch register REG
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 272261b5f94f..a415c5867ec5 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -210,7 +210,7 @@ end:
 
 /*
  * Argument syntax:
- *  - Add uprobe: p|r[:[GRP/]EVENT] PATH:SYMBOL [FETCHARGS]
+ *  - Add uprobe: p|r[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS]
  *
  *  - Remove uprobe: -:[GRP/]EVENT
  */
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/13] tracing/uprobes: Fetch args before reserving a ring buffer

2013-09-02 Thread Namhyung Kim
From: Namhyung Kim 

Fetching from user space should be done in a non-atomic context.  So
use a per-cpu buffer and copy its content to the ring buffer
atomically.  Note that we can migrate during accessing user memory
thus use a per-cpu mutex to protect concurrent accesses.

This is needed since we'll be able to fetch args from an user memory
which can be swapped out.  Before that uprobes could fetch args from
registers only which saved in a kernel space.

While at it, use __get_data_size() and store_trace_args() to reduce
code duplication.

Cc: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_uprobe.c | 97 +
 1 file changed, 81 insertions(+), 16 deletions(-)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 9f2d12d2311d..9ede401759ab 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -530,21 +530,46 @@ static const struct file_operations uprobe_profile_ops = {
.release= seq_release,
 };
 
+static atomic_t uprobe_buffer_ref = ATOMIC_INIT(0);
+static void __percpu *uprobe_cpu_buffer;
+static DEFINE_PER_CPU(struct mutex, uprobe_cpu_mutex);
+
 static void uprobe_trace_print(struct trace_uprobe *tu,
unsigned long func, struct pt_regs *regs)
 {
struct uprobe_trace_entry_head *entry;
struct ring_buffer_event *event;
struct ring_buffer *buffer;
-   void *data;
-   int size, i;
+   struct mutex *mutex;
+   void *data, *arg_buf;
+   int size, dsize, esize;
+   int cpu;
struct ftrace_event_call *call = >p.call;
 
-   size = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
+   dsize = __get_data_size(>p, regs);
+   esize = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
+
+   if (WARN_ON_ONCE(!uprobe_cpu_buffer || tu->p.size + dsize > PAGE_SIZE))
+   return;
+
+   cpu = raw_smp_processor_id();
+   mutex = _cpu(uprobe_cpu_mutex, cpu);
+   arg_buf = per_cpu_ptr(uprobe_cpu_buffer, cpu);
+
+   /*
+* Use per-cpu buffers for fastest access, but we might migrate
+* so the mutex makes sure we have sole access to it.
+*/
+   mutex_lock(mutex);
+   store_trace_args(esize, >p, regs, arg_buf, dsize);
+
+   size = esize + tu->p.size + dsize;
event = trace_current_buffer_lock_reserve(, call->event.type,
- size + tu->p.size, 0, 0);
-   if (!event)
+ size, 0, 0);
+   if (!event) {
+   mutex_unlock(mutex);
return;
+   }
 
entry = ring_buffer_event_data(event);
if (is_ret_probe(tu)) {
@@ -556,13 +581,12 @@ static void uprobe_trace_print(struct trace_uprobe *tu,
data = DATAOF_TRACE_ENTRY(entry, false);
}
 
-   for (i = 0; i < tu->p.nr_args; i++) {
-   call_fetch(>p.args[i].fetch, regs,
-  data + tu->p.args[i].offset);
-   }
+   memcpy(data, arg_buf, tu->p.size + dsize);
 
if (!filter_current_check_discard(buffer, call, entry, event))
trace_buffer_unlock_commit(buffer, event, 0, 0);
+
+   mutex_unlock(mutex);
 }
 
 /* uprobe handler */
@@ -630,6 +654,17 @@ probe_event_enable(struct trace_uprobe *tu, int flag, 
filter_func_t filter)
if (trace_probe_is_enabled(>p))
return -EINTR;
 
+   if (atomic_inc_return(_buffer_ref) == 1) {
+   int cpu;
+
+   uprobe_cpu_buffer = __alloc_percpu(PAGE_SIZE, PAGE_SIZE);
+   if (uprobe_cpu_buffer == NULL)
+   return -ENOMEM;
+
+   for_each_possible_cpu(cpu)
+   mutex_init(_cpu(uprobe_cpu_mutex, cpu));
+   }
+
WARN_ON(!uprobe_filter_is_empty(>filter));
 
tu->p.flags |= flag;
@@ -646,6 +681,11 @@ static void probe_event_disable(struct trace_uprobe *tu, 
int flag)
if (!trace_probe_is_enabled(>p))
return;
 
+   if (atomic_dec_and_test(_buffer_ref)) {
+   free_percpu(uprobe_cpu_buffer);
+   uprobe_cpu_buffer = NULL;
+   }
+
WARN_ON(!uprobe_filter_is_empty(>filter));
 
uprobe_unregister(tu->inode, tu->offset, >consumer);
@@ -776,11 +816,33 @@ static void uprobe_perf_print(struct trace_uprobe *tu,
struct ftrace_event_call *call = >p.call;
struct uprobe_trace_entry_head *entry;
struct hlist_head *head;
-   void *data;
-   int size, rctx, i;
+   struct mutex *mutex;
+   void *data, *arg_buf;
+   int size, dsize, esize;
+   int cpu;
+   int rctx;
 
-   size = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
-   size = ALIGN(size + tu->p.size + sizeof(u32), sizeof(u64)) - 
sizeof(u32);
+   dsize = __get_data_size(>p, regs);
+   

[PATCH 05/13] tracing/kprobes: Staticize stack and memory fetch functions

2013-09-02 Thread Namhyung Kim
From: Namhyung Kim 

Those fetch functions need to be implemented differently for kprobes
and uprobes.  Since the deref fetch functions don't call those
directly anymore, we can make them static and implement them
separately.

Acked-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_kprobe.c | 8 
 kernel/trace/trace_probe.h  | 8 
 2 files changed, 4 insertions(+), 12 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 1eff166990c2..fdb6dec11592 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -758,7 +758,7 @@ static const struct file_operations kprobe_profile_ops = {
  * kprobes-specific fetch functions
  */
 #define DEFINE_FETCH_stack(type)   \
-__kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,  \
+static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
  void *offset, void *dest) \
 {  \
*(type *)dest = (type)regs_get_kernel_stack_nth(regs,   \
@@ -770,7 +770,7 @@ DEFINE_BASIC_FETCH_FUNCS(stack)
 #define fetch_stack_string_sizeNULL
 
 #define DEFINE_FETCH_memory(type)  \
-__kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs, \
+static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
  void *addr, void *dest)   \
 {  \
type retval;\
@@ -784,7 +784,7 @@ DEFINE_BASIC_FETCH_FUNCS(memory)
  * Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max
  * length and relative data location.
  */
-__kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
+static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
   void *addr, void *dest)
 {
long ret;
@@ -821,7 +821,7 @@ __kprobes void FETCH_FUNC_NAME(memory, string)(struct 
pt_regs *regs,
 }
 
 /* Return the length of string -- including null terminal byte */
-__kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
+static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs 
*regs,
void *addr, void *dest)
 {
mm_segment_t old_fs;
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 8c62746e5419..9ac7bdf607cc 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -177,18 +177,10 @@ DECLARE_BASIC_FETCH_FUNCS(reg);
 #define fetch_reg_string   NULL
 #define fetch_reg_string_size  NULL
 
-DECLARE_BASIC_FETCH_FUNCS(stack);
-#define fetch_stack_string NULL
-#define fetch_stack_string_sizeNULL
-
 DECLARE_BASIC_FETCH_FUNCS(retval);
 #define fetch_retval_stringNULL
 #define fetch_retval_string_size   NULL
 
-DECLARE_BASIC_FETCH_FUNCS(memory);
-DECLARE_FETCH_FUNC(memory, string);
-DECLARE_FETCH_FUNC(memory, string_size);
-
 DECLARE_BASIC_FETCH_FUNCS(symbol);
 DECLARE_FETCH_FUNC(symbol, string);
 DECLARE_FETCH_FUNC(symbol, string_size);
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/13] tracing/kprobes: Add priv argument to fetch functions

2013-09-02 Thread Namhyung Kim
From: Namhyung Kim 

This argument is for passing private data structure to each fetch
function and will be used by uprobes.

Acked-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_kprobe.c | 32 ++--
 kernel/trace/trace_probe.c  | 24 
 kernel/trace/trace_probe.h  | 19 ++-
 kernel/trace/trace_uprobe.c |  8 
 4 files changed, 44 insertions(+), 39 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 3159b114f215..c0f4c2dbdbb1 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -745,7 +745,7 @@ static const struct file_operations kprobe_profile_ops = {
  */
 #define DEFINE_FETCH_stack(type)   \
 static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
- void *offset, void *dest) \
+ void *offset, void *dest, void *priv) \
 {  \
*(type *)dest = (type)regs_get_kernel_stack_nth(regs,   \
(unsigned int)((unsigned long)offset)); \
@@ -757,7 +757,7 @@ DEFINE_BASIC_FETCH_FUNCS(stack)
 
 #define DEFINE_FETCH_memory(type)  \
 static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
- void *addr, void *dest)   \
+   void *addr, void *dest, void *priv) \
 {  \
type retval;\
if (probe_kernel_address(addr, retval)) \
@@ -771,7 +771,7 @@ DEFINE_BASIC_FETCH_FUNCS(memory)
  * length and relative data location.
  */
 static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
-  void *addr, void *dest)
+   void *addr, void *dest, void *priv)
 {
long ret;
int maxlen = get_rloc_len(*(u32 *)dest);
@@ -808,7 +808,7 @@ static __kprobes void FETCH_FUNC_NAME(memory, 
string)(struct pt_regs *regs,
 
 /* Return the length of string -- including null terminal byte */
 static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs 
*regs,
-   void *addr, void *dest)
+  void *addr, void *dest, void *priv)
 {
mm_segment_t old_fs;
int ret, len = 0;
@@ -879,11 +879,11 @@ struct symbol_cache *alloc_symbol_cache(const char *sym, 
long offset)
 
 #define DEFINE_FETCH_symbol(type)  \
 __kprobes void FETCH_FUNC_NAME(symbol, type)(struct pt_regs *regs, \
- void *data, void *dest)   \
+   void *data, void *dest, void *priv) \
 {  \
struct symbol_cache *sc = data; \
if (sc->addr)   \
-   fetch_memory_##type(regs, (void *)sc->addr, dest);  \
+   fetch_memory_##type(regs, (void *)sc->addr, dest, priv);\
else\
*(type *)dest = 0;  \
 }
@@ -929,7 +929,7 @@ __kprobe_trace_func(struct trace_kprobe *tp, struct pt_regs 
*regs,
local_save_flags(irq_flags);
pc = preempt_count();
 
-   dsize = __get_data_size(>p, regs);
+   dsize = __get_data_size(>p, regs, NULL);
size = sizeof(*entry) + tp->p.size + dsize;
 
event = trace_event_buffer_lock_reserve(, ftrace_file,
@@ -940,7 +940,8 @@ __kprobe_trace_func(struct trace_kprobe *tp, struct pt_regs 
*regs,
 
entry = ring_buffer_event_data(event);
entry->ip = (unsigned long)tp->rp.kp.addr;
-   store_trace_args(sizeof(*entry), >p, regs, (u8 *)[1], dsize);
+   store_trace_args(sizeof(*entry), >p, regs, (u8 *)[1], dsize,
+NULL);
 
if (!filter_current_check_discard(buffer, call, entry, event))
trace_buffer_unlock_commit_regs(buffer, event,
@@ -977,7 +978,7 @@ __kretprobe_trace_func(struct trace_kprobe *tp, struct 
kretprobe_instance *ri,
local_save_flags(irq_flags);
pc = preempt_count();
 
-   dsize = __get_data_size(>p, regs);
+   dsize = __get_data_size(>p, regs, NULL);
size = sizeof(*entry) + tp->p.size + dsize;
 
event = trace_event_buffer_lock_reserve(, ftrace_file,
@@ -989,7 +990,8 @@ __kretprobe_trace_func(struct trace_kprobe *tp, struct 

[PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v5)

2013-09-02 Thread Namhyung Kim
Hello,

This patchset implements memory (address), stack[N], deference,
bitfield and retval (it needs uretprobe tho) fetch methods for
uprobes.  It's based on the previous work [1] done by Hyeoncheol Lee.

Now kprobes and uprobes have their own fetch_type_tables and, in turn,
memory and stack access methods.  Other fetch methods are shared.

For the dereference method, I added a new argument to fetch functions.
It's because for uprobes it needs to know whether the given address is
a file offset or a virtual address in an user process.  For instance,
in case of fetching from a memory directly (like @offset) it should
convert the address (offset) to a virtual address of the process, but
if it's a dereferencing, the given address already has the virtual
address.

To determine this in a fetch function, I passed a pointer to
trace_uprobe for direct fetch, and passed NULL for dereference.

The patch 1-2 are bug fixes and can be applied independently.

Please look at patch 10 that uses per-cpu buffer for accessing user
memory as suggested by Steven.  While I tried hard not to mess things
up there might be a chance I did something horrible.  It'd be great if
you guys take a look and give comments.


 * v5 changes:
  - use user_stack_pointer() instead of GET_USP()
  - fix a bug in 'stack' fetch method of uprobes

 * v4 changes:
  - add Ack's from Masami
  - rearrange patches to make it easy for simple fixes to be applied
  - update documentation
  - use per-cpu buffer for storing args (thanks to Steve!)


[1] https://lkml.org/lkml/2012/11/14/84

A simple example:

  # cat foo.c
  int glob = -1;
  char str[] = "hello uprobe.";

  struct foo {
unsigned int unused: 2;
unsigned int foo: 20;
unsigned int bar: 10;
  } foo = {
.foo = 5,
  };

  int main(int argc, char *argv[])
  {
long local = 0x1234;

return 127;
  }

  # gcc -o foo -g foo.c

  # objdump -d foo | grep -A9 -F ''
  004004b0 :
4004b0: 55  push   %rbp
4004b1: 48 89 e5mov%rsp,%rbp
4004b4: 89 7d ecmov%edi,-0x14(%rbp)
4004b7: 48 89 75 e0 mov%rsi,-0x20(%rbp)
4004bb: 48 c7 45 f8 34 12 00movq   $0x1234,-0x8(%rbp)
4004c2: 00 
4004c3: b8 7f 00 00 00  mov$0x7f,%eax
4004c8: 5d  pop%rbp
4004c9: c3  retq   

  # nm foo | grep -e glob$ -e str -e foo
  006008bc D foo
  006008a8 D glob
  006008ac D str

  # perf probe -x /home/namhyung/tmp/foo -a 'foo=main+0x13 glob=@0x8a8:s32 \
  > str=@0x8ac:string bit=@0x8bc:b10@2/32 argc=%di local=-0x8(%bp)'
  Added new event:
probe_foo:foo  (on 0x4c3 with glob=@0x8a8:s32 str=@0x8ac:string 
 bit=@0x8bc:b10@2/32 argc=%di local=-0x8(%bp))

  You can now use it in all perf tools, such as:

  perf record -e probe_foo:foo -aR sleep 1

  # perf record -e probe_foo:foo ./foo
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.001 MB perf.data (~33 samples) ]

  # perf script | grep -v ^#
   foo  2008 [002  2199.867154: probe_foo:foo (4004c3)
   glob=-1 str="hello uprobe." bit=5 argc=1 local=1234


This patchset is based on the current for-next branch of the Steven
Rostedt's linux-trace tree.  I also put this on my 'uprobe/fetch-v5'
branch in my tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git


Any comments are welcome, thanks.
Namhyung


Cc: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 


Hyeoncheol Lee (2):
  tracing/kprobes: Move fetch functions to trace_kprobe.c
  tracing/kprobes: Add fetch{,_size} member into deref fetch method

Namhyung Kim (11):
  tracing/uprobes: Fix documentation of uprobe registration syntax
  tracing/probes: Fix basic print type functions
  tracing/kprobes: Staticize stack and memory fetch functions
  tracing/kprobes: Factor out struct trace_probe
  tracing/uprobes: Convert to struct trace_probe
  tracing/kprobes: Move common functions to trace_probe.h
  tracing/kprobes: Integrate duplicate set_print_fmt()
  tracing/uprobes: Fetch args before reserving a ring buffer
  tracing/kprobes: Add priv argument to fetch functions
  tracing/uprobes: Add more fetch functions
  tracing/uprobes: Add support for full argument access methods

 Documentation/trace/uprobetracer.txt |  35 +-
 kernel/trace/trace_kprobe.c  | 642 +++
 kernel/trace/trace_probe.c   | 453 +---
 kernel/trace/trace_probe.h   | 202 ++-
 kernel/trace/trace_uprobe.c  | 457 +
 5 files changed, 1062 insertions(+), 727 deletions(-)

-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo 

[PATCH 07/13] tracing/uprobes: Convert to struct trace_probe

2013-09-02 Thread Namhyung Kim
From: Namhyung Kim 

Convert struct trace_uprobe to make use of the common trace_probe
structure.

Reviewed-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_uprobe.c | 151 ++--
 1 file changed, 75 insertions(+), 76 deletions(-)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index a415c5867ec5..abb95529d851 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -51,22 +51,17 @@ struct trace_uprobe_filter {
  */
 struct trace_uprobe {
struct list_headlist;
-   struct ftrace_event_class   class;
-   struct ftrace_event_callcall;
struct trace_uprobe_filter  filter;
struct uprobe_consumer  consumer;
struct inode*inode;
char*filename;
unsigned long   offset;
unsigned long   nhit;
-   unsigned intflags;  /* For TP_FLAG_* */
-   ssize_t size;   /* trace entry size */
-   unsigned intnr_args;
-   struct probe_argargs[];
+   struct trace_probe  p;
 };
 
-#define SIZEOF_TRACE_UPROBE(n) \
-   (offsetof(struct trace_uprobe, args) +  \
+#define SIZEOF_TRACE_UPROBE(n) \
+   (offsetof(struct trace_uprobe, p.args) +\
(sizeof(struct probe_arg) * (n)))
 
 static int register_uprobe_event(struct trace_uprobe *tu);
@@ -114,13 +109,13 @@ alloc_trace_uprobe(const char *group, const char *event, 
int nargs, bool is_ret)
if (!tu)
return ERR_PTR(-ENOMEM);
 
-   tu->call.class = >class;
-   tu->call.name = kstrdup(event, GFP_KERNEL);
-   if (!tu->call.name)
+   tu->p.call.class = >p.class;
+   tu->p.call.name = kstrdup(event, GFP_KERNEL);
+   if (!tu->p.call.name)
goto error;
 
-   tu->class.system = kstrdup(group, GFP_KERNEL);
-   if (!tu->class.system)
+   tu->p.class.system = kstrdup(group, GFP_KERNEL);
+   if (!tu->p.class.system)
goto error;
 
INIT_LIST_HEAD(>list);
@@ -131,7 +126,7 @@ alloc_trace_uprobe(const char *group, const char *event, 
int nargs, bool is_ret)
return tu;
 
 error:
-   kfree(tu->call.name);
+   kfree(tu->p.call.name);
kfree(tu);
 
return ERR_PTR(-ENOMEM);
@@ -141,12 +136,12 @@ static void free_trace_uprobe(struct trace_uprobe *tu)
 {
int i;
 
-   for (i = 0; i < tu->nr_args; i++)
-   traceprobe_free_probe_arg(>args[i]);
+   for (i = 0; i < tu->p.nr_args; i++)
+   traceprobe_free_probe_arg(>p.args[i]);
 
iput(tu->inode);
-   kfree(tu->call.class->system);
-   kfree(tu->call.name);
+   kfree(tu->p.call.class->system);
+   kfree(tu->p.call.name);
kfree(tu->filename);
kfree(tu);
 }
@@ -156,8 +151,8 @@ static struct trace_uprobe *find_probe_event(const char 
*event, const char *grou
struct trace_uprobe *tu;
 
list_for_each_entry(tu, _list, list)
-   if (strcmp(tu->call.name, event) == 0 &&
-   strcmp(tu->call.class->system, group) == 0)
+   if (strcmp(tu->p.call.name, event) == 0 &&
+   strcmp(tu->p.call.class->system, group) == 0)
return tu;
 
return NULL;
@@ -186,7 +181,7 @@ static int register_trace_uprobe(struct trace_uprobe *tu)
mutex_lock(_lock);
 
/* register as an event */
-   old_tp = find_probe_event(tu->call.name, tu->call.class->system);
+   old_tp = find_probe_event(tu->p.call.name, tu->p.call.class->system);
if (old_tp) {
/* delete old event */
ret = unregister_trace_uprobe(old_tp);
@@ -359,34 +354,36 @@ static int create_trace_uprobe(int argc, char **argv)
/* parse arguments */
ret = 0;
for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
+   struct probe_arg *parg = >p.args[i];
+
/* Increment count for freeing args in error case */
-   tu->nr_args++;
+   tu->p.nr_args++;
 
/* Parse argument name */
arg = strchr(argv[i], '=');
if (arg) {
*arg++ = '\0';
-   tu->args[i].name = kstrdup(argv[i], GFP_KERNEL);
+   parg->name = kstrdup(argv[i], GFP_KERNEL);
} else {
arg = argv[i];
/* If argument name is omitted, set "argN" */
snprintf(buf, MAX_EVENT_NAME_LEN, "arg%d", i + 1);
-   tu->args[i].name = kstrdup(buf, GFP_KERNEL);
+   parg->name = kstrdup(buf, 

[PATCH 12/13] tracing/uprobes: Add more fetch functions

2013-09-02 Thread Namhyung Kim
From: Namhyung Kim 

Implement uprobe-specific stack and memory fetch functions and add
them to the uprobes_fetch_type_table.  Other fetch fucntions will be
shared with kprobes.

Original-patch-by: Hyeoncheol Lee 
Reviewed-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_probe.c  |   9 ++-
 kernel/trace/trace_probe.h  |   1 +
 kernel/trace/trace_uprobe.c | 188 +++-
 3 files changed, 192 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index eaee44d5d9d1..70cd3bfde5a6 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -101,6 +101,10 @@ struct deref_fetch_param {
fetch_func_tfetch_size;
 };
 
+/*
+ * For uprobes, it'll get a vaddr from first call_fetch() so pass NULL
+ * as a priv on the second dprm->fetch() not to translate it to vaddr again.
+ */
 #define DEFINE_FETCH_deref(type)   \
 __kprobes void FETCH_FUNC_NAME(deref, type)(struct pt_regs *regs,  \
void *data, void *dest, void *priv) \
@@ -110,13 +114,14 @@ __kprobes void FETCH_FUNC_NAME(deref, type)(struct 
pt_regs *regs, \
call_fetch(>orig, regs, , priv); \
if (addr) { \
addr += dprm->offset;   \
-   dprm->fetch(regs, (void *)addr, dest, priv);\
+   dprm->fetch(regs, (void *)addr, dest, NULL);\
} else  \
*(type *)dest = 0;  \
 }
 DEFINE_BASIC_FETCH_FUNCS(deref)
 DEFINE_FETCH_deref(string)
 
+/* Same as above */
 __kprobes void FETCH_FUNC_NAME(deref, string_size)(struct pt_regs *regs,
void *data, void *dest, void *priv)
 {
@@ -126,7 +131,7 @@ __kprobes void FETCH_FUNC_NAME(deref, string_size)(struct 
pt_regs *regs,
call_fetch(>orig, regs, , priv);
if (addr && dprm->fetch_size) {
addr += dprm->offset;
-   dprm->fetch_size(regs, (void *)addr, dest, priv);
+   dprm->fetch_size(regs, (void *)addr, dest, NULL);
} else
*(string_size *)dest = 0;
 }
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index fc7edf3749ef..b1e7d722c354 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -263,6 +263,7 @@ ASSIGN_FETCH_FUNC(bitfield, ftype), \
 #define NR_FETCH_TYPES 10
 
 extern const struct fetch_type kprobes_fetch_type_table[];
+extern const struct fetch_type uprobes_fetch_type_table[];
 
 static inline __kprobes void call_fetch(struct fetch_param *fprm,
 struct pt_regs *regs, void *dest, void *priv)
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index fc5f8aa62156..1b778bbf5c70 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -530,6 +530,186 @@ static const struct file_operations uprobe_profile_ops = {
.release= seq_release,
 };
 
+#ifdef CONFIG_STACK_GROWSUP
+static unsigned long adjust_stack_addr(unsigned long addr, unsigned n)
+{
+   return addr - (n * sizeof(long));
+}
+
+static bool within_user_stack(struct vm_area_struct *vma, unsigned long addr,
+ unsigned int n)
+{
+   return vma->vm_start <= adjust_stack_addr(addr, n);
+}
+#else
+static unsigned long adjust_stack_addr(unsigned long addr, unsigned n)
+{
+   return addr + (n * sizeof(long));
+}
+
+static bool within_user_stack(struct vm_area_struct *vma, unsigned long addr,
+ unsigned int n)
+{
+   return vma->vm_end >= adjust_stack_addr(addr, n);
+}
+#endif
+
+static unsigned long get_user_stack_nth(struct pt_regs *regs, unsigned int n)
+{
+   struct vm_area_struct *vma;
+   unsigned long addr = user_stack_pointer(regs);
+   bool valid = false;
+   unsigned long ret = 0;
+
+   down_read(>mm->mmap_sem);
+   vma = find_vma(current->mm, addr);
+   if (vma && vma->vm_start <= addr) {
+   if (within_user_stack(vma, addr, n))
+   valid = true;
+   }
+   up_read(>mm->mmap_sem);
+
+   addr = adjust_stack_addr(addr, n);
+
+   if (valid && copy_from_user(, (void __force __user *)addr,
+   sizeof(ret)) == 0)
+   return ret;
+   return 0;
+}
+
+static unsigned long offset_to_vaddr(struct vm_area_struct *vma,
+unsigned long offset)
+{
+   return vma->vm_start + offset - ((loff_t)vma->vm_pgoff << PAGE_SHIFT);
+}
+
+static void __user *get_user_vaddr(unsigned long addr, struct 

[PATCH 13/13] tracing/uprobes: Add support for full argument access methods

2013-09-02 Thread Namhyung Kim
From: Namhyung Kim 

Enable to fetch other types of argument for the uprobes.  IOW, we can
access stack, memory, deref, bitfield and retval from uprobes now.

The format for the argument types are same as kprobes (but @SYMBOL
type is not supported for uprobes), i.e:

  @ADDR   : Fetch memory at ADDR
  $stackN : Fetch Nth entry of stack (N >= 0)
  $stack  : Fetch stack address
  $retval : Fetch return value
  +|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address

Note that the retval only can be used with uretprobes.

Original-patch-by: Hyeoncheol Lee 
Cc: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 Documentation/trace/uprobetracer.txt | 25 +
 kernel/trace/trace_probe.c   | 36 +++-
 2 files changed, 48 insertions(+), 13 deletions(-)

diff --git a/Documentation/trace/uprobetracer.txt 
b/Documentation/trace/uprobetracer.txt
index 8f1a8b8956fc..6e5cff263e2b 100644
--- a/Documentation/trace/uprobetracer.txt
+++ b/Documentation/trace/uprobetracer.txt
@@ -31,6 +31,31 @@ Synopsis of uprobe_tracer
 
   FETCHARGS : Arguments. Each probe can have up to 128 args.
%REG : Fetch register REG
+   @ADDR   : Fetch memory at ADDR (ADDR should be in userspace)
+   $stackN : Fetch Nth entry of stack (N >= 0)
+   $stack  : Fetch stack address.
+   $retval : Fetch return value.(*)
+   +|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(**)
+   NAME=FETCHARG : Set NAME as the argument name of FETCHARG.
+   FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types
+  (u8/u16/u32/u64/s8/s16/s32/s64), "string" and bitfield
+  are supported.
+
+  (*) only for return probe.
+  (**) this is useful for fetching a field of data structures.
+
+Types
+-
+Several types are supported for fetch-args. Uprobe tracer will access memory
+by given type. Prefix 's' and 'u' means those types are signed and unsigned
+respectively. Traced arguments are shown in decimal (signed) or hex (unsigned).
+String type is a special type, which fetches a "null-terminated" string from
+user space.
+Bitfield is another special type, which takes 3 parameters, bit-width, bit-
+offset, and container-size (usually 32). The syntax is;
+
+ b@/
+
 
 Event Profiling
 ---
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 70cd3bfde5a6..8c77825e87e6 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -253,12 +253,18 @@ fail:
 }
 
 /* Special function : only accept unsigned long */
-static __kprobes void fetch_stack_address(struct pt_regs *regs,
+static __kprobes void fetch_kernel_stack_address(struct pt_regs *regs,
  void *dummy, void *dest, void *priv)
 {
*(unsigned long *)dest = kernel_stack_pointer(regs);
 }
 
+static __kprobes void fetch_user_stack_address(struct pt_regs *regs,
+ void *dummy, void *dest, void *priv)
+{
+   *(unsigned long *)dest = user_stack_pointer(regs);
+}
+
 static fetch_func_t get_fetch_size_function(const struct fetch_type *type,
fetch_func_t orig_fn,
const struct fetch_type *ttbl)
@@ -303,7 +309,8 @@ int traceprobe_split_symbol_offset(char *symbol, unsigned 
long *offset)
 #define PARAM_MAX_STACK (THREAD_SIZE / sizeof(unsigned long))
 
 static int parse_probe_vars(char *arg, const struct fetch_type *t,
-   struct fetch_param *f, bool is_return)
+   struct fetch_param *f, bool is_return,
+   bool is_kprobe)
 {
int ret = 0;
unsigned long param;
@@ -315,13 +322,16 @@ static int parse_probe_vars(char *arg, const struct 
fetch_type *t,
ret = -EINVAL;
} else if (strncmp(arg, "stack", 5) == 0) {
if (arg[5] == '\0') {
-   if (strcmp(t->name, DEFAULT_FETCH_TYPE_STR) == 0)
-   f->fn = fetch_stack_address;
+   if (strcmp(t->name, DEFAULT_FETCH_TYPE_STR))
+   return -EINVAL;
+
+   if (is_kprobe)
+   f->fn = fetch_kernel_stack_address;
else
-   ret = -EINVAL;
+   f->fn = fetch_user_stack_address;
} else if (isdigit(arg[5])) {
ret = kstrtoul(arg + 5, 10, );
-   if (ret || param > PARAM_MAX_STACK)
+   if (ret || (is_kprobe && param > PARAM_MAX_STACK))
ret = -EINVAL;
else {
f->fn = t->fetch[FETCH_MTD_stack];
@@ -345,17 +355,13 @@ 

[PATCH 06/13] tracing/kprobes: Factor out struct trace_probe

2013-09-02 Thread Namhyung Kim
From: Namhyung Kim 

There are functions that can be shared to both of kprobes and uprobes.
Separate common data structure to struct trace_probe and use it from
the shared functions.

Acked-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_kprobe.c | 396 +---
 kernel/trace/trace_probe.h  |  20 +++
 2 files changed, 213 insertions(+), 203 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index fdb6dec11592..6d33cfee9448 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -27,18 +27,12 @@
 /**
  * Kprobe event core functions
  */
-struct trace_probe {
+struct trace_kprobe {
struct list_headlist;
struct kretproberp; /* Use rp.kp for kprobe use */
unsigned long   nhit;
-   unsigned intflags;  /* For TP_FLAG_* */
const char  *symbol;/* symbol name */
-   struct ftrace_event_class   class;
-   struct ftrace_event_callcall;
-   struct list_headfiles;
-   ssize_t size;   /* trace entry size */
-   unsigned intnr_args;
-   struct probe_argargs[];
+   struct trace_probe  p;
 };
 
 struct event_file_link {
@@ -46,56 +40,46 @@ struct event_file_link {
struct list_headlist;
 };
 
-#define SIZEOF_TRACE_PROBE(n)  \
-   (offsetof(struct trace_probe, args) +   \
+#define SIZEOF_TRACE_PROBE(n)  \
+   (offsetof(struct trace_kprobe, p.args) +\
(sizeof(struct probe_arg) * (n)))
 
 
-static __kprobes bool trace_probe_is_return(struct trace_probe *tp)
+static __kprobes bool trace_kprobe_is_return(struct trace_kprobe *tk)
 {
-   return tp->rp.handler != NULL;
+   return tk->rp.handler != NULL;
 }
 
-static __kprobes const char *trace_probe_symbol(struct trace_probe *tp)
+static __kprobes const char *trace_kprobe_symbol(struct trace_kprobe *tk)
 {
-   return tp->symbol ? tp->symbol : "unknown";
+   return tk->symbol ? tk->symbol : "unknown";
 }
 
-static __kprobes unsigned long trace_probe_offset(struct trace_probe *tp)
+static __kprobes unsigned long trace_kprobe_offset(struct trace_kprobe *tk)
 {
-   return tp->rp.kp.offset;
+   return tk->rp.kp.offset;
 }
 
-static __kprobes bool trace_probe_is_enabled(struct trace_probe *tp)
+static __kprobes bool trace_kprobe_has_gone(struct trace_kprobe *tk)
 {
-   return !!(tp->flags & (TP_FLAG_TRACE | TP_FLAG_PROFILE));
+   return !!(kprobe_gone(>rp.kp));
 }
 
-static __kprobes bool trace_probe_is_registered(struct trace_probe *tp)
-{
-   return !!(tp->flags & TP_FLAG_REGISTERED);
-}
-
-static __kprobes bool trace_probe_has_gone(struct trace_probe *tp)
-{
-   return !!(kprobe_gone(>rp.kp));
-}
-
-static __kprobes bool trace_probe_within_module(struct trace_probe *tp,
-   struct module *mod)
+static __kprobes bool trace_kprobe_within_module(struct trace_kprobe *tk,
+struct module *mod)
 {
int len = strlen(mod->name);
-   const char *name = trace_probe_symbol(tp);
+   const char *name = trace_kprobe_symbol(tk);
return strncmp(mod->name, name, len) == 0 && name[len] == ':';
 }
 
-static __kprobes bool trace_probe_is_on_module(struct trace_probe *tp)
+static __kprobes bool trace_kprobe_is_on_module(struct trace_kprobe *tk)
 {
-   return !!strchr(trace_probe_symbol(tp), ':');
+   return !!strchr(trace_kprobe_symbol(tk), ':');
 }
 
-static int register_probe_event(struct trace_probe *tp);
-static int unregister_probe_event(struct trace_probe *tp);
+static int register_kprobe_event(struct trace_kprobe *tk);
+static int unregister_kprobe_event(struct trace_kprobe *tk);
 
 static DEFINE_MUTEX(probe_lock);
 static LIST_HEAD(probe_list);
@@ -107,14 +91,14 @@ static int kretprobe_dispatcher(struct kretprobe_instance 
*ri,
 /*
  * Allocate new trace_probe and initialize it (including kprobes).
  */
-static struct trace_probe *alloc_trace_probe(const char *group,
+static struct trace_kprobe *alloc_trace_kprobe(const char *group,
 const char *event,
 void *addr,
 const char *symbol,
 unsigned long offs,
 int nargs, bool is_return)
 {
-   struct trace_probe *tp;
+   struct trace_kprobe *tp;
int ret = -ENOMEM;
 
tp = kzalloc(SIZEOF_TRACE_PROBE(nargs), GFP_KERNEL);
@@ -140,9 +124,9 @@ static struct trace_probe *alloc_trace_probe(const char 
*group,
goto error;
}
 
-   tp->call.class = 

[PATCH 08/13] tracing/kprobes: Move common functions to trace_probe.h

2013-09-02 Thread Namhyung Kim
From: Namhyung Kim 

The __get_data_size() and store_trace_args() will be used by uprobes
too.  Move them to a common location.

Acked-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_kprobe.c | 48 -
 kernel/trace/trace_probe.h  | 48 +
 2 files changed, 48 insertions(+), 48 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 6d33cfee9448..2a668516f0e4 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -909,54 +909,6 @@ const struct fetch_type kprobes_fetch_type_table[] = {
ASSIGN_FETCH_TYPE(s64, u64, 1),
 };
 
-/* Sum up total data length for dynamic arraies (strings) */
-static __kprobes int __get_data_size(struct trace_probe *tp,
-struct pt_regs *regs)
-{
-   int i, ret = 0;
-   u32 len;
-
-   for (i = 0; i < tp->nr_args; i++)
-   if (unlikely(tp->args[i].fetch_size.fn)) {
-   call_fetch(>args[i].fetch_size, regs, );
-   ret += len;
-   }
-
-   return ret;
-}
-
-/* Store the value of each argument */
-static __kprobes void store_trace_args(int ent_size, struct trace_probe *tp,
-  struct pt_regs *regs,
-  u8 *data, int maxlen)
-{
-   int i;
-   u32 end = tp->size;
-   u32 *dl;/* Data (relative) location */
-
-   for (i = 0; i < tp->nr_args; i++) {
-   if (unlikely(tp->args[i].fetch_size.fn)) {
-   /*
-* First, we set the relative location and
-* maximum data length to *dl
-*/
-   dl = (u32 *)(data + tp->args[i].offset);
-   *dl = make_data_rloc(maxlen, end - tp->args[i].offset);
-   /* Then try to fetch string or dynamic array data */
-   call_fetch(>args[i].fetch, regs, dl);
-   /* Reduce maximum length */
-   end += get_rloc_len(*dl);
-   maxlen -= get_rloc_len(*dl);
-   /* Trick here, convert data_rloc to data_loc */
-   *dl = convert_rloc_to_loc(*dl,
-ent_size + tp->args[i].offset);
-   } else
-   /* Just fetching data normally */
-   call_fetch(>args[i].fetch, regs,
-  data + tp->args[i].offset);
-   }
-}
-
 /* Kprobe handler */
 static __kprobes void
 __kprobe_trace_func(struct trace_kprobe *tp, struct pt_regs *regs,
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 63e5da4e3073..189a40baea98 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -302,3 +302,51 @@ extern ssize_t traceprobe_probes_write(struct file *file,
int (*createfn)(int, char**));
 
 extern int traceprobe_command(const char *buf, int (*createfn)(int, char**));
+
+/* Sum up total data length for dynamic arraies (strings) */
+static inline __kprobes int
+__get_data_size(struct trace_probe *tp, struct pt_regs *regs)
+{
+   int i, ret = 0;
+   u32 len;
+
+   for (i = 0; i < tp->nr_args; i++)
+   if (unlikely(tp->args[i].fetch_size.fn)) {
+   call_fetch(>args[i].fetch_size, regs, );
+   ret += len;
+   }
+
+   return ret;
+}
+
+/* Store the value of each argument */
+static inline __kprobes void
+store_trace_args(int ent_size, struct trace_probe *tp, struct pt_regs *regs,
+u8 *data, int maxlen)
+{
+   int i;
+   u32 end = tp->size;
+   u32 *dl;/* Data (relative) location */
+
+   for (i = 0; i < tp->nr_args; i++) {
+   if (unlikely(tp->args[i].fetch_size.fn)) {
+   /*
+* First, we set the relative location and
+* maximum data length to *dl
+*/
+   dl = (u32 *)(data + tp->args[i].offset);
+   *dl = make_data_rloc(maxlen, end - tp->args[i].offset);
+   /* Then try to fetch string or dynamic array data */
+   call_fetch(>args[i].fetch, regs, dl);
+   /* Reduce maximum length */
+   end += get_rloc_len(*dl);
+   maxlen -= get_rloc_len(*dl);
+   /* Trick here, convert data_rloc to data_loc */
+   *dl = convert_rloc_to_loc(*dl,
+ent_size + tp->args[i].offset);
+   } else
+   /* Just fetching data normally */
+   

RE: [PATCH v4 3/3] dma: Add Freescale eDMA engine driver support

2013-09-02 Thread Lu Jingchang-B35083
> > How about change the filter_fn to follow:
> > static bool fsl_edma_filter_fn(struct dma_chan *chan, void *fn_param)
> > {
> > struct fsl_edma_filter_param *fparam = fn_param;
> > struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(chan);
> > unsigned char val;
> >
> > if (fsl_chan->edmamux->mux_id != fparam->mux_id)
> > return false;
> >
> > val = EDMAMUX_CHCFG_ENBL | EDMAMUX_CHCFG_SOURCE(fparam-
> >slot_id);
> > fsl_edmamux_config_chan(fsl_chan, val);
> > return true;
> > }
> > In fact the slot_id isn't need elsewhere, and if the filter return true,
> > This channel should be to this request. So no need to save the slave id,
> Right?
> something like
> 
> static bool fsl_edma_filter_fn(struct dma_chan *chan, void *fn_param)
> {
>   struct fsl_edma_filter_param *fparam = fn_param;
>   struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(chan);
> 
>   if (fsl_chan->edmamux->mux_id != fparam->mux_id)
>   return false;
>   return true;
> }
> 
> in thedriver which calls this:
> 
> before prep:
> 
>   config->slave_id = val;
> 
>   dma_set_slave_config(chan, slave);
> 
  Do you mean the DMA_SLAVE_CONFIG device_control? Yeah, the slave driver could 
pass
the slave_id. But the DMA_SLAVE_CONFIG may be called more than once, and the 
eDMA
driver just needs to set the slave id once for any given channel, after that 
the 
transfer is transparent to the device. 
  On the other hand, the DMAMUX's setting procedure requires first disable the 
dmamux
before setting, then if it is set in DMA_SLAVE_CONFIG, the repeated setting may 
be
complex and unnecessary. The channel is occupied exclusively by the peripheral.
  So, according the HW feature, I think the eDMA needs only set the slave id 
once,
and since the of_dma helper has pass the slave id in on xlate, we can get and 
set
the slave id here. How do you think about this?
  Thanks!









Best Regards,
Jingchang





[PATCH 09/13] tracing/kprobes: Integrate duplicate set_print_fmt()

2013-09-02 Thread Namhyung Kim
From: Namhyung Kim 

The set_print_fmt() functions are implemented almost same for
[ku]probes.  Move it to a common place and get rid of the duplication.

Acked-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_kprobe.c | 63 +
 kernel/trace/trace_probe.c  | 62 
 kernel/trace/trace_probe.h  |  2 ++
 kernel/trace/trace_uprobe.c | 55 +--
 4 files changed, 66 insertions(+), 116 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 2a668516f0e4..3159b114f215 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1133,67 +1133,6 @@ static int kretprobe_event_define_fields(struct 
ftrace_event_call *event_call)
return 0;
 }
 
-static int __set_print_fmt(struct trace_kprobe *tp, char *buf, int len)
-{
-   int i;
-   int pos = 0;
-
-   const char *fmt, *arg;
-
-   if (!trace_kprobe_is_return(tp)) {
-   fmt = "(%lx)";
-   arg = "REC->" FIELD_STRING_IP;
-   } else {
-   fmt = "(%lx <- %lx)";
-   arg = "REC->" FIELD_STRING_FUNC ", REC->" FIELD_STRING_RETIP;
-   }
-
-   /* When len=0, we just calculate the needed length */
-#define LEN_OR_ZERO (len ? len - pos : 0)
-
-   pos += snprintf(buf + pos, LEN_OR_ZERO, "\"%s", fmt);
-
-   for (i = 0; i < tp->p.nr_args; i++) {
-   pos += snprintf(buf + pos, LEN_OR_ZERO, " %s=%s",
-   tp->p.args[i].name, tp->p.args[i].type->fmt);
-   }
-
-   pos += snprintf(buf + pos, LEN_OR_ZERO, "\", %s", arg);
-
-   for (i = 0; i < tp->p.nr_args; i++) {
-   if (strcmp(tp->p.args[i].type->name, "string") == 0)
-   pos += snprintf(buf + pos, LEN_OR_ZERO,
-   ", __get_str(%s)",
-   tp->p.args[i].name);
-   else
-   pos += snprintf(buf + pos, LEN_OR_ZERO, ", REC->%s",
-   tp->p.args[i].name);
-   }
-
-#undef LEN_OR_ZERO
-
-   /* return the length of print_fmt */
-   return pos;
-}
-
-static int set_print_fmt(struct trace_kprobe *tp)
-{
-   int len;
-   char *print_fmt;
-
-   /* First: called with 0 length to calculate the needed length */
-   len = __set_print_fmt(tp, NULL, 0);
-   print_fmt = kmalloc(len + 1, GFP_KERNEL);
-   if (!print_fmt)
-   return -ENOMEM;
-
-   /* Second: actually write the @print_fmt */
-   __set_print_fmt(tp, print_fmt, len + 1);
-   tp->p.call.print_fmt = print_fmt;
-
-   return 0;
-}
-
 #ifdef CONFIG_PERF_EVENTS
 
 /* Kprobe profile handler */
@@ -1344,7 +1283,7 @@ static int register_kprobe_event(struct trace_kprobe *tp)
call->event.funcs = _funcs;
call->class->define_fields = kprobe_event_define_fields;
}
-   if (set_print_fmt(tp) < 0)
+   if (set_print_fmt(>p, trace_kprobe_is_return(tp)) < 0)
return -ENOMEM;
ret = register_ftrace_event(>event);
if (!ret) {
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index b7b8bda02d6e..1ab83d4c7775 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -629,3 +629,65 @@ out:
 
return ret;
 }
+
+static int __set_print_fmt(struct trace_probe *tp, char *buf, int len,
+  bool is_return)
+{
+   int i;
+   int pos = 0;
+
+   const char *fmt, *arg;
+
+   if (!is_return) {
+   fmt = "(%lx)";
+   arg = "REC->" FIELD_STRING_IP;
+   } else {
+   fmt = "(%lx <- %lx)";
+   arg = "REC->" FIELD_STRING_FUNC ", REC->" FIELD_STRING_RETIP;
+   }
+
+   /* When len=0, we just calculate the needed length */
+#define LEN_OR_ZERO (len ? len - pos : 0)
+
+   pos += snprintf(buf + pos, LEN_OR_ZERO, "\"%s", fmt);
+
+   for (i = 0; i < tp->nr_args; i++) {
+   pos += snprintf(buf + pos, LEN_OR_ZERO, " %s=%s",
+   tp->args[i].name, tp->args[i].type->fmt);
+   }
+
+   pos += snprintf(buf + pos, LEN_OR_ZERO, "\", %s", arg);
+
+   for (i = 0; i < tp->nr_args; i++) {
+   if (strcmp(tp->args[i].type->name, "string") == 0)
+   pos += snprintf(buf + pos, LEN_OR_ZERO,
+   ", __get_str(%s)",
+   tp->args[i].name);
+   else
+   pos += snprintf(buf + pos, LEN_OR_ZERO, ", REC->%s",
+   tp->args[i].name);
+   }
+
+#undef LEN_OR_ZERO
+
+   /* return the length of print_fmt */
+   return pos;
+}
+
+int set_print_fmt(struct trace_probe *tp, 

Re: [GIT PULL rcu/next] RCU commits for 3.12

2013-09-02 Thread Ingo Molnar

* Paul E. McKenney  wrote:

> Hello, Ingo,
> 
> The major changes for this series are:
> 
> 1.Update RCU documentation.  These were posted to LKML at
>   https://lkml.org/lkml/2013/8/19/611.
> 
> 2.Miscellaneous fixes.  These were posted to LKML at
>   https://lkml.org/lkml/2013/8/19/619.
> 
> 3.Full-system idle detection.  This is for use by Frederic
>   Weisbecker's adaptive-ticks mechanism.  Its purpose is
>   to allow the timekeeping CPU to shut off its tick when
>   all other CPUs are idle.  These were posted to LKML at
>   https://lkml.org/lkml/2013/8/19/648.
> 
> 4.Improve rcutorture test coverage.  These were posted to LKML at
>   https://lkml.org/lkml/2013/8/19/675.
> 
> All of these commits have been subjected to -next testing and are
> available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/next
> 
> for you to fetch changes up to 25f27ce4a6a4995c8bdd69b4b2180465ed5ad2b8:
> 
>   Merge branches 'doc.2013.08.19a', 'fixes.2013.08.20a', 
> 'sysidle.2013.08.31a' and 'torture.2013.08.20a' into HEAD (2013-08-31 
> 14:44:45 -0700)
> 
> 
> 
> Borislav Petkov (1):
>   rcu: Expedite grace periods during suspend/resume
> 
> James Hogan (1):
>   rcu: Select IRQ_WORK from TREE_PREEMPT_RCU
> 
> Paul E. McKenney (24):
>   rcu: Fix rcu_barrier() documentation
>   rcu: Simplify debug-objects fixups
>   debugobjects: Make debug_object_activate() return status
>   rcu: Make call_rcu() leak callbacks for debug-object errors
>   rcu: Avoid redundant grace-period kthread wakeups
>   rcu: Eliminate unused APIs intended for adaptive ticks
>   nohz_full: Add testing information to documentation
>   nohz_full: Add Kconfig parameter for scalable detection of all-idle 
> state
>   nohz_full: Add rcu_dyntick data for scalable detection of all-idle state
>   nohz_full: Add per-CPU idle-state tracking
>   nohz_full: Add full-system idle states and variables
>   nohz_full: Add full-system-idle arguments to API
>   rcu: Update RTFP documentation
>   doc: Fix memory-barrier control-dependency example
>   rcu: Add duplicate-callback tests to rcutorture
>   rcu: Increase rcutorture test coverage
>   rcu: Sort rcutorture module parameters
>   rcu: Remove unused variable from rcu_torture_writer()
>   rcu: Make rcutorture emit online failures if verbose
>   rcu: Simplify _rcu_barrier() processing
>   jiffies: Avoid undefined behavior from signed overflow
>   nohz_full: Add full-system-idle state machine
>   nohz_full: Force RCU's grace-period kthreads onto timekeeping CPU
>   Merge branches 'doc.2013.08.19a', 'fixes.2013.08.20a', 
> 'sysidle.2013.08.31a' and 'torture.2013.08.20a' into HEAD
> 
> Tejun Heo (1):
>   rculist: list_first_or_null_rcu() should use list_entry_rcu()
> 
>  Documentation/RCU/RTFP.txt| 858 
> --
>  Documentation/RCU/rcubarrier.txt  |  12 +-
>  Documentation/RCU/torture.txt |  10 +
>  Documentation/memory-barriers.txt |  10 +-
>  Documentation/timers/NO_HZ.txt|  44 +-
>  include/linux/debugobjects.h  |   6 +-
>  include/linux/jiffies.h   |   8 +-
>  include/linux/rculist.h   |   5 +-
>  include/linux/rcupdate.h  |  22 +-
>  init/Kconfig  |   1 +
>  kernel/rcu.h  |  10 +-
>  kernel/rcupdate.c | 100 -
>  kernel/rcutorture.c   | 388 -
>  kernel/rcutree.c  | 150 ---
>  kernel/rcutree.h  |  17 +
>  kernel/rcutree_plugin.h   | 424 ++-
>  kernel/time/Kconfig   |  50 +++
>  lib/debugobjects.c|  20 +-
>  18 files changed, 1418 insertions(+), 717 deletions(-)

Pulled, thanks a lot Paul!

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel/rcutree.c: deem to be lazy if there are no callbacks.

2013-09-02 Thread Chen Gang
Hello Maintainers:

Is this issue finished ?

If need additional help from me (e.g. some test things, or others, if
you have no time, can let me try), please let me know, I should try.


Thanks.

On 08/26/2013 10:21 AM, Chen Gang F T wrote:
> 
> Firstly, thank you for your reply with these details. 
> 
> On 08/26/2013 03:18 AM, Paul E. McKenney wrote:
>> On Thu, Aug 22, 2013 at 11:01:53AM +0800, Chen Gang wrote:
>>> On 08/21/2013 10:23 PM, Paul E. McKenney wrote:
 On Wed, Aug 21, 2013 at 01:59:29PM +0800, Chen Gang wrote:
>>
>> [ . . . ]
>>
 Don't get me wrong, I do welcome appropriate patches.  In fact, if
 you look at RCU's git history, you will see that I frequently accept
 patches from a fair number of people.  And if you were willing to
 invest some time and thought, you might eventually be able to generate
 an appropriate (albeit low priority) patch to this function.  However,
 you seem to be motivated to submit small patches with a minimum of
 thought and preparation, perhaps because you need to meet some external
 or self-imposed quota of accepted patches.  And if you are in fact driven
 by a quota that prevents you from taking the time required to carefully
 think things through, you are wasting your time with RCU.
>>>
>>> Hmm... at least, some contents you said above is correct to me.
>>>
>>> At least, I should provide 10 patches per month, it is a necessary
>>> basic requirement to me.
>>
>> OK, that does help explain the otherwise inexplicable approach you have
>> been taking.  Let's see how you have been doing, based on committer date
>> in Linus's tree:
>>
>>   1 2012-11
>>  15 2013-01
>>   7 2013-02
>>  20 2013-03
>>  21 2013-04
>>  12 2013-05
>>  17 2013-06
>>  10 2013-07
>>
>> The last few months might be understated a bit due to patches
>> still being in maintainer trees.  This is a nice contrast from my
>> first impression of you from https://lkml.org/lkml/2013/6/9/64 and
>> https://lkml.org/lkml/2013/8/19/650, neither of which gave me any
>> reason to trust your work, to put it mildly.  And if I cannot trust
>> your work, I obviously cannot accept your patches.
>>
> 
> Hmm... better to check patches independent personal feelings (trust
> some one, or not).
> 
> ;-)
> 
> 
>> You do seem to select for localized bug fixes, which require less work
>> than the performance-motivated patches you were putting forward earlier
>> in this thread.  With a localized bug, you demonstrate the bug, show the
>> fix, and that is that.  From what I can see, part of the problem with
>> your patches in this email thread is that you are trying to move from
>> localized bug fixes to performance issues without doing the additional
>> work required.  Please see below for a rough outline of this additional
>> work.
>>
> 
> Hmm... it seems I need describe my work flow for fixing bugs in details.
> 
>   1. Is it a bug ?
>  if so, I can be marked as Reported-by and continue to 2nd.
>  else, it is a waste mail.
> 
>   2. Try to fix it in simple ways (so can save the maintainers time resource).
>  if it can be accepted by maintainers, it is OK (I can be Signed-off-by).
>  else need continue to 3rd.
> 
>exception: if I can not find a simple way to fix it, I will send 
> [Suggestion] mail.
> 
>   3. Do the maintainers know how to fix it ?
>  if yes, fix it together with maintainers (may mark me only as 
> Reported-by).
>  else need continue to Last.
> 
>   Last: I should analyze it and fix it (it is my duty to fix it).
> 
> 
> How do you feel about this work flow ? welcome any suggestions or
> completions.
> 
> Thanks.
> 
>>> And what my focus is efficiency: let appliers and maintainers together
>>> to provide contributes to outside with efficiency.
>>
>> Sounds great, but there are many possible definitions of "efficiency".
>> Given your quota, I would expect your definition to involve number of
>> patches accepted.  In contrast, my definition for RCU instead involves
>> maintainability, robustness, scalability, and, for a few critical
>> code paths, performance.  I therefore need you to have thought through
>> and carefully tested your patch.
>>
> 
> Hmm... it seems I need give more description for the 'efficiency' which
> I point to.
> 
> If it is no negative effect with the quality, we need try to use less
> resources (e.g. time resources) to provide more contributions (e.g. fix
> issue).
> 
> 
>>> If you already know about it, why need I continue ?  but if you don't
>>> know either, I should try.
>>
>> What I need you to do in future RCU performance patch submissions is:
>>
>> 1.   Think through your patch and the code that it is modifying.
>>  If you submit a patch to me, you should be able to answer the
>>  sorts of questions that I was asking in this thread.
>>
>> 2.   Tell me what situations your patch helps and not.
>>
>> 3.   Tell me how much your patch improves performance in the
>>  

Re: [PATCH v2 3/3] mm/vmalloc: move VM_UNINITIALIZED just before show_numa_info

2013-09-02 Thread Zhang Yanfei
On 09/03/2013 11:00 AM, Wanpeng Li wrote:
> The VM_UNINITIALIZED/VM_UNLIST flag introduced by commit f5252e00(mm: avoid
> null pointer access in vm_struct via /proc/vmallocinfo) is used to avoid
> accessing the pages field with unallocated page when show_numa_info() is
> called. This patch move the check just before show_numa_info in order that
> some messages still can be dumped via /proc/vmallocinfo.
> 
> Signed-off-by: Wanpeng Li 

Hmmm, sorry again. Please revert commit
d157a5581548caec311dfb543ce8a79e283e. That said, we could still
do the check in show_numa_info like before.

> ---
>  mm/vmalloc.c |   10 +-
>  1 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index e3ec8b4..c4720cd 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2590,11 +2590,6 @@ static int s_show(struct seq_file *m, void *p)
>  
>   v = va->vm;
>  
> - /* Pair with smp_wmb() in clear_vm_uninitialized_flag() */
> - smp_rmb();
> - if (v->flags & VM_UNINITIALIZED)
> - return 0;
> -
>   seq_printf(m, "0x%pK-0x%pK %7ld",
>   v->addr, v->addr + v->size, v->size);
>  
> @@ -2622,6 +2617,11 @@ static int s_show(struct seq_file *m, void *p)
>   if (v->flags & VM_VPAGES)
>   seq_printf(m, " vpages");
>  
> + /* Pair with smp_wmb() in clear_vm_uninitialized_flag() */
> + smp_rmb();
> + if (v->flags & VM_UNINITIALIZED)
> + return 0;
> +
>   show_numa_info(m, v);
>   seq_putc(m, '\n');
>   return 0;
> 


-- 
Thanks.
Zhang Yanfei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND 2/3] x86, mm: Update min_pfn_mapped in add_pfn_range_mapped().

2013-09-02 Thread Tang Chen

On 09/03/2013 10:48 AM, Yinghai Lu wrote:

On Mon, Sep 2, 2013 at 6:06 PM, Tang Chen  wrote:

Hi Yinghai,

On 09/03/2013 02:41 AM, Yinghai Lu wrote:



How about change the "for (from low to high)" in init_range_memory_mapping()
to
"for_rev(from high to low)" ?
Then we can update min_pfn_mapped in add_pfn_range_mapped().

And also, the outer loop is from high to low, we can change the inner loop
to be from high
to low too.


No. there is other reason for doing local from low to high.

kernel_physical_mapping_init() could clear some mapping near the end
of PUG/PMD entries but not the head.


Thanks for your explanation. But sorry, I'd like to understand it more 
clearly.


Are you talking about the following code ?
phys_pud_init()
{
if (addr >= end) {
if (!after_bootmem &&
!e820_any_mapped(addr & PUD_MASK, next, 
E820_RAM) &&
!e820_any_mapped(addr & PUD_MASK, next, 
E820_RESERVED_KERN))

set_pud(pud, __pud(0));
continue;
}
}
It will clear the PUD/PMD out of range.


But,
init_mem_mapping()
{
while (from high to low) {
init_range_memory_mapping()
{
for (from low to high) {		/* I'm saying changing this 
loop */

init_memory_mapping()
{
for () {
/* Not this one */
kernel_physical_mapping_init();
}
add_pfn_range_mapped();
}
}
}
}
}

I'm saying changing the outer loop in init_range_memory_mapping(), not 
the one in init_memory_mapping().
I think it is OK to call init_memory_mapping() with any order. The loop 
is out of init_memory_mapping(), right ?


In init_memory_mapping(), it is still from low to high. But when the 
kernel_physical_mapping_init() finished,
we can update min_pfn_mapped in add_pfn_range_mapped() because the outer 
loop is from high to low.


Am I missing something here ?  Please tell me.





I think updating min_pfn_mapped in init_mem_mapping() is less readable. And
min_pfn_mapped
and max_pfn_mapped should be updated together.


min_pfn_mapped is early local variable to control allocation in alloc_low_pages.
put it in init_mem_mapping is more readable.



But add_pfn_range_mapped() is in the same file with init_mem_mapping(). 
I think

it is OK to update min_pfn_mapped in it.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/3] mm/vmalloc: don't warning vmalloc allocation failure twice

2013-09-02 Thread Zhang Yanfei
On 09/03/2013 11:00 AM, Wanpeng Li wrote:
> Don't warning twice in __vmalloc_area_node and __vmalloc_node_range if
> __vmalloc_area_node allocation failure.
> 
> Signed-off-by: Wanpeng Li 

OK, I missed the warning in __vmalloc_area_node(), so you are right.
You can just revert the commit 46c001a2753f47ffa621131baa3409e636515347.

> ---
>  mm/vmalloc.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index d78d117..e3ec8b4 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1635,7 +1635,7 @@ void *__vmalloc_node_range(unsigned long size, unsigned 
> long align,
>  
>   addr = __vmalloc_area_node(area, gfp_mask, prot, node, caller);
>   if (!addr)
> - goto fail;
> + return NULL;
>  
>   /*
>* In this function, newly allocated vm_struct has VM_UNINITIALIZED
> 


-- 
Thanks.
Zhang Yanfei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/3] mm/vmalloc: don't set area->caller twice

2013-09-02 Thread Zhang Yanfei
On 09/03/2013 11:00 AM, Wanpeng Li wrote:
> Changelog:
>  * rebase against mmotm tree
> 
> The caller address has already been set in set_vmalloc_vm(), there's no need
> to set it again in __vmalloc_area_node.
> 
> Signed-off-by: Wanpeng Li 

Reviewed-by: Zhang Yanfei 

> ---
>  mm/vmalloc.c |1 -
>  1 files changed, 0 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 1074543..d78d117 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1566,7 +1566,6 @@ static void *__vmalloc_area_node(struct vm_struct 
> *area, gfp_t gfp_mask,
>   pages = kmalloc_node(array_size, nested_gfp, node);
>   }
>   area->pages = pages;
> - area->caller = caller;
>   if (!area->pages) {
>   remove_vm_area(area->addr);
>   kfree(area);
> 


-- 
Thanks.
Zhang Yanfei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm/shmem.c: check the return value of mpol_to_str()

2013-09-02 Thread Chen Gang
Hello Maintainers:

Please help check this patch, when you have time.

If it need additional test, please let me know, I should try (better to
provide some suggestions for test).


Thanks.

On 08/22/2013 09:04 AM, Chen Gang wrote:
> mpol_to_str() may fail, and not fill the buffer (e.g. -EINVAL), so need
> check about it, or buffer may not be zero based, and next seq_printf()
> will cause issue.
> 
> Also need let shmem_show_mpol() return value, since it may fail.
> 
> Signed-off-by: Chen Gang 
> Reviewed-by: Cyrill Gorcunov 
> ---
>  mm/shmem.c |   16 ++--
>  1 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/shmem.c b/mm/shmem.c
> index f00c1c1..b4d44db 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -883,16 +883,20 @@ redirty:
>  
>  #ifdef CONFIG_NUMA
>  #ifdef CONFIG_TMPFS
> -static void shmem_show_mpol(struct seq_file *seq, struct mempolicy *mpol)
> +static int shmem_show_mpol(struct seq_file *seq, struct mempolicy *mpol)
>  {
>   char buffer[64];
> + int ret;
>  
>   if (!mpol || mpol->mode == MPOL_DEFAULT)
> - return; /* show nothing */
> + return 0;   /* show nothing */
>  
> - mpol_to_str(buffer, sizeof(buffer), mpol);
> + ret = mpol_to_str(buffer, sizeof(buffer), mpol);
> + if (ret < 0)
> + return ret;
>  
>   seq_printf(seq, ",mpol=%s", buffer);
> + return 0;
>  }
>  
>  static struct mempolicy *shmem_get_sbmpol(struct shmem_sb_info *sbinfo)
> @@ -951,8 +955,9 @@ static struct page *shmem_alloc_page(gfp_t gfp,
>  }
>  #else /* !CONFIG_NUMA */
>  #ifdef CONFIG_TMPFS
> -static inline void shmem_show_mpol(struct seq_file *seq, struct mempolicy 
> *mpol)
> +static inline int shmem_show_mpol(struct seq_file *seq, struct mempolicy 
> *mpol)
>  {
> + return 0;
>  }
>  #endif /* CONFIG_TMPFS */
>  
> @@ -2555,8 +2560,7 @@ static int shmem_show_options(struct seq_file *seq, 
> struct dentry *root)
>   if (!gid_eq(sbinfo->gid, GLOBAL_ROOT_GID))
>   seq_printf(seq, ",gid=%u",
>   from_kgid_munged(_user_ns, sbinfo->gid));
> - shmem_show_mpol(seq, sbinfo->mpol);
> - return 0;
> + return shmem_show_mpol(seq, sbinfo->mpol);
>  }
>  #endif /* CONFIG_TMPFS */
>  
> 


-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 4/4] Documentation: Add device tree bindings for Freescale FTM PWM

2013-09-02 Thread Xiubo Li-B47053
> Subject: Re: [PATCH 4/4] Documentation: Add device tree bindings for
> Freescale FTM PWM
> 
> On 08/30/2013 01:19 PM, Kumar Gala wrote:
> > Should have at least something w/regards to a commit message.
> >
> > On Aug 20, 2013, at 10:07 PM, Xiubo Li wrote:
> >
> >> Signed-off-by: Xiubo Li 
> >> ---
> >> .../devicetree/bindings/pwm/fsl-ftm-pwm.txt| 52
> ++
> >> 1 file changed, 52 insertions(+)
> >> create mode 100644
> >> Documentation/devicetree/bindings/pwm/fsl-ftm-pwm.txt
> >>
> >> diff --git a/Documentation/devicetree/bindings/pwm/fsl-ftm-pwm.txt
> >> b/Documentation/devicetree/bindings/pwm/fsl-ftm-pwm.txt
> >> new file mode 100644
> >> index 000..698965b
> >> --- /dev/null
> >> +++ b/Documentation/devicetree/bindings/pwm/fsl-ftm-pwm.txt
> >> @@ -0,0 +1,52 @@
> >> +Freescale FTM PWM controller
> >> +
> >> +Required properties:
> >> +- compatible: should be "fsl,vf610-ftm-pwm"
> >> +- reg: physical base address and length of the controller's
> >> +registers
> >> +- #pwm-cells: Should be 3. Number of cells being used to specify PWM
> property.
> >> +  First cell specifies the per-chip channel index of the PWM to use,
> >> +the
> >> +  second cell is the period in nanoseconds and bit 0 in the third
> >> +cell is
> >> +  used to encode the polarity of PWM output. Set bit 0 of the third
> >> +in PWM
> >> +  specifier to 1 for inverse polarity & set to 0 for normal polarity.
> >> +- fsl,pwm-clk-ps: the ftm0 pwm clock's prescaler, divide-by 2^n(n = 0
> ~ 7).
> >> +- fsl,pwm-cpwm: Center-Aligned PWM (CPWM) mode.
> >
> > Should describe this in more detail, what does the value actually mean
> for what modes there are?
> 
> Assuming "CPWM" is clearly explained in the HW documentation for this
> chip (I have no idea if that's actually the case), then is it still
> necessary to explain what this means in *detail*? Perhaps simply "see
> section XXX in the TRM" or "see register XXX, bit YYY in the HW
> documentation" would be enough?
>
If to clearly explain the 'CPWM' mode, there maybe need much more words, I 
think just simply explain it, and then for more detail information "see section 
XXX in the TRM" or "see register XXX, bit YYY in HW documentation".


Thanks.

--
Best Regards,
Xiubo


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel/groups.c: consider about NULL for 'group_info' in all related extern functions

2013-09-02 Thread Chen Gang
Hello Maintainers:

Please help check this patch, when you have time.

If need a related test, please let me know, I should try (better to
provide some suggestions for test).


Thanks.

On 08/20/2013 11:03 AM, Chen Gang wrote:
> 
> If this patch is correct, also need modify the man page for the return
> value of getgroups().
> 
> Thanks.
> 
> On 08/20/2013 11:01 AM, Chen Gang wrote:
>> groups_alloc() can return NULL for 'group_info', also group_search()
>> already considers about NULL for 'group_info', so can assume the caller
>> has right to use all related extern functions when 'group_info' is NULL.
>>
>> For groups_free(), need check NULL to match groups_alloc(), just like
>> kmalloc/free().
>>
>> For set_groups(), can allow the caller to set NULL parameter to new
>> 'cred'.
>>
>> For system call getgroups(), if 'cred->group_info' is NULL, need return
>> the related error code (no related data), also need change the related
>> man page ("man 2 getgroups") to complete the return value.
>>
>>
>> Signed-off-by: Chen Gang 
>> ---
>>  kernel/groups.c |   14 +++---
>>  1 files changed, 11 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/groups.c b/kernel/groups.c
>> index 6b2588d..a21a4ce 100644
>> --- a/kernel/groups.c
>> +++ b/kernel/groups.c
>> @@ -52,6 +52,9 @@ EXPORT_SYMBOL(groups_alloc);
>>
>>  void groups_free(struct group_info *group_info)
>>  {
>> +if (!group_info)
>> +return;
>> +
>>  if (group_info->blocks[0] != group_info->small_block) {
>>  int i;
>>  for (i = 0; i < group_info->nblocks; i++)
>> @@ -163,9 +166,12 @@ int groups_search(const struct group_info
>> *group_info, kgid_t grp)
>>   */
>>  int set_groups(struct cred *new, struct group_info *group_info)
>>  {
>> -put_group_info(new->group_info);
>> -groups_sort(group_info);
>> -get_group_info(group_info);
>> +if (new->group_info)
>> +put_group_info(new->group_info);
>> +if (group_info) {
>> +groups_sort(group_info);
>> +get_group_info(group_info);
>> +}
>>  new->group_info = group_info;
>>  return 0;
>>  }
>> @@ -206,6 +212,8 @@ SYSCALL_DEFINE2(getgroups, int, gidsetsize, gid_t
>> __user *, grouplist)
>>
>>  if (gidsetsize < 0)
>>  return -EINVAL;
>> +if (!cred->group_info)
>> +return -ENODATA;
>>
>>  /* no need to grab task_lock here; it cannot change */
>>  i = cred->group_info->ngroups;
>>
> 
> 


-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] kernel/sysctl.c: check return value after call proc_put_char() in __do_proc_doulongvec_minmax()

2013-09-02 Thread Chen Gang
Need check the return value of proc_put_char(), just like another have
done in __do_proc_doulongvec_minmax().

Signed-off-by: Chen Gang 
---
 kernel/sysctl.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index b2f06f3..7453418 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -2214,8 +2214,11 @@ static int __do_proc_doulongvec_minmax(void *data, 
struct ctl_table *table, int
*i = val;
} else {
val = convdiv * (*i) / convmul;
-   if (!first)
+   if (!first) {
err = proc_put_char(, , '\t');
+   if (err)
+   break;
+   }
err = proc_put_long(, , val, false);
if (err)
break;
-- 
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Large pastes into readline enabled programs causes breakage from v2.6.31 onwards

2013-09-02 Thread Arkadiusz Miskiewicz
On Sunday 18 of August 2013, Margarita Manterola wrote:
> Hi,
> 
> On Sat, Aug 17, 2013 at 5:28 PM, Pavel Machek  wrote:
> >> diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
> >> index 4bf0fc0..2ba7f4e 100644
> >> --- a/drivers/tty/n_tty.c
> >> +++ b/drivers/tty/n_tty.c
> >> @@ -149,7 +149,8 @@ static int set_room(struct tty_struct *tty)
> >> 
> >>  * characters will be beeped.
> >>  */
> >> 
> >> if (left <= 0)
> >> 
> >> -   left = ldata->icanon && !ldata->canon_data;
> >> +   if (waitqueue_active(>read_wait))
> >> +   left = ldata->icanon && !ldata->canon_data;
> >> 
> >> old_left = tty->receive_room;
> >> tty->receive_room = left;
> > 
> > Was this applied? You may want to cc rjw... it is a regression, it is
> > not pretty, and it is something I blieve I hit but thought it was some
> > kind of "X weirdness".
> 
> There were no replies to the previous mail asking for comments, and as
> far as I can see this has not been applied. I don't know who rjw is,
> could you be a bit more explicit, please?

Hi.

Was there some kind of continuation of this thread or the thing died 
completly?

-- 
Arkadiusz Miśkiewicz, arekm / maven.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel/futex.c: notice the return value after rt_mutex_finish_proxy_lock() fails

2013-09-02 Thread Chen Gang
Hello Maintainers:

Please help check this patch, when you have time.


Thanks.

On 08/21/2013 11:48 AM, Chen Gang wrote:
> On 08/21/2013 12:19 AM, Darren Hart wrote:
>> HopingOn Tue, 2013-08-20 at 11:07 +0800, Chen Gang wrote:
>>
>>
>> Hi Chen,
>>
>>> rt_mutex_finish_proxy_lock() can return failure code (e.g. -EINTR,
>>> -ETIMEDOUT).
>>>
>>> Original implementation has already noticed about it, but not check it
>>> before next work.
>>>
>>> Also let coments within 80 columns to pass "./scripts/checkpatch.pl".
>>>
>>>
>>> Signed-off-by: Chen Gang 
>>> ---
>>>  kernel/futex.c |   30 --
>>>  1 files changed, 16 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/kernel/futex.c b/kernel/futex.c
>>> index c3a1a55..1a94e7d 100644
>>> --- a/kernel/futex.c
>>> +++ b/kernel/futex.c
>>> @@ -2373,21 +2373,23 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, 
>>> unsigned int flags,
>>> ret = rt_mutex_finish_proxy_lock(pi_mutex, to, _waiter, 1);
>>> debug_rt_mutex_free_waiter(_waiter);
>>>  
>>> -   spin_lock(q.lock_ptr);
>>> -   /*
>>> -* Fixup the pi_state owner and possibly acquire the lock if we
>>> -* haven't already.
>>> -*/
>>> -   res = fixup_owner(uaddr2, , !ret);
>>
>>
>> This call catches a corner case which appears to be skipped now. Or am I
>> missing how you accounted for that?
>>
>>
> 
> Pardon ?
> 
> Hmm... this patch lets related code block in "if(!ret) {...}", should
> not remove any code.
> 
> Please help check again for whether what I have done is correct or not.
> 
> Thanks.
> 
>>> -   /*
>>> -* If fixup_owner() returned an error, proprogate that.  If it
>>> -* acquired the lock, clear -ETIMEDOUT or -EINTR.
>>> -*/
>>> -   if (res)
>>> -   ret = (res < 0) ? res : 0;
>>> +   if (!ret) {
>>> +   spin_lock(q.lock_ptr);
>>> +   /*
>>> +* Fixup the pi_state owner and possibly acquire the
>>> +* lock if we haven't already.
>>> +*/
>>> +   res = fixup_owner(uaddr2, , !ret);
>>> +   /*
>>> +* If fixup_owner() returned an error, proprogate that.
>>> +* If it acquired the lock, clear -ETIMEDOUT or -EINTR.
>>> +*/
>>> +   if (res)
>>> +   ret = (res < 0) ? res : 0;
>>>  
>>> -   /* Unqueue and drop the lock. */
>>> -   unqueue_me_pi();
>>> +   /* Unqueue and drop the lock. */
>>> +   unqueue_me_pi();
>>> +   }
>>> }
>>>  
>>> /*
>>
>> Thanks,
>>
> 
> 


-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel/taskstats.c: add nla_nest_cancel() for failure processing between nla_nest_start() and nla_nest_end()

2013-09-02 Thread Chen Gang
Hello maintainers:

Please help check this patch, when you have time.

Thanks.

On 08/20/2013 10:44 AM, Chen Gang wrote:
> When failure occurs between nla_nest_start() and nla_nest_end(), need
> call nla_nest_cancel() to clean up related things.
> 
> Signed-off-by: Chen Gang 
> ---
>  kernel/taskstats.c |8 ++--
>  1 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> index 145bb4d..1db6808 100644
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -404,11 +404,15 @@ static struct taskstats *mk_reply(struct sk_buff *skb, 
> int type, u32 pid)
>   if (!na)
>   goto err;
>  
> - if (nla_put(skb, type, sizeof(pid), ) < 0)
> + if (nla_put(skb, type, sizeof(pid), ) < 0) {
> + nla_nest_cancel(skb, na);
>   goto err;
> + }
>   ret = nla_reserve(skb, TASKSTATS_TYPE_STATS, sizeof(struct taskstats));
> - if (!ret)
> + if (!ret) {
> + nla_nest_cancel(skb, na);
>   goto err;
> + }
>   nla_nest_end(skb, na);
>  
>   return nla_data(ret);
> 


-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH] ftrace: add set_graph_notrace filter

2013-09-02 Thread Namhyung Kim
From: Namhyung Kim 

The set_graph_notrace filter is analogous to set_ftrace_notrace and
can be used for eliminating uninteresting part of function graph trace
output.  It also works with set_graph_function nicely.

  # cd /sys/kernel/debug/tracing/
  # echo do_page_fault > set_graph_function
  # perf ftrace live true
   2)   |  do_page_fault() {
   2)   |__do_page_fault() {
   2)   0.381 us|  down_read_trylock();
   2)   0.055 us|  __might_sleep();
   2)   0.696 us|  find_vma();
   2)   |  handle_mm_fault() {
   2)   |handle_pte_fault() {
   2)   |  __do_fault() {
   2)   |filemap_fault() {
   2)   |  find_get_page() {
   2)   0.033 us|__rcu_read_lock();
   2)   0.035 us|__rcu_read_unlock();
   2)   1.696 us|  }
   2)   0.031 us|  __might_sleep();
   2)   2.831 us|}
   2)   |_raw_spin_lock() {
   2)   0.046 us|  add_preempt_count();
   2)   0.841 us|}
   2)   0.033 us|page_add_file_rmap();
   2)   |_raw_spin_unlock() {
   2)   0.057 us|  sub_preempt_count();
   2)   0.568 us|}
   2)   |unlock_page() {
   2)   0.084 us|  page_waitqueue();
   2)   0.126 us|  __wake_up_bit();
   2)   1.117 us|}
   2)   7.729 us|  }
   2)   8.397 us|}
   2)   8.956 us|  }
   2)   0.085 us|  up_read();
   2) + 12.745 us   |}
   2) + 13.401 us   |  }
  ...

  # echo handle_mm_fault > set_graph_notrace
  # perf ftrace live true
   1)   |  do_page_fault() {
   1)   |__do_page_fault() {
   1)   0.205 us|  down_read_trylock();
   1)   0.041 us|  __might_sleep();
   1)   0.344 us|  find_vma();
   1)   0.069 us|  up_read();
   1)   4.692 us|}
   1)   5.311 us|  }
  ...

Signed-off-by: Namhyung Kim 
---
 include/linux/ftrace.h   |   1 +
 kernel/trace/ftrace.c| 118 ++-
 kernel/trace/trace.h |  23 +++
 kernel/trace/trace_functions_graph.c |  21 ++-
 4 files changed, 159 insertions(+), 4 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 9f15c0064c50..ec85d48619e1 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -721,6 +721,7 @@ ftrace_push_return_trace(unsigned long ret, unsigned long 
func, int *depth,
 extern char __irqentry_text_start[];
 extern char __irqentry_text_end[];
 
+#define FTRACE_NOTRACE_DEPTH 65536
 #define FTRACE_RETFUNC_DEPTH 50
 #define FTRACE_RETSTACK_ALLOC_SIZE 32
 extern int register_ftrace_graph(trace_func_graph_ret_t retfunc,
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index a6d098c6df3f..1b1f3409f788 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3819,6 +3819,43 @@ static const struct seq_operations ftrace_graph_seq_ops 
= {
.show = g_show,
 };
 
+int ftrace_graph_notrace_count;
+int ftrace_graph_notrace_enabled;
+unsigned long ftrace_graph_notrace_funcs[FTRACE_GRAPH_MAX_FUNCS] __read_mostly;
+
+static void *
+__n_next(struct seq_file *m, loff_t *pos)
+{
+   if (*pos >= ftrace_graph_notrace_count)
+   return NULL;
+   return _graph_notrace_funcs[*pos];
+}
+
+static void *
+n_next(struct seq_file *m, void *v, loff_t *pos)
+{
+   (*pos)++;
+   return __n_next(m, pos);
+}
+
+static void *n_start(struct seq_file *m, loff_t *pos)
+{
+   mutex_lock(_lock);
+
+   /* Nothing, tell g_show to print all functions are enabled */
+   if (!ftrace_graph_notrace_enabled && !*pos)
+   return (void *)1;
+
+   return __n_next(m, pos);
+}
+
+static const struct seq_operations ftrace_graph_notrace_seq_ops = {
+   .start = n_start,
+   .next = n_next,
+   .stop = g_stop,
+   .show = g_show,
+};
+
 static int
 ftrace_graph_open(struct inode *inode, struct file *file)
 {
@@ -3843,6 +3880,30 @@ ftrace_graph_open(struct inode *inode, struct file *file)
 }
 
 static int
+ftrace_graph_notrace_open(struct inode *inode, struct file *file)
+{
+   int ret = 0;
+
+   if (unlikely(ftrace_disabled))
+   return -ENODEV;
+
+   mutex_lock(_lock);
+   if ((file->f_mode & FMODE_WRITE) &&
+   (file->f_flags & O_TRUNC)) {
+   ftrace_graph_notrace_enabled = 0;
+   ftrace_graph_notrace_count = 0;
+   memset(ftrace_graph_notrace_funcs, 0,
+  sizeof(ftrace_graph_notrace_funcs));
+   }
+   mutex_unlock(_lock);
+
+   if (file->f_mode & FMODE_READ)
+   ret = seq_open(file, _graph_notrace_seq_ops);
+
+   return ret;
+}
+
+static int
 

[PATCH] kernel/delayacct.c: remove redundancy checking in __delayacct_add_tsk()

2013-09-02 Thread Chen Gang
The wrapper function delayacct_add_tsk() already checked 'tsk->delays',
and __delayacct_add_tsk() has no another direct callers, so can remove
the redundancy checking code.

And the label 'done' is also useless, so remove it, too.


Signed-off-by: Chen Gang 
---
 kernel/delayacct.c |7 ---
 1 files changed, 0 insertions(+), 7 deletions(-)

diff --git a/kernel/delayacct.c b/kernel/delayacct.c
index d473988..54996b7 100644
--- a/kernel/delayacct.c
+++ b/kernel/delayacct.c
@@ -108,12 +108,6 @@ int __delayacct_add_tsk(struct taskstats *d, struct 
task_struct *tsk)
struct timespec ts;
cputime_t utime, stime, stimescaled, utimescaled;
 
-   /* Though tsk->delays accessed later, early exit avoids
-* unnecessary returning of other data
-*/
-   if (!tsk->delays)
-   goto done;
-
tmp = (s64)d->cpu_run_real_total;
task_cputime(tsk, , );
cputime_to_timespec(utime + stime, );
@@ -158,7 +152,6 @@ int __delayacct_add_tsk(struct taskstats *d, struct 
task_struct *tsk)
d->freepages_count += tsk->delays->freepages_count;
spin_unlock_irqrestore(>delays->lock, flags);
 
-done:
return 0;
 }
 
-- 
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/8] partitions/efi: use lba-aware partition records

2013-09-02 Thread Davidlohr Bueso
On Mon, 2013-09-02 at 12:10 +0200, Karel Zak wrote:
> On Mon, Aug 05, 2013 at 10:21:09PM -0700, Davidlohr Bueso wrote:
> >  
> > +typedef struct _gpt_record {
> > +u8  boot_indicator; /* unused by EFI, set to 0x80 for bootable 
> > */
> > +u8  start_head; /* unused by EFI, pt start in CHS */
> > +u8  start_sector;   /* unused by EFI, pt start in CHS */
> > +u8  start_track;
> > +u8  os_type;/* EFI and legacy non-EFI OS types */
> > +u8  end_head;   /* unused by EFI, pt end in CHS */
> > +u8  end_sector; /* unused by EFI, pt end in CHS */
> > +u8  end_track;  /* unused by EFI, pt end in CHS */
> > +__le32  starting_lba;   /* used by EFI - start addr of the on disk 
> > pt */
> > +__le32  size_in_lba;/* used by EFI - size of pt in LBA */
> > +} __attribute__ ((packed)) gpt_record;
> > +
> 
>  Maybe it would be better to rename this struct to "gpt_mbr_record" to
>  make it more obvious.

Yes, good idea. I've added the patch below.

8<--
From: Davidlohr Bueso 
Subject: [PATCH] partitions/efi: rename gpt_record structure

Since the gpt_record structure is an MBR-specific
type, rename it to gpt_mbr_record for obvious
reading.

Suggested-by: Karel Zak 
Signed-off-by: Davidlohr Bueso 
---
 block/partitions/efi.c | 2 +-
 block/partitions/efi.h | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/partitions/efi.c b/block/partitions/efi.c
index 8e6d77e..9a4eba7 100644
--- a/block/partitions/efi.c
+++ b/block/partitions/efi.c
@@ -152,7 +152,7 @@ static u64 last_lba(struct block_device *bdev)
   bdev_logical_block_size(bdev)) - 1ULL;
 }
 
-static inline int pmbr_part_valid(gpt_record *part)
+static inline int pmbr_part_valid(gpt_mbr_record *part)
 {
 if (part->os_type != EFI_PMBR_OSTYPE_EFI_GPT)
 goto invalid;
diff --git a/block/partitions/efi.h b/block/partitions/efi.h
index 9ab8ee9..54b2687 100644
--- a/block/partitions/efi.h
+++ b/block/partitions/efi.h
@@ -104,7 +104,7 @@ typedef struct _gpt_entry {
efi_char16_t partition_name[72 / sizeof (efi_char16_t)];
 } __attribute__ ((packed)) gpt_entry;
 
-typedef struct _gpt_record {
+typedef struct _gpt_mbr_record {
 u8  boot_indicator; /* unused by EFI, set to 0x80 for bootable */
 u8  start_head; /* unused by EFI, pt start in CHS */
 u8  start_sector;   /* unused by EFI, pt start in CHS */
@@ -115,14 +115,14 @@ typedef struct _gpt_record {
 u8  end_track;  /* unused by EFI, pt end in CHS */
 __le32  starting_lba;   /* used by EFI - start addr of the on disk pt 
*/
 __le32  size_in_lba;/* used by EFI - size of pt in LBA */
-} __attribute__ ((packed)) gpt_record;
+} __attribute__ ((packed)) gpt_mbr_record;
 
 
 typedef struct _legacy_mbr {
u8 boot_code[440];
__le32 unique_mbr_signature;
__le16 unknown;
-   gpt_record partition_record[4];
+   gpt_mbr_record partition_record[4];
__le16 signature;
 } __attribute__ ((packed)) legacy_mbr;
 
-- 
1.7.11.7



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v3 04/35] mm: Initialize node memory regions during boot

2013-09-02 Thread Yasuaki Ishimatsu

(2013/09/03 2:43), Srivatsa S. Bhat wrote:

On 09/02/2013 11:50 AM, Yasuaki Ishimatsu wrote:

(2013/08/30 22:15), Srivatsa S. Bhat wrote:

Initialize the node's memory-regions structures with the information
about
the region-boundaries, at boot time.

Based-on-patch-by: Ankita Garg 
Signed-off-by: Srivatsa S. Bhat 
---

   include/linux/mm.h |4 
   mm/page_alloc.c|   28 
   2 files changed, 32 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index f022460..18fdec4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -627,6 +627,10 @@ static inline pte_t maybe_mkwrite(pte_t pte,
struct vm_area_struct *vma)
   #define LAST_NID_MASK((1UL << LAST_NID_WIDTH) - 1)
   #define ZONEID_MASK((1UL << ZONEID_SHIFT) - 1)

+/* Hard-code memory region size to be 512 MB for now. */
+#define MEM_REGION_SHIFT(29 - PAGE_SHIFT)
+#define MEM_REGION_SIZE(1UL << MEM_REGION_SHIFT)
+
   static inline enum zone_type page_zonenum(const struct page *page)
   {
   return (page->flags >> ZONES_PGSHIFT) & ZONES_MASK;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b86d7e3..bb2d5d4 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4809,6 +4809,33 @@ static void __init_refok
alloc_node_mem_map(struct pglist_data *pgdat)
   #endif /* CONFIG_FLAT_NODE_MEM_MAP */
   }

+static void __meminit init_node_memory_regions(struct pglist_data
*pgdat)
+{
+int nid = pgdat->node_id;
+unsigned long start_pfn = pgdat->node_start_pfn;
+unsigned long end_pfn = start_pfn + pgdat->node_spanned_pages;
+struct node_mem_region *region;
+unsigned long i, absent;
+int idx;
+
+for (i = start_pfn, idx = 0; i < end_pfn;
+i += region->spanned_pages, idx++) {
+



+region = >node_regions[idx];


It seems that overflow easily occurs.
node_regions[] has 256 entries and MEM_REGION_SIZE is 512MiB. So if
the pgdat has more than 128 GiB, overflow will occur. Am I wrong?



No, you are right. It should be made dynamic to accommodate larger
memory. I just used that value as a placeholder, since my focus was to
demonstrate what algorithms and designs could be developed on top of
this infrastructure, to help shape memory allocations. But certainly
this needs to be modified to be flexible enough to work with any memory
size. Thank you for your review!


Thank you for your explanation. I understood it.

Thanks,
Yasuaki Ishimatsu



Regards,
Srivatsa S. Bhat




+region->pgdat = pgdat;
+region->start_pfn = i;
+region->spanned_pages = min(MEM_REGION_SIZE, end_pfn - i);
+region->end_pfn = region->start_pfn + region->spanned_pages;
+
+absent = __absent_pages_in_range(nid, region->start_pfn,
+ region->end_pfn);
+
+region->present_pages = region->spanned_pages - absent;
+}
+
+pgdat->nr_node_regions = idx;
+}
+
   void __paginginit free_area_init_node(int nid, unsigned long
*zones_size,
   unsigned long node_start_pfn, unsigned long *zholes_size)
   {
@@ -4837,6 +4864,7 @@ void __paginginit free_area_init_node(int nid,
unsigned long *zones_size,

   free_area_init_core(pgdat, start_pfn, end_pfn,
   zones_size, zholes_size);
+init_node_memory_regions(pgdat);
   }

   #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP









--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 2/6] dma: edma: Write out and handle MAX_NR_SG at a given time

2013-09-02 Thread Vinod Koul
On Thu, Aug 29, 2013 at 06:05:41PM -0500, Joel Fernandes wrote:
> Process SG-elements in batches of MAX_NR_SG if they are greater
> than MAX_NR_SG. Due to this, at any given time only those many
> slots will be used in the given channel no matter how long the
> scatter list is. We keep track of how much has been written
> inorder to process the next batch of elements in the scatter-list
> and detect completion.
> 
> For such intermediate transfer completions (one batch of MAX_NR_SG),
> make use of pause and resume functions instead of start and stop
> when such intermediate transfer is in progress or completed as we
> donot want to clear any pending events.
> 
> Signed-off-by: Joel Fernandes 
> ---
>  drivers/dma/edma.c | 79 
> --
>  1 file changed, 53 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
> index e522ad5..732829b 100644
> --- a/drivers/dma/edma.c
> +++ b/drivers/dma/edma.c
> @@ -56,6 +56,7 @@ struct edma_desc {
>   struct list_headnode;
>   int absync;
>   int pset_nr;
> + int processed;
>   struct edmacc_param pset[0];
>  };
>  
> @@ -104,22 +105,34 @@ static void edma_desc_free(struct virt_dma_desc *vdesc)
>  /* Dispatch a queued descriptor to the controller (caller holds lock) */
>  static void edma_execute(struct edma_chan *echan)
>  {
> - struct virt_dma_desc *vdesc = vchan_next_desc(>vchan);
> + struct virt_dma_desc *vdesc;
>   struct edma_desc *edesc;
> - int i;
> -
> - if (!vdesc) {
> - echan->edesc = NULL;
> - return;
> + struct device *dev = echan->vchan.chan.device->dev;
> + int i, j, left, nslots;
> +
> + /* If either we processed all psets or we're still not started */
> + if (!echan->edesc ||
> + echan->edesc->pset_nr == echan->edesc->processed) {
> + /* Get next vdesc */
> + vdesc = vchan_next_desc(>vchan);
> + if (!vdesc) {
> + echan->edesc = NULL;
> + return;
> + }
> + list_del(>node);
> + echan->edesc = to_edma_desc(>tx);
>   }
>  
> - list_del(>node);
> + edesc = echan->edesc;
>  
> - echan->edesc = edesc = to_edma_desc(>tx);
> + /* Find out how many left */
> + left = edesc->pset_nr - edesc->processed;
> + nslots = min(MAX_NR_SG, left);
>  
>   /* Write descriptor PaRAM set(s) */
> - for (i = 0; i < edesc->pset_nr; i++) {
> - edma_write_slot(echan->slot[i], >pset[i]);
> + for (i = 0; i < nslots; i++) {
> + j = i + edesc->processed;
> + edma_write_slot(echan->slot[i], >pset[j]);
>   dev_dbg(echan->vchan.chan.device->dev,
>   "\n pset[%d]:\n"
>   "  chnum\t%d\n"
> @@ -132,24 +145,31 @@ static void edma_execute(struct edma_chan *echan)
>   "  bidx\t%08x\n"
>   "  cidx\t%08x\n"
>   "  lkrld\t%08x\n",
> - i, echan->ch_num, echan->slot[i],
> - edesc->pset[i].opt,
> - edesc->pset[i].src,
> - edesc->pset[i].dst,
> - edesc->pset[i].a_b_cnt,
> - edesc->pset[i].ccnt,
> - edesc->pset[i].src_dst_bidx,
> - edesc->pset[i].src_dst_cidx,
> - edesc->pset[i].link_bcntrld);
> + j, echan->ch_num, echan->slot[i],
> + edesc->pset[j].opt,
> + edesc->pset[j].src,
> + edesc->pset[j].dst,
> + edesc->pset[j].a_b_cnt,
> + edesc->pset[j].ccnt,
> + edesc->pset[j].src_dst_bidx,
> + edesc->pset[j].src_dst_cidx,
> + edesc->pset[j].link_bcntrld);
>   /* Link to the previous slot if not the last set */
> - if (i != (edesc->pset_nr - 1))
> + if (i != (nslots - 1))
>   edma_link(echan->slot[i], echan->slot[i+1]);
>   /* Final pset links to the dummy pset */
>   else
>   edma_link(echan->slot[i], echan->ecc->dummy_slot);
>   }
>  
> - edma_start(echan->ch_num);
> + edesc->processed += nslots;
> +
> + edma_resume(echan->ch_num);
> +
> + if (edesc->processed <= MAX_NR_SG) {
> + dev_dbg(dev, "first transfer starting %d\n", echan->ch_num);
> + edma_start(echan->ch_num);
> + }
>  }
>  
>  static int edma_terminate_all(struct edma_chan *echan)
> @@ -368,19 +388,26 @@ static void edma_callback(unsigned ch_num, u16 
> ch_status, void *data)
>   struct edma_desc *edesc;
>   unsigned long flags;
>  
> - /* Stop the channel */
> -   

[PATCH 1/2] audit: flush_hold_queue(): don't drop queued SKBs

2013-09-02 Thread Luiz Capitulino
From: Luiz capitulino 

flush_hold_queue() first dequeues an SKB and then checks if
auditd exists. If auditd doesn't exist, the SKB is silently
dropped.

Avoid this by not dequeing an SKB when we detected that
auditd disappeared.

Signed-off-by: Luiz capitulino 
---
 kernel/audit.c | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 91e53d0..475c1d1 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -380,7 +380,7 @@ static void audit_printk_skb(struct sk_buff *skb)
audit_hold_skb(skb);
 }
 
-static void kauditd_send_skb(struct sk_buff *skb)
+static int kauditd_send_skb(struct sk_buff *skb)
 {
int err;
/* take a reference in case we can't send it and we want to hold it */
@@ -393,9 +393,12 @@ static void kauditd_send_skb(struct sk_buff *skb)
audit_pid = 0;
/* we might get lucky and get this in the next auditd */
audit_hold_skb(skb);
+   return err;
} else
/* drop the extra reference if sent ok */
consume_skb(skb);
+
+   return 0;
 }
 
 /*
@@ -416,6 +419,7 @@ static void kauditd_send_skb(struct sk_buff *skb)
 static void flush_hold_queue(void)
 {
struct sk_buff *skb;
+   int err;
 
if (!audit_default || !audit_pid)
return;
@@ -424,17 +428,12 @@ static void flush_hold_queue(void)
if (likely(!skb))
return;
 
-   while (skb && audit_pid) {
-   kauditd_send_skb(skb);
+   while (skb) {
+   err = kauditd_send_skb(skb);
+   if (err)
+   break;
skb = skb_dequeue(_skb_hold_queue);
}
-
-   /*
-* if auditd just disappeared but we
-* dequeued an skb we need to drop ref
-*/
-   if (skb)
-   consume_skb(skb);
 }
 
 static int kauditd_thread(void *dummy)
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] audit: fix soft lockup

2013-09-02 Thread Luiz Capitulino
The second patch fixes a softlockup which is fully described and now is
100% reproducible with simple steps. The first patch fixes a bug I found
while working on the second patch.

Chuck Anderson just posted a different solution for the same problem.
I was about to post this solution when he posted his version, so I'm
posting it anyway.

Luiz capitulino (2):
  audit: flush_hold_queue(): don't drop queued SKBs
  audit: kaudit_send_skb(): make non-blocking call to netlink_unicast()

 kernel/audit.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

-- 
1.8.1.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] audit: kaudit_send_skb(): make non-blocking call to netlink_unicast()

2013-09-02 Thread Luiz Capitulino
From: Luiz capitulino 

Try this:

 1. Download the readahead-collector program and build it
 2. Run it with:
   # readahead-collector -f
 3. From another terminal do:
   # pkill -SIGSTOP readahead-collector
 4. Keep using the system, run top -d1, vmstat -S 1, etc
 5. Eventually, you'll get something like this:

[  124.046016] BUG: soft lockup - CPU#0 stuck for 22s! [login:2196]
[  124.046016] Modules linked in:
[  124.046016] CPU: 0 PID: 2196 Comm: login Not tainted 
3.11.0-rc7-00030-g41615e8 #13
[  124.046016] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  124.046016] task: 88003d92c970 ti: 88003cd5 task.ti: 
88003cd5
[  124.046016] RIP: 0010:[]  [] 
audit_log_start+0x99/0x349
[  124.046016] RSP: 0018:88003cd51db0  EFLAGS: 0202
[  124.046016] RAX: 0100 RBX: 8107115e RCX: ea60
[  124.046016] RDX: 95f3 RSI: 0101 RDI: ea60
[  124.046016] RBP: 88003cd51e30 R08: 0100 R09: 
[  124.046016] R10: 000399b3 R11: 88003fc0d4a0 R12: 0046
[  124.046016] R13: 88003cd51d28 R14: 0046 R15: 810501ac
[  124.046016] FS:  7f80d3efa800() GS:88003fc0() 
knlGS:
[  124.046016] CS:  0010 DS:  ES:  CR0: 80050033
[  124.046016] CR2: 7f3f04f8c000 CR3: 3cd41000 CR4: 06b0
[  124.046016] Stack:
[  124.046016]  95f3 88003d747800 fffbfc40 
05160010
[  124.046016]  88003cd51e30   
88003d92c970
[  124.046016]  8105b3a6 dead00100100 dead00200200 
88003d747860
[  124.046016] Call Trace:
[  124.046016]  [] ? wake_up_state+0x12/0x12
[  124.046016]  [] audit_log_name+0x34/0x1a2
[  124.046016]  [] ? _raw_spin_unlock_irqrestore+0x34/0x48
[  124.046016]  [] audit_log_exit+0xa44/0xa8f
[  124.046016]  [] ? rcu_read_unlock+0x1c/0x2d
[  124.046016]  [] ? audit_filter_inodes+0xf5/0x10e
[  124.046016]  [] ? audit_filter_syscall+0xb2/0xd9
[  124.046016]  [] __audit_syscall_exit+0x4d/0x108
[  124.046016]  [] sysret_audit+0x17/0x21
[  124.046016] Code: e7 8b 05 1c ed 59 00 8b 0d 12 ed 59 00 8b 35 1c 24 e1 00 
46 8d 04 30 48 63 f9 85 c0 0f 84 29 01 00 00 44 39 c6 0f 86 20 01 00 00 <83> 7c 
24 18 00 0f 84 a4 00 00 00 85 c9 0f 84 9c 00 00 00 48 8b

This is what happens:

 1. The readahead-collector daemon got stuck and stops reading
from the netlink socket
 2. The kernel keeps logging stuff to the audit subsystem at
a high rate
 3. Because kauditd's call to netlink_unicast() is blocking and
as the netlink socket got a backlog, the kaudit thread will
eventually get blocked when trying to send an SKB to user-space
 4. As the kaudit thread is blocked, SKBs start to accumulate.
This will cause a thread calling audit_log_start() to
be put to sleep when a threshold of queued SBKs is reached
 5. The kaudit thread never wakes up, but the kernel thread
put to sleep in step 4 does. Soon or later sleep_time will be
negative, causing it to busy-wait in the while() loop

This commit fixes that problem by making the call to netlink_unicast()
non-blocking in kaudit_send_skb(). This way the kaudit thread
never gets blocked, completely avoiding the scenario described above.

Signed-off-by: Luiz capitulino 
---
 kernel/audit.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 475c1d1..2b34bd6 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -385,8 +385,12 @@ static int kauditd_send_skb(struct sk_buff *skb)
int err;
/* take a reference in case we can't send it and we want to hold it */
skb_get(skb);
-   err = netlink_unicast(audit_sock, skb, audit_nlk_portid, 0);
-   if (err < 0) {
+   err = netlink_unicast(audit_sock, skb, audit_nlk_portid, 1);
+   if (err == -EAGAIN) {
+   pr_warn_ratelimited("auditd (pid=%d) is not responding\n", 
audit_pid);
+   audit_hold_skb(skb);
+   return err;
+   } else if (err < 0) {
BUG_ON(err != -ECONNREFUSED); /* Shouldn't happen */
printk(KERN_ERR "audit: *NO* daemon at audit_pid=%d\n", 
audit_pid);
audit_log_lost("auditd disappeared\n");
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] audit: fix soft lockups and udevd errors when audit is overrun

2013-09-02 Thread Luiz Capitulino
On Mon, 02 Sep 2013 20:45:14 -0700
Chuck Anderson  wrote:

> The two patches that follow in separate emails resolve soft lockups and
> udevd reported errors that prevented a large memory 3.8 system from booting.
> 
> The patches are based on 3.11-rc7.
> 
> I believe it is the same issue recently posted as:
> 
>[RFC] audit: avoid soft lockup in audit_log_start()
>https://lkml.org/lkml/2013/8/28/626

Nice to see someone else looking into this! And Thanks for CC'ing me.

I've a couple of news to you.

First, I've tried to apply your series but got this:

[lcapitulino@volcano linux-2.6]$ git am ~/audit-fix.mbox
Applying: audit: fix soft lockups due to loop in audit_log_start() wh,en 
audit_backlog_limit exceeded
fatal: corrupt patch at line 23
Patch failed at 0001 audit: fix soft lockups due to loop in audit_log_start() 
wh,en audit_backlog_limit exceeded
The copy of the patch that failed is found in:
   /home/lcapitulino/work/src/upstream/linux-2.6/.git/rebase-apply/patch
When you have resolved this problem, run "git am --resolved".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
[lcapitulino@volcano linux-2.6]$

Now, I was a few minutes a away before sending a different fix I cooked
this evening when I got your series in my inbox. So I really wanted to give
this a try and applied the first patch manually (resulting version is
attached). The softlockup is gone, but I still get a hang for several
seconds just like I did with my first rfc.

I found a very easy way to reproduce the problem and our analysis is
similar, but our solutions differs.

I'm going to send my solution right now, sorry for any mistakes it's
almost 1h AM here but I really wanted to give your version a try before
sending my version (and before going to bed). If you send a v2 I'll try
it again and we can discuss our approaches.
diff --git a/kernel/audit.c b/kernel/audit.c
index 91e53d0..8255d9b 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -103,9 +103,11 @@ static int	audit_rate_limit;
 
 /* Number of outstanding audit_buffers allowed. */
 static int	audit_backlog_limit = 64;
-static int	audit_backlog_wait_time = 60 * HZ;
 static int	audit_backlog_wait_overflow = 0;
 
+#define AUDIT_BACKLOG_WAIT_TIME (60 * HZ)
+static int audit_backlog_wait_time = AUDIT_BACKLOG_WAIT_TIME;
+
 /* The identity of the user shutting down the audit system. */
 kuid_t		audit_sig_uid = INVALID_UID;
 pid_t		audit_sig_pid = -1;
@@ -1053,14 +1055,14 @@ static inline void audit_get_stamp(struct audit_context *ctx,
 /*
  * Wait for auditd to drain the queue a little
  */
-static void wait_for_auditd(unsigned long sleep_time)
+static void wait_for_auditd(unsigned long sleep_time, int limit)
 {
 	DECLARE_WAITQUEUE(wait, current);
 	set_current_state(TASK_UNINTERRUPTIBLE);
 	add_wait_queue(_backlog_wait, );
 
 	if (audit_backlog_limit &&
-	skb_queue_len(_skb_queue) > audit_backlog_limit)
+	skb_queue_len(_skb_queue) > limit)
 		schedule_timeout(sleep_time);
 
 	__set_current_state(TASK_RUNNING);
@@ -1095,8 +1097,8 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
 	struct audit_buffer	*ab	= NULL;
 	struct timespec		t;
 	unsigned int		uninitialized_var(serial);
-	int reserve;
 	unsigned long timeout_start = jiffies;
+	int limit;
 
 	if (audit_initialized != AUDIT_INITIALIZED)
 		return NULL;
@@ -1104,22 +1106,21 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
 	if (unlikely(audit_filter_type(type)))
 		return NULL;
 
-	if (gfp_mask & __GFP_WAIT)
-		reserve = 0;
-	else
-		reserve = 5; /* Allow atomic callers to go up to five
-entries over the normal backlog limit */
+	limit = audit_backlog_limit;
+	 if (!(gfp_mask & __GFP_WAIT))
+	 	limit += 5;
 
 	while (audit_backlog_limit
-	   && skb_queue_len(_skb_queue) > audit_backlog_limit + reserve) {
+	   && skb_queue_len(_skb_queue) > limit) {
 		if (gfp_mask & __GFP_WAIT && audit_backlog_wait_time) {
 			unsigned long sleep_time;
 
 			sleep_time = timeout_start + audit_backlog_wait_time -
 	jiffies;
-			if ((long)sleep_time > 0)
-wait_for_auditd(sleep_time);
-			continue;
+			if ((long)sleep_time > 0) {
+wait_for_auditd(sleep_time, limit);
+continue;
+			}
 		}
 		if (audit_rate_check() && printk_ratelimit())
 			printk(KERN_WARNING
@@ -1133,6 +1134,8 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
 		return NULL;
 	}
 
+	audit_backlog_wait_time = AUDIT_BACKLOG_WAIT_TIME;
+
 	ab = audit_buffer_alloc(ctx, gfp_mask, type);
 	if (!ab) {
 		audit_log_lost("out of memory in audit_log_start");


Re: [PATCH] rbtree: Add some necessary condition checks

2013-09-02 Thread Zhi Yong Wu
On Mon, Sep 2, 2013 at 4:57 PM, Michel Lespinasse  wrote:
> On Sun, Sep 1, 2013 at 11:30 PM, Zhi Yong Wu  wrote:
>> In Tue, Aug 27, 2013 at 6:01 AM, Michel Lespinasse  wrote:
>>> On Fri, Aug 23, 2013 at 7:45 AM,   wrote:
 From: Zhi Yong Wu 

 Signed-off-by: Zhi Yong Wu 
 ---
  include/linux/rbtree_augmented.h | 3 ++-
  lib/rbtree.c | 5 +++--
  2 files changed, 5 insertions(+), 3 deletions(-)
>>>
>>> So, you are saying that the checks are necessary, but you are not saying 
>>> why.
>>>
>>> The way I see it, the checks are *not* necessary, because the rbtree
>>> invariants guarantee them to be true. The only way for the checks to
>>> fail would be if people directly manipulate the rbtrees without going
>>> through the proper APIs, and if they do that then I think they're on
>>> their own. So to me, I think it's the same situation as dereferencing
>>> a pointer without checking if it's NULL, because you know it should
>>> never be NULL - which in my eyes is perfectly acceptable.
>> In my patchset, some rbtree APIs to be invoked, and I think that those
>> rbtree APIs are used corrently, Below is the pointer of its code:
>> https://github.com/wuzhy/kernel/compare/torvalds:master...hot_tracking
>> But I hit some issues when using compilebench to do perf benchmark.
>> compile dir kernel-7 691MB in 8.92 seconds (77.53 MB/s)
>
> Thanks for the link - I now better understand where you are coming
> from with these fixes.
>
> Going back to the original message:
>
>> diff --git a/include/linux/rbtree_augmented.h 
>> b/include/linux/rbtree_augmented.h
>> index fea49b5..7d19770 100644
>> --- a/include/linux/rbtree_augmented.h
>> +++ b/include/linux/rbtree_augmented.h
>> @@ -199,7 +199,8 @@ __rb_erase_augmented(struct rb_node *node, struct 
>> rb_root *root,
>> }
>>
>> successor->rb_left = tmp = node->rb_left;
>> -   rb_set_parent(tmp, successor);
>> +   if (tmp)
>> +   rb_set_parent(tmp, successor);
>>
>> pc = node->__rb_parent_color;
>> tmp = __rb_parent(pc);
>
> Note that node->rb_left was already fetched at the top of
> __rb_erase_augmented(), and was checked to be non-NULL at the time -
> otherwise we would have executed 'Case 1' in that function. So, you
If 'Case 1' is executed, this line of code is also done, how about the result?
'Case 1' seems *not* to change node->rb_left at all.

> are not expected to find tmp == NULL here.
>
>> diff --git a/lib/rbtree.c b/lib/rbtree.c
>> index c0e31fe..2cb01ba 100644
>> --- a/lib/rbtree.c
>> +++ b/lib/rbtree.c
>> @@ -214,7 +214,7 @@ rb_erase_color(struct rb_node *parent, struct 
>> rb_root *root,
>>  */
>> sibling = parent->rb_right;
>> if (node != sibling) {  /* node == parent->rb_left */
>> -   if (rb_is_red(sibling)) {
>> +   if (sibling && rb_is_red(sibling)) {
>> /*
>>  * Case 1 - left rotate at parent
>>  *
>
> Note the loop invariants quoted just above:
>
> /*
>  * Loop invariants:
>  * - node is black (or NULL on first iteration)
>  * - node is not the root (parent is not NULL)
>  * - All leaf paths going through parent and node have a
>  *   black node count that is 1 lower than other leaf paths.
>  */
>
> Because of these, each path from sibling to a leaf must include at
> least one black node, which implies that sibling can't be NULL - or to
> put it another way, if sibling is null then the expected invariants
> were violated before we even got there.
In theory, i can understand what you mean, But don't know why and
where it got violated.
>
>> @@ -226,7 +226,8 @@ rb_erase_color(struct rb_node *parent, struct 
>> rb_root *root,
>>  */
>> parent->rb_right = tmp1 = sibling->rb_left;
>> sibling->rb_left = parent;
>> -   rb_set_parent_color(tmp1, parent, RB_BLACK);
>> +   if (tmp1)
>> +   rb_set_parent_color(tmp1, parent, 
>> RB_BLACK);
>> __rb_rotate_set_parents(parent, sibling, 
>> root,
>> RB_RED);
>> augment_rotate(parent, sibling);
>
> This is actually the same invariant here - each path from sibling to a
> leaf must include at least one black node, and sibling is now known to
> be red, so it must have two black children.
Ditto.
>
>
> Now I had a quick look at your code and I couldn't tell at which point
> the invariants are violated. However I did notice a couple suspicious
> things in the very first patch
> 

RE: [PATCHv2 1/4] pwm: Add Freescale FTM PWM driver support

2013-09-02 Thread Xiubo Li-B47053
> Subject: Re: [PATCHv2 1/4] pwm: Add Freescale FTM PWM driver support
> 
> On Mon, Sep 02, 2013 at 03:33:37AM +, Xiubo Li-B47053 wrote:
> >
> > > > +static void fsl_pwm_free(struct pwm_chip *chip, struct pwm_device
> > > > +*pwm) {
> > > > +   struct fsl_pwm_chip *fpc;
> > > > +   struct fsl_pwm_data *pwm_data;
> > > > +
> > > > +   fpc = to_fsl_chip(chip);
> > > > +
> > > > +   pwm_data = pwm_get_chip_data(pwm);
> > > > +   if (!pwm_data)
> > > > +   return;
> > >
> > > THis check seems unnecessary.
> > >
> >
> > But if do not check it here, I must check it in the following code.
> >
> > > > +
> > > > +   if (pwm_data->available != FSL_AVAILABLE)
> > > > +   return;
> > > > +
> >
> > So the ' struct fsl_pwm_data' may be removed in the future.
> >
> > >
> > > > +
> > > > +
> > > > +   pwm_data->period_cycles = period_cycles;
> > > > +   pwm_data->duty_cycles = duty_cycles;
> > >
> > > These fields are set but never read. Please drop them.
> > >
> > > If you drop the 'available' field also the you can drop chip_data
> > > completely.
> > >
> >
> > I think I may move the 'available' field to the PWM driver data struct.
> 
> You simply don't need the available field. You don't need to track
> whether they are available. If a user enables a pwm which is not routed
> out of the SoC (disabled in the iomux) simply nothing will happen except
> for a slightly increased power consumption.
> 
If the there is not need to explicitly specify the channels are available or 
not, so there is no doubt that the 'available' field will be dropt.
Why I added this here is because that the 4th and 5th channels' pinctrls are 
used as UART TX and RX as I have mentioned before, so here if you configure 
these two pinctrls, the UART TX and RX will be polluted, there maybe some other 
cases like this.
So, if there is no need to worry about this in PWM driver, the customer should 
be aware of it and be responsible for the potential risk.
I will think it over and optimize it then.



Thanks very much.
--
Best Regards.
Xiubo





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 01/10] tracing: Add support for SOFT_DISABLE to syscall events

2013-09-02 Thread Tom Zanussi
The original SOFT_DISABLE patches didn't add support for soft disable
of syscall events; this adds it and paves the way for future patches
allowing triggers to be added to syscall events, since triggers are
built on top of SOFT_DISABLE.

Add an array of ftrace_event_file pointers indexed by syscall number
to the trace array and remove the existing enabled bitmaps, which as a
result are now redundant.  The ftrace_event_file structs in turn
contain the soft disable flags we need for per-syscall soft disable
accounting; later patches add additional 'trigger' flags and
per-syscall triggers and filters.

Signed-off-by: Tom Zanussi 
---
 kernel/trace/trace.h  |  4 ++--
 kernel/trace/trace_syscalls.c | 36 ++--
 2 files changed, 32 insertions(+), 8 deletions(-)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index fe39acd..b1227b9 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -192,8 +192,8 @@ struct trace_array {
 #ifdef CONFIG_FTRACE_SYSCALLS
int sys_refcount_enter;
int sys_refcount_exit;
-   DECLARE_BITMAP(enabled_enter_syscalls, NR_syscalls);
-   DECLARE_BITMAP(enabled_exit_syscalls, NR_syscalls);
+   struct ftrace_event_file *enter_syscall_files[NR_syscalls];
+   struct ftrace_event_file *exit_syscall_files[NR_syscalls];
 #endif
int stop_count;
int clock_id;
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 559329d..af4b71c 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -302,6 +302,7 @@ static int __init syscall_exit_define_fields(struct 
ftrace_event_call *call)
 static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
 {
struct trace_array *tr = data;
+   struct ftrace_event_file *ftrace_file;
struct syscall_trace_enter *entry;
struct syscall_metadata *sys_data;
struct ring_buffer_event *event;
@@ -314,7 +315,13 @@ static void ftrace_syscall_enter(void *data, struct 
pt_regs *regs, long id)
syscall_nr = trace_get_syscall_nr(current, regs);
if (syscall_nr < 0)
return;
-   if (!test_bit(syscall_nr, tr->enabled_enter_syscalls))
+
+   /* Here we're inside the tp handler's rcu_read_lock (__DO_TRACE()) */
+   ftrace_file = rcu_dereference(tr->enter_syscall_files[syscall_nr]);
+   if (!ftrace_file)
+   return;
+
+   if (test_bit(FTRACE_EVENT_FL_SOFT_DISABLED_BIT, _file->flags))
return;
 
sys_data = syscall_nr_to_meta(syscall_nr);
@@ -345,6 +352,7 @@ static void ftrace_syscall_enter(void *data, struct pt_regs 
*regs, long id)
 static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
 {
struct trace_array *tr = data;
+   struct ftrace_event_file *ftrace_file;
struct syscall_trace_exit *entry;
struct syscall_metadata *sys_data;
struct ring_buffer_event *event;
@@ -356,7 +364,13 @@ static void ftrace_syscall_exit(void *data, struct pt_regs 
*regs, long ret)
syscall_nr = trace_get_syscall_nr(current, regs);
if (syscall_nr < 0)
return;
-   if (!test_bit(syscall_nr, tr->enabled_exit_syscalls))
+
+   /* Here we're inside the tp handler's rcu_read_lock (__DO_TRACE()) */
+   ftrace_file = rcu_dereference(tr->exit_syscall_files[syscall_nr]);
+   if (!ftrace_file)
+   return;
+
+   if (test_bit(FTRACE_EVENT_FL_SOFT_DISABLED_BIT, _file->flags))
return;
 
sys_data = syscall_nr_to_meta(syscall_nr);
@@ -397,7 +411,7 @@ static int reg_event_syscall_enter(struct ftrace_event_file 
*file,
if (!tr->sys_refcount_enter)
ret = register_trace_sys_enter(ftrace_syscall_enter, tr);
if (!ret) {
-   set_bit(num, tr->enabled_enter_syscalls);
+   rcu_assign_pointer(tr->enter_syscall_files[num], file);
tr->sys_refcount_enter++;
}
mutex_unlock(_trace_lock);
@@ -415,10 +429,15 @@ static void unreg_event_syscall_enter(struct 
ftrace_event_file *file,
return;
mutex_lock(_trace_lock);
tr->sys_refcount_enter--;
-   clear_bit(num, tr->enabled_enter_syscalls);
+   rcu_assign_pointer(tr->enter_syscall_files[num], NULL);
if (!tr->sys_refcount_enter)
unregister_trace_sys_enter(ftrace_syscall_enter, tr);
mutex_unlock(_trace_lock);
+   /*
+* Callers expect the event to be completely disabled on
+* return, so wait for current handlers to finish.
+*/
+   synchronize_sched();
 }
 
 static int reg_event_syscall_exit(struct ftrace_event_file *file,
@@ -435,7 +454,7 @@ static int reg_event_syscall_exit(struct ftrace_event_file 
*file,
if (!tr->sys_refcount_exit)
ret = register_trace_sys_exit(ftrace_syscall_exit, 

[PATCH v8 07/10] tracing: Add and use generic set_trigger_filter() implementation

2013-09-02 Thread Tom Zanussi
Add a generic event_command.set_trigger_filter() op implementation and
have the current set of trigger commands use it - this essentially
gives them all support for filters.

Syntactically, filters are supported by adding 'if ' just
after the command, in which case only events matching the filter will
invoke the trigger.  For example, to add a filter to an
enable/disable_event command:

echo 'enable_event:system:event if common_pid == 999' > \
  .../othersys/otherevent/trigger

The above command will only enable the system:event event if the
common_pid field in the othersys:otherevent event is 999.

As another example, to add a filter to a stacktrace command:

echo 'stacktrace if common_pid == 999' > \
   .../somesys/someevent/trigger

The above command will only trigger a stacktrace if the common_pid
field in the event is 999.

The filter syntax is the same as that described in the 'Event
filtering' section of Documentation/trace/events.txt.

Because triggers can now use filters, the trigger-invoking logic needs
to be moved in those cases - e.g. for ftrace_raw_event_calls, if a
trigger has a filter associated with it, the trigger invocation now
needs to happen after the { assign; } part of the call, in order for
the trigger condition to be tested.

There's still a SOFT_DISABLED-only check at the top of e.g. the
ftrace_raw_events function, so when an event is soft disabled but not
because of the presence of a trigger, the original SOFT_DISABLED
behavior remains unchanged.

There's also a bit of trickiness in that some triggers need to avoid
being invoked while an event is currently in the process of being
logged, since the trigger may itself log data into the trace buffer.
Thus we make sure the current event is committed before invoking those
triggers.  To do that, we split the trigger invocation in two - the
first part (event_triggers_call()) checks the filter using the current
trace record; if a command has the post_trigger flag set, it sets a
bit for itself in the return value, otherwise it directly invoks the
trigger.  Once all commands have been either invoked or set their
return flag, event_triggers_call() returns.  The current record is
then either committed or discarded; if any commands have deferred
their triggers, those commands are finally invoked following the close
of the current event by event_triggers_post_call().

To simplify the above and make it more efficient, the TRIGGER_COND bit
is introduced, which is set only if a soft-disabled trigger needs to
use the log record for filter testing or needs to wait until the
current log record is closed.

The syscall event invocation code is also changed in analogous ways.

Because event triggers need to be able to create and free filters,
this also adds a couple external wrappers for the existing
create_filter and free_filter functions, which are too generic to be
made extern functions themselves.

Signed-off-by: Tom Zanussi 
---
 include/linux/ftrace_event.h|  10 ++-
 include/trace/ftrace.h  |  53 ++
 kernel/trace/trace.h|   5 ++
 kernel/trace/trace_events_filter.c  |  13 
 kernel/trace/trace_events_trigger.c | 142 +++-
 kernel/trace/trace_syscalls.c   |  50 ++---
 6 files changed, 243 insertions(+), 30 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 8e87302..8365a4c 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -1,3 +1,4 @@
+
 #ifndef _LINUX_FTRACE_EVENT_H
 #define _LINUX_FTRACE_EVENT_H
 
@@ -256,6 +257,7 @@ enum {
FTRACE_EVENT_FL_SOFT_MODE_BIT,
FTRACE_EVENT_FL_SOFT_DISABLED_BIT,
FTRACE_EVENT_FL_TRIGGER_MODE_BIT,
+   FTRACE_EVENT_FL_TRIGGER_COND_BIT,
 };
 
 /*
@@ -266,6 +268,7 @@ enum {
  *  SOFT_DISABLED - When set, do not trace the event (even though its
  *   tracepoint may be enabled)
  *  TRIGGER_MODE  - When set, invoke the triggers associated with the event
+ *  TRIGGER_COND  - When set, one or more triggers has an associated filter
  */
 enum {
FTRACE_EVENT_FL_ENABLED = (1 << FTRACE_EVENT_FL_ENABLED_BIT),
@@ -273,6 +276,7 @@ enum {
FTRACE_EVENT_FL_SOFT_MODE   = (1 << FTRACE_EVENT_FL_SOFT_MODE_BIT),
FTRACE_EVENT_FL_SOFT_DISABLED   = (1 << 
FTRACE_EVENT_FL_SOFT_DISABLED_BIT),
FTRACE_EVENT_FL_TRIGGER_MODE= (1 << 
FTRACE_EVENT_FL_TRIGGER_MODE_BIT),
+   FTRACE_EVENT_FL_TRIGGER_COND= (1 << 
FTRACE_EVENT_FL_TRIGGER_COND_BIT),
 };
 
 struct ftrace_event_file {
@@ -326,11 +330,15 @@ enum event_trigger_type {
 
 extern void destroy_preds(struct ftrace_event_call *call);
 extern int filter_match_preds(struct event_filter *filter, void *rec);
+
 extern int filter_current_check_discard(struct ring_buffer *buffer,
struct ftrace_event_call *call,
void *rec,
 

[PATCH v8 08/10] tracing: Update event filters for multibuffer

2013-09-02 Thread Tom Zanussi
The trace event filters are still tied to event calls rather than
event files, which means you don't get what you'd expect when using
filters in the multibuffer case:

Before:

  # echo 'count > 65536' > 
/sys/kernel/debug/tracing/events/syscalls/sys_enter_read/filter
  # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/filter
  count > 65536
  # mkdir /sys/kernel/debug/tracing/instances/test1
  # echo 'count > 4096' > 
/sys/kernel/debug/tracing/instances/test1/events/syscalls/sys_enter_read/filter
  # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/filter
  count > 4096

Setting the filter in tracing/instances/test1/events shouldn't affect
the same event in tracing/events as it does above.

After:

  # echo 'count > 65536' > 
/sys/kernel/debug/tracing/events/syscalls/sys_enter_read/filter
  # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/filter
  count > 65536
  # mkdir /sys/kernel/debug/tracing/instances/test1
  # echo 'count > 4096' > 
/sys/kernel/debug/tracing/instances/test1/events/syscalls/sys_enter_read/filter
  # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/filter
count > 65536

We'd like to just move the filter directly from ftrace_event_call to
ftrace_event_file, but there are a couple cases that don't yet have
multibuffer support and therefore have to continue using the current
event_call-based filters.  For those cases, a new USE_CALL_FILTER bit
is added to the event_call flags, whose main purpose is to keep the
old behavioir for those cases until they can be updated with
multibuffer support; at that point, the USE_CALL_FILTER flag (and the
new associated call_filter_check_discard() function) can go away.

The multibuffer support also made filter_current_check_discard()
redundant, so this change removes that function as well and replaces
it with filter_check_discard() (or call_filter_check_discard() as
appropriate).

Signed-off-by: Tom Zanussi 
---
 include/linux/ftrace_event.h |  31 +++--
 include/trace/ftrace.h   |   6 +-
 kernel/trace/trace.c |  40 +--
 kernel/trace/trace.h |  18 +--
 kernel/trace/trace_branch.c  |   2 +-
 kernel/trace/trace_events.c  |  23 ++--
 kernel/trace/trace_events_filter.c   | 218 ---
 kernel/trace/trace_export.c  |   2 +-
 kernel/trace/trace_functions_graph.c |   4 +-
 kernel/trace/trace_kprobe.c  |   4 +-
 kernel/trace/trace_mmiotrace.c   |   4 +-
 kernel/trace/trace_sched_switch.c|   4 +-
 kernel/trace/trace_syscalls.c|   7 +-
 kernel/trace/trace_uprobe.c  |   3 +-
 14 files changed, 263 insertions(+), 103 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 8365a4c..c96009c 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -203,6 +203,7 @@ enum {
TRACE_EVENT_FL_NO_SET_FILTER_BIT,
TRACE_EVENT_FL_IGNORE_ENABLE_BIT,
TRACE_EVENT_FL_WAS_ENABLED_BIT,
+   TRACE_EVENT_FL_USE_CALL_FILTER_BIT,
 };
 
 /*
@@ -214,6 +215,7 @@ enum {
  *  WAS_ENABLED   - Set and stays set when an event was ever enabled
  *(used for module unloading, if a module event is enabled,
  * it is best to clear the buffers that used it).
+ *  USE_CALL_FILTER - For ftrace internal events, don't use file filter
  */
 enum {
TRACE_EVENT_FL_FILTERED = (1 << TRACE_EVENT_FL_FILTERED_BIT),
@@ -221,6 +223,7 @@ enum {
TRACE_EVENT_FL_NO_SET_FILTER= (1 << 
TRACE_EVENT_FL_NO_SET_FILTER_BIT),
TRACE_EVENT_FL_IGNORE_ENABLE= (1 << 
TRACE_EVENT_FL_IGNORE_ENABLE_BIT),
TRACE_EVENT_FL_WAS_ENABLED  = (1 << TRACE_EVENT_FL_WAS_ENABLED_BIT),
+   TRACE_EVENT_FL_USE_CALL_FILTER  = (1 << 
TRACE_EVENT_FL_USE_CALL_FILTER_BIT),
 };
 
 struct ftrace_event_call {
@@ -239,6 +242,7 @@ struct ftrace_event_call {
 *   bit 2: failed to apply filter
 *   bit 3: ftrace internal event (do not enable)
 *   bit 4: Event was enabled by module
+*   bit 5: use call filter rather than file filter
 */
int flags; /* static flags of different events */
 
@@ -254,6 +258,8 @@ struct ftrace_subsystem_dir;
 enum {
FTRACE_EVENT_FL_ENABLED_BIT,
FTRACE_EVENT_FL_RECORDED_CMD_BIT,
+   FTRACE_EVENT_FL_FILTERED_BIT,
+   FTRACE_EVENT_FL_NO_SET_FILTER_BIT,
FTRACE_EVENT_FL_SOFT_MODE_BIT,
FTRACE_EVENT_FL_SOFT_DISABLED_BIT,
FTRACE_EVENT_FL_TRIGGER_MODE_BIT,
@@ -264,6 +270,8 @@ enum {
  * Ftrace event file flags:
  *  ENABLED  - The event is enabled
  *  RECORDED_CMD  - The comms should be recorded at sched_switch
+ *  FILTERED - The event has a filter attached
+ *  NO_SET_FILTER - Set when filter has error and is to be ignored
  *  SOFT_MODE - The event is enabled/disabled by SOFT_DISABLED
  *  

[PATCH v8 00/10] tracing: trace event triggers

2013-09-02 Thread Tom Zanussi
Hi,

This is v8 of the trace event triggers patchset.  This version
addresses the comments and feedback from Steve Rostedt on v7.

v8:
 - changed rcu_dereference_raw() to rcu_dereference() and moved
   synchronize_sched() out from under the syscall_trace_lock mutex.
 - got rid of the various void ** usages in the basic framework and
   individual trigger patches.  Since triggers always expect an
   event_trigger_data instance, there's not even any reason to make it
   a void *, so those along with the void * usages were changed to use
   event_trigger_data * directly.  To allow for trigger-specific data,
   a new void * field named private_data was added to
   event_trigger_data; this is made use of by the enable/disable_event
   triggers.
 - fixed various style nitpicks.
 - added a new TRIGGER_COND flag to ftrace_file - this flag basically
   tracks whether or not an event has any triggers that have a
   condition associated with them that requires looking at the data
   being logged (or that would be in the case of soft-disable) for the
   current event.  If TRIGGER_COND is not set, then the triggers can
   be invoked immediately without forcing the ineffeciency of actually
   generating the log event when not necessary.
 - patch 8 removed the obsolete filter_current_check_discard() and
   replaced it with filter_check_discard() but accidentally made the
   new function static inline, which is obviously not what was
   intended.  That and the new call_filter_check_discard() functions
   are now normal functions as filter_current_check_discard() was.
 - isolated all the ugly 'if (USE_CALL_FILTER) else' usages in patch 8
   which significantly cleaned up that patch as a result.

v7:
 - moved find_event_file() extern declartion to patch 06.
 - moved helper functions from patch 02 to 03, where they're first
   used.
 - removed copies of cmd_ops fields from trigger_data and changed to
   use cmd_ops diretly instead.
 - renamed trigger_mode to trigger_type to avoid confusion with the
   FTRACE_EVENT_FL_TRIGGER_MODE_BIT bitflag, and fixed up
   usage/documentation, etc.

v6:
 - fixed up the conflicts in trace_events.c related to the actual
   creation of the per-event 'trigger' files.

v5:
 - got rid of the trigger_iterator, a vestige of the first patchset,
   which attempted to abstract the ftrace_iterator for triggers, and
   cleaned up related code simplified as a result.
 - replaced the void *cmd_data everywhere with ftrace_event_file *,
   another vestige of the initial patchset.
 - updated the patchset to use event_file_data() to grab the i_private
   ftrace_event_files where appropriate (this was a separate patch in
   the previous patchset, but was merged into the basic framework
   patch as suggested by Masami.  The only interesting part about this
   is that it moved event_file_data() from kernel/trace/trace_events.c
   to kernel/trace/trace.h so it can be used in
   e.g. trace_events_trigger.c as well.)
 - add missing grab of event_mutex in event_trigger_regex_write().
 - realized when making the above changes that the trigger filters
   weren't being freed when the trigger was freed, so added a
   trigger_data_free() to do that.  It also ensures that trigger_data
   won't be freed until nothing is using it.
 - added clear_event_triggers(), which clears all triggers in a trace
   array (and soft-disable associated with event_enable/disable
   events).
 - added a comment to ftrace_syscall_enter/exit to document the use of
   rcu_dereference_raw() there.

v4:
 - made some changes to the soft-disable for syscall patch, according
   to Masami's suggestions.  Actually, since there's now an array of
   ftrace_files for syscalls that can serve the same purpose, the
   enabled_enter/exit_syscalls bit arrays became redundant and were
   removed.
 - moved all the remaining common functions out of the
   traceon/traceoff patch and into the basic trigger framework patch
   and added comments to all the common functions.
 - extensively commented the event_trigger_ops and event_command ops.
 - made the register/unregister_command functions __init.  Since that
   code was originally inspired by similar ftrace code, a new patch
   was added to do the same thing for the register/unregister of the
   ftrace commands (patch 10/11).
 - fixed the event_trigger_regex_open i_private problem noted by
   Masami that's currently being addressed by Oleg Nesterov's fixes
   for this.  Note that that patchset also affects patch 8/11 (update
   filters for multi-buffer, since it touches event filters as well).
   Patch 11/11 depends on that patchset and also moves
   event_file_data() to trace.h.b

v3:
 - added a new patch to the series (patch 8/9 - update event filters
   for multibuffer) to bring the event filters up-to-date wrt the
   multibuffer changes - without this patch, the same filter is
   applied to all buffers regardless of which instance sets it; this
   patch allows you to set per-instance filters as you'd expect. 

[PATCH 2/2] audit: Two efficiency fixes for audit mechanism

2013-09-02 Thread Chuck Anderson

audit: Two efficiency fixes for audit mechanism

author: Dan Duval 

These and similar errors were seen on a patched 3.8 kernel when the
audit subsystem was overrun during boot:

  udevd[876]: worker [887] unexpectedly returned with status 0x0100
  udevd[876]: worker [887] failed while handling 
'/devices/pci:00/:00:03.0/:40:00.0'

  udevd[876]: worker [880] unexpectedly returned with status 0x0100
  udevd[876]: worker [880] failed while handling 
'/devices/LNXSYSTM:00/LNXPWRBN:00/input/input1/event1'


  udevadm settle - timeout of 180 seconds reached, the event queue 
contains:

/sys/devices/LNXSYSTM:00/LNXPWRBN:00/input/input1/event1 (3995)
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/INT3F0D:00 (4034)

  audit: audit_backlog=258 > audit_backlog_limit=256
  audit: audit_lost=1 audit_rate_limit=0 audit_backlog_limit=256

The changes below increase the efficiency of the audit code and
prevent it from being overrun:

1. Only issue a wake_up in kauditd if the length of the skb queue
   is less than the backlog limit.  Otherwise, threads waiting in
   wait_for_auditd() will simply wake up, discover that the
   queue is still too long for them to proceed, and go back
   to sleep.  This results in wasted context switches and
   machine cycles.  kauditd_thread() is the only function that
   removes buffers from audit_skb_queue so we can't race.  If we
   did, the timeout in wait_for_auditd() would expire and the
   waiting thread would continue.

2. Use add_wait_queue_exclusive() in wait_for_auditd() to put the
   thread on the wait queue.  When kauditd dequeues an skb, all
   of the waiting threads are waiting for the same resource, but
   only one is going to get it, so there's no need to wake up
   more than one waiter.

Signed-off-by: Dan Duval 
Signed-off-by: Chuck Anderson 
---
 kernel/audit.c |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 9a78dde..d87b4dd 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -449,8 +449,11 @@ static int kauditd_thread(void *dummy)
flush_hold_queue();

skb = skb_dequeue(_skb_queue);
-   wake_up(_backlog_wait);
+
if (skb) {
+   if(skb_queue_len(_skb_queue) <= audit_backlog_limi
t)
+   wake_up(_backlog_wait);
+
if (audit_pid)
kauditd_send_skb(skb);
else
@@ -1059,7 +1062,7 @@ static void wait_for_auditd(unsigned long 
sleep_time, int

limit)
 {
DECLARE_WAITQUEUE(wait, current);
set_current_state(TASK_UNINTERRUPTIBLE);
-   add_wait_queue(_backlog_wait, );
+   add_wait_queue_exclusive(_backlog_wait, );

if (audit_backlog_limit &&
skb_queue_len(_skb_queue) > limit)
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 10/10] tracing: Make register/unregister_ftrace_command __init

2013-09-02 Thread Tom Zanussi
register/unregister_ftrace_command() are only ever called from __init
functions, so can themselves be made __init.

Also make register_snapshot_cmd() __init for the same reason.

Signed-off-by: Tom Zanussi 
---
 include/linux/ftrace.h |  4 ++--
 kernel/trace/ftrace.c  | 12 ++--
 kernel/trace/trace.c   |  4 ++--
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 9f15c00..6062491 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -533,11 +533,11 @@ static inline int ftrace_force_update(void) { return 0; }
 static inline void ftrace_disable_daemon(void) { }
 static inline void ftrace_enable_daemon(void) { }
 static inline void ftrace_release_mod(struct module *mod) {}
-static inline int register_ftrace_command(struct ftrace_func_command *cmd)
+static inline __init int register_ftrace_command(struct ftrace_func_command 
*cmd)
 {
return -EINVAL;
 }
-static inline int unregister_ftrace_command(char *cmd_name)
+static inline __init int unregister_ftrace_command(char *cmd_name)
 {
return -EINVAL;
 }
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index a6d098c..64f7f39 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3292,7 +3292,11 @@ void unregister_ftrace_function_probe_all(char *glob)
 static LIST_HEAD(ftrace_commands);
 static DEFINE_MUTEX(ftrace_cmd_mutex);
 
-int register_ftrace_command(struct ftrace_func_command *cmd)
+/*
+ * Currently we only register ftrace commands from __init, so mark this
+ * __init too.
+ */
+__init int register_ftrace_command(struct ftrace_func_command *cmd)
 {
struct ftrace_func_command *p;
int ret = 0;
@@ -3311,7 +3315,11 @@ int register_ftrace_command(struct ftrace_func_command 
*cmd)
return ret;
 }
 
-int unregister_ftrace_command(struct ftrace_func_command *cmd)
+/*
+ * Currently we only unregister ftrace commands from __init, so mark
+ * this __init too.
+ */
+__init int unregister_ftrace_command(struct ftrace_func_command *cmd)
 {
struct ftrace_func_command *p, *n;
int ret = -ENODEV;
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 69f5796..76e2ecf 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -5486,12 +5486,12 @@ static struct ftrace_func_command ftrace_snapshot_cmd = 
{
.func   = ftrace_trace_snapshot_callback,
 };
 
-static int register_snapshot_cmd(void)
+static __init int register_snapshot_cmd(void)
 {
return register_ftrace_command(_snapshot_cmd);
 }
 #else
-static inline int register_snapshot_cmd(void) { return 0; }
+static inline __init int register_snapshot_cmd(void) { return 0; }
 #endif /* defined(CONFIG_TRACER_SNAPSHOT) && defined(CONFIG_DYNAMIC_FTRACE) */
 
 struct dentry *tracing_init_dentry_tr(struct trace_array *tr)
-- 
1.7.11.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 04/10] tracing: Add 'snapshot' event trigger command

2013-09-02 Thread Tom Zanussi
Add 'snapshot' event_command.  snapshot event triggers are added by
the user via this command in a similar way and using practically the
same syntax as the analogous 'snapshot' ftrace function command, but
instead of writing to the set_ftrace_filter file, the snapshot event
trigger is written to the per-event 'trigger' files:

echo 'snapshot' > .../somesys/someevent/trigger

The above command will turn on snapshots for someevent i.e. whenever
someevent is hit, a snapshot will be done.

This also adds a 'count' version that limits the number of times the
command will be invoked:

echo 'snapshot:N' > .../somesys/someevent/trigger

Where N is the number of times the command will be invoked.

The above command will snapshot N times for someevent i.e. whenever
someevent is hit N times, a snapshot will be done.

Also adds a new ftrace_alloc_snapshot() function - the ftrace snapshot
command defines code that allocates a snapshot, which would be nice to
be able to reuse, which this does.

Signed-off-by: Tom Zanussi 
---
 include/linux/ftrace_event.h|  1 +
 kernel/trace/trace.c|  9 +
 kernel/trace/trace.h|  1 +
 kernel/trace/trace_events_trigger.c | 76 +
 4 files changed, 87 insertions(+)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index a14650b..40b517b 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -319,6 +319,7 @@ struct ftrace_event_file {
 enum event_trigger_type {
ETT_NONE= (0),
ETT_TRACE_ONOFF = (1 << 0),
+   ETT_SNAPSHOT= (1 << 1),
 };
 
 extern void destroy_preds(struct ftrace_event_call *call);
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 496f94d..5a61dbe 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -5358,6 +5358,15 @@ static const struct file_operations 
tracing_dyn_info_fops = {
 };
 #endif /* CONFIG_DYNAMIC_FTRACE */
 
+#if defined(CONFIG_TRACER_SNAPSHOT)
+int ftrace_alloc_snapshot(void)
+{
+   return alloc_snapshot(_trace);
+}
+#else
+int ftrace_alloc_snapshot(void) { return -ENOSYS; }
+#endif
+
 #if defined(CONFIG_TRACER_SNAPSHOT) && defined(CONFIG_DYNAMIC_FTRACE)
 static void
 ftrace_snapshot(unsigned long ip, unsigned long parent_ip, void **data)
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 37b8ecf..f032dd8 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1200,6 +1200,7 @@ struct event_command {
 
 extern int trace_event_enable_disable(struct ftrace_event_file *file,
  int enable, int soft_disable);
+extern int ftrace_alloc_snapshot(void);
 
 extern const char *__start___trace_bprintk_fmt[];
 extern const char *__stop___trace_bprintk_fmt[];
diff --git a/kernel/trace/trace_events_trigger.c 
b/kernel/trace/trace_events_trigger.c
index 5388d55..9bdcc38 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -664,6 +664,74 @@ static struct event_command trigger_traceoff_cmd = {
.get_trigger_ops= onoff_get_trigger_ops,
 };
 
+static void
+snapshot_trigger(struct event_trigger_data *data)
+{
+   tracing_snapshot();
+}
+
+static void
+snapshot_count_trigger(struct event_trigger_data *data)
+{
+   if (!data->count)
+   return;
+
+   if (data->count != -1)
+   (data->count)--;
+
+   snapshot_trigger(data);
+}
+
+static int
+register_snapshot_trigger(char *glob, struct event_trigger_ops *ops,
+ struct event_trigger_data *data,
+ struct ftrace_event_file *file)
+{
+   int ret = register_trigger(glob, ops, data, file);
+
+   if (ret > 0)
+   ftrace_alloc_snapshot();
+
+   return ret;
+}
+
+static int
+snapshot_trigger_print(struct seq_file *m, struct event_trigger_ops *ops,
+  struct event_trigger_data *data)
+{
+   return event_trigger_print("snapshot", m, (void *)data->count,
+  data->filter_str);
+}
+
+static struct event_trigger_ops snapshot_trigger_ops = {
+   .func   = snapshot_trigger,
+   .print  = snapshot_trigger_print,
+   .init   = event_trigger_init,
+   .free   = event_trigger_free,
+};
+
+static struct event_trigger_ops snapshot_count_trigger_ops = {
+   .func   = snapshot_count_trigger,
+   .print  = snapshot_trigger_print,
+   .init   = event_trigger_init,
+   .free   = event_trigger_free,
+};
+
+static struct event_trigger_ops *
+snapshot_get_trigger_ops(char *cmd, char *param)
+{
+   return param ? _count_trigger_ops : _trigger_ops;
+}
+
+static struct event_command trigger_snapshot_cmd = {
+   .name   = "snapshot",
+   .trigger_type   = ETT_SNAPSHOT,
+   .func 

[PATCH v8 06/10] tracing: Add 'enable_event' and 'disable_event' event trigger commands

2013-09-02 Thread Tom Zanussi
Add 'enable_event' and 'disable_event' event_command commands.

enable_event and disable_event event triggers are added by the user
via these commands in a similar way and using practically the same
syntax as the analagous 'enable_event' and 'disable_event' ftrace
function commands, but instead of writing to the set_ftrace_filter
file, the enable_event and disable_event triggers are written to the
per-event 'trigger' files:

echo 'enable_event:system:event' > .../othersys/otherevent/trigger
echo 'disable_event:system:event' > .../othersys/otherevent/trigger

The above commands will enable or disable the 'system:event' trace
events whenever the othersys:otherevent events are hit.

This also adds a 'count' version that limits the number of times the
command will be invoked:

echo 'enable_event:system:event:N' > .../othersys/otherevent/trigger
echo 'disable_event:system:event:N' > .../othersys/otherevent/trigger

Where N is the number of times the command will be invoked.

The above commands will will enable or disable the 'system:event'
trace events whenever the othersys:otherevent events are hit, but only
N times.

This also makes the find_event_file() helper function extern, since
it's useful to use from other places, such as the event triggers code,
so make it accessible.

Signed-off-by: Tom Zanussi 
---
 include/linux/ftrace_event.h|   1 +
 kernel/trace/trace.h|   4 +
 kernel/trace/trace_events.c |   2 +-
 kernel/trace/trace_events_trigger.c | 379 
 4 files changed, 385 insertions(+), 1 deletion(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 31750df..8e87302 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -321,6 +321,7 @@ enum event_trigger_type {
ETT_TRACE_ONOFF = (1 << 0),
ETT_SNAPSHOT= (1 << 1),
ETT_STACKTRACE  = (1 << 2),
+   ETT_EVENT_ENABLE= (1 << 3),
 };
 
 extern void destroy_preds(struct ftrace_event_call *call);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index f032dd8..3941499 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1016,6 +1016,10 @@ extern void trace_event_enable_cmd_record(bool enable);
 extern int event_trace_add_tracer(struct dentry *parent, struct trace_array 
*tr);
 extern int event_trace_del_tracer(struct trace_array *tr);
 
+extern struct ftrace_event_file *find_event_file(struct trace_array *tr,
+const char *system,
+const char *event);
+
 static inline void *event_file_data(struct file *filp)
 {
return ACCESS_ONCE(file_inode(filp)->i_private);
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 7d8eb8a..25b2c86 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1860,7 +1860,7 @@ struct event_probe_data {
boolenable;
 };
 
-static struct ftrace_event_file *
+struct ftrace_event_file *
 find_event_file(struct trace_array *tr, const char *system,  const char *event)
 {
struct ftrace_event_file *file;
diff --git a/kernel/trace/trace_events_trigger.c 
b/kernel/trace/trace_events_trigger.c
index 36b6601..cab187b 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -807,6 +807,373 @@ static __init void 
unregister_trigger_traceon_traceoff_cmds(void)
 _cmd_mutex);
 }
 
+/* Avoid typos */
+#define ENABLE_EVENT_STR   "enable_event"
+#define DISABLE_EVENT_STR  "disable_event"
+
+struct enable_trigger_data {
+   struct ftrace_event_file*file;
+   boolenable;
+};
+
+static void
+event_enable_trigger(struct event_trigger_data *data)
+{
+   struct enable_trigger_data *enable_data = data->private_data;
+
+   if (enable_data->enable)
+   clear_bit(FTRACE_EVENT_FL_SOFT_DISABLED_BIT, 
_data->file->flags);
+   else
+   set_bit(FTRACE_EVENT_FL_SOFT_DISABLED_BIT, 
_data->file->flags);
+}
+
+static void
+event_enable_count_trigger(struct event_trigger_data *data)
+{
+   struct enable_trigger_data *enable_data = data->private_data;
+
+   if (!data->count)
+   return;
+
+   /* Skip if the event is in a state we want to switch to */
+   if (enable_data->enable == !(enable_data->file->flags & 
FTRACE_EVENT_FL_SOFT_DISABLED))
+   return;
+
+   if (data->count != -1)
+   (data->count)--;
+
+   event_enable_trigger(data);
+}
+
+static int
+event_enable_trigger_print(struct seq_file *m, struct event_trigger_ops *ops,
+  struct event_trigger_data *data)
+{
+   struct enable_trigger_data *enable_data = data->private_data;
+
+   seq_printf(m, "%s:%s:%s",
+  enable_data->enable ? ENABLE_EVENT_STR : 

[PATCH v8 05/10] tracing: Add 'stacktrace' event trigger command

2013-09-02 Thread Tom Zanussi
Add 'stacktrace' event_command.  stacktrace event triggers are added
by the user via this command in a similar way and using practically
the same syntax as the analogous 'stacktrace' ftrace function command,
but instead of writing to the set_ftrace_filter file, the stacktrace
event trigger is written to the per-event 'trigger' files:

echo 'stacktrace' > .../tracing/events/somesys/someevent/trigger

The above command will turn on stacktraces for someevent i.e. whenever
someevent is hit, a stacktrace will be logged.

This also adds a 'count' version that limits the number of times the
command will be invoked:

echo 'stacktrace:N' > .../tracing/events/somesys/someevent/trigger

Where N is the number of times the command will be invoked.

The above command will log N stacktraces for someevent i.e. whenever
someevent is hit N times, a stacktrace will be logged.

Signed-off-by: Tom Zanussi 
---
 include/linux/ftrace_event.h|  1 +
 kernel/trace/trace_events_trigger.c | 76 +
 2 files changed, 77 insertions(+)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 40b517b..31750df 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -320,6 +320,7 @@ enum event_trigger_type {
ETT_NONE= (0),
ETT_TRACE_ONOFF = (1 << 0),
ETT_SNAPSHOT= (1 << 1),
+   ETT_STACKTRACE  = (1 << 2),
 };
 
 extern void destroy_preds(struct ftrace_event_call *call);
diff --git a/kernel/trace/trace_events_trigger.c 
b/kernel/trace/trace_events_trigger.c
index 9bdcc38..36b6601 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -732,6 +732,71 @@ static struct event_command trigger_snapshot_cmd = {
.get_trigger_ops= snapshot_get_trigger_ops,
 };
 
+/*
+ * Skip 4:
+ *   ftrace_stacktrace()
+ *   function_trace_probe_call()
+ *   ftrace_ops_list_func()
+ *   ftrace_call()
+ */
+#define STACK_SKIP 4
+
+static void
+stacktrace_trigger(struct event_trigger_data *data)
+{
+   trace_dump_stack(STACK_SKIP);
+}
+
+static void
+stacktrace_count_trigger(struct event_trigger_data *data)
+{
+   if (!data->count)
+   return;
+
+   if (data->count != -1)
+   (data->count)--;
+
+   stacktrace_trigger(data);
+}
+
+static int
+stacktrace_trigger_print(struct seq_file *m, struct event_trigger_ops *ops,
+struct event_trigger_data *data)
+{
+   return event_trigger_print("stacktrace", m, (void *)data->count,
+  data->filter_str);
+}
+
+static struct event_trigger_ops stacktrace_trigger_ops = {
+   .func   = stacktrace_trigger,
+   .print  = stacktrace_trigger_print,
+   .init   = event_trigger_init,
+   .free   = event_trigger_free,
+};
+
+static struct event_trigger_ops stacktrace_count_trigger_ops = {
+   .func   = stacktrace_count_trigger,
+   .print  = stacktrace_trigger_print,
+   .init   = event_trigger_init,
+   .free   = event_trigger_free,
+};
+
+static struct event_trigger_ops *
+stacktrace_get_trigger_ops(char *cmd, char *param)
+{
+   return param ? _count_trigger_ops : _trigger_ops;
+}
+
+static struct event_command trigger_stacktrace_cmd = {
+   .name   = "stacktrace",
+   .trigger_type   = ETT_STACKTRACE,
+   .post_trigger   = true,
+   .func   = event_trigger_callback,
+   .reg= register_trigger,
+   .unreg  = unregister_trigger,
+   .get_trigger_ops= stacktrace_get_trigger_ops,
+};
+
 static __init void unregister_trigger_traceon_traceoff_cmds(void)
 {
unregister_event_command(_traceon_cmd,
@@ -776,5 +841,16 @@ __init int register_trigger_cmds(void)
return ret;
}
 
+   ret = register_event_command(_stacktrace_cmd,
+ _commands,
+ _cmd_mutex);
+   if (WARN_ON(ret < 0)) {
+   unregister_trigger_traceon_traceoff_cmds();
+   unregister_event_command(_snapshot_cmd,
+_commands,
+_cmd_mutex);
+   return ret;
+   }
+
return 0;
 }
-- 
1.7.11.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/8] Drop support for Renesas H8/300 architecture

2013-09-02 Thread Chen Gang F T
On 09/03/2013 11:26 AM, Guenter Roeck wrote:
> On 09/02/2013 07:53 PM, Chen Gang F T wrote:
>> Hello Guenter Roeck:
>>
>>
>> I don't care about whether I am in cc mailing list, but at least,
>> please help confirm 2 things:
>>
>>Is what I had done for h8300 just making wastes and noisy in kernel and 
>> related sub-system mailing list ?
>>
>>and is the disccusion about h8300 between us also wastes and noisy in 
>> kernel mailing list ?
>>
> 
> It raised my awareness of the status of h8300 maintenance,
> so I would not see it as noise or waste. I might have suggested
> a different target for your efforts, but that is your choice to make,
> not mine.
> 

OK, thank you for your confirmation, I plan to scan all architectures
one by one with allmodconfig.

Hmm... if suitable, next, when I focus one of architectures, I also cc
to you, if it can be removed, please let me know in time, so can avoid
sending waste mails to mailing list.

I plan to try one of architectures within arc, hexagon, and metag. I
will begin at 2013-09-20 (or later), if some (or all) of them can be
removed, please let me know, thanks.


> On the code review side, I had suggested that you should not add new
> ifdefs into code, much less unnecessary ones. Your counter-argument
> was that you wanted to follow the existing coding style in the file
> in question. To me, that argument is along the line of "the coding
> style in this file is bad, let's do more of it".

Hmm... in fact, I will not say whether the code style is good or bad. I
mainly focus on to try to avoid multiple code styles within one file.

  extreme sample: let 'kernel code style' and 'gcc code style' in one file, 
that will make the code very ugly.

> That doesn't make much sense to me, so I did not bother to respond.
> Setting that aside, it is not up to me to approve or reject your patches.
> Whoever does that would be the one you have to convince.
> 

OK, I can understand, and now it seems it can be canceled, since h8300
has been removed.

> Guenter
> 


Thanks.
-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 09/10] tracing: Add documentation for trace event triggers

2013-09-02 Thread Tom Zanussi
Provide a basic overview of trace event triggers and document the
available trigger commands, along with a few simple examples.

Signed-off-by: Tom Zanussi 
---
 Documentation/trace/events.txt | 207 +
 1 file changed, 207 insertions(+)

diff --git a/Documentation/trace/events.txt b/Documentation/trace/events.txt
index 37732a2..c94435d 100644
--- a/Documentation/trace/events.txt
+++ b/Documentation/trace/events.txt
@@ -287,3 +287,210 @@ their old filters):
 prev_pid == 0
 # cat sched_wakeup/filter
 common_pid == 0
+
+6. Event triggers
+=
+
+Trace events can be made to conditionally invoke trigger 'commands'
+which can take various forms and are described in detail below;
+examples would be enabling or disabling other trace events or invoking
+a stack trace whenever the trace event is hit.  Whenever a trace event
+with attached triggers is invoked, the set of trigger commands
+associated with that event is invoked.  Any given trigger can
+additionally have an event filter of the same form as described in
+section 5 (Event filtering) associated with it - the command will only
+be invoked if the event being invoked passes the associated filter.
+If no filter is associated with the trigger, it always passes.
+
+Triggers are added to and removed from a particular event by writing
+trigger expressions to the 'trigger' file for the given event.
+
+A given event can have any number of triggers associated with it,
+subject to any restrictions that individual commands may have in that
+regard.
+
+Event triggers are implemented on top of "soft" mode, which means that
+whenever a trace event has one or more triggers associated with it,
+the event is activated even if it isn't actually enabled, but is
+disabled in a "soft" mode.  That is, the tracepoint will be called,
+but just will not be traced, unless of course it's actually enabled.
+This scheme allows triggers to be invoked even for events that aren't
+enabled, and also allows the current event filter implementation to be
+used for conditionally invoking triggers.
+
+The syntax for event triggers is roughly based on the syntax for
+set_ftrace_filter 'ftrace filter commands' (see the 'Filter commands'
+section of Documentation/trace/ftrace.txt), but there are major
+differences and the implementation isn't currently tied to it in any
+way, so beware about making generalizations between the two.
+
+6.1 Expression syntax
+-
+
+Triggers are added by echoing the command to the 'trigger' file:
+
+  # echo 'command[:count] [if filter]' > trigger
+
+Triggers are removed by echoing the same command but starting with '!'
+to the 'trigger' file:
+
+  # echo '!command[:count] [if filter]' > trigger
+
+The [if filter] part isn't used in matching commands when removing, so
+leaving that off in a '!' command will accomplish the same thing as
+having it in.
+
+The filter syntax is the same as that described in the 'Event
+filtering' section above.
+
+For ease of use, writing to the trigger file using '>' currently just
+adds or removes a single trigger and there's no explicit '>>' support
+('>' actually behaves like '>>') or truncation support to remove all
+triggers (you have to use '!' for each one added.)
+
+6.2 Supported trigger commands
+--
+
+The following commands are supported:
+
+- enable_event/disable_event
+
+  These commands can enable or disable another trace event whenever
+  the triggering event is hit.  When these commands are registered,
+  the other trace event is activated, but disabled in a "soft" mode.
+  That is, the tracepoint will be called, but just will not be traced.
+  The event tracepoint stays in this mode as long as there's a trigger
+  in effect that can trigger it.
+
+  For example, the following trigger causes kmalloc events to be
+  traced when a read system call is entered, and the :1 at the end
+  specifies that this enablement happens only once:
+
+  # echo 'enable_event:kmem:kmalloc:1' > \
+  /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger
+
+  The following trigger causes kmalloc events to stop being traced
+  when a read system call exits.  This disablement happens on every
+  read system call exit:
+
+  # echo 'disable_event:kmem:kmalloc' > \
+  /sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger
+
+  The format is:
+
+  enable_event::[:count]
+  disable_event::[:count]
+
+  To remove the above commands:
+
+  # echo '!enable_event:kmem:kmalloc:1' > \
+  /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger
+
+  # echo '!disable_event:kmem:kmalloc' > \
+  /sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger
+
+  Note that there can be any number of enable/disable_event triggers
+  per triggering event, but there can only be one trigger per
+  triggered event. e.g. sys_enter_read can have triggers enabling both
+  kmem:kmalloc and sched:sched_switch, but can't 

[PATCH v8 02/10] tracing: Add basic event trigger framework

2013-09-02 Thread Tom Zanussi
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.

'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.

The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.

The event trigger functionality is built on top of SOFT_DISABLE
functionality.  It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires.  Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that.  Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function.  Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.

The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.

The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.

The standard open, read, and release file operations are implemented
here.

The open() implementation sets up for the various open modes of the
'trigger' file.  It creates and attaches the trigger iterator and sets
up the command parser.  If opened for reading set up the trigger
seq_ops.

The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.

The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.

A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.

also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.

A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations.  They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.

The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event.  It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.

Every event_command func() implementation essentially does the
same thing for any command:

   - choose ops - use the value of param to choose either a number or
 count version of event_trigger_ops specific to the command
   - do the register or unregister of those ops
   - associate a filter, if specified, with the triggering event

The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized.  When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite.  The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.

Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.

The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions.  This allows func()
implementations to use command-specific blobs and supports code
re-use.

The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked.  The other 

[PATCH v8 03/10] tracing: Add 'traceon' and 'traceoff' event trigger commands

2013-09-02 Thread Tom Zanussi
Add 'traceon' and 'traceoff' event_command commands.  traceon and
traceoff event triggers are added by the user via these commands in a
similar way and using practically the same syntax as the analagous
'traceon' and 'traceoff' ftrace function commands, but instead of
writing to the set_ftrace_filter file, the traceon and traceoff
triggers are written to the per-event 'trigger' files:

echo 'traceon' > .../tracing/events/somesys/someevent/trigger
echo 'traceoff' > .../tracing/events/somesys/someevent/trigger

The above command will turn tracing on or off whenever someevent is
hit.

This also adds a 'count' version that limits the number of times the
command will be invoked:

echo 'traceon:N' > .../tracing/events/somesys/someevent/trigger
echo 'traceoff:N' > .../tracing/events/somesys/someevent/trigger

Where N is the number of times the command will be invoked.

The above commands will will turn tracing on or off whenever someevent
is hit, but only N times.

Some common register/unregister_trigger() implementations of the
event_command reg()/unreg() callbacks are also provided, which add and
remove trigger instances to the per-event list of triggers, and
arm/disarm them as appropriate.  event_trigger_callback() is a
general-purpose event_command func() implementation that orchestrates
command parsing and registration for most normal commands.

Most event commands will use these, but some will override and
possibly reuse them.

The event_trigger_init(), event_trigger_free(), and
event_trigger_print() functions are meant to be common implementations
of the event_trigger_ops init(), free(), and print() ops,
respectively.

Most trigger_ops implementations will use these, but some will
override and possibly reuse them.

Signed-off-by: Tom Zanussi 
---
 include/linux/ftrace_event.h|   1 +
 kernel/trace/trace_events_trigger.c | 436 
 2 files changed, 437 insertions(+)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 34ae1d4..a14650b 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -318,6 +318,7 @@ struct ftrace_event_file {
 
 enum event_trigger_type {
ETT_NONE= (0),
+   ETT_TRACE_ONOFF = (1 << 0),
 };
 
 extern void destroy_preds(struct ftrace_event_call *call);
diff --git a/kernel/trace/trace_events_trigger.c 
b/kernel/trace/trace_events_trigger.c
index 85319cf..5388d55 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -28,6 +28,13 @@
 static LIST_HEAD(trigger_commands);
 static DEFINE_MUTEX(trigger_cmd_mutex);
 
+static void
+trigger_data_free(struct event_trigger_data *data)
+{
+   synchronize_sched(); /* make sure current triggers exit before free */
+   kfree(data);
+}
+
 void event_triggers_call(struct ftrace_event_file *file)
 {
struct event_trigger_data *data;
@@ -215,6 +222,121 @@ const struct file_operations event_trigger_fops = {
.release = event_trigger_release,
 };
 
+/*
+ * Currently we only register event commands from __init, so mark this
+ * __init too.
+ */
+static __init int register_event_command(struct event_command *cmd,
+struct list_head *cmd_list,
+struct mutex *cmd_list_mutex)
+{
+   struct event_command *p;
+   int ret = 0;
+
+   mutex_lock(cmd_list_mutex);
+   list_for_each_entry(p, cmd_list, list) {
+   if (strcmp(cmd->name, p->name) == 0) {
+   ret = -EBUSY;
+   goto out_unlock;
+   }
+   }
+   list_add(>list, cmd_list);
+ out_unlock:
+   mutex_unlock(cmd_list_mutex);
+
+   return ret;
+}
+
+/*
+ * Currently we only unregister event commands from __init, so mark
+ * this __init too.
+ */
+static __init int unregister_event_command(struct event_command *cmd,
+  struct list_head *cmd_list,
+  struct mutex *cmd_list_mutex)
+{
+   struct event_command *p, *n;
+   int ret = -ENODEV;
+
+   mutex_lock(cmd_list_mutex);
+   list_for_each_entry_safe(p, n, cmd_list, list) {
+   if (strcmp(cmd->name, p->name) == 0) {
+   ret = 0;
+   list_del_init(>list);
+   goto out_unlock;
+   }
+   }
+ out_unlock:
+   mutex_unlock(cmd_list_mutex);
+
+   return ret;
+}
+
+/**
+ * event_trigger_print - generic event_trigger_ops @print implementation
+ *
+ * Common implementation for event triggers to print themselves.
+ *
+ * Usually wrapped by a function that simply sets the @name of the
+ * trigger command and then invokes this.
+ */
+static int
+event_trigger_print(const char *name, struct seq_file *m,
+   void *data, char *filter_str)
+{
+   long count = (long)data;
+
+   seq_printf(m, "%s", name);

[PATCH 1/2] audit: fix soft lockups due to loop in audit_log_start() wh,en audit_backlog_limit exceeded

2013-09-02 Thread Chuck Anderson
audit: fix softlockups due to loop in audit_log_start() when 
audit_backlog_limit exceeded


author: Dan Duval 

This patch fixes a bug in kernel/audit that can cause many soft lockups
and prevent the boot of a large memory 3.8 system:

  BUG: soft lockup - CPU#66 stuck for 22s! [udevd:9559]
  RIP: 0010:[]  [] 
audit_log_start+0xe6/0x350

  Call Trace:
   [] ? try_to_wake_up+0x2d0/0x2d0
   [] audit_log_exit+0x3f/0x590
   [] __audit_syscall_exit+0x28d/0x2c0
   [] sysret_audit+0x17/0x21

audit_log_start() will call wait_for_auditd() to delay returning an
audit_buffer if there are too many SKBs on audit_skb_queue.
wait_for_auditd() puts itself on the audit_backlog_wait queue and
sleeps for sleep_time jiffies or until it is (normally) woken when
kauditd takes an SKB off of audit_skb_queue.  wait_for_auditd() returns
to audit_log_start() which checks to see if audit_skb_queue still has
too many SKBs.  If there are still too many, audit_log_start() will
continue to call wait_for_auditd() in a loop until
audit_backlog_wait_time has passed.  audit_log_start() will then
complain ("backlog limit exceeded"); set audit_backlog_wait_time
to NULL so other waiters will fall out of the loop when woken up; wake
up any waiters in wait_for_auditd(); return NULL which tells the caller
that an audit_buffer could not be allocated.

A bug in audit_log_start() prevents it from breaking out of the
wait_for_auditd() loop when audit_backlog_wait_time has passed.
Instead, it will loop in the audit_skb_queue-is-too-long while-loop
eventually causing a soft lockup.  There can (and most likely will)
be multiple threads looping.  The fix is to continue in the while-loop
only if sleep_time was greater than 0 (audit_backlog_wait_time has not
passed).

Another bug in audit_log_start() prevents audit_backlog_wait_time from
working as expected.  audit_backlog_wait_time is normally the time
period that audit_log_start() will wait for the number of SKBs on
audit_skb_queue to fall below the too-many threshold.  If
audit_backlog_wait_time passes, audit_log_start() will set it to
audit_backlog_wait_overflow, which is zero, and wake up any waiters in
wait_for_auditd().  audit_backlog_wait_time is now zero so the waiters
will fall out of the loop when they return to audit_log_start().  That
is expected behavior.  But audit_backlog_wait_time is not reset to its
initial value when audit_skb_queue's length is no longer too long.
Subsequent calls to audit_log_start() when audit_skb_queue is too long
will not wait in wait_for_auditd(), instead returning NULL.  The fix
is to set audit_backlog_wait_time to its initial value when
audit_skb_queue passes the size test, potentially resetting it.

A third issue is to have both audit_log_start() and wait_for_auditd()
use the same limit value for the length of audit_skb_queue.  It isn't
necessary today but (1) assumptions may change in the future and (2)
is one less oddity for a reader to have to verify.

Signed-off-by: Dan Duval 
Signed-off-by: Chuck Anderson 
---
 kernel/audit.c |   28 
 1 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 91e53d0..9a78dde 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -103,9 +103,11 @@ static int audit_rate_limit;

 /* Number of outstanding audit_buffers allowed. */
 static int audit_backlog_limit = 64;
-static int audit_backlog_wait_time = 60 * HZ;
 static int audit_backlog_wait_overflow = 0;

+#define AUDIT_BACKLOG_WAIT_TIME (60 * HZ)
+static int audit_backlog_wait_time = AUDIT_BACKLOG_WAIT_TIME;
+
 /* The identity of the user shutting down the audit system. */
 kuid_t audit_sig_uid = INVALID_UID;
 pid_t  audit_sig_pid = -1;
@@ -1053,14 +1055,14 @@ static inline void audit_get_stamp(struct 
audit_context

*ctx,
 /*
  * Wait for auditd to drain the queue a little
  */
-static void wait_for_auditd(unsigned long sleep_time)
+static void wait_for_auditd(unsigned long sleep_time, int limit)
 {
DECLARE_WAITQUEUE(wait, current);
set_current_state(TASK_UNINTERRUPTIBLE);
add_wait_queue(_backlog_wait, );

if (audit_backlog_limit &&
-   skb_queue_len(_skb_queue) > audit_backlog_limit)
+   skb_queue_len(_skb_queue) > limit)
schedule_timeout(sleep_time);

__set_current_state(TASK_RUNNING);
@@ -1095,8 +1097,8 @@ struct audit_buffer *audit_log_start(struct 
audit_context

*ctx, gfp_t gfp_mask,
struct audit_buffer *ab = NULL;
struct timespec t;
unsigned intuninitialized_var(serial);
-   int reserve;
unsigned long timeout_start = jiffies;
+   int limit;

if (audit_initialized != AUDIT_INITIALIZED)
return NULL;
@@ -1104,22 +1106,22 @@ struct audit_buffer *audit_log_start(struct 
audit_contex

t *ctx, gfp_t gfp_mask,
if (unlikely(audit_filter_type(type)))
return NULL;

-   if (gfp_mask & 

[PATCH 0/2] audit: fix soft lockups and udevd errors when audit is overrun

2013-09-02 Thread Chuck Anderson

The two patches that follow in separate emails resolve soft lockups and
udevd reported errors that prevented a large memory 3.8 system from booting.

The patches are based on 3.11-rc7.

I believe it is the same issue recently posted as:

  [RFC] audit: avoid soft lockup in audit_log_start()
  https://lkml.org/lkml/2013/8/28/626

The first patch:

  audit: fix soft lockups due to loop in audit_log_start() when 
audit_backlog_limit exceeded


fixes a bug in kernel/audit that caused many soft lockups during boot:

  BUG: soft lockup - CPU#66 stuck for 22s! [udevd:9559]
  RIP: 0010:[]  [] 
audit_log_start+0xe6/0x350

  Call Trace:
   [] ? try_to_wake_up+0x2d0/0x2d0
   [] audit_log_exit+0x3f/0x590
   [] __audit_syscall_exit+0x28d/0x2c0
   [] sysret_audit+0x17/0x21

The second patch:

  audit: Two efficiency fixes for audit mechanism

prevents these and similar error messages repeated often during boot:

  udevd[876]: worker [887] unexpectedly returned with status 0x0100
  udevd[876]: worker [887] failed while handling 
'/devices/pci:00/:00:03.0/:40:00.0'

  udevd[876]: worker [880] unexpectedly returned with status 0x0100
  udevd[876]: worker [880] failed while handling 
'/devices/LNXSYSTM:00/LNXPWRBN:00/input/input1/event1'


  udevadm settle - timeout of 180 seconds reached, the event queue 
contains:

/sys/devices/LNXSYSTM:00/LNXPWRBN:00/input/input1/event1 (3995)
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/INT3F0D:00 (4034)

  audit: audit_backlog=258 > audit_backlog_limit=256
  audit: audit_lost=1 audit_rate_limit=0 audit_backlog_limit=256
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4:No Change] xHCI:Fixing xhci_readl definition and function call

2013-09-02 Thread Kumar Gaurav

I tried applying this patch on linux-next and it applies well.
i used
git apply --apply



On Saturday 31 August 2013 11:02 PM, Kumar Gaurav wrote:

This patch redefine function xhci_readl. xhci_readl function doesn't use 
xhci_hcd argument.
Hence there is no need of keeping it in the function arguments.

Redefining this function breaks other functions which calls this function.
This phatch also correct those calls in xhci driver.

Signed-off-by: Kumar Gaurav 
---
  drivers/usb/host/xhci-dbg.c  |   36 -
  drivers/usb/host/xhci-hub.c  |   72 -
  drivers/usb/host/xhci-mem.c  |   20 -
  drivers/usb/host/xhci-ring.c |   12 +++---
  drivers/usb/host/xhci.c  |   92 +-
  drivers/usb/host/xhci.h  |3 +-
  6 files changed, 117 insertions(+), 118 deletions(-)

diff --git a/drivers/usb/host/xhci-dbg.c b/drivers/usb/host/xhci-dbg.c
index 73503a8..229e312 100644
--- a/drivers/usb/host/xhci-dbg.c
+++ b/drivers/usb/host/xhci-dbg.c
@@ -32,7 +32,7 @@ void xhci_dbg_regs(struct xhci_hcd *xhci)
  
  	xhci_dbg(xhci, "// xHCI capability registers at %p:\n",

xhci->cap_regs);
-   temp = xhci_readl(xhci, >cap_regs->hc_capbase);
+   temp = xhci_readl(>cap_regs->hc_capbase);
xhci_dbg(xhci, "// @%p = 0x%x (CAPLENGTH AND HCIVERSION)\n",
>cap_regs->hc_capbase, temp);
xhci_dbg(xhci, "//   CAPLENGTH: 0x%x\n",
@@ -44,13 +44,13 @@ void xhci_dbg_regs(struct xhci_hcd *xhci)
  
  	xhci_dbg(xhci, "// xHCI operational registers at %p:\n", xhci->op_regs);
  
-	temp = xhci_readl(xhci, >cap_regs->run_regs_off);

+   temp = xhci_readl(>cap_regs->run_regs_off);
xhci_dbg(xhci, "// @%p = 0x%x RTSOFF\n",
>cap_regs->run_regs_off,
(unsigned int) temp & RTSOFF_MASK);
xhci_dbg(xhci, "// xHCI runtime registers at %p:\n", xhci->run_regs);
  
-	temp = xhci_readl(xhci, >cap_regs->db_off);

+   temp = xhci_readl(>cap_regs->db_off);
xhci_dbg(xhci, "// @%p = 0x%x DBOFF\n", >cap_regs->db_off, temp);
xhci_dbg(xhci, "// Doorbell array at %p:\n", xhci->dba);
  }
@@ -61,7 +61,7 @@ static void xhci_print_cap_regs(struct xhci_hcd *xhci)
  
  	xhci_dbg(xhci, "xHCI capability registers at %p:\n", xhci->cap_regs);
  
-	temp = xhci_readl(xhci, >cap_regs->hc_capbase);

+   temp = xhci_readl(>cap_regs->hc_capbase);
xhci_dbg(xhci, "CAPLENGTH AND HCIVERSION 0x%x:\n",
(unsigned int) temp);
xhci_dbg(xhci, "CAPLENGTH: 0x%x\n",
@@ -69,7 +69,7 @@ static void xhci_print_cap_regs(struct xhci_hcd *xhci)
xhci_dbg(xhci, "HCIVERSION: 0x%x\n",
(unsigned int) HC_VERSION(temp));
  
-	temp = xhci_readl(xhci, >cap_regs->hcs_params1);

+   temp = xhci_readl(>cap_regs->hcs_params1);
xhci_dbg(xhci, "HCSPARAMS 1: 0x%x\n",
(unsigned int) temp);
xhci_dbg(xhci, "  Max device slots: %u\n",
@@ -79,7 +79,7 @@ static void xhci_print_cap_regs(struct xhci_hcd *xhci)
xhci_dbg(xhci, "  Max ports: %u\n",
(unsigned int) HCS_MAX_PORTS(temp));
  
-	temp = xhci_readl(xhci, >cap_regs->hcs_params2);

+   temp = xhci_readl(>cap_regs->hcs_params2);
xhci_dbg(xhci, "HCSPARAMS 2: 0x%x\n",
(unsigned int) temp);
xhci_dbg(xhci, "  Isoc scheduling threshold: %u\n",
@@ -87,7 +87,7 @@ static void xhci_print_cap_regs(struct xhci_hcd *xhci)
xhci_dbg(xhci, "  Maximum allowed segments in event ring: %u\n",
(unsigned int) HCS_ERST_MAX(temp));
  
-	temp = xhci_readl(xhci, >cap_regs->hcs_params3);

+   temp = xhci_readl(>cap_regs->hcs_params3);
xhci_dbg(xhci, "HCSPARAMS 3 0x%x:\n",
(unsigned int) temp);
xhci_dbg(xhci, "  Worst case U1 device exit latency: %u\n",
@@ -95,14 +95,14 @@ static void xhci_print_cap_regs(struct xhci_hcd *xhci)
xhci_dbg(xhci, "  Worst case U2 device exit latency: %u\n",
(unsigned int) HCS_U2_LATENCY(temp));
  
-	temp = xhci_readl(xhci, >cap_regs->hcc_params);

+   temp = xhci_readl(>cap_regs->hcc_params);
xhci_dbg(xhci, "HCC PARAMS 0x%x:\n", (unsigned int) temp);
xhci_dbg(xhci, "  HC generates %s bit addresses\n",
HCC_64BIT_ADDR(temp) ? "64" : "32");
/* FIXME */
xhci_dbg(xhci, "  FIXME: more HCCPARAMS debugging\n");
  
-	temp = xhci_readl(xhci, >cap_regs->run_regs_off);

+   temp = xhci_readl(>cap_regs->run_regs_off);
xhci_dbg(xhci, "RTSOFF 0x%x:\n", temp & RTSOFF_MASK);
  }
  
@@ -110,7 +110,7 @@ static void xhci_print_command_reg(struct xhci_hcd *xhci)

  {
u32 temp;
  
-	temp = xhci_readl(xhci, >op_regs->command);

+   temp = xhci_readl(>op_regs->command);
xhci_dbg(xhci, "USBCMD 0x%x:\n", temp);
xhci_dbg(xhci, "  HC is %s\n",
   

Re: [guv v2 04/31] net: Replace __get_cpu_var uses

2013-09-02 Thread David Miller
From: David Howells 
Date: Mon, 02 Sep 2013 22:35:06 +0100

> Would it be possible to use __thread annotations for per-CPU
> variables, I wonder?

Paul Mackerras tried it on powerpc and you can't do it.

The problem is that there is no way to tell the compiler that sched()
and similar (potentially) change the thread pointer base.

It really will cache pre-computed __thread pointer calculations across
sched().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/4] mm/hwpoison: fix miss catch transparent huge page

2013-09-02 Thread Chen Gong
On Tue, Sep 03, 2013 at 07:36:44AM +0800, Wanpeng Li wrote:
> Date: Tue,  3 Sep 2013 07:36:44 +0800
> From: Wanpeng Li 
> To: Andrew Morton 
> Cc: Andi Kleen , Fengguang Wu
>  , Naoya Horiguchi ,
>  Tony Luck , gong.c...@linux.intel.com,
>  linux...@kvack.org, linux-kernel@vger.kernel.org, Wanpeng Li
>  
> Subject: [PATCH v2 2/4] mm/hwpoison: fix miss catch transparent huge page 
> X-Mailer: git-send-email 1.7.5.4
> 
> Changelog:
>  *v1 -> v2: reverse PageTransHuge(page) && !PageHuge(page) check 
> 
> PageTransHuge() can't guarantee the page is transparent huge page since it 
> return true for both transparent huge and hugetlbfs pages. This patch fix 
> it by check the page is also !hugetlbfs page.
> 
> Before patch:
> 
> [  121.571128] Injecting memory failure at pfn 23a200
> [  121.571141] MCE 0x23a200: huge page recovery: Delayed
> [  140.355100] MCE: Memory failure is now running on 0x23a200
> 
> After patch:
> 
> [   94.290793] Injecting memory failure at pfn 23a000
> [   94.290800] MCE 0x23a000: huge page recovery: Delayed
> [  105.722303] MCE: Software-unpoisoned page 0x23a000
> 
> Signed-off-by: Wanpeng Li 
> ---
>  mm/memory-failure.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index e28ee77..b114570 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1349,7 +1349,7 @@ int unpoison_memory(unsigned long pfn)
>* worked by memory_failure() and the page lock is not held yet.
>* In such case, we yield to memory_failure() and make unpoison fail.
>*/
> - if (PageTransHuge(page)) {
> + if (!PageHuge(page) && PageTransHuge(page)) {
>   pr_info("MCE: Memory failure is now running on %#lx\n", pfn);
>   return 0;
>   }

Not sure which git tree should be used to apply this patch series? I assume
this patch series follows this link: https://lkml.org/lkml/2013/8/26/76.

In unpoison_memory we already have
if (PageHuge(page)) {
...
return 0;
}
so it looks like this patch is redundant.


signature.asc
Description: Digital signature


Re: [PATCH v2 0/8] Drop support for Renesas H8/300 architecture

2013-09-02 Thread Guenter Roeck
On 09/02/2013 07:53 PM, Chen Gang F T wrote:
> Hello Guenter Roeck:
> 
> 
> I don't care about whether I am in cc mailing list, but at least,
> please help confirm 2 things:
> 
>Is what I had done for h8300 just making wastes and noisy in kernel and 
> related sub-system mailing list ?
> 
>and is the disccusion about h8300 between us also wastes and noisy in 
> kernel mailing list ?
> 

It raised my awareness of the status of h8300 maintenance,
so I would not see it as noise or waste. I might have suggested
a different target for your efforts, but that is your choice to make,
not mine.

On the code review side, I had suggested that you should not add new
ifdefs into code, much less unnecessary ones. Your counter-argument
was that you wanted to follow the existing coding style in the file
in question. To me, that argument is along the line of "the coding
style in this file is bad, let's do more of it".
That doesn't make much sense to me, so I did not bother to respond.
Setting that aside, it is not up to me to approve or reject your patches.
Whoever does that would be the one you have to convince.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] [PATCH] checkpatch: Add comment about updating Documentation/CodingStyle

2013-09-02 Thread Fengguang Wu
On Mon, Sep 02, 2013 at 08:16:45PM -0700, Josh Triplett wrote:
> On Tue, Sep 03, 2013 at 10:46:40AM +0800, Fengguang Wu wrote:
> > On Mon, Sep 02, 2013 at 06:52:45PM -0700, Joe Perches wrote:
> > > On Mon, 2013-09-02 at 18:34 -0700, Josh Triplett wrote:
> > > > CONFIG_EXPERIMENTAL
> > > > CVS_KEYWORD
> > > 
> > > OK, but 
> [...]
> > Thanks for both of your suggestions! I'll add the commonly agreed ones:
> > 
> > +INVALID_UTF8
> > +LINUX_VERSION_CODE
> > +MISSING_EOF_NEWLINE
> > +HEXADECIMAL_BOOLEAN_TEST
> > +ALLOC_ARRAY_ARGS
> > +CONST_STRUCT
> > +CONSIDER_KSTRTO
> > 
> > And remove the duplicate one (good catch, Josh!)
> > 
> > -KREALLOC_ARG_REUSE
> 
> You missed CONFIG_EXPERIMENTAL and CVS_KEYWORD; see above.

OK, added, thanks!

Cheers,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OCFS2: ocfs2_read_blocks:285 ERROR: block 532737 had the JBD bit set while I was in lock_buffer!

2013-09-02 Thread Jeff Liu
Hello,

It seems like Sunil has fixed a similar issue against ocfs2-1.4
several years ago:
https://oss.oracle.com/git/?p=ocfs2-1.4.git;a=commitdiff_plain;h=2fd250839d0f5073af8d42e97f1db74beb621674;hp=e882faf84930431524f84598caea7d4e9a9529c5
https://oss.oracle.com/git/?p=ocfs2-1.4.git;a=commitdiff_plain;h=eccff85213d4c2762f787d9e7cb1503042ba75b9;hp=edc147473ffd9c03790dc4502b893823f44a9ec4

The old bug ticket for the discussion:
https://oss.oracle.com/bugzilla/show_bug.cgi?id=1235

This fix is specifically for ocfs2-1.4, but Mark once mentioned that
the BUG() there can be removed if we have a good explanation for this
sort of behavior, is it time to have it in mainline?

Thanks,
-Jeff
On 09/03/2013 04:32 AM, richard -rw- weinberger wrote:

> Hi!
> 
> Today one of my computers crashed with the following panic.
> The machine is heavily using reflinks.
> Looks like it managed to hit a CATCH_BH_JBD_RACES error check.
> 
> <3>[37628.934461] (reflink,512,0):ocfs2_reflink_ioctl:4459 ERROR: status = -17
> <3>[37628.943160] (kworker/u:2,809,1):ocfs2_read_blocks:285 ERROR:
> block 532737 had the JBD bit set while I was in lock_buffer!
> <4>[37628.943169] [ cut here ]
> <2>[37628.944464] kernel BUG at
> /home/rw/work/ssworkstation/maker/_source/kernel/fs/ocfs2/buffer_head_io.c:286!
> <4>[37628.945134] invalid opcode:  [#1] PREEMPT SMP
> <4>[37628.945809] CPU 1
> <4>[37628.945817] Pid: 809, comm: kworker/u:2 Not tainted 3.8.4+ #46
>/
> <4>[37628.947167] RIP: 0010:[]  []
> ocfs2_read_blocks+0x410/0x610
> <4>[37628.947880] RSP: 0018:880234631908  EFLAGS: 00010292
> <4>[37628.948593] RAX: 006d RBX: 0001 RCX:
> 0067
> <4>[37628.949317] RDX: 0048 RSI: 0046 RDI:
> 8214c0dc
> Oops#1 Part3
> <4>[37628.950037] RBP: 880234631988 R08: 000a R09:
> d490
> <4>[37628.950758] R10:  R11: 0004 R12:
> 00082101
> <4>[37628.951477] R13: 880233147980 R14:  R15:
> 880216ca2208
> <4>[37628.952201] FS:  ()
> GS:88023e28() knlGS:
> <4>[37628.952936] CS:  0010 DS:  ES:  CR0: 80050033
> <4>[37628.953669] CR2: 7fe7ea29fc62 CR3: 06c0b000 CR4:
> 000407e0
> <4>[37628.954421] DR0:  DR1:  DR2:
> 
> <4>[37628.955176] DR3:  DR6: 0ff0 DR7:
> 0400
> <4>[37628.955925] Process kworker/u:2 (pid: 809, threadinfo
> 88023463, task 880234ba86e0)
> <4>[37628.956689] Stack:
> <4>[37628.957461]  00082101 ea0008900880 880234631948
> 1000
> <4>[37628.958250]  00082102 00082102 81295eb0
> 
> Oops#1 Part2
> <4>[37628.959044]  88023428c000 0001 
> 8802346319f0
> <4>[37628.959844] Call Trace:
> <4>[37628.960639]  [] ? ocfs2_read_refcount_block+0x50/0x50
> <4>[37628.961453]  [] ocfs2_read_refcount_block+0x2b/0x50
> <4>[37628.962249]  [] ocfs2_get_refcount_tree+0xa7/0x350
> <4>[37628.963042]  [] ? __find_get_block+0xa1/0x1e0
> <4>[37628.963835]  [] ocfs2_lock_refcount_tree+0x48/0x4f0
> <4>[37628.964645]  [] ocfs2_remove_btree_range+0xab/0xb30
> <4>[37628.965452]  [] ocfs2_commit_truncate+0x139/0x550
> <4>[37628.966247]  [] ? ocfs2_extend_trans+0x1c0/0x1c0
> <4>[37628.967049]  [] ocfs2_evict_inode+0x89e/0x2530
> <4>[37628.967851]  [] ? __inode_wait_for_writeback+0x68/0xc0
> <4>[37628.968645]  [] evict+0xaf/0x1b0
> <4>[37628.969432]  [] iput+0x105/0x1a0
> Oops#1 Part1
> <4>[37628.970213]  [] 
> __ocfs2_drop_dl_inodes.isra.14+0x47/0x80
> <4>[37628.971002]  [] ocfs2_drop_dl_inodes+0x25/0xa0
> <4>[37628.971788]  [] process_one_work+0x147/0x470
> <4>[37628.972580]  [] worker_thread+0x14d/0x3f0
> <4>[37628.973381]  [] ? rescuer_thread+0x240/0x240
> <4>[37628.974175]  [] kthread+0xbb/0xc0
> <4>[37628.974960]  [] ? __kthread_parkme+0x80/0x80
> <4>[37628.975747]  [] ret_from_fork+0x7c/0xb0
> <4>[37628.976529]  [] ? __kthread_parkme+0x80/0x80
> <4>[37628.977307] Code: 0f 0b 4c 89 ff e8 11 0b f0 ff e9 f2 fc ff ff
> 48 b8 00 00 00 00 00 00 00 10 48 85 05 2b 58 9d 00 74 09 48 85 05 c2
> 79 f4 00 74 02 <0f> 0b 65 48 8b 14 25 70 b8 00 00 48 8d 82 28 e0 ff ff
> 4d 8b 67
> <1>[37628.979053] RIP  [] ocfs2_read_blocks+0x410/0x610
> <4>[37628.979893]  RSP 
> <4>[37628.983420] ---[ end trace c03a48f44cf30d5e ]---
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] [PATCH] checkpatch: Add comment about updating Documentation/CodingStyle

2013-09-02 Thread Josh Triplett
On Tue, Sep 03, 2013 at 10:46:40AM +0800, Fengguang Wu wrote:
> On Mon, Sep 02, 2013 at 06:52:45PM -0700, Joe Perches wrote:
> > On Mon, 2013-09-02 at 18:34 -0700, Josh Triplett wrote:
> > > CONFIG_EXPERIMENTAL
> > > CVS_KEYWORD
> > 
> > OK, but 
[...]
> Thanks for both of your suggestions! I'll add the commonly agreed ones:
> 
> +INVALID_UTF8
> +LINUX_VERSION_CODE
> +MISSING_EOF_NEWLINE
> +HEXADECIMAL_BOOLEAN_TEST
> +ALLOC_ARRAY_ARGS
> +CONST_STRUCT
> +CONSIDER_KSTRTO
> 
> And remove the duplicate one (good catch, Josh!)
> 
> -KREALLOC_ARG_REUSE

You missed CONFIG_EXPERIMENTAL and CVS_KEYWORD; see above.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] iommu: WARN_ON when removing a device with no iommu_group associated

2013-09-02 Thread Wei Yang
Any more comments? Or this one is not proper?

On Thu, Aug 22, 2013 at 09:33:27PM -0600, Alex Williamson wrote:
>[+cc iommu]
>
>On Fri, 2013-08-23 at 09:55 +0800, Wei Yang wrote:
>> When removing a device from the system, iommu_group driver will try to
>> disconnect it from its group. While in some cases, one device may not
>> associated with any iommu_group. For example, not enough DMA address space.
>> 
>> In the generic bus notification, it will check dev->iommu_group before 
>> calling
>> iommu_group_remove_device(). While in some cases, developers may call
>> iommu_group_remove_device() in a different code path and without check. For
>> those devices with dev->iommu_group set to NULL, kernel will crash.
>> 
>> This patch gives a warning and return when trying to remove a device from an
>> iommu_group with dev->iommu_group set to NULL. This helps to indicate some 
>> bad
>> behavior and also guard the kernel.
>> 
>> Signed-off-by: Wei Yang 
>
>Acked-by: Alex Williamson 
>
>> ---
>>  drivers/iommu/iommu.c |3 +++
>>  1 files changed, 3 insertions(+), 0 deletions(-)
>> 
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index fbe9ca7..43396f0 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -379,6 +379,9 @@ void iommu_group_remove_device(struct device *dev)
>>  struct iommu_group *group = dev->iommu_group;
>>  struct iommu_device *tmp_device, *device = NULL;
>>  
>> +if (WARN_ON(!group))
>> +return;
>> +
>>  /* Pre-notify listeners that a device is being removed. */
>>  blocking_notifier_call_chain(>notifier,
>>   IOMMU_GROUP_NOTIFY_DEL_DEVICE, dev);
>
>

-- 
Richard Yang
Help you, Help me

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 3/3] mm/vmalloc: move VM_UNINITIALIZED just before show_numa_info

2013-09-02 Thread Wanpeng Li
The VM_UNINITIALIZED/VM_UNLIST flag introduced by commit f5252e00(mm: avoid
null pointer access in vm_struct via /proc/vmallocinfo) is used to avoid
accessing the pages field with unallocated page when show_numa_info() is
called. This patch move the check just before show_numa_info in order that
some messages still can be dumped via /proc/vmallocinfo.

Signed-off-by: Wanpeng Li 
---
 mm/vmalloc.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e3ec8b4..c4720cd 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2590,11 +2590,6 @@ static int s_show(struct seq_file *m, void *p)
 
v = va->vm;
 
-   /* Pair with smp_wmb() in clear_vm_uninitialized_flag() */
-   smp_rmb();
-   if (v->flags & VM_UNINITIALIZED)
-   return 0;
-
seq_printf(m, "0x%pK-0x%pK %7ld",
v->addr, v->addr + v->size, v->size);
 
@@ -2622,6 +2617,11 @@ static int s_show(struct seq_file *m, void *p)
if (v->flags & VM_VPAGES)
seq_printf(m, " vpages");
 
+   /* Pair with smp_wmb() in clear_vm_uninitialized_flag() */
+   smp_rmb();
+   if (v->flags & VM_UNINITIALIZED)
+   return 0;
+
show_numa_info(m, v);
seq_putc(m, '\n');
return 0;
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/3] mm/vmalloc: don't warning vmalloc allocation failure twice

2013-09-02 Thread Wanpeng Li
Don't warning twice in __vmalloc_area_node and __vmalloc_node_range if
__vmalloc_area_node allocation failure.

Signed-off-by: Wanpeng Li 
---
 mm/vmalloc.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index d78d117..e3ec8b4 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1635,7 +1635,7 @@ void *__vmalloc_node_range(unsigned long size, unsigned 
long align,
 
addr = __vmalloc_area_node(area, gfp_mask, prot, node, caller);
if (!addr)
-   goto fail;
+   return NULL;
 
/*
 * In this function, newly allocated vm_struct has VM_UNINITIALIZED
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/3] mm/vmalloc: don't set area->caller twice

2013-09-02 Thread Wanpeng Li
Changelog:
 * rebase against mmotm tree

The caller address has already been set in set_vmalloc_vm(), there's no need
to set it again in __vmalloc_area_node.

Signed-off-by: Wanpeng Li 
---
 mm/vmalloc.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 1074543..d78d117 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1566,7 +1566,6 @@ static void *__vmalloc_area_node(struct vm_struct *area, 
gfp_t gfp_mask,
pages = kmalloc_node(array_size, nested_gfp, node);
}
area->pages = pages;
-   area->caller = caller;
if (!area->pages) {
remove_vm_area(area->addr);
kfree(area);
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xtensa: Fix broken allmodconfig build

2013-09-02 Thread Guenter Roeck

On 08/27/2013 09:06 PM, Guenter Roeck wrote:

xtansa allmodbuild fails with:

arch/xtensa/kernel/xtensa_ksyms.c:129:1: error: '_mcount' undeclared here (not 
in a function)
make[2]: *** [arch/xtensa/kernel/xtensa_ksyms.o] Error 1
make[1]: *** [arch/xtensa/kernel] Error 2

The breakage is due to commit 478ba61af (xtensa: add static function tracer
support) which exports _mcount without declaring it.

Cc: Max Filippov 
Signed-off-by: Guenter Roeck 
---


Ping ... 3.11 now ships with xtensa allmodconfig broken, which is completely 
unnecessary.

Guenter


  arch/xtensa/kernel/xtensa_ksyms.c |1 +
  1 file changed, 1 insertion(+)

diff --git a/arch/xtensa/kernel/xtensa_ksyms.c 
b/arch/xtensa/kernel/xtensa_ksyms.c
index d8507f8..74a60c7 100644
--- a/arch/xtensa/kernel/xtensa_ksyms.c
+++ b/arch/xtensa/kernel/xtensa_ksyms.c
@@ -25,6 +25,7 @@
  #include 
  #include 
  #include 
+#include 
  #ifdef CONFIG_BLK_DEV_FD
  #include 
  #endif



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/8] Drop support for Renesas H8/300 architecture

2013-09-02 Thread Chen Gang F T
Hello Guenter Roeck:


I don't care about whether I am in cc mailing list, but at least,
please help confirm 2 things:

  Is what I had done for h8300 just making wastes and noisy in kernel and 
related sub-system mailing list ?

  and is the disccusion about h8300 between us also wastes and noisy in kernel 
mailing list ?


And also I have to make an apologize to kernel and other related sub
system mailing list:

  some of patches about h8300 which I have sent in 2013-09-02 are really wastes 
(and I wasted my time resource for it, too).

  the excuse (not reason) is I do not know about Guenter Roeck has sent this 
patch (I am not in this cc list, so I find it one day delay).


BTW: I also add some another related members in cc mailing list to let
them know about some of suspending thread about h8300 (which waiting
for allmodconfig finish) can be canceled.


Thanks.

On 08/31/2013 07:51 AM, Guenter Roeck wrote:
> H8/300 has been dead for several years, the kernel for it has
> not compiled for ages, and recent versions of gcc for it are broken.
> It is time to drop support for it.
> 
> Yes, I understand it is not that simple to drop an architecture,
> and it may need some discussion, but someone has to put a stake
> into the ground. Keeping a virtually dead architecture on life support
> takes resources which are better spent elsewhere.
> 
> v2:
> - s/Renesys/Renesas/g
> - Found and removed more architecture specific code in fs/minix
>   and in smc9194 driver
> - Added explicit Cc: for h8300 maintainer
> - Added subsystem maintainer Acks
> 
> 
> Guenter Roeck (8):
>   Drop support for Renesas H8/300 (h8300) architecture
>   ide: Drop H8/300 driver
>   net/ethernet: smsc9194: Drop conditional code for H8/300
>   net/ethernet: Drop H8/300 Ethernet driver
>   watchdog: Drop references to H8300 architecture
>   Drop MAINTAINERS entry for H8/300
>   Drop remaining references to H8/300 architecture
>   fs/minix: Drop dependency on H8300
> 
>  Documentation/scheduler/sched-arch.txt   |5 -
>  MAINTAINERS  |8 -
>  arch/h8300/Kconfig   |  109 
>  arch/h8300/Kconfig.cpu   |  171 --
>  arch/h8300/Kconfig.debug |   68 ---
>  arch/h8300/Kconfig.ide   |   44 --
>  arch/h8300/Makefile  |   71 ---
>  arch/h8300/README|   38 --
>  arch/h8300/boot/Makefile |   22 -
>  arch/h8300/boot/compressed/Makefile  |   37 --
>  arch/h8300/boot/compressed/head.S|   47 --
>  arch/h8300/boot/compressed/misc.c|  180 --
>  arch/h8300/boot/compressed/vmlinux.lds   |   32 -
>  arch/h8300/boot/compressed/vmlinux.scr   |9 -
>  arch/h8300/defconfig |   42 --
>  arch/h8300/include/asm/Kbuild|8 -
>  arch/h8300/include/asm/asm-offsets.h |1 -
>  arch/h8300/include/asm/atomic.h  |  146 -
>  arch/h8300/include/asm/barrier.h |   29 -
>  arch/h8300/include/asm/bitops.h  |  211 ---
>  arch/h8300/include/asm/bootinfo.h|2 -
>  arch/h8300/include/asm/bug.h |   12 -
>  arch/h8300/include/asm/bugs.h|   16 -
>  arch/h8300/include/asm/cache.h   |   13 -
>  arch/h8300/include/asm/cachectl.h|   14 -
>  arch/h8300/include/asm/cacheflush.h  |   40 --
>  arch/h8300/include/asm/checksum.h|  102 
>  arch/h8300/include/asm/cmpxchg.h |   60 --
>  arch/h8300/include/asm/cputime.h |6 -
>  arch/h8300/include/asm/current.h |   25 -
>  arch/h8300/include/asm/dbg.h |2 -
>  arch/h8300/include/asm/delay.h   |   38 --
>  arch/h8300/include/asm/device.h  |7 -
>  arch/h8300/include/asm/div64.h   |1 -
>  arch/h8300/include/asm/dma.h |   15 -
>  arch/h8300/include/asm/elf.h |  101 
>  arch/h8300/include/asm/emergency-restart.h   |6 -
>  arch/h8300/include/asm/fb.h  |   12 -
>  arch/h8300/include/asm/flat.h|   26 -
>  arch/h8300/include/asm/fpu.h |1 -
>  arch/h8300/include/asm/ftrace.h  |1 -
>  arch/h8300/include/asm/futex.h   |6 -
>  arch/h8300/include/asm/gpio-internal.h   |   52 --
>  arch/h8300/include/asm/hardirq.h |   19 -
>  arch/h8300/include/asm/hw_irq.h  |1 -
>  arch/h8300/include/asm/io.h  |  358 ---
>  arch/h8300/include/asm/irq.h |   49 --
>  

Re: Linux 3.11

2013-09-02 Thread Nicholas A. Bellinger
Hi Ted,

On Mon, 2013-09-02 at 22:17 -0400, Theodore Ts'o wrote:
> On Mon, Sep 02, 2013 at 04:46:18PM -0700, Guenter Roeck wrote:
> > I don't think it has anything to do with linux-iscsi.org.
> > Possibly Nicholas' e-mail provider is not hosted in the US, meaning e-mail
> > sent through it can not be logged and examined by a certain US government 
> > agency.
> 
> Hardly.  mail.linux-iscsi.org is hosted by Rackspace, which is most
> certainly in the US.  There may be spammers using some of Rackspace
> subnets, which is much more likely to have something to be the issue.
> 
> I had a similar issue with thunk.org, which is hosted by Linode.  In
> my case, part of the problem was that I was that I moved my host to a
> different Linode datacenter (from Dallas to Atlanta), and I forgot to
> update my SPF record, so e-mails with an SMTP envelope address of
> ty...@thunk.org were getting a soft-fail.  (And e-mails with an SMTP
> return address of ty...@mit.edu but sent from imap.thunk.org were
> always getting a soft-fail, which would tend to increase the
> likelihood that if the e-mail tripped other hueristics, would cause it
> to be considered spam.)
> 
> Fixing my SPF record, and enabling DKIM (with a DKIM key published for
> thunk.org in DNS, and making sure that I always used an SMTP envelope
> return address of ty...@thunk.org, even if the RFC 822 from address
> stated ty...@mit.edu) fixed the spam false positive issues for me.
> 
> (Hint: installing and configuring OpenDKIM really isn't all that hard.
> I did it in less than an hour.)

, thanks for the additional information.

Enabling DKIM now, and just waiting for the TXT records to update to
verify.

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND 2/3] x86, mm: Update min_pfn_mapped in add_pfn_range_mapped().

2013-09-02 Thread Yinghai Lu
On Mon, Sep 2, 2013 at 6:06 PM, Tang Chen  wrote:
> Hi Yinghai,
>
> On 09/03/2013 02:41 AM, Yinghai Lu wrote:

> How about change the "for (from low to high)" in init_range_memory_mapping()
> to
> "for_rev(from high to low)" ?
> Then we can update min_pfn_mapped in add_pfn_range_mapped().
>
> And also, the outer loop is from high to low, we can change the inner loop
> to be from high
> to low too.

No. there is other reason for doing local from low to high.

kernel_physical_mapping_init() could clear some mapping near the end
of PUG/PMD entries but not the head.

>
> I think updating min_pfn_mapped in init_mem_mapping() is less readable. And
> min_pfn_mapped
> and max_pfn_mapped should be updated together.

min_pfn_mapped is early local variable to control allocation in alloc_low_pages.
put it in init_mem_mapping is more readable.

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] [PATCH] checkpatch: Add comment about updating Documentation/CodingStyle

2013-09-02 Thread Fengguang Wu
On Mon, Sep 02, 2013 at 06:52:45PM -0700, Joe Perches wrote:
> On Mon, 2013-09-02 at 18:34 -0700, Josh Triplett wrote:
> > I'd suggest a couple more, which
> > *should* always make sense, and to the best of my knowledge don't tend
> > to generate false positives:
> > 
> > C99_COMMENTS
> 
> I don't have a problem with c99 comments.
> As far as I know, Linus doesn't either.
> 
> https://lkml.org/lkml/2012/4/16/473
> 
> > CONFIG_EXPERIMENTAL
> > CVS_KEYWORD
> 
> OK, but 
> 
> > ELSE_AFTER_BRACE
> 
> I wouldn't do this one.  I think
> there are some false positives here.
> 
> > GLOBAL_INITIALIZERS
> > INITIALISED_STATIC
> 
> Nor these.
> 
> > INVALID_UTF8
> > LINUX_VERSION_CODE
> > MISSING_EOF_NEWLINE
> 
> OK I suppose.
> 
> > PREFER_SEQ_PUTS
> > PRINTK_WITHOUT_KERN_LEVEL
> 
> There are a lot of these.
> I suggest no here.
> 
> > RETURN_PARENTHESES
> > SIZEOF_PARENTHESIS
> 
> It's in coding style, but some newish patches
> do avoid them.  It's a question about how noisy
> you want your robot to be.

I'd prefer the robot to show up only when necessary. The coding style
warnings are good for the developers who actively run checkpatch.pl to
make their patch better. However most are probably not suitable for a
robot to send people unsolicited warnings.

> > SPACE_BEFORE_TAB
> > TRAILING_SEMICOLON
> > TRAILING_WHITESPACE
> > USE_DEVICE_INITCALL
> 
> > USE_RELATIVE_PATH
> 
> Having checkpatch tell people how to write changelogs
> I think not a great idea.
> 
> > These *ought* to make sense, but I don't know their false positive rates:
> > 
> > HEXADECIMAL_BOOLEAN_TEST
> 
> That's a good one.  0 false positives.
> 
> > ALLOC_ARRAY_ARGS
> 
> Yes, this would be reasonable too.
> 
> > CONSIDER_KSTRTO
> 
> I think orobably not.  This would be a cleanup thing.

Perhaps we can run it for a while, so that people at least come to
aware there is a kstrto() for use. :)

> > CONST_STRUCT
> 
> OK
> 
> > SPLIT_STRING
> 
> I suggest no but 

Thanks for both of your suggestions! I'll add the commonly agreed ones:

+INVALID_UTF8
+LINUX_VERSION_CODE
+MISSING_EOF_NEWLINE
+HEXADECIMAL_BOOLEAN_TEST
+ALLOC_ARRAY_ARGS
+CONST_STRUCT
+CONSIDER_KSTRTO

And remove the duplicate one (good catch, Josh!)

-KREALLOC_ARG_REUSE

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] [PATCH] checkpatch: Add comment about updating Documentation/CodingStyle

2013-09-02 Thread Joe Perches
On Mon, 2013-09-02 at 19:12 -0700, Josh Triplett wrote:
> On Mon, Sep 02, 2013 at 06:52:45PM -0700, Joe Perches wrote:
> > On Mon, 2013-09-02 at 18:34 -0700, Josh Triplett wrote:
> > > I'd suggest a couple more, which
> > > *should* always make sense, and to the best of my knowledge don't tend
> > > to generate false positives:

Hey Josh.

I don't want to enable too many types of messages
because the "barrier to entry" to submit patches
shouldn't be so high that it discourages people.

I feel mostly that many types of checkpatch messages
are OK to emit, but aren't necessary to fix and
people should feel that checkpatch isn't a necessary
thing to silence before patches are accepted.

cheers, Joe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 3.11

2013-09-02 Thread Theodore Ts'o
On Mon, Sep 02, 2013 at 04:46:18PM -0700, Guenter Roeck wrote:
> I don't think it has anything to do with linux-iscsi.org.
> Possibly Nicholas' e-mail provider is not hosted in the US, meaning e-mail
> sent through it can not be logged and examined by a certain US government 
> agency.

Hardly.  mail.linux-iscsi.org is hosted by Rackspace, which is most
certainly in the US.  There may be spammers using some of Rackspace
subnets, which is much more likely to have something to be the issue.

I had a similar issue with thunk.org, which is hosted by Linode.  In
my case, part of the problem was that I was that I moved my host to a
different Linode datacenter (from Dallas to Atlanta), and I forgot to
update my SPF record, so e-mails with an SMTP envelope address of
ty...@thunk.org were getting a soft-fail.  (And e-mails with an SMTP
return address of ty...@mit.edu but sent from imap.thunk.org were
always getting a soft-fail, which would tend to increase the
likelihood that if the e-mail tripped other hueristics, would cause it
to be considered spam.)

Fixing my SPF record, and enabling DKIM (with a DKIM key published for
thunk.org in DNS, and making sure that I always used an SMTP envelope
return address of ty...@thunk.org, even if the RFC 822 from address
stated ty...@mit.edu) fixed the spam false positive issues for me.

(Hint: installing and configuring OpenDKIM really isn't all that hard.
I did it in less than an hour.)

> I had the same experience; Google blocks all e-mail from my private provider
> (located in Singapore). When asked by the provider, they claimed to know
> nothing about it. No, my provider doesn't forward more spam than other 
> providers,
> and definitely less than, say, Yahoo.

One of the things that might be happening is that your private
provider may be hosting mailing lists used by companies to send
marketing "newsletters".  Unfortunately, sometimes it's a pain to
subscribe from such newsletters, and some users will just simply hit
the "it's spam" button to make such newsletters go away.  For a small
provider, it's easier for a percentage of e-mails being emitted from a
mailer to be considered spam to exceed some magic threshold, thus
increasing the "spam score" for e-mails originating from that
provider.

I'll also note that Yahoo uses DKIM (heck, it invented DKIM) and using
DKIM is useful because if someone tries to fake spam using your
domain, if your e-mails are getting signed using DKIM, and the spam is
getting sent without being DKIM signed, many of the anti-spam
filtering services defintiely do take this into account.  Some may
even automatically decrease your spam score slightly just because you
are using DKIM, just because spammers tend not to do this, and using
DKIM to sign your e-mail headers makes it easier for spam filtering
systems to hold senders accountable for spam that they send.

  - Ted

P.S.  Although I work for Google, I don't know anything about the
low-level details of how Google's anti-SPAM systems work.  However,
for almost a decade, I was a member of MIT Network Operations, and was
one of the postmasters for mit.edu, back when aol.com was in its prime
(and we had a larger number of SMTP deliveries per day than AOL did).
So I know a thing or two about e-mail and I'd be really surprised
if anyone, particular a major mail provider such as Google, Yahoo,
Hotmail, etc, was filtering e-mail just because it came from a non-US
mail server.

The reality is that e-mail is international, and it's only the admins
of smaller mail services (perhaps desperate to filter out vast
quantities of Russian or Chinese Spam, and figuring that they weren't
expecting any valid e-mails from those countries), that would do
something as silly has filtering based on geographic source locations.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] [PATCH] checkpatch: Add comment about updating Documentation/CodingStyle

2013-09-02 Thread Josh Triplett
On Mon, Sep 02, 2013 at 06:52:45PM -0700, Joe Perches wrote:
> On Mon, 2013-09-02 at 18:34 -0700, Josh Triplett wrote:
> > I'd suggest a couple more, which
> > *should* always make sense, and to the best of my knowledge don't tend
> > to generate false positives:
> > 
> > C99_COMMENTS
> 
> I don't have a problem with c99 comments.
> As far as I know, Linus doesn't either.
> 
> https://lkml.org/lkml/2012/4/16/473

That doesn't look like an endorsement so much as a statement that C99
comments are less awful than the net/ special-case comment style.

Documentation/CodingStyle chapter 8 says:
> Linux style for comments is the C89 "/* ... */" style.
> Don't use C99-style "// ..." comments.

If that no longer holds true, we should remove it from CodingStyle.  As
far as I know, though, it still holds.  In any case, it rarely comes up;
most kernel code doesn't use such comments.

> > CONFIG_EXPERIMENTAL
> > CVS_KEYWORD
> 
> OK, but 

Sure, I don't expect them to come up often.

> > ELSE_AFTER_BRACE
> 
> I wouldn't do this one.  I think
> there are some false positives here.

Oh?  What kinds of false positives have you seen?

In any case, fair enough.

> > GLOBAL_INITIALIZERS
> > INITIALISED_STATIC
> 
> Nor these.

I don't see an obvious way for those to have false positives.  What have
you seen?

> > INVALID_UTF8
> > LINUX_VERSION_CODE
> > MISSING_EOF_NEWLINE
> 
> OK I suppose.

Not particularly critical, but uncontroversial and no false positives.

> > PREFER_SEQ_PUTS
> > PRINTK_WITHOUT_KERN_LEVEL
> 
> There are a lot of these.
> I suggest no here.

I assume the bot only applies this to new patches, not to existing code,
in which case these seem completely reasonable.  New code should follow
these, even if we don't mass-fix existing code.

> > RETURN_PARENTHESES
> > SIZEOF_PARENTHESIS
> 
> It's in coding style, but some newish patches
> do avoid them.  It's a question about how noisy
> you want your robot to be.

These two seem reasonable to enforce on new code.  I agree that they
shouldn't trigger mass cleanups of existing code.

> > SPACE_BEFORE_TAB
> > TRAILING_SEMICOLON
> > TRAILING_WHITESPACE
> > USE_DEVICE_INITCALL

I didn't see any comment from you on these four.  Thoughts?

> > USE_RELATIVE_PATH
> 
> Having checkpatch tell people how to write changelogs
> I think not a great idea.

In general, sure, but that particular one seems OK.  In any case, not
particularly critical.

> > These *ought* to make sense, but I don't know their false positive rates:
> > 
> > HEXADECIMAL_BOOLEAN_TEST
> 
> That's a good one.  0 false positives.

Ah, good.

> > ALLOC_ARRAY_ARGS
> 
> Yes, this would be reasonable too.

Excellent.

> > CONSIDER_KSTRTO
> 
> I think orobably not.  This would be a cleanup thing.

Even if applied to new code only?  New code should use the right
functions to start with.

> > CONST_STRUCT
> 
> OK

Good to know; glad to hear it doesn't have false positives.

> > SPLIT_STRING
> 
> I suggest no but 

I can easily believe that it has too many false positives.  Let's leave
that one alone for now.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] [PATCH] checkpatch: Add comment about updating Documentation/CodingStyle

2013-09-02 Thread Joe Perches
On Mon, 2013-09-02 at 18:34 -0700, Josh Triplett wrote:
> I'd suggest a couple more, which
> *should* always make sense, and to the best of my knowledge don't tend
> to generate false positives:
> 
> C99_COMMENTS

I don't have a problem with c99 comments.
As far as I know, Linus doesn't either.

https://lkml.org/lkml/2012/4/16/473

> CONFIG_EXPERIMENTAL
> CVS_KEYWORD

OK, but 

> ELSE_AFTER_BRACE

I wouldn't do this one.  I think
there are some false positives here.

> GLOBAL_INITIALIZERS
> INITIALISED_STATIC

Nor these.

> INVALID_UTF8
> LINUX_VERSION_CODE
> MISSING_EOF_NEWLINE

OK I suppose.

> PREFER_SEQ_PUTS
> PRINTK_WITHOUT_KERN_LEVEL

There are a lot of these.
I suggest no here.

> RETURN_PARENTHESES
> SIZEOF_PARENTHESIS

It's in coding style, but some newish patches
do avoid them.  It's a question about how noisy
you want your robot to be.

> SPACE_BEFORE_TAB
> TRAILING_SEMICOLON
> TRAILING_WHITESPACE
> USE_DEVICE_INITCALL

> USE_RELATIVE_PATH

Having checkpatch tell people how to write changelogs
I think not a great idea.

> These *ought* to make sense, but I don't know their false positive rates:
> 
> HEXADECIMAL_BOOLEAN_TEST

That's a good one.  0 false positives.

> ALLOC_ARRAY_ARGS

Yes, this would be reasonable too.

> CONSIDER_KSTRTO

I think orobably not.  This would be a cleanup thing.

> CONST_STRUCT

OK

> SPLIT_STRING

I suggest no but 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RESEND: Generating interrupts from a USB device driver?

2013-09-02 Thread Daniel Santos

On 09/02/2013 06:07 PM, Greg KH wrote:

On Mon, Sep 02, 2013 at 05:46:58PM -0500, Daniel Santos wrote:

Hello guys.  I didn't get a response the last time so hopefully with
3.11 out I'll get one this time.

I need to be able to generate interrupts from a USB device driver while
servicing the complete() function of an interrupt URB.

No you don't :)


While I realize that this may seem strange, the purpose is for a USB
to SPI/GPIO bridge chip (the MCP2210). When something happens on the
remote device where a chip is expecting it's interrupt out pin to
trigger an interrupt on some local (to the board) microcontroller, the
MCP2210 instead receives that signal and communicates it to the host
the next time it's queried. This is the interrupt that I need to, in
effect propagate locally. Since my spi_master and gpio_chip are all
functioning now, this is the last step to get one of my spi protocol
drivers working correctly.

Just pass the data up the spi stack in your interrupt endpoint handler.
No need to try to create a "real" interrupt.  There are other USB SPI
drivers that should give you the idea of how to do it.
Thanks for your response! I haven't been able to find these drivers, can 
you please point me to one? I guess I don't know the "spi stack" well 
enough to know how to propagate that notification up to the driver for 
the spi device (let alone that it was called a "stack" :)


Thanks!!
Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] [PATCH] checkpatch: Add comment about updating Documentation/CodingStyle

2013-09-02 Thread Fengguang Wu
On Mon, Sep 02, 2013 at 05:47:54PM -0700, Joe Perches wrote:
> On Tue, 2013-09-03 at 08:39 +0800, Fengguang Wu wrote:
> > On Mon, Sep 02, 2013 at 02:11:36PM -0700, Joe Perches wrote:
> []
> > > Fengguang Wu's very useful build robot
> > > sends out emails on build failures.
> > > I think that's great.
> > 
> > Thanks! Yes I'm now running checkpatch these days because some people
> > suggested to me that some of the checkpatch warnings do help catch
> > real bugs.
> 
> Hi Fengguang.
> 
> I see, I don't recall receiving one of these so
> it must be working just fine.

Hi Joe!

Log shows that one of your patch being checked earlier today:

[4 days ago, Joe Perches] perf: Convert kmalloc_node(...GFP_ZERO...) to 
kzalloc_node()

If you have more patches in some git tree that missed the check,
please let me know.

> > However I do try to avoid upsetting people with maybe-subjective
> > warnings. A checkpatch report will only be sent when a small fraction
> > of error types are detected. Comments are very welcome on how to
> > improve this list:
> 
> Your list seems reasonable.
> 
> I might add:
> 
> DOS_LINE_ENDINGS
> MODIFIED_INCLUDE_ASM
> JIFFIES_COMPARISON
> ONE_SEMICOLON

Yeah they all look good to have. Thanks for the suggestions again!

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] [PATCH] checkpatch: Add comment about updating Documentation/CodingStyle

2013-09-02 Thread Josh Triplett
On Tue, Sep 03, 2013 at 08:39:58AM +0800, Fengguang Wu wrote:
> On Mon, Sep 02, 2013 at 02:11:36PM -0700, Joe Perches wrote:
> > On Mon, 2013-09-02 at 21:50 +0100, David Howells wrote:
> > > Josh Triplett  wrote:
> > > 
> > > > > There are many checkpatch rules (like semicolons) that
> > > > > are not in CodingStyle.
> > > > 
> > > > It's a rule of thumb, not a mandate.  In *general*, checkpatch.pl should
> > > > not be enforcing style rules that aren't documented in CodingStyle.
> > > 
> > > Except that it becomes a mandate when someone runs it automatically 
> > > against
> > > every one of your patches and then sends you an email for each patch it 
> > > finds
> > > a checkpatch niggle against...
> > 
> > I think that any robot sending such checkpatch-only
> > emails should be disabled.
> > 
> > I know of 2 email robots.
> > 
> > Fengguang Wu's very useful build robot
> > sends out emails on build failures.
> > I think that's great.
> 
> Thanks! Yes I'm now running checkpatch these days because some people
> suggested to me that some of the checkpatch warnings do help catch
> real bugs.
> 
> However I do try to avoid upsetting people with maybe-subjective
> warnings. A checkpatch report will only be sent when a small fraction
> of error types are detected. Comments are very welcome on how to
> improve this list:
> 
> MEMSET
> IN_ATOMIC
> UAPI_INCLUDE
> MALFORMED_INCLUDE   
> SIZEOF_ADDRESS  
> KREALLOC_ARG_REUSE  
> EXECUTE_PERMISSIONS 
> ERROR:BAD_SIGN_OFF  
> LO_MACRO
> HI_MACRO
> CSYNC
> SSYNC
> HOTPLUG_SECTION
> INDENTED_LABEL
> INLINE_LOCATION
> STORAGE_CLASS
> USLEEP_RANGE
> UNNECESSARY_CASTS
> ALLOC_SIZEOF_STRUCT
> KREALLOC_ARG_REUSE
> USE_FUNC
> LOCKDEP
> EXPORTED_WORLD_WRITABLE
> WHITESPACE_AFTER_LINE_CONTINUATION
> MISSING_VMLINUX_SYMBOL
> NEEDLESS_IF
> PRINTF_L

Looks like you have KREALLOC_ARG_REUSE in that list twice.

Other than that, those look sensible.  I'd suggest a couple more, which
*should* always make sense, and to the best of my knowledge don't tend
to generate false positives:

C99_COMMENTS
CONFIG_EXPERIMENTAL
CVS_KEYWORD
ELSE_AFTER_BRACE
GLOBAL_INITIALIZERS
INITIALISED_STATIC
INVALID_UTF8
LINUX_VERSION_CODE
MISSING_EOF_NEWLINE
PREFER_SEQ_PUTS
PRINTK_WITHOUT_KERN_LEVEL
REDUNDANT_CODE
RETURN_PARENTHESES
SIZEOF_PARENTHESIS
SPACE_BEFORE_TAB
TRAILING_SEMICOLON
TRAILING_WHITESPACE
USE_DEVICE_INITCALL
USE_RELATIVE_PATH

These *ought* to make sense, but I don't know their false positive rates:

HEXADECIMAL_BOOLEAN_TEST
ALLOC_ARRAY_ARGS
CONSIDER_KSTRTO
CONST_STRUCT
SPLIT_STRING

The following almost always make sense, but only on patches not
yet applied to a tree:

PATCH_PREFIX
MODIFIED_INCLUDE_ASM
CORRUPTED_PATCH
NOT_UNIFIED_DIFF
MISSING_SIGN_OFF

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] [PATCH] checkpatch: Add comment about updating Documentation/CodingStyle

2013-09-02 Thread Fengguang Wu
On Mon, Sep 02, 2013 at 05:36:03PM -0700, Josh Triplett wrote:
> On Tue, Sep 03, 2013 at 08:26:21AM +0800, Shilong Wang wrote:
> > 2013/9/3 Joe Perches :
> > > Wang Shilong 
> > > sent me an automated checkpatch email I
> > > thought was not useful.
> > 
> > I am sorry if i give you any trouble, i have disabled it(in fact, it
> > only has run for a day!)
> 
> I would suggest that you leave it running, but rather than sending mails
> directly, have it prep the mails for you to send after manual review.
> Do some careful scrutiny for false positives and cases where the change
> would not improve the code, and use checkpatch's options to turn off
> the more contentious warnings (like the 80-column warning).  Over time,
> you'll develop a set of options that produce warnings people mostly
> *want* to get notified about.
 
Good suggestions! That's exactly what I'm trying to do. And Joe kindly
showed me the initial list of checkpatch error types suitable for auto
notification.

Coverage is good: the checkpatch robot iterates over every new commit
in the 300+ git trees I collected over time. Some maintainer trees are
skipped because they should already run the check.

Here is the list of reports sent in the last two weeks. They are private
emails directly sent to the commit author and committer.  So far I've not
received complaints on these unsolicited checkpatch reports.

 Aug 23  [netdev-next:master 200/301] WARNING: usb_free_urb(NULL) is safe this 
check is probably n
 Aug 23  [netdev-next:master 202/301] WARNING: usb_free_urb(NULL) is safe this 
check is probably n
 Aug 23  [linuxtv-media:master 321/499] ERROR: Unrecognized email address: 
'Kyungmin Park http://c-faq.com/ma
 Aug 28  [mmotm:master 473/483] WARNING: __func__ should be used instead of gcc 
specific __FUNCTIO
 Aug 28  [kvm:queue 13/14] ERROR: Unrecognized email address: 'Gleb Natapov 
@g...@redhat.com>'
 Aug 29  [dhowells-fs:keys-devel 9/12] WARNING: labels should not be indented
 Aug 29  [jolsa-perf:perf/plugins2 14/20] WARNING: storage class should be at 
the beginning of the
 Aug 30  [nfs:testing 47/61] ERROR: Unrecognized email address: 'Trond 
Myklebust http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND 2/3] x86, mm: Update min_pfn_mapped in add_pfn_range_mapped().

2013-09-02 Thread Tang Chen

Hi Yinghai,

On 09/03/2013 02:41 AM, Yinghai Lu wrote:
..


Nak, you can not move that.

min_pfn_mapped should not be updated before init_range_memory_mapping
is returned. as it need to refer old min_pfn_mapped.
and init_range_memory_mapping still init mapping from low to high locally.
min_pfn_mapped can not be updated too early.


The current code is like this:

init_mem_mapping()
{
while (from high to low) {
init_range_memory_mapping()
{
/* Here is from low to high */
for (from low to high) {
init_memory_mapping()
{
for () {
/* Need to refer min_pfn_mapped 
here */
kernel_physical_mapping_init();
}
/* So if updating min_pfn_mapped here, 
it is too low */
add_pfn_range_mapped();
}
}
}   
}
}

How about change the "for (from low to high)" in 
init_range_memory_mapping() to

"for_rev(from high to low)" ?
Then we can update min_pfn_mapped in add_pfn_range_mapped().

And also, the outer loop is from high to low, we can change the inner 
loop to be from high

to low too.

I think updating min_pfn_mapped in init_mem_mapping() is less readable. 
And min_pfn_mapped

and max_pfn_mapped should be updated together.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] [PATCH] checkpatch: Add comment about updating Documentation/CodingStyle

2013-09-02 Thread Joe Perches
On Tue, 2013-09-03 at 08:39 +0800, Fengguang Wu wrote:
> On Mon, Sep 02, 2013 at 02:11:36PM -0700, Joe Perches wrote:
[]
> > Fengguang Wu's very useful build robot
> > sends out emails on build failures.
> > I think that's great.
> 
> Thanks! Yes I'm now running checkpatch these days because some people
> suggested to me that some of the checkpatch warnings do help catch
> real bugs.

Hi Fengguang.

I see, I don't recall receiving one of these so
it must be working just fine.

> However I do try to avoid upsetting people with maybe-subjective
> warnings. A checkpatch report will only be sent when a small fraction
> of error types are detected. Comments are very welcome on how to
> improve this list:

Your list seems reasonable.

I might add:

DOS_LINE_ENDINGS
MODIFIED_INCLUDE_ASM
JIFFIES_COMPARISON
ONE_SEMICOLON


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] f2fs: optimize gc for better performance

2013-09-02 Thread Jaegeuk Kim
Hi Jin,

> [...]
> >
> > It seems that we can obtain the performance gain just by setting the
> > MAX_VICTIM_SEARCH to 4096, for example.
> > So, how about just adding an ending criteria like below?
> >
> 
> I agree that we could get the performance improvement by simply
> enlarging the MAX_VICTIM_SEARCH to 4096, but I am concerning the
> scalability a little bit. Because it might always searching the whole
> bitmap in some cases, for example, when dirty segments is 4000 and
> total segments is 409600.
> > [snip]
> [...]
> >
> > if (p->max_search > MAX_VICTIM_SEARCH)
> > p->max_search = MAX_VICTIM_SEARCH;
> >
> 
> The optimization does not apply to SSR mode. There has a reason.
> As noticed in the test, when SSR selected the segments that have most
> garbage blocks, then when gc is needed, all the dirty segments might
> have very less garbage blocks, thus the gc overhead is high. This might
> lead to performance degradation. So the patch does not change the
> victim selection policy for SSR.

I think it doesn't care.
GC is only triggered during the direct node block allocation.
What it means that we need to consider the number of GC triggers where
the GC triggers more frequently during the normal data allocation than
the node block allocation.
So, I think it would not degrade performance significatly.

BTW, could you show some numbers for this?
Or could you test what I suggested?

Thanks,

> 
> What do you think now?
> 
> > #define MAX_VICTIM_SEARCH 4096 /* covers 8GB */
> >
> >>p->offset = sbi->last_victim[p->gc_mode];
> >> @@ -243,6 +245,8 @@ static int get_victim_by_default(struct f2fs_sb_info 
> >> *sbi,
> >>struct victim_sel_policy p;
> >>unsigned int secno, max_cost;
> >>int nsearched = 0;
> >> +  unsigned int max_search = MAX_VICTIM_SEARCH;
> >> +  unsigned int nr_dirty;
> >>
> >>p.alloc_mode = alloc_mode;
> >>select_policy(sbi, gc_type, type, );
> >> @@ -258,6 +262,27 @@ static int get_victim_by_default(struct f2fs_sb_info 
> >> *sbi,
> >>goto got_it;
> >>}
> >>
> >> +  nr_dirty = dirty_i->nr_dirty[p.dirty_type];
> >> +  if (p.gc_mode == GC_GREEDY && p.alloc_mode != SSR) {
> >> +  if (TOTAL_SEGS(sbi) <= FULL_VICTIM_SEARCH_THRESH)
> >> +  max_search = nr_dirty; /* search all the dirty segs */
> >> +  else {
> >> +  /*
> >> +   * With more dirty segments, garbage blocks are likely
> >> +   * more scattered, thus search harder for better
> >> +   * victim.
> >> +   */
> >> +  max_search = div_u64 ((nr_dirty *
> >> +  FULL_VICTIM_SEARCH_THRESH), TOTAL_SEGS(sbi));
> >> +  if (max_search < MIN_VICTIM_SEARCH_GREEDY)
> >> +  max_search = MIN_VICTIM_SEARCH_GREEDY;
> >> +  }
> >> +  }
> >> +
> >> +  /* no more than the total dirty segments */
> >> +  if (max_search > nr_dirty)
> >> +  max_search = nr_dirty;
> >> +
> >>while (1) {
> >>unsigned long cost;
> >>unsigned int segno;
> >> @@ -290,7 +315,7 @@ static int get_victim_by_default(struct f2fs_sb_info 
> >> *sbi,
> >>if (cost == max_cost)
> >>continue;
> >>
> >> -  if (nsearched++ >= MAX_VICTIM_SEARCH) {
> >> +  if (nsearched++ >= max_search) {
> >
> > if (nsearched++ >= p.max_search) {
> >
> >>sbi->last_victim[p.gc_mode] = segno;
> >>break;
> >>}
> >> diff --git a/fs/f2fs/gc.h b/fs/f2fs/gc.h
> >> index 2c6a6bd..2f525aa 100644
> >> --- a/fs/f2fs/gc.h
> >> +++ b/fs/f2fs/gc.h
> >> @@ -20,7 +20,9 @@
> >>   #define LIMIT_FREE_BLOCK 40 /* percentage over invalid + free space */
> >>
> >>   /* Search max. number of dirty segments to select a victim segment */
> >> -#define MAX_VICTIM_SEARCH 20
> >> +#define MAX_VICTIM_SEARCH 20
> >> +#define MIN_VICTIM_SEARCH_GREEDY  20
> >> +#define FULL_VICTIM_SEARCH_THRESH 4096
> >>
> >>   struct f2fs_gc_kthread {
> >>struct task_struct *f2fs_gc_task;
> >> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
> >> index 062424a..cd33f96 100644
> >> --- a/fs/f2fs/segment.h
> >> +++ b/fs/f2fs/segment.h
> >> @@ -142,6 +142,7 @@ struct victim_sel_policy {
> >>int alloc_mode; /* LFS or SSR */
> >>int gc_mode;/* GC_CB or GC_GREEDY */
> >>unsigned long *dirty_segmap;/* dirty segment bitmap */
> >> +  int dirty_type;
> >
> > int max_search; /* maximum # of segments to search */
> >
> >>unsigned int offset;/* last scanned bitmap offset */
> >>unsigned int ofs_unit;  /* bitmap search unit */
> >>unsigned int min_cost;  /* minimum cost */
> >
> 

-- 
Jaegeuk Kim
Samsung

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

Re: Linux 3.11

2013-09-02 Thread Nicholas A. Bellinger
On Mon, 2013-09-02 at 15:50 -0700, Linus Torvalds wrote:
> On Mon, Sep 2, 2013 at 3:30 PM, Nicholas A. Bellinger
>  wrote:
> >
> > Unfortunately, this doesn't include the remaining target fixes for
> > v3.11:
> >
> > Re: [GIT PULL -v2] target fixes for v3.11
> > http://marc.info/?l=linux-kernel=137799048226191=2
> >
> > Is there a reason why these did not get PULLed..?
> 
> Very simple: I have no such email in my mailbox. I see the "target
> updates for v3.11-rc1" email (and I pulled that), and there is nothing
> since.
> 
> I don't even have that mail in my lkml archives, much less as a private email.
> 
> I see neither youe "-v2 PULL request" nor the "One more late v3.11
> specific regression" one. In fact, I see no emails from you at all
> from Aug 31.
> 
> It may be that gmail hates you for some reason...
> 
> [ time passes ]
> 
> Yup. It's in my spam-box, with gmail helpfully telling me:
> 
>Why is this message in Spam? We've found that lots of messages from
> linux-iscsi.org are spam.  Learn more
> 
> so something is rotten in the state of linux-iscsi.org.
> 
> Recent messages from you were similarly tagged:
> 
>  "[GIT PULL -v2] target fixes for 3.11"
>  "Re: LIO FC Target"
>  "Re: [GIT PULL] target fixes for v3.11-rc7"
> 
> there might have been more. You might want to try to figure out why
> gmail thinks that linux-iscsi.org is spammy.

Mmmm, that is what I was afraid of, looking into that now..

So if you would, please go ahead and pull target-pending/master ASAP for
the fixes, and I'll send the parts that need to goto Greg-KH separately.

Thanks,

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] [PATCH] checkpatch: Add comment about updating Documentation/CodingStyle

2013-09-02 Thread Fengguang Wu
On Mon, Sep 02, 2013 at 02:11:36PM -0700, Joe Perches wrote:
> On Mon, 2013-09-02 at 21:50 +0100, David Howells wrote:
> > Josh Triplett  wrote:
> > 
> > > > There are many checkpatch rules (like semicolons) that
> > > > are not in CodingStyle.
> > > 
> > > It's a rule of thumb, not a mandate.  In *general*, checkpatch.pl should
> > > not be enforcing style rules that aren't documented in CodingStyle.
> > 
> > Except that it becomes a mandate when someone runs it automatically against
> > every one of your patches and then sends you an email for each patch it 
> > finds
> > a checkpatch niggle against...
> 
> I think that any robot sending such checkpatch-only
> emails should be disabled.
> 
> I know of 2 email robots.
> 
> Fengguang Wu's very useful build robot
> sends out emails on build failures.
> I think that's great.

Thanks! Yes I'm now running checkpatch these days because some people
suggested to me that some of the checkpatch warnings do help catch
real bugs.

However I do try to avoid upsetting people with maybe-subjective
warnings. A checkpatch report will only be sent when a small fraction
of error types are detected. Comments are very welcome on how to
improve this list:

MEMSET
IN_ATOMIC
UAPI_INCLUDE
MALFORMED_INCLUDE   
SIZEOF_ADDRESS  
KREALLOC_ARG_REUSE  
EXECUTE_PERMISSIONS 
ERROR:BAD_SIGN_OFF  
LO_MACRO
HI_MACRO
CSYNC
SSYNC
HOTPLUG_SECTION
INDENTED_LABEL
INLINE_LOCATION
STORAGE_CLASS
USLEEP_RANGE
UNNECESSARY_CASTS
ALLOC_SIZEOF_STRUCT
KREALLOC_ARG_REUSE
USE_FUNC
LOCKDEP
EXPORTED_WORLD_WRITABLE
WHITESPACE_AFTER_LINE_CONTINUATION
MISSING_VMLINUX_SYMBOL
NEEDLESS_IF
PRINTF_L

Once the decision is made to send a checkpatch error/warning, the
report email will use the triggering error (the one that matters) as
the email subject, with the complete output of checkpatch.pl included
in email body.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] [PATCH] checkpatch: Add comment about updating Documentation/CodingStyle

2013-09-02 Thread Josh Triplett
On Tue, Sep 03, 2013 at 08:26:21AM +0800, Shilong Wang wrote:
> 2013/9/3 Joe Perches :
> > Wang Shilong 
> > sent me an automated checkpatch email I
> > thought was not useful.
> 
> I am sorry if i give you any trouble, i have disabled it(in fact, it
> only has run for a day!)

I would suggest that you leave it running, but rather than sending mails
directly, have it prep the mails for you to send after manual review.
Do some careful scrutiny for false positives and cases where the change
would not improve the code, and use checkpatch's options to turn off
the more contentious warnings (like the 80-column warning).  Over time,
you'll develop a set of options that produce warnings people mostly
*want* to get notified about.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] checkpatch: Report missing spaces around trigraphs with --strict

2013-09-02 Thread Josh Triplett
On Mon, Sep 02, 2013 at 04:54:25PM -0700, Joe Perches wrote:
> > would you mind looking at why
> > it gives a false positive for spaces around '*' on my recent patch at
> > http://mid.gmane.org/20130901234251.GB25057@leaf ?  It appears to
> > mistake the '*' of a pointer for a multiply.
> 
> Looks like checkpatch thinks this should be a multiplication.
> 
> Try this:
> ---
>  scripts/checkpatch.pl | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 9bb056c..e421b5e 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -3005,7 +3005,7 @@ sub process {
>$op eq '*' or $op eq '/' or
>$op eq '%')
>   {
> - if ($ctx =~ /Wx[^WCE]|[^WCE]xW/) {
> + if ($ctx =~ /Wx[^WCEB]|[^WCE]xW/) {
>   if (ERROR("SPACING",
> "need consistent 
> spacing around '$op' $at\n" . $hereptr)) {
>   $good = 
> rtrim($fix_elements[$n]) . " " . trim($fix_elements[$n + 1]) . " ";
> 
> 

That patch does indeed fix the problem, thanks!

Tested-by: Josh Triplett 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: linux 3.11

2013-09-02 Thread Juan Barry Manuel Canham
I noticed that linux-iscsi.org isn't doing much to protect itself from being 
used as a spam source. If you setup the following you should be less likely to 
be marked as spam:

* SPF record (setup both spf and a txt spf record for compatibility) 
* DMARC record to enforce SPF and allow servers to contact you when linux-
iscsi.org is used as a spam source
* DKIM - more work and probably not needed, but I suspect having valid dkim 
signatures will help with some mail servers spam rankings 

Apologies as this isn't really Linux kernel related stuff but it might help 
other developers avoid being spammed by gmail too.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] [PATCH] checkpatch: Add comment about updating Documentation/CodingStyle

2013-09-02 Thread Shilong Wang
2013/9/3 Joe Perches :
> On Mon, 2013-09-02 at 21:50 +0100, David Howells wrote:
>> Josh Triplett  wrote:
>>
>> > > There are many checkpatch rules (like semicolons) that
>> > > are not in CodingStyle.
>> >
>> > It's a rule of thumb, not a mandate.  In *general*, checkpatch.pl should
>> > not be enforcing style rules that aren't documented in CodingStyle.
>>
>> Except that it becomes a mandate when someone runs it automatically against
>> every one of your patches and then sends you an email for each patch it finds
>> a checkpatch niggle against...

Agree with this..
But using checkpatch.pl, i found there are *so many* patches that have
warnings or errors.

As far as i know, patches with checkpatch.pl's errors should be
avoided  at least unless
there is a *bug* in checkpatch.pl!

>
> I think that any robot sending such checkpatch-only
> emails should be disabled.
>
> I know of 2 email robots.
>
> Fengguang Wu's very useful build robot
> sends out emails on build failures.
> I think that's great.
>
> Wang Shilong 
> sent me an automated checkpatch email I
> thought was not useful.

I am sorry if i give you any trouble, i have disabled it(in fact, it
only has run for a day!)

Thanks,
wang
>
> Does anyone know of other checkpatch robots?
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/4] mm/hwpoison: fix miss catch transparent huge page

2013-09-02 Thread Naoya Horiguchi
On Tue, Sep 03, 2013 at 07:36:44AM +0800, Wanpeng Li wrote:
> Changelog:
>  *v1 -> v2: reverse PageTransHuge(page) && !PageHuge(page) check 
> 
> PageTransHuge() can't guarantee the page is transparent huge page since it 
> return true for both transparent huge and hugetlbfs pages. This patch fix 
> it by check the page is also !hugetlbfs page.
> 
> Before patch:
> 
> [  121.571128] Injecting memory failure at pfn 23a200
> [  121.571141] MCE 0x23a200: huge page recovery: Delayed
> [  140.355100] MCE: Memory failure is now running on 0x23a200
> 
> After patch:
> 
> [   94.290793] Injecting memory failure at pfn 23a000
> [   94.290800] MCE 0x23a000: huge page recovery: Delayed
> [  105.722303] MCE: Software-unpoisoned page 0x23a000
> 
> Signed-off-by: Wanpeng Li 

Thanks!

Reviewed-by: Naoya Horiguchi 

> ---
>  mm/memory-failure.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index e28ee77..b114570 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1349,7 +1349,7 @@ int unpoison_memory(unsigned long pfn)
>* worked by memory_failure() and the page lock is not held yet.
>* In such case, we yield to memory_failure() and make unpoison fail.
>*/
> - if (PageTransHuge(page)) {
> + if (!PageHuge(page) && PageTransHuge(page)) {
>   pr_info("MCE: Memory failure is now running on %#lx\n", pfn);
>   return 0;
>   }
> -- 
> 1.8.1.2
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [BUGFIX] crash/ioapic: Prevent crash_kexec() from deadlocking of ioapic_lock

2013-09-02 Thread Eric W. Biederman
Yoshihiro YUNOMAE  writes:

> Hi Eric and Don,
>
> Sorry for the late reply.
>
> (2013/08/31 9:58), Eric W. Biederman wrote:
>> Don Zickus  writes:
>> 
>>> On Tue, Aug 27, 2013 at 12:41:51PM +0900, Yoshihiro YUNOMAE wrote:
 Hi Don,

 Sorry for the late reply.

 (2013/08/22 22:11), Don Zickus wrote:
> On Thu, Aug 22, 2013 at 05:38:07PM +0900, Yoshihiro YUNOMAE wrote:
>>> So, I agree with Eric, let's remove the disable_IO_APIC() stuff and keep
>>> the code simpler.
>>
>> Thank you for commenting about my patch.
>> I didn't know you already have submitted the patches for this deadlock
>> problem.
>>
>> I can't answer definitively right now that no problems are induced by
>> removing disable_IO_APIC(). However, my patch should be work well (and
>> has already been merged to -tip tree). So how about taking my patch at
>> first, and then discussing the removal of disabled_IO_APIC()?
>
> It doesn't matter to me.  My orignal patch last year was similar to yours
> until it was suggested that we were working around a problem which was we
> shouldn't touch the IO_APIC code on panic.  Then I wrote the removal of
> disable_IO_APIC patch and did lots of testing on it.  I don't think I have
> seen any issues with it (just the removal of disabling the lapic stuff).

 Yes, you really did a lot of testing about this problem according to
 your patch(https://lkml.org/lkml/2012/1/31/391). Although you
 said jiffies calibration code does not need the PIT in
 http://lists.infradead.org/pipermail/kexec/2012-February/006017.html,
 I don't understand yet why we can remove disable_IO_APIC.
 Would you please explain about the calibration codes?
>>>
>>> I forgot a lot of this, Eric B. might remember more (as he was the one that
>>> pointed this out initially).  I believe initially the io_apic had to be in
>>> a pre-configured state in order to do some early calibration of the timing
>>> code.  Later on, it was my understanding, that the calibration of various
>>> time keeping stuff did not need the io_apic in a correct state.  The code
>>> might have switched to tsc instead of PIT, I forget.
>> 
>> Yes.  Alan Coxe's initial SMP port had a few cases where it still
>> exepected the system to be in PIT mode during boot and it took us a
>> decade or so before those assumptions were finally expunged.
>
> Would you please tell me the commit ID or the hint like files,
> functions, or when?

The short version is last time we tilted at this windmill the only
problem we could find was nmi's caused by the nmi watchdog.

So as a bug work-around all we need to retain is disabling the nmi
watchdog in crash-kexec.

>>> Then again looking at the output of the latest dmesg, it seems the IO APIC
>>> is initialized way before the tsc is calibrated.  So I am not sure what
>>> needed to get done or what interrupts are needed before the IO APIC gets
>>> initialized.
>> 
>> The practical issue is that jiffies was calibrated off of the PIT timer
>> if I recall.  But that is all old news.
>
> Are the jiffies calibration codes calibrate_delay()?
> It seems that the jiffies calibration have not used PIT in 2005
> according to 8a9e1b0.

Exactly.  That was the original reason why we put in the code to
disable the IOAPIC and the local apic.  There might have been other
reasons but that was the primary.

 By the way, can we remove disable_IO_APIC even if an old dump capture
 kernel is used?
>>>
>>> Good question.  I did a bunch of testing with RHEL-6 too, which is 2.6.32
>>> based.  But I think we added some IRR fixes (commit 1e75b31d638), which
>>> may or may not have helped in this case.  So I don't know when a kernel
>>> started worked correctly during init (with the right changes).  I believe
>>> 2.6.32 had everything.
>> 
>> A sufficient old and buggy dump capture kernel will fail because of bugs
>> in it's startup path, but I don't think anyone cares.
>
> OK, if the jiffies calibration problem has been fixed in the old days,
> we don't need to care for the old kernel.

Exactly.  There may have been one or two other silly assumptions and to
the best of our knowledge all of those have been purged except the
assumption that an NMI watchdog won't happen between kernels and while
booting the kernel.

>> The kernel startup path has been fixed for years, and disable_IO_APIC in
>> crash_kexec has always been a bug work-around for deficiencies in the
>> kernel's start up path (not part of the guaranteed interface).
>> Furthermore every real system configuration I have encountered used the
>> same kernel version for the crashdump kernel and the production kernel.
>> So we should be good.
>
> We also will be use the kdump(crashdump) kernel as the production
> kernel. Should I only care for the current kernel?

For this particular issue yes.

In general it is important for there to be a stable interface between
the two kernels just so you are 

[PATCH v4 0/1 resend] ARM: shmobile: r8a7790: add I2C support

2013-09-02 Thread Nguyen Viet Dung
Hi Wolfram
CC Morimoto

Please consider the following patch for the r8a7790 Soc.
This patch modify I2C driver of rcar-H1 to usable on both rcar-H1 and rcar-H2.
It was developed base on the renesas-devel-20130722 branch and
have tested on the Lager board.

Thanks,
Nguyen viet Dung

Nguyen Viet Dung (1):
  i2c: rcar: modify I2C driver

 drivers/i2c/busses/i2c-rcar.c |   35 +--
 1 file changed, 33 insertions(+), 2 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 1/1 resend] i2c: rcar: modify I2C driver

2013-09-02 Thread Nguyen Viet Dung
This patch modify I2C driver of rcar-H1 to usable on both rcar-H1 and rcar-H2.

Signed-off-by: Nguyen Viet Dung 
---
 drivers/i2c/busses/i2c-rcar.c |   35 +--
 1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/drivers/i2c/busses/i2c-rcar.c b/drivers/i2c/busses/i2c-rcar.c
index 0fc5858..32ec693 100644
--- a/drivers/i2c/busses/i2c-rcar.c
+++ b/drivers/i2c/busses/i2c-rcar.c
@@ -101,6 +101,11 @@ enum {
 #define ID_ARBLOST (1 << 3)
 #define ID_NACK(1 << 4)
 
+enum rcar_i2c_type {
+   I2C_RCAR_H1,
+   I2C_RCAR_H2,
+};
+
 struct rcar_i2c_priv {
void __iomem *io;
struct i2c_adapter adap;
@@ -113,6 +118,7 @@ struct rcar_i2c_priv {
int irq;
u32 icccr;
u32 flags;
+   enum rcar_i2c_type  devtype;
 };
 
 #define rcar_i2c_priv_to_dev(p)((p)->adap.dev.parent)
@@ -224,12 +230,25 @@ static int rcar_i2c_clock_calculate(struct rcar_i2c_priv 
*priv,
u32 scgd, cdf;
u32 round, ick;
u32 scl;
+   u32 cdf_width;
 
if (!clkp) {
dev_err(dev, "there is no peripheral_clk\n");
return -EIO;
}
 
+   switch (priv->devtype) {
+   case I2C_RCAR_H1:
+   cdf_width = 2;
+   break;
+   case I2C_RCAR_H2:
+   cdf_width = 3;
+   break;
+   default:
+   dev_err(dev, "device type error\n");
+   return -EIO;
+   }
+
/*
 * calculate SCL clock
 * see
@@ -245,7 +264,7 @@ static int rcar_i2c_clock_calculate(struct rcar_i2c_priv 
*priv,
 * clkp : peripheral_clk
 * F[]  : integer up-valuation
 */
-   for (cdf = 0; cdf < 4; cdf++) {
+   for (cdf = 0; cdf < (1 << cdf_width); cdf++) {
ick = clk_get_rate(clkp) / (1 + cdf);
if (ick < 2000)
goto ick_find;
@@ -287,7 +306,7 @@ scgd_find:
/*
 * keep icccr value
 */
-   priv->icccr = (scgd << 2 | cdf);
+   priv->icccr = (scgd << (cdf_width) | cdf);
 
return 0;
 }
@@ -632,6 +651,9 @@ static int rcar_i2c_probe(struct platform_device *pdev)
bus_speed = 10; /* default 100 kHz */
if (pdata && pdata->bus_speed)
bus_speed = pdata->bus_speed;
+
+   priv->devtype = platform_get_device_id(pdev)->driver_data;
+
ret = rcar_i2c_clock_calculate(priv, bus_speed, dev);
if (ret < 0)
return ret;
@@ -686,6 +708,14 @@ static int rcar_i2c_remove(struct platform_device *pdev)
return 0;
 }
 
+static struct platform_device_id rcar_i2c_id_table[] = {
+   { "i2c-rcar",   I2C_RCAR_H1 },
+   { "i2c-rcar_h1",I2C_RCAR_H1 },
+   { "i2c-rcar_h2",I2C_RCAR_H2 },
+   {},
+};
+MODULE_DEVICE_TABLE(platform, rcar_i2c_id_table);
+
 static struct platform_driver rcar_i2c_driver = {
.driver = {
.name   = "i2c-rcar",
@@ -693,6 +723,7 @@ static struct platform_driver rcar_i2c_driver = {
},
.probe  = rcar_i2c_probe,
.remove = rcar_i2c_remove,
+   .id_table   = rcar_i2c_id_table,
 };
 
 module_platform_driver(rcar_i2c_driver);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PATCH] TTY/Serial patches for 3.12-rc1

2013-09-02 Thread Greg KH
The following changes since commit c095ba7224d8edc71dcef0d655911399a8bd4a3f:

  Linux 3.11-rc4 (2013-08-04 13:46:46 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git/ 
tags/tty-3.12-rc1

for you to fetch changes up to 2d1d3f3ae985ec5676fb56ff2c7acad2e1c4e6eb:

  hvc_xen: Remove unnecessary __GFP_ZERO from kzalloc (2013-08-30 14:11:28 
-0700)


TTY/Serial driver patches for 3.12-rc1

Here's the big tty/serial driver pull request for 3.12-rc1.

Lots of n_tty reworks to resolve some very long-standing issues, removing the
3-4 different locks that were taken for every character.  This code has been
beaten on for a long time in linux-next with no reported regressions.

Other than that, a range of serial and tty driver updates and revisions.  Full
details in the shortlog.

Signed-off-by: Greg Kroah-Hartman 


Aldo Iljazi (1):
  Drivers: tty: n_gsm.c: fixed 7 errors & 6 warnings that checkpatch 
complained

Alexander Shiyan (8):
  serial: max310x: Driver rework
  serial: max310x: Add MAX3109 support
  serial: max310x: Add MAX14830 support
  serial: max310x: Fix dev_pm_ops
  serial: sccnxp: Disable regulator on error
  serial: sccnxp: Using CLK API for getting UART clock
  serial: sccnxp: Using structure for each supported IC instead of switch 
in probe
  serial: sccnxp: Add DT support

Alexandru Juncu (1):
  TTY: synclink: replace bitmasks add operation with OR operation.

Andreas Bießmann (1):
  register_console: prevent adding the same console twice

Andreas Platschek (1):
  tty: Remove dead code

Axel Lin (2):
  serial: fsl_lpuart: Return proper error on lpuart_serial_init error path
  serial: bfin_uart: Remove redundant testing for ifdef 
CONFIG_SERIAL_BFIN_MODULE

Barry Song (2):
  serial: sirf: add support for Marco chip
  serial: sirf: drop redundant pinctrl_get_select_default as pinctrl core 
does it

Christophe Leroy (1):
  tty: serial: cpm_uart: Adding proper request of GPIO used by cpm_uart 
driver

Clemens Ladisch (1):
  vt: make the default color configurable

Dan Carpenter (1):
  serial: icom: move array overflow checks earlier

Daniel Mack (1):
  tty: serial: pxa: remove old cruft

Darren Hart (3):
  pch_uart: Use DMI interface for board detection
  serial: pch_uart: Remove __initdata annotation from dmi_table
  serial: pch_uart: Fix signed-ness and casting of uartclk related fields

Dmitry Fink (1):
  OMAP: UART: Keep the TX fifo full when possible

Elen Song (8):
  serial: at91: correct definition from DMA to PDC
  serial: at91: use function pointer to choose pdc or pio
  serial: at91: add tx dma support
  serial: at91: add rx dma support
  serial: at91: support run time switch transfer mode
  serial: at91: distinguish usart and uart
  serial: at91: make UART support dma and pdc transfers
  serial: at91: add dma support in usart binding descriptions

Fabio Estevam (1):
  serial: amba-pl011: Use __releases/__acquires annotations

Gabor Juhos (6):
  tty: ar933x_uart: convert to use devm_* functions
  tty: ar933x_uart: use the clk API to get the uart clock
  tty: ar933x_uart: remove superfluous assignment of ar933x_uart_driver.nr
  tty: ar933x_uart: use config_enabled() macro to clean up ifdefs
  tty: ar933x_uart: allow to build the driver as a module
  tty: ar933x_uart: add device tree support and binding documentation

Govindraj.R (1):
  OMAP2+: UART: enable tx wakeup bit for wer reg

Greg Kroah-Hartman (5):
  Merge 3.11-rc3 into tty-next
  Revert "serial: sccnxp: Add DT support"
  Merge 3.11-rc4 into tty-next
  Revert "serial: omap: Fix IRQ handling return value"
  Revert "OMAP: UART: Keep the TX fifo full when possible"

Grygorii Strashko (1):
  serial: omap: enable PM runtime only when its fully configured

Hendrik Brueckner (2):
  tty/hvc_console: Add DTR/RTS callback to handle HUPCL control
  tty/hvc_iucv: Disconnect IUCV connection when lowering DTR

Huang Shijie (7):
  serial: imx: remove the uart_console() check
  serial: imx: distinguish the imx6q uart from the others
  serial: imx: add DMA support for imx6q
  serial: mxs: enable the DMA only when the RTS/CTS is valid
  serial: mxs: remove the MXS_AUART_DMA_CONFIG
  ARM: dts: imx28-evk: add the RTS/CTS property for auart0
  serial: imx: initialize the local variable

Hubert Feurstein (1):
  serial/imx: fix custom-baudrate handling

Ian Abbott (4):
  pci_ids.h: move PCI_VENDOR_ID_AMCC here
  serial: 8250_pci: replace PCI_VENDOR_ID_ADDIDATA_OLD
  serial: 8250_pci: use local device ID for ADDI-DATA APCI-7800
  pci_ids.h: remove PCI_VENDOR_ID_ADDIDATA_OLD and 
PCI_DEVICE_ID_ADDIDATA_APCI7800

Jingoo Han (15):
  

[GIT PATCH] Driver core patches for 3.12-rc1

2013-09-02 Thread Greg KH
The following changes since commit 5ae90d8e467e625e447000cb4335c4db973b1095:

  Linux 3.11-rc3 (2013-07-28 20:53:33 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git/ 
tags/driver-core-3.12-rc1

for you to fetch changes up to 1f153c02f5856ec109fa532eb5f31c39f85c:

  firmware loader: fix pending_fw_head list corruption (2013-08-30 12:04:27 
-0700)


Driver core patches for 3.12-rc1

Here's the big driver core pull request for 3.12-rc1.

Lots of tiny changes here fixing up the way sysfs attributes are
created, to try to make drivers simpler, and fix a whole class race
conditions with creations of device attributes after the device was
announced to userspace.

All the various pieces are acked by the different subsystem maintainers.

Signed-off-by: Greg Kroah-Hartman 


David Graham White (1):
  drivers:base:core: Moved sym export macros to respective functions

Geert Uytterhoeven (1):
  Kconfig: Remove hotplug enable hints in CONFIG_KEXEC help texts

Greg KH (2):
  ACPI: bgrt: take advantage of binary sysfs groups
  firmware: dcdbas: use binary attribute groups

Greg Kroah-Hartman (92):
  misc: c2port: use dev_bin_attrs instead of hand-coding it
  bsr: convert bsr_class to use dev_groups
  tile: srom: convert srom_class to use dev_groups
  c2port: convert class code to use dev_groups
  enclosure: convert class code to use dev_groups
  UIO: convert class code to use dev_groups
  staging: comedi: convert class code to use dev_groups
  c2port: convert class code to use bin_attrs in groups
  char: tile-srom: fix build error
  dma: convert dma_devclass to use dev_groups
  extcon: convert extcon_class to use dev_groups
  SCSI: OSD: convert class code to use dev_groups
  video: backlight: convert class code to use dev_groups
  video: backlight: lcd: convert class code to use dev_groups
  net: wireless: convert class code to use dev_groups
  net: rfkill: convert class code to use dev_groups
  ISDN: convert class code to use dev_groups
  leds: convert class code to use dev_groups
  PTP: convert class code to use dev_groups
  cuse: convert class code to use dev_groups
  net: core: convert class code to use dev_groups
  net: ieee802154: convert class code to use dev_groups
  Merge 3.11-rc3 into driver-core-next
  rtc: convert class code to use dev_groups
  driver core: bus_type: add dev_groups
  driver core: bus_type: add drv_groups
  driver core: bus_type: add bus_groups
  mips: convert vpe_class to use dev_groups
  devfreq: convert devfreq_class to use dev_groups
  HID: roccat: convert class code to use dev_groups
  v4l2: convert class code to use dev_groups
  x86: wmi: convert class code to use dev_groups
  PPS: convert class code to use dev_groups
  backing-dev: convert class code to use dev_groups
  hid: roccat-arvo: convert class code to use bin_attrs in groups
  hid: roccat-isku: convert class code to use bin_attrs in groups
  hid: roccat-kone: convert class code to use bin_attrs in groups
  hid: roccat-savu: convert class code to use bin_attrs in groups
  hid: roccat-koneplus: convert class code to use bin_attrs in groups
  hid: roccat-konepure: convert class code to use bin_attrs in groups
  hid: roccat-kovaplus: convert class code to use bin_attrs in groups
  hid: roccat-kone: fix off-by-one bug in attributes
  sysfs.h: fix __BIN_ATTR_RW()
  hid: roccat-pyra: convert class code to use bin_attrs in groups
  sysfs: add sysfs_create/remove_groups()
  sysfs: group.c: move EXPORT_SYMBOL_GPL() to the proper location
  sysfs: group.c: fix trailing whitespace
  sysfs: group.c: fix up some * coding style issues
  sysfs: group.c: fix up broken string coding style
  sysfs: group.c: add kerneldoc for sysfs_remove_group
  sysfs: group: update copyright to add myself and the LF
  sysfs: fix placement of EXPORT_SYMBOL()
  sysfs: remove trailing whitespace
  sysfs: fix up space coding style issues
  sysfs: fix up 80 column coding style issues
  sysfs: fix up uaccess.h coding style warnings
  sysfs: dir.c: fix up odd do/while indentation
  sysfs: file.c: fix up broken string warnings
  sysfs: sysfs.h: fix coding style issues
  sysfs: fix up minor coding style issues in sysfs.h
  acpi: bgrt: fix build error due to attribute change
  sysfs: group.c: fix up kerneldoc
  sysfs.h: remove attr_name() macro
  w1: remove race with sysfs file creation
  w1: use default attribute groups for w1 slave devices
  w1: add attribute groups to struct w1_family_ops
  w1: slaves: w1_therm: convert to use w1_family_ops.groups
  w1: slaves: w1_ds2408: convert to use 

[GIT PATCH] USB patches for 3.12-rc1

2013-09-02 Thread Greg KH
The following changes since commit b36f4be3de1b123d8601de062e7dbfc904f305fb:

  Linux 3.11-rc6 (2013-08-18 14:36:53 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/ 
tags/usb-3.12-rc1

for you to fetch changes up to b9a1048137f4ae43ee90f61a3f34f0efe863cfeb:

  usbcore: fix incorrect type in assignment in descriptors_changed() 
(2013-08-30 18:50:43 -0700)


USB patches for 3.12-rc1

Here's the big USB driver pull request for 3.12-rc1

Lots of USB driver fixes and updates.  Nothing major, just the normal
xhci, gadget, and other driver changes.  Full details in the shortlog.

Signed-off-by: Greg Kroah-Hartman 


Al Cooper (1):
  usb: Add Device Tree support to XHCI Platform driver

Alan Stern (9):
  USB: remove redundant "#if"
  USB: simplify the interface of usb_get_status()
  USB: refactor code for enabling/disabling remote wakeup
  USB: handle LPM errors during device suspend correctly
  USB: EHCI: keep better track of resuming ports
  USB: EHCI: don't depend on hardware for tracking port resets and resumes
  USB: handle LPM errors during device suspend correctly
  USB: OHCI: Allow runtime PM without system sleep
  USB: fix build error when CONFIG_PM_SLEEP isn't enabled

Alexey Khoroshilov (1):
  usb: gadget: amd5536udc: unconditionally use GFP_ATOMIC in udc_queue()

Anatolij Gustschin (1):
  usb: phy: fix build breakage

Andrzej Pietrasiewicz (1):
  usb: gadget: configfs: keep a function if it is not successfully added

Andy Shevchenko (4):
  usbtmc: remove trailing spaces
  usbtmc: call pr_err instead of plain printk
  usbtmc: remove redundant braces
  usbtmc: convert to devm_kzalloc

Boris BREZILLON (4):
  usb: gadget: atmel_usba: prepare clk before calling enable
  USB: ohci-at91: add usb_clk for transition to common clk framework
  usb: gadget: at91_udc: add missing clk_put on fclk and iclk
  usb: gadget: at91_udc: add usb_clk for transition to common clk framework

Chen Wang (1):
  USB: usb-skeleton.c: add retry for nonblocking read

Dan Carpenter (6):
  USB: mos7720: use GFP_ATOMIC under spinlock
  usb: gadget: gadgetfs: use after free in dev_release()
  usb: gadget: gadgetfs: potential use after free in unbind()
  usb: phy: signedness bugs in suspend/resume functions
  usb: gadget: double unlocks on error in atmel_usba_start()
  dma: cppi41: off by one in desc_to_chan()

Daniel Mack (1):
  usb: ehci-mxc: check for pdata before dereferencing

David Daney (1):
  usb: Move definition of USB_EHCI_BIG_ENDIAN_MMIO et al. out side of the 
ifs.

Dmitry Kasatkin (2):
  xhci:prevent "callbacks suppressed" when debug is not enabled
  dev-core: fix build break when DEBUG is enabled

Fabio Estevam (4):
  usb: phy: phy-mxs-usb: Check the return value from stmp_reset_block()
  usb: chipidea: ci_hdrc_imx: remove unused variable 'res'
  usb: chipidea: move hw_phymode_configure() into probe
  usb: chipidea: remove previous MODULE_ALIAS

Felipe Balbi (32):
  usb: class: cdc-acm: be careful with bInterval
  usb: atm: speedtch: be careful with bInterval
  usb: clamp bInterval to allowed range
  usb: dwc3: make glue layers selectable
  usb: gadget: remove imx_udc
  usb: dwc3: gadget: don't request IRQs in atomic
  usb: dwc3: switch to GPL v2 only
  usb: phy: protect against NULL phy pointers
  usb: common: introduce of_usb_get_maximum_speed()
  usb: dwc3: let non-DT platforms pass tx-fifo-resize flag;
  usb: dwc3: make maximum-speed a per-instance attribute
  usb: dwc3: core: switch to snps,dwc3
  usb: dwc3: gadget: drop dwc3 manual phy control
  usb: dwc3: omap: switch over to devm_ioremap_resource()
  usb: dwc3: core: switch over to devm_ioremap_resource()
  usb: dwc3: gadget: move debugging print around
  usb: dwc3: gadget: move direction setting up
  usb: dwc3: gadget: add a debugging print when initializing endpoints
  usb: dwc3: core: don't redefine DWC3_DCFG_LPM_CAP
  usb: dwc3: gadget: don't enable LPM early
  usb: dwc3: core: introduce and use macros for Event Size register
  usb: dwc3: gadget: get rid of IRQF_ONESHOT
  usb: dwc3: gadget: rename dwc3_process_event_buf
  usb: dwc3: gadget: introduce dwc3_process_event_buf
  usb: gadget: udc-core: move sysfs_notify() to a workqueue
  usb: dwc3: ep0: only change to ADDRESS if set_config() succeeds
  usb: dwc3: ep0: don't change to configured state too early
  usb: of: fix build breakage caused by recent patches
  usb: dwc3: use dev_get_platdata()
  Merge branch 'nop-phy-rename' into next
  usb: musb: dsps: make it depend on OF_IRQ
  usb: dwc3: core: cope with NULL pdata

Feng-Hsin Chiang (1):
  usb: 

[GIT PATCH] char/misc patches for 3.12-rc1

2013-09-02 Thread Greg KH
The following changes since commit b36f4be3de1b123d8601de062e7dbfc904f305fb:

  Linux 3.11-rc6 (2013-08-18 14:36:53 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git/ 
tags/char-misc-3.12-rc1

for you to fetch changes up to 3cc1f95283a125cf54ccf1e25065321d4385133b:

  drivers: uio: Kconfig: add MMU dependancy for UIO (2013-08-30 14:19:46 -0700)


Char/Misc patches for 3.12-rc1

Here is the big char/misc driver pull request for 3.12-rc1

Lots of driver updates all over the char/misc tree, full details in the
shortlog below.

Signed-off-by: Greg Kroah-Hartman 


Alessandro Rubini (1):
  FMC: fix locking in sample chardev driver

Alexandru Juncu (1):
  pcmcia: synclink_cs: replace sum of bitmasks with OR operation.

Andy King (2):
  VMCI: Remove non-blocking/pinned queuepair support
  VMCI: Add support for virtual IOMMU

Chen Gang (2):
  drivers: parport: Kconfig: exclude h8300 for PARPORT_PC
  drivers: uio: Kconfig: add MMU dependancy for UIO

Greg Kroah-Hartman (5):
  misc: c2port: use dev_bin_attrs instead of hand-coding it
  Revert "misc: c2port: use dev_bin_attrs instead of hand-coding it"
  Merge 3.11-rc3 into char-misc-next.
  Merge tag 'extcon-next-for-3.12' of 
git://git.kernel.org/.../chanwoo/extcon into char-misc-next
  Merge 3.11-rc6 into char-misc-next

Heiko Stübner (1):
  misc: sram: fix error path in sram_probe

Jan-Simon Möller (1):
  misc: vmw_balloon: Remove braces to fix build for clang.

Jingoo Han (9):
  FMC: Staticize local symbols
  vme: vme_tsi148.c: add missing __iomem annotation
  vme: vme_ca91cx42.c: add missing __iomem annotation
  vme: vme_vmivme7805.c: add missing __iomem annotation
  parport: amiga: remove unnecessary platform_set_drvdata()
  uio: uio_pruss: remove unnecessary platform_set_drvdata()
  drivers: uio_dmem_genirq: use dev_get_platdata()
  drivers: uio_pruss: use dev_get_platdata()
  drivers: uio_pdrv_genirq: use dev_get_platdata()

K. Y. Srinivasan (4):
  Drivers: hv: util: Fix a bug in version negotiation code for util services
  Drivers: hv: balloon: Initialize the transaction ID just before sending 
the packet
  Drivers: hv: vmbus: Fix a bug in the handling of channel offers
  Drivers: hv: vmbus: Do not attempt to negoatiate a new version prematurely

Kees Cook (4):
  lkdtm: fix stack protector trigger
  lkdtm: add "WARNING" trigger
  lkdtm: add "SPINLOCKUP" trigger
  lkdtm: add "EXEC_*" triggers

Kishon Vijay Abraham I (3):
  extcon: Add an API to get extcon device from dt node
  usb: dwc3: use extcon fwrk to receive connect/disconnect
  extcon: palmas: remove assigning "edev.name" to palmas

Laxman Dewangan (6):
  extcon: palmas: rename device tree binding matching with file name
  extcon: palmas: devicetree: remove non-require property details
  extcon: palmas: remove unused member from palams_usb structure
  extcon: palmas: enable ID_GND and ID_FLOAT detection always
  extcon: palams: add support for suspend/resume
  extcon: palmas: Option to disable ID/VBUS detection based on platform

Mark Brown (3):
  extcon: arizona: Use power efficient workqueue
  extcon: gpio: Use power efficient workqueue for debounce
  extcon: adc-jack: Use power efficient workqueue

Mark Rusk (1):
  drivers/misc/hpilo: Correct panic when an AUX iLO is detected

Michal Simek (1):
  uio: Remove uio_pdrv and use uio_pdrv_genirq instead

Olaf Hering (7):
  Drivers: hv: remove HV_DRV_VERSION
  Tools: hv: fix send/recv buffer allocation
  Tools: hv: check return value of daemon to fix compiler warning.
  Tools: hv: in kvp_set_ip_info free mac_addr right after usage
  Tools: hv: check return value of system in hv_kvp_daemon
  Tools: hv: correct payload size in netlink_send
  Tools: hv: use full nlmsghdr in netlink_send

Oleksandr Kozaruk (1):
  drivers: misc: ti-st: fix potential race if st_kim_start fails

Rostislav Lisovy (1):
  drivers: uio: Add driver for Humusoft MF624 DAQ PCI card

Tomas Hozza (3):
  tools: hv: Improve error logging in VSS daemon.
  tools: hv: Check return value of poll call
  tools: hv: Check return value of setsockopt call

Tomas Winkler (4):
  mei: wake also writers on reset
  mei: bus: do not overflow the device name buffer
  mei: don't get stuck in select during reset
  mei: me: fix hardware reset flow

Uwe Kleine-König (3):
  mm: make generic_access_phys available for modules
  uio: provide vm access to UIO_MEM_PHYS maps
  uio: drop unused vma_count member in uio_device struct

Wei Yongjun (1):
  vme: vme_ca91cx42.c: fix to pass correct device identity to free_irq()

Wolfram Sang (1):
  drivers/misc: don't use 

Re: [PATCH] checkpatch: Report missing spaces around trigraphs with --strict

2013-09-02 Thread Joe Perches
> would you mind looking at why
> it gives a false positive for spaces around '*' on my recent patch at
> http://mid.gmane.org/20130901234251.GB25057@leaf ?  It appears to
> mistake the '*' of a pointer for a multiply.

Looks like checkpatch thinks this should be a multiplication.

Try this:
---
 scripts/checkpatch.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 9bb056c..e421b5e 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3005,7 +3005,7 @@ sub process {
 $op eq '*' or $op eq '/' or
 $op eq '%')
{
-   if ($ctx =~ /Wx[^WCE]|[^WCE]xW/) {
+   if ($ctx =~ /Wx[^WCEB]|[^WCE]xW/) {
if (ERROR("SPACING",
  "need consistent 
spacing around '$op' $at\n" . $hereptr)) {
$good = 
rtrim($fix_elements[$n]) . " " . trim($fix_elements[$n + 1]) . " ";


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >