Re: [PATCH -tip v3 02/23] kprobes: Introduce NOKPROBE_SYMBOL() macro for blacklist

2013-11-27 Thread Masami Hiramatsu
(2013/11/27 22:32), Ingo Molnar wrote:
> 
> * Masami Hiramatsu  wrote:
> 
>> +#ifdef CONFIG_KPROBES
>> +/*
>> + * Blacklist ganerating macro. Specify functions which is not probed
>> + * by using this macro.
>> + */
>> +#define __NOKPROBE_SYMBOL(fname)\
>> +static struct kprobe_blackpoint __used  \
>> +_kprobe_bp_##fname = {  \
>> +.name = #fname, \
>> +.start_addr = (unsigned long)fname, \
>> +};  \
>> +static struct kprobe_blackpoint __used  \
>> +__attribute__((section("_kprobe_blacklist")))   \
>> +*_p_kprobe_bp_##fname = &_kprobe_bp_##fname;
> 
> 'kprobe_blackpoint' sounds a bit weird - how about 
> 'kprobe_blacklist_entry' ?

OK, I just tried to reuse existed kprobe_blacklist.

> also, _kprobe_blacklist probably wants to be _kprobes_blacklist, 
> right?

I sse. I'll update it. :)

Thank you,

-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH/RFC 17/17] tracing/uprobes: Add @+file_offset fetch method

2013-11-27 Thread Namhyung Kim
Hi Oleg,

On Wed, 27 Nov 2013 19:55:46 +0100, Oleg Nesterov wrote:
> Hi Namhyung,
>
> I'll certainly try to read (and even apply ;) this series carefully.

Thanks in advance. :)

>
> But let me make a couple of nits right now, even if I do not understand
> this code yet.

Okay.

>
> On 11/27, Namhyung Kim wrote:
>>
>> +} else if (arg[1] == '+') {
>> +struct file_offset_fetch_param *foprm;
>> +
>> +/* kprobes don't support file offsets */
>> +if (is_kprobe)
>> +return -EINVAL;
>> +
>> +ret = kstrtol(arg + 2, 0, );
>> +if (ret)
>> +break;
>> +
>> +foprm = kzalloc(sizeof(*foprm), GFP_KERNEL);
>> +if (!foprm)
>> +return -ENOMEM;
>> +
>> +foprm->tu = priv;
>> +foprm->offset = offset;
>
> Hmm. I am not sure, but can't we simplify this?
>
> Why do we need this foprm at all? To pass tu/offset obviously. But
> why we need to store this info in fetch_param?
>
> translate_user_vaddr() needs to access utask->vaddr anyway. It seems
> to me it would be more clean to do the following:
>
>   1. Add
>   struct xxx {
>   struct trace_uprobe *tu;
>   unsigned long bp_addr;
>   };
>
>  in trace_uprobe.c.
>
>   2. Add
>
>   struct xxx info = {
>   .tu = tu,
>   .bp_addr = instruction_pointer(regs);
>   };
>
>   current->utask->vaddr = (long)
>
>  into uprobe_dispatcher() and uretprobe_dispatcher() (the latter
>  should obviously use func instead of instruction_pointer).
>
>3. FETCH_FUNC_NAME(file_offset, type) can do
>
>   struct xxx *info = (void*)current->utask->vaddr;
>   void *addr = data + info->bp_addr - info->tu->offset;
>
>   return FETCH_FUNC_NAME(memory, type)(regs, aaddr, dest);
>
>4. Now, the only change we need in parse_probe_arg("@") is that
>   it should use either FETCH_MTD_memory or FETCH_MTD_file_offset
>   depending on arg[0] == '+'.
>
>   And we do not need to pass "void *prive" to parse_probe_arg().
>
> What do you think? One again, I can be easily wrong, I didn't read the
> code yet.

You are absolutely right.

I thought we need a fetch_param anyway if we will add support for
cross-fetch later.  But I won't insist it strongly, I can delay it to
later work and make current code simpler if you want. :)

>
>>  static int uprobe_dispatcher(struct uprobe_consumer *con, struct pt_regs 
>> *regs)
>>  {
>>  struct trace_uprobe *tu;
>> +struct uprobe_task *utask;
>>  int ret = 0;
>>  
>>  tu = container_of(con, struct trace_uprobe, consumer);
>>  tu->nhit++;
>>  
>> +utask = current->utask;
>> +if (utask == NULL)
>> +return UPROBE_HANDLER_REMOVE;
>
> Hmm, why? The previous change ensures ->utask is not NULL? If we hit
> NULL we have a bug, we should not remove this uprobe.

Yes, I just want to be defensive. :)

So do you suggest to add BUG_ON()?  And can I convert or remove a
similar check in uprobes.c:pre_ssout() too?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i2c: bcm-kona: fix module device table symbol

2013-11-27 Thread Wolfram Sang
On Wed, Nov 27, 2013 at 11:38:40PM +0100, Vincent Stehlé wrote:
> This fixes the following compilation error when compiling in module:
> 
>   drivers/i2c/busses/i2c-bcm-kona.c:894:1: error: ‘__mod_of_device_table’ 
> aliased to undefined symbol ‘kona_i2c_of_match’
> 
> Signed-off-by: Vincent Stehlé 
> Cc: Wolfram Sang 
> Cc: triv...@kernel.org

A fix is in my pull request for Linus today.



signature.asc
Description: Digital signature


Re: [PATCH 0/3] makedumpfile: hugepage filtering for vmcore dump

2013-11-27 Thread HATAYAMA Daisuke
(2013/11/28 16:08), Atsushi Kumagai wrote:
> On 2013/11/22 16:18:20, kexec  wrote:
>> (2013/11/07 9:54), HATAYAMA Daisuke wrote:
>>> (2013/11/06 11:21), Atsushi Kumagai wrote:
 (2013/11/06 5:27), Vivek Goyal wrote:
> On Tue, Nov 05, 2013 at 09:45:32PM +0800, Jingbai Ma wrote:
>> This patch set intend to exclude unnecessary hugepages from vmcore dump 
>> file.
>>
>> This patch requires the kernel patch to export necessary data structures 
>> into
>> vmcore: "kexec: export hugepage data structure into vmcoreinfo"
>> http://lists.infradead.org/pipermail/kexec/2013-November/009997.html
>>
>> This patch introduce two new dump levels 32 and 64 to exclude all unused 
>> and
>> active hugepages. The level to exclude all unnecessary pages will be 127 
>> now.
>
> Interesting. Why hugepages should be treated any differentely than normal
> pages?
>
> If user asked to filter out free page, then it should be filtered and
> it should not matter whether it is a huge page or not?

 I'm making a RFC patch of hugepages filtering based on such policy.

 I attach the prototype version.
 It's able to filter out also THPs, and suitable for cyclic processing
 because it depends on mem_map and looking up it can be divided into
 cycles. This is the same idea as page_is_buddy().

 So I think it's better.

>>>
 @@ -4506,14 +4583,49 @@ __exclude_unnecessary_pages(unsigned long mem_map,
 && !isAnon(mapping)) {
 if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
 pfn_cache_private++;
 +/*
 + * NOTE: If THP for cache is introduced, the check for
 + *   compound pages is needed here.
 + */
 }
 /*
  * Exclude the data page of the user process.
  */
 -else if ((info->dump_level & DL_EXCLUDE_USER_DATA)
 -&& isAnon(mapping)) {
 -if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
 -pfn_user++;
 +else if (info->dump_level & DL_EXCLUDE_USER_DATA) {
 +/*
 + * Exclude the anonnymous pages as user pages.
 + */
 +if (isAnon(mapping)) {
 +if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
 +pfn_user++;
 +
 +/*
 + * Check the compound page
 + */
 +if (page_is_hugepage(flags) && compound_order > 0) {
 +int i, nr_pages = 1 << compound_order;
 +
 +for (i = 1; i < nr_pages; ++i) {
 +if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i))
 +pfn_user++;
 +}
 +pfn += nr_pages - 2;
 +mem_map += (nr_pages - 1) * SIZE(page);
 +}
 +}
 +/*
 + * Exclude the hugetlbfs pages as user pages.
 + */
 +else if (hugetlb_dtor == SYMBOL(free_huge_page)) {
 +int i, nr_pages = 1 << compound_order;
 +
 +for (i = 0; i < nr_pages; ++i) {
 +if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i))
 +pfn_user++;
 +}
 +pfn += nr_pages - 1;
 +mem_map += (nr_pages - 1) * SIZE(page);
 +}
 }
 /*
  * Exclude the hwpoison page.
>>>
>>> I'm concerned about the case that filtering is not performed to part of 
>>> mem_map
>>> entries not belonging to the current cyclic range.
>>>
>>> If maximum value of compound_order is larger than maximum value of
>>> CONFIG_FORCE_MAX_ZONEORDER, which makedumpfile obtains by 
>>> ARRAY_LENGTH(zone.free_area),
>>> it's necessary to align info->bufsize_cyclic with larger one in
>>> check_cyclic_buffer_overrun().
>>>
>>
>> ping, in case you overlooked this...
> 
> Sorry for the delayed response, I prioritize the release of v1.5.5 now.
> 
> Thanks for your advice, check_cyclic_buffer_overrun() should be fixed
> as you said. In addition, I'm considering other way to address such case,
> that is to bring the number of "overflowed pages" to the next cycle and
> exclude them at the top of __exclude_unnecessary_pages() like below:
> 
> /*
>  * The pages which should be excluded still remain.
>  */
> if (remainder >= 1) {
> int i;
> unsigned long tmp;
> for (i = 0; i < remainder; ++i) {
> if (clear_bit_on_2nd_bitmap_for_kernel(pfn + 
> i)) {
>   

[PATCH 3/9] mm/rmap: factor lock function out of rmap_walk_anon()

2013-11-27 Thread Joonsoo Kim
When we traverse anon_vma, we need to take a read-side anon_lock.
But there is subtle difference in the situation so that we can't use
same method to take a lock in each cases. Therefore, we need to make
rmap_walk_anon() taking difference lock function.

This patch is the first step, factoring lock function for anon_lock out
of rmap_walk_anon(). It will be used in case of removing migration entry
and in default of rmap_walk_anon().

Signed-off-by: Joonsoo Kim 

diff --git a/mm/rmap.c b/mm/rmap.c
index e6d532c..916f2ed 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1683,6 +1683,24 @@ void __put_anon_vma(struct anon_vma *anon_vma)
 }
 
 #ifdef CONFIG_MIGRATION
+static struct anon_vma *rmap_walk_anon_lock(struct page *page)
+{
+   struct anon_vma *anon_vma;
+
+   /*
+* Note: remove_migration_ptes() cannot use page_lock_anon_vma_read()
+* because that depends on page_mapped(); but not all its usages
+* are holding mmap_sem. Users without mmap_sem are required to
+* take a reference count to prevent the anon_vma disappearing
+*/
+   anon_vma = page_anon_vma(page);
+   if (!anon_vma)
+   return NULL;
+
+   anon_vma_lock_read(anon_vma);
+   return anon_vma;
+}
+
 /*
  * rmap_walk() and its helpers rmap_walk_anon() and rmap_walk_file():
  * Called by migrate.c to remove migration ptes, but might be used more later.
@@ -1695,16 +1713,10 @@ static int rmap_walk_anon(struct page *page, int 
(*rmap_one)(struct page *,
struct anon_vma_chain *avc;
int ret = SWAP_AGAIN;
 
-   /*
-* Note: remove_migration_ptes() cannot use page_lock_anon_vma_read()
-* because that depends on page_mapped(); but not all its usages
-* are holding mmap_sem. Users without mmap_sem are required to
-* take a reference count to prevent the anon_vma disappearing
-*/
-   anon_vma = page_anon_vma(page);
+   anon_vma = rmap_walk_anon_lock(page);
if (!anon_vma)
return ret;
-   anon_vma_lock_read(anon_vma);
+
anon_vma_interval_tree_foreach(avc, _vma->rb_root, pgoff, pgoff) {
struct vm_area_struct *vma = avc->vma;
unsigned long address = vma_address(page, vma);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/9] mm/rmap: factor nonlinear handling out of try_to_unmap_file()

2013-11-27 Thread Joonsoo Kim
To merge all kinds of rmap traverse functions, try_to_unmap(),
try_to_munlock(), page_referenced() and page_mkclean(), we need to
extract common parts and separate out non-common parts.

Nonlinear handling is handled just in try_to_unmap_file() and other
rmap traverse functions doesn't care of it. Therfore it is better
to factor nonlinear handling out of try_to_unmap_file() in order to
merge all kinds of rmap traverse functions easily.

Signed-off-by: Joonsoo Kim 

diff --git a/mm/rmap.c b/mm/rmap.c
index 1214703..e6d532c 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1422,6 +1422,79 @@ static int try_to_unmap_cluster(unsigned long cursor, 
unsigned int *mapcount,
return ret;
 }
 
+static int try_to_unmap_nonlinear(struct page *page,
+   struct address_space *mapping, struct vm_area_struct *vma)
+{
+   int ret = SWAP_AGAIN;
+   unsigned long cursor;
+   unsigned long max_nl_cursor = 0;
+   unsigned long max_nl_size = 0;
+   unsigned int mapcount;
+
+   list_for_each_entry(vma,
+   >i_mmap_nonlinear, shared.nonlinear) {
+
+   cursor = (unsigned long) vma->vm_private_data;
+   if (cursor > max_nl_cursor)
+   max_nl_cursor = cursor;
+   cursor = vma->vm_end - vma->vm_start;
+   if (cursor > max_nl_size)
+   max_nl_size = cursor;
+   }
+
+   if (max_nl_size == 0) { /* all nonlinears locked or reserved ? */
+   return SWAP_FAIL;
+   }
+
+   /*
+* We don't try to search for this page in the nonlinear vmas,
+* and page_referenced wouldn't have found it anyway.  Instead
+* just walk the nonlinear vmas trying to age and unmap some.
+* The mapcount of the page we came in with is irrelevant,
+* but even so use it as a guide to how hard we should try?
+*/
+   mapcount = page_mapcount(page);
+   if (!mapcount)
+   return ret;
+
+   cond_resched();
+
+   max_nl_size = (max_nl_size + CLUSTER_SIZE - 1) & CLUSTER_MASK;
+   if (max_nl_cursor == 0)
+   max_nl_cursor = CLUSTER_SIZE;
+
+   do {
+   list_for_each_entry(vma,
+   >i_mmap_nonlinear, shared.nonlinear) {
+
+   cursor = (unsigned long) vma->vm_private_data;
+   while (cursor < max_nl_cursor &&
+   cursor < vma->vm_end - vma->vm_start) {
+   if (try_to_unmap_cluster(cursor, ,
+   vma, page) == SWAP_MLOCK)
+   ret = SWAP_MLOCK;
+   cursor += CLUSTER_SIZE;
+   vma->vm_private_data = (void *) cursor;
+   if ((int)mapcount <= 0)
+   return ret;
+   }
+   vma->vm_private_data = (void *) max_nl_cursor;
+   }
+   cond_resched();
+   max_nl_cursor += CLUSTER_SIZE;
+   } while (max_nl_cursor <= max_nl_size);
+
+   /*
+* Don't loop forever (perhaps all the remaining pages are
+* in locked vmas).  Reset cursor on all unreserved nonlinear
+* vmas, now forgetting on which ones it had fallen behind.
+*/
+   list_for_each_entry(vma, >i_mmap_nonlinear, shared.nonlinear)
+   vma->vm_private_data = NULL;
+
+   return ret;
+}
+
 bool is_vma_temporary_stack(struct vm_area_struct *vma)
 {
int maybe_stack = vma->vm_flags & (VM_GROWSDOWN | VM_GROWSUP);
@@ -1511,10 +1584,6 @@ static int try_to_unmap_file(struct page *page, enum 
ttu_flags flags)
pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
struct vm_area_struct *vma;
int ret = SWAP_AGAIN;
-   unsigned long cursor;
-   unsigned long max_nl_cursor = 0;
-   unsigned long max_nl_size = 0;
-   unsigned int mapcount;
 
if (PageHuge(page))
pgoff = page->index << compound_order(page);
@@ -1538,64 +1607,7 @@ static int try_to_unmap_file(struct page *page, enum 
ttu_flags flags)
if (TTU_ACTION(flags) == TTU_MUNLOCK)
goto out;
 
-   list_for_each_entry(vma, >i_mmap_nonlinear,
-   shared.nonlinear) {
-   cursor = (unsigned long) vma->vm_private_data;
-   if (cursor > max_nl_cursor)
-   max_nl_cursor = cursor;
-   cursor = vma->vm_end - vma->vm_start;
-   if (cursor > max_nl_size)
-   max_nl_size = cursor;
-   }
-
-   if (max_nl_size == 0) { /* all nonlinears locked or reserved ? */
-   ret = SWAP_FAIL;
-   goto out;
-   }
-
-   /*
-* We don't try to search for this page in the nonlinear vmas,
-* and page_referenced wouldn't have 

[PATCH 5/9] mm/rmap: extend rmap_walk_xxx() to cope with different cases

2013-11-27 Thread Joonsoo Kim
There are a lot of common parts in traversing functions, but there are
also a little of uncommon parts in it. By assigning proper function
pointer on each rmap_walker_control, we can handle these difference
correctly.

Following are differences we should handle.

1. difference of lock function in anon mapping case
2. nonlinear handling in file mapping case
3. prechecked condition:
checking memcg in page_referenced(),
checking VM_SHARE in page_mkclean()
checking temporary vma in try_to_unmap()
4. exit condition:
checking page_mapped() in try_to_unmap()

So, in this patch, I introduce 4 function pointers to
handle above differences.

Signed-off-by: Joonsoo Kim 

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 0f65686..58624b4 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -239,6 +239,12 @@ struct rmap_walk_control {
int (*main)(struct page *, struct vm_area_struct *,
unsigned long, void *);
void *arg;  /* argument to main function */
+   int (*main_done)(struct page *page);/* check exit condition */
+   int (*file_nonlinear)(struct page *, struct address_space *,
+   struct vm_area_struct *vma);
+   struct anon_vma *(*anon_lock)(struct page *);
+   int (*vma_skip)(struct vm_area_struct *, void *);
+   void *skip_arg; /* argument to vma_skip function */
 };
 
 /*
diff --git a/mm/ksm.c b/mm/ksm.c
index c42ec30..0aa6e09 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2032,12 +2032,19 @@ again:
if ((rmap_item->mm == vma->vm_mm) == search_new_forks)
continue;
 
+   if (rwc->vma_skip && rwc->vma_skip(vma, rwc->skip_arg))
+   continue;
+
ret = rwc->main(page, vma,
rmap_item->address, rwc->arg);
if (ret != SWAP_AGAIN) {
anon_vma_unlock_read(anon_vma);
goto out;
}
+   if (rwc->main_done && rwc->main_done(page)) {
+   anon_vma_unlock_read(anon_vma);
+   goto out;
+   }
}
anon_vma_unlock_read(anon_vma);
}
diff --git a/mm/rmap.c b/mm/rmap.c
index 5933488..5dad5dd 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1683,10 +1683,14 @@ void __put_anon_vma(struct anon_vma *anon_vma)
 }
 
 #ifdef CONFIG_MIGRATION
-static struct anon_vma *rmap_walk_anon_lock(struct page *page)
+static struct anon_vma *rmap_walk_anon_lock(struct page *page,
+   struct rmap_walk_control *rwc)
 {
struct anon_vma *anon_vma;
 
+   if (rwc->anon_lock)
+   return rwc->anon_lock(page);
+
/*
 * Note: remove_migration_ptes() cannot use page_lock_anon_vma_read()
 * because that depends on page_mapped(); but not all its usages
@@ -1712,16 +1716,22 @@ static int rmap_walk_anon(struct page *page, struct 
rmap_walk_control *rwc)
struct anon_vma_chain *avc;
int ret = SWAP_AGAIN;
 
-   anon_vma = rmap_walk_anon_lock(page);
+   anon_vma = rmap_walk_anon_lock(page, rwc);
if (!anon_vma)
return ret;
 
anon_vma_interval_tree_foreach(avc, _vma->rb_root, pgoff, pgoff) {
struct vm_area_struct *vma = avc->vma;
unsigned long address = vma_address(page, vma);
+
+   if (rwc->vma_skip && rwc->vma_skip(vma, rwc->skip_arg))
+   continue;
+
ret = rwc->main(page, vma, address, rwc->arg);
if (ret != SWAP_AGAIN)
break;
+   if (rwc->main_done && rwc->main_done(page))
+   break;
}
anon_vma_unlock_read(anon_vma);
return ret;
@@ -1743,15 +1753,26 @@ static int rmap_walk_file(struct page *page, struct 
rmap_walk_control *rwc)
mutex_lock(>i_mmap_mutex);
vma_interval_tree_foreach(vma, >i_mmap, pgoff, pgoff) {
unsigned long address = vma_address(page, vma);
+
+   if (rwc->vma_skip && rwc->vma_skip(vma, rwc->skip_arg))
+   continue;
+
ret = rwc->main(page, vma, address, rwc->arg);
if (ret != SWAP_AGAIN)
-   break;
+   goto done;
+   if (rwc->main_done && rwc->main_done(page))
+   goto done;
}
-   /*
-* No nonlinear handling: being always shared, nonlinear vmas
-* never contain migration ptes.  Decide what to do about this
-* limitation to linear when we need rmap_walk() on nonlinear.
-*/
+
+   if (!rwc->file_nonlinear)
+   goto done;
+
+   if (list_empty(>i_mmap_nonlinear))

[PATCH 0/9] mm/rmap: unify rmap traversing functions through rmap_walk

2013-11-27 Thread Joonsoo Kim
Rmap traversing is used in five different cases, try_to_unmap(),
try_to_munlock(), page_referenced(), page_mkclean() and
remove_migration_ptes(). Each one implements its own traversing functions
for the cases, anon, file, ksm, respectively. These cause lots of duplications
and cause maintenance overhead. They also make codes being hard to understand
and error-prone. One example is hugepage handling. There is a code to compute
hugepage offset correctly in try_to_unmap_file(), but, there isn't a code
to compute hugepage offset in rmap_walk_file(). These are used pairwise
in migration context, but we missed to modify pairwise.

To overcome these drawbacks, we should unify these through one unified
function. I decide rmap_walk() as main function since it has no
unnecessity. And to control behavior of rmap_walk(), I introduce
struct rmap_walk_control having some function pointers. These makes
rmap_walk() working for their specific needs.

This patchset remove a lot of duplicated code as you can see in below
short-stat and kernel text size also decrease slightly.

   textdata bss dec hex filename
  10640   1  16   1065729a1 mm/rmap.o
  10047   1  16   100642750 mm/rmap.o

  13823 7058288   228165920 mm/ksm.o
  13199 7058288   2219256b0 mm/ksm.o

Thanks.

Joonsoo Kim (9):
  mm/rmap: recompute pgoff for huge page
  mm/rmap: factor nonlinear handling out of try_to_unmap_file()
  mm/rmap: factor lock function out of rmap_walk_anon()
  mm/rmap: make rmap_walk to get the rmap_walk_control argument
  mm/rmap: extend rmap_walk_xxx() to cope with different cases
  mm/rmap: use rmap_walk() in try_to_unmap()
  mm/rmap: use rmap_walk() in try_to_munlock()
  mm/rmap: use rmap_walk() in page_referenced()
  mm/rmap: use rmap_walk() in page_mkclean()

 include/linux/ksm.h  |   15 +-
 include/linux/rmap.h |   19 +-
 mm/ksm.c |  116 +-
 mm/migrate.c |7 +-
 mm/rmap.c|  570 ++
 5 files changed, 286 insertions(+), 441 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 8/9] mm/rmap: use rmap_walk() in page_referenced()

2013-11-27 Thread Joonsoo Kim
Now, we have an infrastructure in rmap_walk() to handle difference
from variants of rmap traversing functions.

So, just use it in page_referenced().

In this patch, I change following things.

1. remove some variants of rmap traversing functions.
cf> page_referenced_ksm, page_referenced_anon,
page_referenced_file
2. introduce new struct page_referenced_arg and pass it to
page_referenced_one(), main function of rmap_walk, in order to
count reference, to store vm_flags and to check finish condition.
3. mechanical change to use rmap_walk() in page_referenced().

Signed-off-by: Joonsoo Kim 

diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 91b9719..3be6bb1 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -73,8 +73,6 @@ static inline void set_page_stable_node(struct page *page,
 struct page *ksm_might_need_to_copy(struct page *page,
struct vm_area_struct *vma, unsigned long address);
 
-int page_referenced_ksm(struct page *page,
-   struct mem_cgroup *memcg, unsigned long *vm_flags);
 int rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc);
 void ksm_migrate_page(struct page *newpage, struct page *oldpage);
 
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index d641f6d..e529ba3 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -184,7 +184,7 @@ static inline void page_dup_rmap(struct page *page)
 int page_referenced(struct page *, int is_locked,
struct mem_cgroup *memcg, unsigned long *vm_flags);
 int page_referenced_one(struct page *, struct vm_area_struct *,
-   unsigned long address, unsigned int *mapcount, unsigned long *vm_flags);
+   unsigned long address, void *arg);
 
 #define TTU_ACTION(x) ((x) & TTU_ACTION_MASK)
 
diff --git a/mm/ksm.c b/mm/ksm.c
index 4f25cf7..4c4541b 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1891,61 +1891,6 @@ struct page *ksm_might_need_to_copy(struct page *page,
return new_page;
 }
 
-int page_referenced_ksm(struct page *page, struct mem_cgroup *memcg,
-   unsigned long *vm_flags)
-{
-   struct stable_node *stable_node;
-   struct rmap_item *rmap_item;
-   unsigned int mapcount = page_mapcount(page);
-   int referenced = 0;
-   int search_new_forks = 0;
-
-   VM_BUG_ON(!PageKsm(page));
-   VM_BUG_ON(!PageLocked(page));
-
-   stable_node = page_stable_node(page);
-   if (!stable_node)
-   return 0;
-again:
-   hlist_for_each_entry(rmap_item, _node->hlist, hlist) {
-   struct anon_vma *anon_vma = rmap_item->anon_vma;
-   struct anon_vma_chain *vmac;
-   struct vm_area_struct *vma;
-
-   anon_vma_lock_read(anon_vma);
-   anon_vma_interval_tree_foreach(vmac, _vma->rb_root,
-  0, ULONG_MAX) {
-   vma = vmac->vma;
-   if (rmap_item->address < vma->vm_start ||
-   rmap_item->address >= vma->vm_end)
-   continue;
-   /*
-* Initially we examine only the vma which covers this
-* rmap_item; but later, if there is still work to do,
-* we examine covering vmas in other mms: in case they
-* were forked from the original since ksmd passed.
-*/
-   if ((rmap_item->mm == vma->vm_mm) == search_new_forks)
-   continue;
-
-   if (memcg && !mm_match_cgroup(vma->vm_mm, memcg))
-   continue;
-
-   referenced += page_referenced_one(page, vma,
-   rmap_item->address, , vm_flags);
-   if (!search_new_forks || !mapcount)
-   break;
-   }
-   anon_vma_unlock_read(anon_vma);
-   if (!mapcount)
-   goto out;
-   }
-   if (!search_new_forks++)
-   goto again;
-out:
-   return referenced;
-}
-
 int rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc)
 {
struct stable_node *stable_node;
diff --git a/mm/rmap.c b/mm/rmap.c
index 860a393..5e78d5c 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -656,17 +656,23 @@ int page_mapped_in_vma(struct page *page, struct 
vm_area_struct *vma)
return 1;
 }
 
+struct page_referenced_arg {
+   int mapcount;
+   int referenced;
+   unsigned long vm_flags;
+};
+
 /*
  * Subfunctions of page_referenced: page_referenced_one called
  * repeatedly from either page_referenced_anon or page_referenced_file.
  */
 int page_referenced_one(struct page *page, struct vm_area_struct *vma,
-   unsigned long address, unsigned int *mapcount,
-   unsigned long *vm_flags)
+   

[PATCH 1/9] mm/rmap: recompute pgoff for huge page

2013-11-27 Thread Joonsoo Kim
We have to recompute pgoff if the given page is huge, since result based
on HPAGE_SIZE is not approapriate for scanning the vma interval tree, as
shown by commit 36e4f20af833 ("hugetlb: do not use vma_hugecache_offset()
for vma_prio_tree_foreach") and commit 369a713e ("rmap: recompute pgoff
for unmapping huge page").

Signed-off-by: Joonsoo Kim 

diff --git a/mm/rmap.c b/mm/rmap.c
index 55c8b8d..1214703 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1714,6 +1714,10 @@ static int rmap_walk_file(struct page *page, int 
(*rmap_one)(struct page *,
 
if (!mapping)
return ret;
+
+   if (PageHuge(page))
+   pgoff = page->index << compound_order(page);
+
mutex_lock(>i_mmap_mutex);
vma_interval_tree_foreach(vma, >i_mmap, pgoff, pgoff) {
unsigned long address = vma_address(page, vma);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/9] mm/rmap: use rmap_walk() in try_to_unmap()

2013-11-27 Thread Joonsoo Kim
Now, we have an infrastructure in rmap_walk() to handle difference
from variants of rmap traversing functions.

So, just use it in try_to_unmap().

In this patch, I change following things.

1. enable rmap_walk() if !CONFIG_MIGRATION.
2. mechanical change to use rmap_walk() in try_to_unmap().

Signed-off-by: Joonsoo Kim 

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 58624b4..d641f6d 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -190,7 +190,7 @@ int page_referenced_one(struct page *, struct 
vm_area_struct *,
 
 int try_to_unmap(struct page *, enum ttu_flags flags);
 int try_to_unmap_one(struct page *, struct vm_area_struct *,
-   unsigned long address, enum ttu_flags flags);
+   unsigned long address, void *arg);
 
 /*
  * Called from mm/filemap_xip.c to unmap empty zero page
diff --git a/mm/ksm.c b/mm/ksm.c
index 0aa6e09..e1b0198 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1982,7 +1982,7 @@ again:
continue;
 
ret = try_to_unmap_one(page, vma,
-   rmap_item->address, flags);
+   rmap_item->address, (void *)flags);
if (ret != SWAP_AGAIN || !page_mapped(page)) {
anon_vma_unlock_read(anon_vma);
goto out;
@@ -1996,7 +1996,6 @@ out:
return ret;
 }
 
-#ifdef CONFIG_MIGRATION
 int rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc)
 {
struct stable_node *stable_node;
@@ -2054,6 +2053,7 @@ out:
return ret;
 }
 
+#ifdef CONFIG_MIGRATION
 void ksm_migrate_page(struct page *newpage, struct page *oldpage)
 {
struct stable_node *stable_node;
diff --git a/mm/rmap.c b/mm/rmap.c
index 5dad5dd..7407710 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1177,13 +1177,14 @@ out:
  * repeatedly from try_to_unmap_ksm, try_to_unmap_anon or try_to_unmap_file.
  */
 int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
-unsigned long address, enum ttu_flags flags)
+unsigned long address, void *arg)
 {
struct mm_struct *mm = vma->vm_mm;
pte_t *pte;
pte_t pteval;
spinlock_t *ptl;
int ret = SWAP_AGAIN;
+   enum ttu_flags flags = (enum ttu_flags)arg;
 
pte = page_check_address(page, mm, address, , 0);
if (!pte)
@@ -1509,6 +1510,11 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
return false;
 }
 
+static int skip_vma_temporary_stack(struct vm_area_struct *vma, void *arg)
+{
+   return (int)is_vma_temporary_stack(vma);
+}
+
 /**
  * try_to_unmap_anon - unmap or unlock anonymous page using the object-based
  * rmap method
@@ -1554,7 +1560,7 @@ static int try_to_unmap_anon(struct page *page, enum 
ttu_flags flags)
continue;
 
address = vma_address(page, vma);
-   ret = try_to_unmap_one(page, vma, address, flags);
+   ret = try_to_unmap_one(page, vma, address, (void *)flags);
if (ret != SWAP_AGAIN || !page_mapped(page))
break;
}
@@ -1591,7 +1597,7 @@ static int try_to_unmap_file(struct page *page, enum 
ttu_flags flags)
mutex_lock(>i_mmap_mutex);
vma_interval_tree_foreach(vma, >i_mmap, pgoff, pgoff) {
unsigned long address = vma_address(page, vma);
-   ret = try_to_unmap_one(page, vma, address, flags);
+   ret = try_to_unmap_one(page, vma, address, (void *)flags);
if (ret != SWAP_AGAIN || !page_mapped(page))
goto out;
}
@@ -1613,6 +1619,11 @@ out:
return ret;
 }
 
+static int page_not_mapped(struct page *page)
+{
+   return !page_mapped(page);
+};
+
 /**
  * try_to_unmap - try to remove all page table mappings to a page
  * @page: the page to get unmapped
@@ -1630,16 +1641,30 @@ out:
 int try_to_unmap(struct page *page, enum ttu_flags flags)
 {
int ret;
+   struct rmap_walk_control rwc;
 
-   BUG_ON(!PageLocked(page));
VM_BUG_ON(!PageHuge(page) && PageTransHuge(page));
 
-   if (unlikely(PageKsm(page)))
-   ret = try_to_unmap_ksm(page, flags);
-   else if (PageAnon(page))
-   ret = try_to_unmap_anon(page, flags);
-   else
-   ret = try_to_unmap_file(page, flags);
+   memset(, 0, sizeof(rwc));
+   rwc.main = try_to_unmap_one;
+   rwc.arg = (void *)flags;
+   rwc.main_done = page_not_mapped;
+   rwc.file_nonlinear = try_to_unmap_nonlinear;
+   rwc.anon_lock = page_lock_anon_vma_read;
+
+   /*
+* During exec, a temporary VMA is setup and later moved.
+* The VMA is moved under the anon_vma lock but not the
+* page tables leading to a race where migration cannot
+* find the migration ptes. Rather than increasing the
+* locking 

[PATCH 7/9] mm/rmap: use rmap_walk() in try_to_munlock()

2013-11-27 Thread Joonsoo Kim
Now, we have an infrastructure in rmap_walk() to handle difference
from variants of rmap traversing functions.

So, just use it in try_to_munlock().

In this patch, I change following things.

1. remove some variants of rmap traversing functions.
cf> try_to_unmap_ksm, try_to_unmap_anon, try_to_unmap_file
2. mechanical change to use rmap_walk() in try_to_munlock().
3. copy and paste comments.

Signed-off-by: Joonsoo Kim 

diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 0eef8cb..91b9719 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -75,7 +75,6 @@ struct page *ksm_might_need_to_copy(struct page *page,
 
 int page_referenced_ksm(struct page *page,
struct mem_cgroup *memcg, unsigned long *vm_flags);
-int try_to_unmap_ksm(struct page *page, enum ttu_flags flags);
 int rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc);
 void ksm_migrate_page(struct page *newpage, struct page *oldpage);
 
@@ -114,11 +113,6 @@ static inline int page_referenced_ksm(struct page *page,
return 0;
 }
 
-static inline int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
-{
-   return 0;
-}
-
 static inline int rmap_walk_ksm(struct page *page,
struct rmap_walk_control *rwc)
 {
diff --git a/mm/ksm.c b/mm/ksm.c
index e1b0198..4f25cf7 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1946,56 +1946,6 @@ out:
return referenced;
 }
 
-int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
-{
-   struct stable_node *stable_node;
-   struct rmap_item *rmap_item;
-   int ret = SWAP_AGAIN;
-   int search_new_forks = 0;
-
-   VM_BUG_ON(!PageKsm(page));
-   VM_BUG_ON(!PageLocked(page));
-
-   stable_node = page_stable_node(page);
-   if (!stable_node)
-   return SWAP_FAIL;
-again:
-   hlist_for_each_entry(rmap_item, _node->hlist, hlist) {
-   struct anon_vma *anon_vma = rmap_item->anon_vma;
-   struct anon_vma_chain *vmac;
-   struct vm_area_struct *vma;
-
-   anon_vma_lock_read(anon_vma);
-   anon_vma_interval_tree_foreach(vmac, _vma->rb_root,
-  0, ULONG_MAX) {
-   vma = vmac->vma;
-   if (rmap_item->address < vma->vm_start ||
-   rmap_item->address >= vma->vm_end)
-   continue;
-   /*
-* Initially we examine only the vma which covers this
-* rmap_item; but later, if there is still work to do,
-* we examine covering vmas in other mms: in case they
-* were forked from the original since ksmd passed.
-*/
-   if ((rmap_item->mm == vma->vm_mm) == search_new_forks)
-   continue;
-
-   ret = try_to_unmap_one(page, vma,
-   rmap_item->address, (void *)flags);
-   if (ret != SWAP_AGAIN || !page_mapped(page)) {
-   anon_vma_unlock_read(anon_vma);
-   goto out;
-   }
-   }
-   anon_vma_unlock_read(anon_vma);
-   }
-   if (!search_new_forks++)
-   goto again;
-out:
-   return ret;
-}
-
 int rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc)
 {
struct stable_node *stable_node;
diff --git a/mm/rmap.c b/mm/rmap.c
index 7407710..860a393 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1515,110 +1515,6 @@ static int skip_vma_temporary_stack(struct 
vm_area_struct *vma, void *arg)
return (int)is_vma_temporary_stack(vma);
 }
 
-/**
- * try_to_unmap_anon - unmap or unlock anonymous page using the object-based
- * rmap method
- * @page: the page to unmap/unlock
- * @flags: action and flags
- *
- * Find all the mappings of a page using the mapping pointer and the vma chains
- * contained in the anon_vma struct it points to.
- *
- * This function is only called from try_to_unmap/try_to_munlock for
- * anonymous pages.
- * When called from try_to_munlock(), the mmap_sem of the mm containing the vma
- * where the page was found will be held for write.  So, we won't recheck
- * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
- * 'LOCKED.
- */
-static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
-{
-   struct anon_vma *anon_vma;
-   pgoff_t pgoff;
-   struct anon_vma_chain *avc;
-   int ret = SWAP_AGAIN;
-
-   anon_vma = page_lock_anon_vma_read(page);
-   if (!anon_vma)
-   return ret;
-
-   pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
-   anon_vma_interval_tree_foreach(avc, _vma->rb_root, pgoff, pgoff) {
-   struct vm_area_struct *vma = avc->vma;
-   unsigned long address;
-
-   

[PATCH 4/9] mm/rmap: make rmap_walk to get the rmap_walk_control argument

2013-11-27 Thread Joonsoo Kim
In each rmap traverse case, there is some difference so that we need
function pointers and arguments to them in order to handle these
difference properly.

For this purpose, struct rmap_walk_control is introduced in this patch,
and will be extended in following patch. Introducing and extending are
separate, because it clarify changes.

Signed-off-by: Joonsoo Kim 

diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 45c9b6a..0eef8cb 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -76,8 +76,7 @@ struct page *ksm_might_need_to_copy(struct page *page,
 int page_referenced_ksm(struct page *page,
struct mem_cgroup *memcg, unsigned long *vm_flags);
 int try_to_unmap_ksm(struct page *page, enum ttu_flags flags);
-int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct page *,
- struct vm_area_struct *, unsigned long, void *), void *arg);
+int rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc);
 void ksm_migrate_page(struct page *newpage, struct page *oldpage);
 
 #else  /* !CONFIG_KSM */
@@ -120,8 +119,8 @@ static inline int try_to_unmap_ksm(struct page *page, enum 
ttu_flags flags)
return 0;
 }
 
-static inline int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct 
page*,
-   struct vm_area_struct *, unsigned long, void *), void *arg)
+static inline int rmap_walk_ksm(struct page *page,
+   struct rmap_walk_control *rwc)
 {
return 0;
 }
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 6dacb93..0f65686 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -235,11 +235,16 @@ struct anon_vma *page_lock_anon_vma_read(struct page 
*page);
 void page_unlock_anon_vma_read(struct anon_vma *anon_vma);
 int page_mapped_in_vma(struct page *page, struct vm_area_struct *vma);
 
+struct rmap_walk_control {
+   int (*main)(struct page *, struct vm_area_struct *,
+   unsigned long, void *);
+   void *arg;  /* argument to main function */
+};
+
 /*
  * Called by migrate.c to remove migration ptes, but might be used more later.
  */
-int rmap_walk(struct page *page, int (*rmap_one)(struct page *,
-   struct vm_area_struct *, unsigned long, void *), void *arg);
+int rmap_walk(struct page *page, struct rmap_walk_control *rwc);
 
 #else  /* !CONFIG_MMU */
 
diff --git a/mm/ksm.c b/mm/ksm.c
index 175fff7..c42ec30 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1997,8 +1997,7 @@ out:
 }
 
 #ifdef CONFIG_MIGRATION
-int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct page *,
- struct vm_area_struct *, unsigned long, void *), void *arg)
+int rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc)
 {
struct stable_node *stable_node;
struct rmap_item *rmap_item;
@@ -2033,7 +2032,8 @@ again:
if ((rmap_item->mm == vma->vm_mm) == search_new_forks)
continue;
 
-   ret = rmap_one(page, vma, rmap_item->address, arg);
+   ret = rwc->main(page, vma,
+   rmap_item->address, rwc->arg);
if (ret != SWAP_AGAIN) {
anon_vma_unlock_read(anon_vma);
goto out;
diff --git a/mm/migrate.c b/mm/migrate.c
index bb94004..c47425b 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -198,7 +198,12 @@ out:
  */
 static void remove_migration_ptes(struct page *old, struct page *new)
 {
-   rmap_walk(new, remove_migration_pte, old);
+   struct rmap_walk_control rwc;
+
+   memset(, 0, sizeof(rwc));
+   rwc.main = remove_migration_pte;
+   rwc.arg = old;
+   rmap_walk(new, );
 }
 
 /*
diff --git a/mm/rmap.c b/mm/rmap.c
index 916f2ed..5933488 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1705,8 +1705,7 @@ static struct anon_vma *rmap_walk_anon_lock(struct page 
*page)
  * rmap_walk() and its helpers rmap_walk_anon() and rmap_walk_file():
  * Called by migrate.c to remove migration ptes, but might be used more later.
  */
-static int rmap_walk_anon(struct page *page, int (*rmap_one)(struct page *,
-   struct vm_area_struct *, unsigned long, void *), void *arg)
+static int rmap_walk_anon(struct page *page, struct rmap_walk_control *rwc)
 {
struct anon_vma *anon_vma;
pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
@@ -1720,7 +1719,7 @@ static int rmap_walk_anon(struct page *page, int 
(*rmap_one)(struct page *,
anon_vma_interval_tree_foreach(avc, _vma->rb_root, pgoff, pgoff) {
struct vm_area_struct *vma = avc->vma;
unsigned long address = vma_address(page, vma);
-   ret = rmap_one(page, vma, address, arg);
+   ret = rwc->main(page, vma, address, rwc->arg);
if (ret != SWAP_AGAIN)
break;
}
@@ -1728,8 +1727,7 @@ static int 

[PATCH 9/9] mm/rmap: use rmap_walk() in page_mkclean()

2013-11-27 Thread Joonsoo Kim
Now, we have an infrastructure in rmap_walk() to handle difference
from variants of rmap traversing functions.

So, just use it in page_mkclean().

In this patch, I change following things.

1. remove some variants of rmap traversing functions.
cf> page_mkclean_file
2. mechanical change to use rmap_walk() in page_mkclean().

Signed-off-by: Joonsoo Kim 

diff --git a/mm/rmap.c b/mm/rmap.c
index 5e78d5c..bbbc705 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -809,12 +809,13 @@ int page_referenced(struct page *page,
 }
 
 static int page_mkclean_one(struct page *page, struct vm_area_struct *vma,
-   unsigned long address)
+   unsigned long address, void *arg)
 {
struct mm_struct *mm = vma->vm_mm;
pte_t *pte;
spinlock_t *ptl;
int ret = 0;
+   int *cleaned = arg;
 
pte = page_check_address(page, mm, address, , 1);
if (!pte)
@@ -833,44 +834,46 @@ static int page_mkclean_one(struct page *page, struct 
vm_area_struct *vma,
 
pte_unmap_unlock(pte, ptl);
 
-   if (ret)
+   if (ret) {
mmu_notifier_invalidate_page(mm, address);
+   (*cleaned)++;
+   }
 out:
-   return ret;
+   return SWAP_AGAIN;
 }
 
-static int page_mkclean_file(struct address_space *mapping, struct page *page)
+static int skip_vma_non_shared(struct vm_area_struct *vma, void *arg)
 {
-   pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
-   struct vm_area_struct *vma;
-   int ret = 0;
-
-   BUG_ON(PageAnon(page));
+   if (vma->vm_flags & VM_SHARED)
+   return 0;
 
-   mutex_lock(>i_mmap_mutex);
-   vma_interval_tree_foreach(vma, >i_mmap, pgoff, pgoff) {
-   if (vma->vm_flags & VM_SHARED) {
-   unsigned long address = vma_address(page, vma);
-   ret += page_mkclean_one(page, vma, address);
-   }
-   }
-   mutex_unlock(>i_mmap_mutex);
-   return ret;
+   return 1;
 }
 
 int page_mkclean(struct page *page)
 {
-   int ret = 0;
+   struct address_space *mapping;
+   struct rmap_walk_control rwc;
+   int cleaned;
 
BUG_ON(!PageLocked(page));
 
-   if (page_mapped(page)) {
-   struct address_space *mapping = page_mapping(page);
-   if (mapping)
-   ret = page_mkclean_file(mapping, page);
-   }
+   if (!page_mapped(page))
+   return 0;
 
-   return ret;
+   mapping = page_mapping(page);
+   if (!mapping)
+   return 0;
+
+   memset(, 0, sizeof(rwc));
+   cleaned = 0;
+   rwc.main = page_mkclean_one;
+   rwc.arg = (void *)
+   rwc.vma_skip = skip_vma_non_shared;
+
+   rmap_walk(page, );
+
+   return cleaned;
 }
 EXPORT_SYMBOL_GPL(page_mkclean);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[f2fs-dev] [PATCH 4/5] f2fs: check return value of f2fs_readpage in find_data_page

2013-11-27 Thread Chao Yu
We should return error if we do not get an updated page in find_date_page
when f2fs_readpage failed.

Signed-off-by: Chao Yu 
---
 fs/f2fs/data.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 2d02cf3..85071d6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -240,6 +240,9 @@ struct page *find_data_page(struct inode *inode, pgoff_t 
index, bool sync)
 
err = f2fs_readpage(sbi, page, dn.data_blkaddr,
sync ? READ_SYNC : READA);
+   if (err)
+   return ERR_PTR(err);
+
if (sync) {
wait_on_page_locked(page);
if (!PageUptodate(page)) {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC patch 0/5] futex: Allow lockless empty check of hashbucket plist in futex_wake()

2013-11-27 Thread Davidlohr Bueso
On Wed, 2013-11-27 at 00:56 +0100, Thomas Gleixner wrote:
> On Tue, 26 Nov 2013, Davidlohr Bueso wrote:
> > On Tue, 2013-11-26 at 11:25 -0800, Davidlohr Bueso wrote:
> > > *sigh* I just realized I had some extra debugging options in the .config
> > > I ran for the patched kernel. This probably explains why the huge
> > > overhead. I'll rerun and report shortly.
> 
> So you pulled a FUTEX, i.e. F*d Up That EXperiment :)

hehe

> 
> > I'm very sorry about the false alarm -- after midnight my brain starts
> > to melt. After re-running everything on my laptop (yes, with the
> > correct .config file), I can see that the differences are rather minimal
> > and variation also goes down, as expected. I've also included the
> > results for the original atomic ops approach, which mostly measures the
> > atomic_dec when we dequeue the woken task. Results are in the noise
> > range and virtually the same for both approaches (at least on a smaller
> > x86_64 system).
> >
> > +-+-++--+
> > | threads | baseline time (ms) [stddev] | barrier time (ms) [stddev] | 
> > atomicops time (ms) [stddev] |
> > +-+-++--+
> > | 512 | 2.8360 [0.5168] | 4.4100 [1.1150]| 
> > 3.8150 [1.3293]  |
> > | 256 | 2.5080 [0.6375] | 2.3070 [0.5112]| 
> > 2.5980 [0.9079]  |
> > | 128 | 1.0200 [0.4264] | 1.3980 [0.3391]| 
> > 1.5180 [0.4902]  |
> > |  64 | 0.7890 [0.2667] | 0.6970 [0.3374]| 
> > 0.4020 [0.2447]  |
> > |  32 | 0.1150 [0.0184] | 0.1870 [0.1428]| 
> > 0.1490 [0.1156]  |
> > +-+-++--+
> 
> That probably wants more than 10 repeated runs to converge into stable
> numbers. Thanks for providing the atomicops comparison! That's very
> helpful.
> 

Sorry about the delay, I've been airborne all day.

Here are the results for 1000 runs. The numbers stabilize nicely as you
add more samples. I think we can conclude that there really isn't much
of an impact in either case.

+-+-++--+
| threads | baseline time (ms) [stddev] | barrier time (ms) [stddev] | 
atomicops time (ms) [stddev] |
+-+-++--+
| 512 | 3.0959 [0.5293] | 3.8402 [0.4282]| 3.4274 
[0.4418]  |
| 256 | 1.0973 [0.4023] | 1.1162 [0.4167]| 1.0768 
[0.4130]  |
| 128 | 0.5062 [0.2110] | 0.5221 [0.1867]| 0.4249 
[0.1922]  |
|  64 | 0.3146 [0.1312] | 0.2580 [0.1302]| 0.2555 
[0.1266]  |
|  32 | 0.1448 [0.1022] | 0.1568 [0.0838]| 0.1478 
[0.0788]  |
+-+-++--+

> It would be interesting to measure the overhead on the waiter side as
> well for both approaches (mb and atomic_inc), but I'm sure that at
> least for x86 it's going to be in the same ballpark.

Yeah, I don't expect much difference either, but will do the experiment.

> So I discovered earlier today, that your atomic ops variant is working
> because the atomic_inc() in get_futex_key_refs() is accidentally
> providing the required memory barrier on the waker side (on x86 and
> all other architectures which have an implict mb in atomic_inc()).
> 
> For !fshared ones it's even a full mb on all architectiures today, see
> ihold().

Interesting.

> Aside of that get_user_pages_fast() is using atomic ops as well, not
> sure if it's a full mb on all architectures today, but it is on x86
> and others.
> 
> Now I'm tempted to turn this accidental into a documented behaviour.
> The basic requirement for the fast lockless check of the plist is:
> 
> record_waiter(hb)   |  *uaddr = newval
> mb  |  mb
> *uaddr == oldval ?  |  nr_waiters(hb) != 0?
> 
> So on the waker side we can conditonally (dependent on the arch
> implementation) rely on the mb in get_futex_key_refs(). See below.
> 
> Now it does not matter much in terms of barrier related overhead
> whether the record_waiter() is implemented via atomic_inc() or via the
> early enqueue into the plist + smp_mb. In both cases we have a full
> smp_mb(), whether implicit or explicit.
> 
> And versus memory/cache footprint it's probably not relevant either
> whether we add the counter or not. Assumed we have no debug options
> enabled then the resulting size of futex_hash_bucket will be:
> 
>  16 bytes on 32bit (12 bytes 

[f2fs-dev] [PATCH 5/5] f2fs: convert recover_orphan_inodes to void

2013-11-27 Thread Chao Yu
Signed-off-by: Chao Yu 
---
 fs/f2fs/checkpoint.c |6 +++---
 fs/f2fs/f2fs.h   |2 +-
 fs/f2fs/super.c  |8 
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 7fe69ff..b28e61b 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -270,12 +270,12 @@ static void recover_orphan_inode(struct f2fs_sb_info 
*sbi, nid_t ino)
iput(inode);
 }
 
-int recover_orphan_inodes(struct f2fs_sb_info *sbi)
+void recover_orphan_inodes(struct f2fs_sb_info *sbi)
 {
block_t start_blk, orphan_blkaddr, i, j;
 
if (!is_set_ckpt_flags(F2FS_CKPT(sbi), CP_ORPHAN_PRESENT_FLAG))
-   return 0;
+   return;
 
sbi->por_doing = true;
start_blk = __start_cp_addr(sbi) + 1;
@@ -295,7 +295,7 @@ int recover_orphan_inodes(struct f2fs_sb_info *sbi)
/* clear Orphan Flag */
clear_ckpt_flags(F2FS_CKPT(sbi), CP_ORPHAN_PRESENT_FLAG);
sbi->por_doing = false;
-   return 0;
+   return;
 }
 
 static void write_orphan_inodes(struct f2fs_sb_info *sbi, block_t start_blk)
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index c8eb37e..bb96b64 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1114,7 +1114,7 @@ int acquire_orphan_inode(struct f2fs_sb_info *);
 void release_orphan_inode(struct f2fs_sb_info *);
 void add_orphan_inode(struct f2fs_sb_info *, nid_t);
 void remove_orphan_inode(struct f2fs_sb_info *, nid_t);
-int recover_orphan_inodes(struct f2fs_sb_info *);
+void recover_orphan_inodes(struct f2fs_sb_info *);
 int get_valid_checkpoint(struct f2fs_sb_info *);
 void set_dirty_dir_page(struct inode *, struct page *);
 void add_dirty_dir_inode(struct inode *);
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 9981b28..09a2b07 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -946,9 +946,7 @@ static int f2fs_fill_super(struct super_block *sb, void 
*data, int silent)
}
 
/* if there are nt orphan nodes free them */
-   err = -EINVAL;
-   if (recover_orphan_inodes(sbi))
-   goto free_node_inode;
+   recover_orphan_inodes(sbi);
 
/* read root inode and dentry */
root = f2fs_iget(sb, F2FS_ROOT_INO(sbi));
@@ -957,8 +955,10 @@ static int f2fs_fill_super(struct super_block *sb, void 
*data, int silent)
err = PTR_ERR(root);
goto free_node_inode;
}
-   if (!S_ISDIR(root->i_mode) || !root->i_blocks || !root->i_size)
+   if (!S_ISDIR(root->i_mode) || !root->i_blocks || !root->i_size) {
+   err = -EINVAL;
goto free_root_inode;
+   }
 
sb->s_root = d_make_root(root); /* allocate root dentry */
if (!sb->s_root) {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[f2fs-dev] [PATCH 3/5] f2fs: use true and false for boolean variable

2013-11-27 Thread Chao Yu
Signed-off-by: Chao Yu 
---
 fs/f2fs/node.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 954155b..daf8ee8 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -791,7 +791,7 @@ int truncate_xattr_node(struct inode *inode, struct page 
*page)
set_new_dnode(, inode, page, npage, nid);
 
if (page)
-   dn.inode_page_locked = 1;
+   dn.inode_page_locked = true;
truncate_node();
return 0;
 }
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[f2fs-dev] [PATCH 2/5] f2fs: add unlikely macro for compiler optimization

2013-11-27 Thread Chao Yu
Signed-off-by: Chao Yu 
---
 fs/f2fs/node.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 0fe9a97..954155b 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1160,7 +1160,7 @@ int wait_on_node_pages_writeback(struct f2fs_sb_info 
*sbi, nid_t ino)
struct page *page = pvec.pages[i];
 
/* until radix tree lookup accepts end_index */
-   if (page->index > end)
+   if (unlikely(page->index > end))
continue;
 
if (ino && ino_of_node(page) == ino) {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[f2fs-dev] [PATCH 1/5] f2fs: correct type of wait in struct bio_private

2013-11-27 Thread Chao Yu
Signed-off-by: Chao Yu 
---
 fs/f2fs/segment.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index b84dd23..5f733ec 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -96,7 +96,7 @@
 struct bio_private {
struct f2fs_sb_info *sbi;
bool is_sync;
-   void *wait;
+   struct completion *wait;
 };
 
 /*
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] ASoC: SGTL5000: Fix kernel failed while trying to get optional VDDD supply.

2013-11-27 Thread Xiubo Li
The SGTL5000 requires 2 external power supplies: VDDA and VDDIO. An optional
third external power supply VDDD may be provided externally to achieve lower
power.If an external supply is not used for VDDD, the SGTL5000 driver will
register it's own regulator device, and then provides the VDDD supply consumer,
and now there will be two regulator devices exist, local regulator for VDDD
and platform regulator for VDDIO, VDDA.

***
* *-|3.3V VDDIO
* *
*  SGTL5000 codec *---x   VDDD
* *
* *-|3.3V VDDA
***

If an external supply is not used for VDDD, in the DT or architecture-specific
file, only "VDDA-supply" and "VDDIO-supply" properties will be presented.
This caused the following kernel failed while trying to get the external VDDD
supply before trying to register it's own regulator device.

sgtl5000 0-000a: Failed to get supply 'VDDD': -19

Here use regulator_get_optional() trying to look at the fact that whether the
external VDDD supply is used or not.

Signed-off-by: Xiubo Li 
---
 sound/soc/codecs/sgtl5000.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/sound/soc/codecs/sgtl5000.c b/sound/soc/codecs/sgtl5000.c
index 1f4093f..90d6d1e 100644
--- a/sound/soc/codecs/sgtl5000.c
+++ b/sound/soc/codecs/sgtl5000.c
@@ -1298,6 +1298,21 @@ static int sgtl5000_replace_vddd_with_ldo(struct 
snd_soc_codec *codec)
return 0;
 }
 
+static int sgtl5000_external_vddd_used(struct snd_soc_codec *codec)
+{
+   struct regulator *consumer;
+   struct sgtl5000_priv *sgtl5000 = snd_soc_codec_get_drvdata(codec);
+
+   consumer = regulator_get_optional(codec->dev,
+   sgtl5000->supplies[VDDD].supply);
+   if (IS_ERR(consumer))
+   return 0;
+
+   regulator_put(consumer);
+
+   return 1;
+}
+
 static int sgtl5000_enable_regulators(struct snd_soc_codec *codec)
 {
int reg;
@@ -1310,11 +1325,15 @@ static int sgtl5000_enable_regulators(struct 
snd_soc_codec *codec)
for (i = 0; i < ARRAY_SIZE(sgtl5000->supplies); i++)
sgtl5000->supplies[i].supply = supply_names[i];
 
-   ret = regulator_bulk_get(codec->dev, ARRAY_SIZE(sgtl5000->supplies),
+   if (sgtl5000_external_vddd_used(codec)) {
+   ret = regulator_bulk_get(codec->dev,
+   ARRAY_SIZE(sgtl5000->supplies),
sgtl5000->supplies);
-   if (!ret)
+   if (ret)
+   return ret;
+
external_vddd = 1;
-   else {
+   } else {
ret = sgtl5000_replace_vddd_with_ldo(codec);
if (ret)
return ret;
-- 
1.8.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ARM: nommu: Unable to allocate RAM for process text/data, errno 12

2013-11-27 Thread Axel Lin
2013/11/27 Andrew Morton :
> On Tue, 26 Nov 2013 17:29:29 +0800 Axel Lin  wrote:
>
>> Hi,
>> I got below error messages while starting mdev (busybox).
>>
>> ...
>>
>> [  108.537109] chmod: page allocation failure: order:8, mode:0xd0
>
> It wants to allocate 2^8 physically contiguous pages!
>
>> [  108.543945] CPU: 0 PID: 47 Comm: chmod Not tainted 
>> 3.13.0-rc1-00170-g1bab531-dirty #1940
>> [  108.580078] [] (unwind_backtrace+0x0/0xe0) from [] 
>> (show_stack+0x10/0x14)
>> [  108.592773] [] (show_stack+0x10/0x14) from [<00050010>] 
>> (warn_alloc_failed+0xf8/0x128)
>> [  108.605468] [<00050010>] (warn_alloc_failed+0xf8/0x128) from [<00052030>] 
>> (__alloc_pages_nodemask+0x64c/0x6c4)
>> [  108.620117] [<00052030>] (__alloc_pages_nodemask+0x64c/0x6c4) from 
>> [<0005f028>] (do_mmap_pgoff+0x5d0/0x9b0)
>> [  108.633789] [<0005f028>] (do_mmap_pgoff+0x5d0/0x9b0) from [<0005ac04>] 
>> (vm_mmap_pgoff+0x64/0x7c)
>> [  108.647460] [<0005ac04>] (vm_mmap_pgoff+0x64/0x7c) from [<0009e6e8>] 
>> (load_flat_binary+0x38c/0xa0c)
>> [  108.660156] [<0009e6e8>] (load_flat_binary+0x38c/0xa0c) from [<0006bc40>] 
>> (search_binary_handler+0x4c/0xa4)
>> [  108.676757] [<0006bc40>] (search_binary_handler+0x4c/0xa4) from 
>> [<0006bfc8>] (do_execve+0x330/0x4e8)
>> [  108.689453] [<0006bfc8>] (do_execve+0x330/0x4e8) from [<0006c3c4>] 
>> (SyS_execve+0x30/0x44)
>> [  108.701171] [<0006c3c4>] (SyS_execve+0x30/0x44) from [<8f40>] 
>> (ret_fast_syscall+0x0/0x44)
>
> So the binfmt_flat driver is allocating memory into which to load
> mdev's text (I assume it's the text).
>
>> Why it got page allocation failure?
>
> Because 256 physically contiguous free pages were not available.
>
>> Does that mean it run into OOM?
>
> Nope.
>
>> Seem the system still has enough memory available.
>
> Sure, but it is too fragmented.  Get an MMU ;)
>
>
> otoh, memory reclaim *should* have at least reclaimed non-mmapped
> pagecache.  Shooting down lots of pagecache is preferable to failing
> exec().  But I expect the PAGE_ALLOC_COSTLY_ORDER logic prevents the kernel
> from trying to do this.
>
> If it's repeatable then something like this:
>
> --- a/mm/nommu.c~a
> +++ a/mm/nommu.c
> @@ -1173,7 +1173,7 @@ static int do_mmap_private(struct vm_are
> order = get_order(len);
> kdebug("alloc order %d for %lx", order, len);
>
> -   pages = alloc_pages(GFP_KERNEL, order);
> +   pages = alloc_pages(GFP_KERNEL|__GFP_REPEAT, order);
> if (!pages)
> goto enomem;
>
>
> *might* help.

Hi Andrew,
Thanks for your reply.

I try to boot a couple times with your patch.
Sometimes I can still see the same (above) messages with your patch applied.

I'm trying to remove unnecessary features to reduce memory usage.
(seems this does help. I got more free memory so less chance to hit
memory allocation failure)
Is it possible to know current memory consumption (slab) in a running system?

BTW, I'm wondering what is the guidline to choose the SLAB allocator?
(especially, for embedded platforms without mmu).

I google for slab/slub/slob, and found some material [1] says:
SLOB (Simple List Of Blocks) is a memory allocator optimized for
embedded systems
with very little memory—on the order of megabytes.
But it also says SLOB suffer from pathological fragmentation.
So I'm wondering if I should choose SLOB or not. ( Currently, I'm using SLUB ).

[1] 
http://stackoverflow.com/questions/15470560/what-to-choose-between-slab-and-slub-allocator-in-linux-kernel

Regards,
Axel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: page fault deadlock

2013-11-27 Thread Xiaotian Feng
On Thu, Nov 28, 2013 at 12:11 PM, Greg KH  wrote:
> On Thu, Nov 28, 2013 at 11:25:32AM +0800, Xiaotian Feng wrote:
>> Hi,
>>
>> When I upgrade to latest kernel, I found my system hang there. It
>> is reproducible on my virtualbox, and I found each time I mounted my
>> RAID6 partition and tried to vi or build kernel, my whole system
>> lockup very soon.
>>
>> After turning on lockdep, I found following lockdep warning:
>>
>> [   27.848462]
>> [   27.848471] ==
>> [   27.848477] [ INFO: possible circular locking dependency detected ]
>> [   27.848484] 3.13.0-rc1+ #1 Tainted: GF   W
>> [   27.848490] ---
>> [   27.848496] Xorg/1268 is trying to acquire lock:
>> [   27.848501]  (>mutex){+.+.+.}, at: []
>> sysfs_bin_mmap+0x4f/0x120
>> [   27.848516]
>> [   27.848516] but task is already holding lock:
>> [   27.848521]  (>mmap_sem){++}, at: []
>> vm_mmap_pgoff+0x6f/0xc0
>> [   27.848534]
>> [   27.848534] which lock already depends on the new lock.
>> [   27.848534]
>> [   27.848541]
>> [   27.848541] the existing dependency chain (in reverse order) is:
>> [   27.848547]
>> [   27.848547] -> #2 (>mmap_sem){++}:
>> [   27.848556][] lock_acquire+0xb0/0x160
>> [   27.848564][] might_fault+0x8c/0xb0
>> [   27.848572][] md_ioctl+0xa78/0x19b0
>> [   27.848580][] blkdev_ioctl+0x234/0x840
>> [   27.848588][] block_ioctl+0x41/0x50
>> [   27.848597][] do_vfs_ioctl+0x300/0x520
>> [   27.848605][] SyS_ioctl+0x81/0xa0
>> [   27.848613][] tracesys+0xe1/0xe6
>> [   27.848622]
>> [   27.848622] -> #1 (>reconfig_mutex){+.+.+.}:
>> [   27.848630][] lock_acquire+0xb0/0x160
>> [   27.848637][]
>> mutex_lock_interruptible_nested+0x78/0x610
>> [   27.848646][] rdev_attr_show+0x40/0x90
>> [   27.848654][] sysfs_seq_show+0xda/0x170
>> [   27.848662][] seq_read+0x164/0x3e0
>> [   27.848671][] vfs_read+0x95/0x160
>> [   27.848680][] SyS_read+0x49/0xa0
>> [   27.848687][] tracesys+0xe1/0xe6
>> [   27.848695]
>> [   27.848695] -> #0 (>mutex){+.+.+.}:
>> [   27.848703][] __lock_acquire+0x1587/0x1ca0
>> [   27.848711][] lock_acquire+0xb0/0x160
>> [   27.848718][] mutex_lock_nested+0x68/0x510
>> [   27.848725][] sysfs_bin_mmap+0x4f/0x120
>> [   27.848732][] mmap_region+0x3ed/0x5d0
>> [   27.848741][] do_mmap_pgoff+0x34e/0x3d0
>> [   27.848748][] vm_mmap_pgoff+0x90/0xc0
>> [   27.848755][] SyS_mmap_pgoff+0x1d5/0x270
>> [   27.848763][] SyS_mmap+0x22/0x30
>> [   27.848771][] tracesys+0xe1/0xe6
>> [   27.848778]
>> [   27.848778] other info that might help us debug this:
>> [   27.848778]
>> [   27.848785] Chain exists of:
>> [   27.848785]   >mutex --> >reconfig_mutex --> >mmap_sem
>> [   27.848785]
>> [   27.848795]  Possible unsafe locking scenario:
>> [   27.848795]
>> [   27.848800]CPU0CPU1
>> [   27.848805]
>> [   27.848810]   lock(>mmap_sem);
>> [   27.848817]lock(>reconfig_mutex);
>> [   27.848824]lock(>mmap_sem);
>> [   27.848830]   lock(>mutex);
>> [   27.848837]
>> [   27.848837]  *** DEADLOCK ***
>> [   27.848837]
>> [   27.848844] 1 lock held by Xorg/1268:
>> [   27.848849]  #0:  (>mmap_sem){++}, at: []
>> vm_mmap_pgoff+0x6f/0xc0
>> [   27.848861]
>> [   27.848861] stack backtrace:
>> [   27.848868] CPU: 1 PID: 1268 Comm: Xorg Tainted: GF   W
>> 3.13.0-rc1+ #1
>> [   27.848873] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
>> VirtualBox 12/01/2006
>> [   27.848879]  822daa00 8800d0371bc8 817725f7
>> 822cbdc0
>> [   27.848901]  8800d0371c08 8176d9eb 8800d0371c60
>> 880115b42a78
>> [   27.848909]   880115b42a78 880115b422a0
>> 0001
>> [   27.848918] Call Trace:
>> [   27.848930]  [] dump_stack+0x4e/0x7a
>> [   27.848942]  [] print_circular_bug+0x1f9/0x208
>> [   27.848952]  [] __lock_acquire+0x1587/0x1ca0
>> [   27.848964]  [] ? print_context_stack+0x8f/0x100
>> [   27.848975]  [] lock_acquire+0xb0/0x160
>> [   27.848986]  [] ? sysfs_bin_mmap+0x4f/0x120
>> [   27.848996]  [] ? sysfs_bin_mmap+0x4f/0x120
>> [   27.849007]  [] mutex_lock_nested+0x68/0x510
>> [   27.849016]  [] ? sysfs_bin_mmap+0x4f/0x120
>> [   27.849027]  [] ? kmemleak_alloc+0x4e/0xb0
>> [   27.849038]  [] sysfs_bin_mmap+0x4f/0x120
>> [   27.849048]  [] mmap_region+0x3ed/0x5d0
>> [   27.849058]  [] do_mmap_pgoff+0x34e/0x3d0
>> [   27.849070]  [] vm_mmap_pgoff+0x90/0xc0
>> [   27.849080]  [] SyS_mmap_pgoff+0x1d5/0x270
>> [   27.849092]  [] ? syscall_trace_enter+0x145/0x270
>> [   27.849102]  [] SyS_mmap+0x22/0x30
>> [   27.849112]  [] tracesys+0xe1/0xe6
>>
>>
>> I think it is a real deadlock, and it is caused 

Re: [RFC 9/9] of/irq: create interrupts-extended property

2013-11-27 Thread Peter Crosthwaite
On Thu, Nov 28, 2013 at 12:17 AM, Grant Likely  wrote:
> On Wed, 27 Nov 2013 19:06:35 +1000, Peter Crosthwaite 
>  wrote:
>> On Mon, Nov 25, 2013 at 7:32 AM, Grant Likely  
>> wrote:
>> > On Sun, 24 Nov 2013 17:04:52 +1000, Peter Crosthwaite 
>> >  wrote:
>> >> On Wed, Nov 13, 2013 at 4:14 PM, Grant Likely  
>> >> wrote:
>> >> > On Wed, 13 Nov 2013 09:17:01 +1000, Peter Crosthwaite 
>> >> >  wrote:
>> >> >> It's going to get a little verbose once you start making multiple
>> >> >> connections as you need one mux per wire. Perhaps it could be cleaned
>> >> >> up by making the foo_irq_mux node(s) a child of foo?
>> >> >
>> >> > It could, but then you need some way of attaching a driver to that node,
>> >> > and that would require building knowledge into the driver again.
>> >> >
>> >> > Can you boil it down to a couple of concrete examples? What is a
>> >> > specific example of how the platform should decide which interrupt line
>> >> > to use?
>> >> >
>> >>
>> >> So i've spent some time playing with this. I now have a booting kernel
>> >> with multiple root interrupt controllers and peripheral devices
>> >> multiply-connected to both root controllers. But only one on of the
>> >> controllers is used by Linux (as linux being able to use multiple
>> >> intcs is a non-trivial problem). So the scheme I am using is to have
>> >> one of these root intc's marked as disabled via
>> >
>> > Multiple intc's should be a solved problem. What issue are you seeing?
>> > Or is this a microblaze specific problem?
>> >
>>
>> It's multiple root (i.e. have no explicit parent) interrupt
>> controllers. And linux
>> doesnt respect status = "disabled" for interrupt controllers at all it seems.
>
> That can be fixed.  :-)
>

Patches on list.

Regards,
Peter

> g.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 1/2] of: irq: Ignore disabled intc's when searching map

2013-11-27 Thread Peter Crosthwaite
When searching the interrupt map, if a matched parent is disabled, just
ignore it and move on with the search.

This allows for specifying connection of a single device IRQ to
multiple interrupt controllers via the interrupt map schema. This change
allows for selection of the active interrupt controller via the already
existing status = "disabled" mechanism.

Signed-off-by: Peter Crosthwaite 
Acked-by: Michal Simek 
---
 drivers/of/irq.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/of/irq.c b/drivers/of/irq.c
index 786b0b4..22e414b 100644
--- a/drivers/of/irq.c
+++ b/drivers/of/irq.c
@@ -217,6 +217,9 @@ int of_irq_parse_raw(const __be32 *addr, struct 
of_phandle_args *out_irq)
goto fail;
}
 
+   if (!of_device_is_available(newpar))
+   match = 0;
+
/* Get #interrupt-cells and #address-cells of new
 * parent
 */
-- 
1.8.4.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 2/2] of: irq: Ignore disabled interrupt controllers

2013-11-27 Thread Peter Crosthwaite
When searching the system for interrupt controllers, skip over any
that are explicitly disabled.

This makes interrupt controllers consistent with regular devices,
which can be marked as do-not-probe via the status = "disabled" dts
property.

Signed-off-by: Peter Crosthwaite 
Acked-by: Michal Simek 
---
 drivers/of/irq.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/of/irq.c b/drivers/of/irq.c
index 22e414b..bf80268 100644
--- a/drivers/of/irq.c
+++ b/drivers/of/irq.c
@@ -441,7 +441,8 @@ void __init of_irq_init(const struct of_device_id *matches)
INIT_LIST_HEAD(_parent_list);
 
for_each_matching_node(np, matches) {
-   if (!of_find_property(np, "interrupt-controller", NULL))
+   if (!of_find_property(np, "interrupt-controller", NULL) ||
+   !of_device_is_available(np))
continue;
/*
 * Here, we allocate and populate an intc_desc with the node
-- 
1.8.4.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 0/2] of: irq: Ignore disabled interrupt controllers

2013-11-27 Thread Peter Crosthwaite
Hi All,

These two patches support the case where an interrupt controller is
marked as disabled. Patch 1 fixed the dts interrupt-map search logic to
ignore disabled interrupt controllers. Patch 2 stops disabled interrupt
controllers getting probed.

Peter Crosthwaite (2):
  of: irq: Ignore disabled intc's when searching map
  of: irq: Ignore disabled interrupt controllers

 drivers/of/irq.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

-- 
1.8.4.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.13.0-rc1+ INFO: inconsistent lock state

2013-11-27 Thread Hannes Frederic Sowa
On Thu, Nov 28, 2013 at 04:17:56AM +, 허종만 wrote:
> With current linus git (HEAD 8ae516aa), I got following kernel message.
> This is slightly different from what I had reported recently (14th Nov).

I guess the prolem is that the STATS _BH functions are to be used if they
are executed in bottom half and maybe people expect it to protect against
bottom half. We seem to have some more of this problems in udp_sendmsg
(this trace) and ping_sendmsg, maybe in tcp, too. The non-_BH functions
actually disable bh.

I'll check for more prolems and will post a patch.

Thanks for reporting,

  Hannes

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] ASoC: soc_compress: Add set_metadata

2013-11-27 Thread Vinod Koul
On Mon, Nov 25, 2013 at 10:16:48AM +, Richard Fitzgerald wrote:
> 
> Pass the set_metadata() calls through to the codec driver.
> 
> Signed-off-by: Zhao Weijia 
> Signed-off-by: Richard Fitzgerald 
> ---
>  sound/soc/soc-compress.c |   18 +-
>  1 files changed, 17 insertions(+), 1 deletions(-)
> 
> diff --git a/sound/soc/soc-compress.c b/sound/soc/soc-compress.c
> index 53c9ecd..186802b 100644
> --- a/sound/soc/soc-compress.c
> +++ b/sound/soc/soc-compress.c
> @@ -318,6 +318,21 @@ static int soc_compr_pointer(struct snd_compr_stream 
> *cstream,
>   mutex_unlock(>pcm_mutex);
>   return 0;
>  }
> +static int soc_compr_set_metadata(struct snd_compr_stream *cstream,
> + struct snd_compr_metadata *metadata)
> +{
> + struct snd_soc_pcm_runtime *rtd = cstream->private_data;
> + struct snd_soc_platform *platform = rtd->platform;
> +
> + mutex_lock_nested(>pcm_mutex, rtd->pcm_subclass);
> +
> + if (platform->driver->compr_ops && 
> platform->driver->compr_ops->set_metadata)
> +  platform->driver->compr_ops->set_metadata(cstream, metadata);
> +
> + mutex_unlock(>pcm_mutex);
> + return 0;
> +}
> +
>  
>  static int soc_compr_copy(struct snd_compr_stream *cstream,
> char __user *buf, size_t count)
> @@ -372,7 +387,8 @@ static struct snd_compr_ops soc_compr_ops = {
>   .pointer= soc_compr_pointer,
>   .ack= soc_compr_ack,
>   .get_caps   = soc_compr_get_caps,
> - .get_codec_caps = soc_compr_get_codec_caps
> + .get_codec_caps = soc_compr_get_codec_caps,
> + .set_metadata = soc_compr_set_metadata,
Sorry am confused by this and the patch description? The set_metadata call
exists so how is this propogating to codec? Also which tree has this been
generated against?

--
~Vinod
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION][v3.4-rc1] PCI: add a PCI resource reallocation config option

2013-11-27 Thread Yinghai Lu
On Wed, Nov 27, 2013 at 12:22 PM, Joseph Salisbury
 wrote:
> Hi Yinghai,
>
> A kernel bug was opened against Ubuntu [0].  After a kernel bisect, it
> was found that the following commit introduced the bug:
>
> commit b07f2ebc109b607789f648dedcff4b125f9afec6
> Author: Yinghai Lu 
> Date:   Thu Feb 23 19:23:32 2012 -0800
>
> PCI: add a PCI resource reallocation config option
>
>
>
> The regression was introduced as of v3.4-rc1 and also exists in current
> mainline.
>
> The bug seems platform specific since we have not had allot of other
> reports.  The bug is preventing ixgbe to probe Intel x520 NIC's.
>
> It would be easy enough to disable the PCI_REALLOC_ENABLE_AUTO config
> option, but I wanted to get your feedback since you are the author.  It
> looks like the bug reporter may also be able to work around the problem
> with the pci=realloc=.

The bios has problem to have two functions rom bar to same place, and
root bus does
not enough mmio range and confuse realloc logic.

Please try attached patches to top of linus's tree, or your internal tree.

Thanks

Yinghai
Subject: [PATCH] PCI: pcibus address to resource converting take bus instead of dev

For allocating resource under bus path, we do not have dev to pass along,
and we only have bus to use instead.

-v2: drop pcibios_bus_addr_to_resource().

Signed-off-by: Yinghai Lu 

---
 drivers/pci/host-bridge.c |   34 +-
 include/linux/pci.h   |5 +
 2 files changed, 26 insertions(+), 13 deletions(-)

Index: linux-2.6/drivers/pci/host-bridge.c
===
--- linux-2.6.orig/drivers/pci/host-bridge.c
+++ linux-2.6/drivers/pci/host-bridge.c
@@ -9,22 +9,19 @@
 
 #include "pci.h"
 
-static struct pci_bus *find_pci_root_bus(struct pci_dev *dev)
+static struct pci_bus *find_pci_root_bus(struct pci_bus *bus)
 {
-	struct pci_bus *bus;
-
-	bus = dev->bus;
 	while (bus->parent)
 		bus = bus->parent;
 
 	return bus;
 }
 
-static struct pci_host_bridge *find_pci_host_bridge(struct pci_dev *dev)
+static struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
 {
-	struct pci_bus *bus = find_pci_root_bus(dev);
+	struct pci_bus *root_bus = find_pci_root_bus(bus);
 
-	return to_pci_host_bridge(bus->bridge);
+	return to_pci_host_bridge(root_bus->bridge);
 }
 
 void pci_set_host_bridge_release(struct pci_host_bridge *bridge,
@@ -40,10 +37,11 @@ static bool resource_contains(struct res
 	return res1->start <= res2->start && res1->end >= res2->end;
 }
 
-void pcibios_resource_to_bus(struct pci_dev *dev, struct pci_bus_region *region,
-			 struct resource *res)
+void __pcibios_resource_to_bus(struct pci_bus *bus,
+  struct pci_bus_region *region,
+  struct resource *res)
 {
-	struct pci_host_bridge *bridge = find_pci_host_bridge(dev);
+	struct pci_host_bridge *bridge = find_pci_host_bridge(bus);
 	struct pci_host_bridge_window *window;
 	resource_size_t offset = 0;
 
@@ -60,6 +58,11 @@ void pcibios_resource_to_bus(struct pci_
 	region->start = res->start - offset;
 	region->end = res->end - offset;
 }
+void pcibios_resource_to_bus(struct pci_dev *dev, struct pci_bus_region *region,
+			 struct resource *res)
+{
+	__pcibios_resource_to_bus(dev->bus, region, res);
+}
 EXPORT_SYMBOL(pcibios_resource_to_bus);
 
 static bool region_contains(struct pci_bus_region *region1,
@@ -68,10 +71,10 @@ static bool region_contains(struct pci_b
 	return region1->start <= region2->start && region1->end >= region2->end;
 }
 
-void pcibios_bus_to_resource(struct pci_dev *dev, struct resource *res,
-			 struct pci_bus_region *region)
+void __pcibios_bus_to_resource(struct pci_bus *bus, struct resource *res,
+  struct pci_bus_region *region)
 {
-	struct pci_host_bridge *bridge = find_pci_host_bridge(dev);
+	struct pci_host_bridge *bridge = find_pci_host_bridge(bus);
 	struct pci_host_bridge_window *window;
 	resource_size_t offset = 0;
 
@@ -93,4 +96,9 @@ void pcibios_bus_to_resource(struct pci_
 	res->start = region->start + offset;
 	res->end = region->end + offset;
 }
+void pcibios_bus_to_resource(struct pci_dev *dev, struct resource *res,
+			 struct pci_bus_region *region)
+{
+	__pcibios_bus_to_resource(dev->bus, res, region);
+}
 EXPORT_SYMBOL(pcibios_bus_to_resource);
Index: linux-2.6/include/linux/pci.h
===
--- linux-2.6.orig/include/linux/pci.h
+++ linux-2.6/include/linux/pci.h
@@ -738,8 +738,13 @@ void pci_fixup_cardbus(struct pci_bus *)
 
 /* Generic PCI functions used internally */
 
+void __pcibios_resource_to_bus(struct pci_bus *bus,
+			   struct pci_bus_region *region,
+			   struct resource *res);
 void pcibios_resource_to_bus(struct pci_dev *dev, struct pci_bus_region *region,
 			 struct resource *res);
+void __pcibios_bus_to_resource(struct pci_bus *bus, struct resource *res,
+  struct pci_bus_region *region);
 void pcibios_bus_to_resource(struct 

Re: [PATCH 0/3] makedumpfile: hugepage filtering for vmcore dump

2013-11-27 Thread Atsushi Kumagai
On 2013/11/22 16:18:20, kexec  wrote:
> (2013/11/07 9:54), HATAYAMA Daisuke wrote:
> > (2013/11/06 11:21), Atsushi Kumagai wrote:
> >> (2013/11/06 5:27), Vivek Goyal wrote:
> >>> On Tue, Nov 05, 2013 at 09:45:32PM +0800, Jingbai Ma wrote:
>  This patch set intend to exclude unnecessary hugepages from vmcore dump 
>  file.
> 
>  This patch requires the kernel patch to export necessary data structures 
>  into
>  vmcore: "kexec: export hugepage data structure into vmcoreinfo"
>  http://lists.infradead.org/pipermail/kexec/2013-November/009997.html
> 
>  This patch introduce two new dump levels 32 and 64 to exclude all unused 
>  and
>  active hugepages. The level to exclude all unnecessary pages will be 127 
>  now.
> >>>
> >>> Interesting. Why hugepages should be treated any differentely than normal
> >>> pages?
> >>>
> >>> If user asked to filter out free page, then it should be filtered and
> >>> it should not matter whether it is a huge page or not?
> >>
> >> I'm making a RFC patch of hugepages filtering based on such policy.
> >>
> >> I attach the prototype version.
> >> It's able to filter out also THPs, and suitable for cyclic processing
> >> because it depends on mem_map and looking up it can be divided into
> >> cycles. This is the same idea as page_is_buddy().
> >>
> >> So I think it's better.
> >>
> >
> >> @@ -4506,14 +4583,49 @@ __exclude_unnecessary_pages(unsigned long mem_map,
> >>&& !isAnon(mapping)) {
> >>if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
> >>pfn_cache_private++;
> >> +/*
> >> + * NOTE: If THP for cache is introduced, the check for
> >> + *   compound pages is needed here.
> >> + */
> >>}
> >>/*
> >> * Exclude the data page of the user process.
> >> */
> >> -else if ((info->dump_level & DL_EXCLUDE_USER_DATA)
> >> -&& isAnon(mapping)) {
> >> -if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
> >> -pfn_user++;
> >> +else if (info->dump_level & DL_EXCLUDE_USER_DATA) {
> >> +/*
> >> + * Exclude the anonnymous pages as user pages.
> >> + */
> >> +if (isAnon(mapping)) {
> >> +if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
> >> +pfn_user++;
> >> +
> >> +/*
> >> + * Check the compound page
> >> + */
> >> +if (page_is_hugepage(flags) && compound_order > 0) {
> >> +int i, nr_pages = 1 << compound_order;
> >> +
> >> +for (i = 1; i < nr_pages; ++i) {
> >> +if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i))
> >> +pfn_user++;
> >> +}
> >> +pfn += nr_pages - 2;
> >> +mem_map += (nr_pages - 1) * SIZE(page);
> >> +}
> >> +}
> >> +/*
> >> + * Exclude the hugetlbfs pages as user pages.
> >> + */
> >> +else if (hugetlb_dtor == SYMBOL(free_huge_page)) {
> >> +int i, nr_pages = 1 << compound_order;
> >> +
> >> +for (i = 0; i < nr_pages; ++i) {
> >> +if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i))
> >> +pfn_user++;
> >> +}
> >> +pfn += nr_pages - 1;
> >> +mem_map += (nr_pages - 1) * SIZE(page);
> >> +}
> >>}
> >>/*
> >> * Exclude the hwpoison page.
> >
> > I'm concerned about the case that filtering is not performed to part of 
> > mem_map
> > entries not belonging to the current cyclic range.
> >
> > If maximum value of compound_order is larger than maximum value of
> > CONFIG_FORCE_MAX_ZONEORDER, which makedumpfile obtains by 
> > ARRAY_LENGTH(zone.free_area),
> > it's necessary to align info->bufsize_cyclic with larger one in
> > check_cyclic_buffer_overrun().
> >
> 
> ping, in case you overlooked this...

Sorry for the delayed response, I prioritize the release of v1.5.5 now.

Thanks for your advice, check_cyclic_buffer_overrun() should be fixed
as you said. In addition, I'm considering other way to address such case,
that is to bring the number of "overflowed pages" to the next cycle and
exclude them at the top of __exclude_unnecessary_pages() like below:

   /*
* The pages which should be excluded still remain.
*/
   if (remainder >= 1) {
   int i;
   unsigned long tmp;
   for (i = 0; i < remainder; ++i) {
   if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i)) 
{
   pfn_user++;
   tmp++;
  

Re: [tip:sched/urgent] sched: Check sched_domain before computing group power

2013-11-27 Thread Yinghai Lu
On Wed, Nov 27, 2013 at 7:02 PM, David Rientjes  wrote:
> On Thu, 21 Nov 2013, Yinghai Lu wrote:
>
>> original one in linus's tree:
>>
>> [8.952728] NMI watchdog: enabled on all CPUs, permanently consumes
>> one hw-PMU counter.
>> [8.965697] BUG: unable to handle kernel NULL pointer dereference
>> at 0010
>> [8.969495] IP: [] update_group_power+0x1d3/0x250
>
> This should have been fixed by Srikar's patch, no?

maybe not related, now in another system, linus's tree + Srikar's patch.

got

[   33.546361] divide error:  [#1]
SMP
[   33.589436] Modules linked in:
[   33.592869] CPU: 15 PID: 567 Comm: kworker/u482:0 Not tainted
3.13.0-rc1-yh-00324-gcf1be1c-dirty #10
[   33.603075] Hardware name: Oracle Corporation
[   33.609571] calling  ipc_ns_init+0x0/0x14 @ 1
[   33.609575] initcall ipc_ns_init+0x0/0x14 returned 0 after 0 usecs
[   33.609577] calling  init_mmap_min_addr+0x0/0x16 @ 1
[   33.609579] initcall init_mmap_min_addr+0x0/0x16 returned 0 after 0 usecs
[   33.609583] calling  init_cpufreq_transition_notifier_list+0x0/0x1b @ 1
[   33.609621] initcall init_cpufreq_transition_notifier_list+0x0/0x1b
returned 0 after 0 usecs
[   33.609624] calling  net_ns_init+0x0/0xfa @ 1
[   33.677194] task: 897c5ba5c8c0 ti: 897c5ba8e000 task.ti:
897c5ba8e000
[   33.685558] RIP: 0010:[]  []
find_busiest_group+0x2ac/0x880
[   33.695310] RSP: :897c5ba8f9a8  EFLAGS: 00010046
[   33.701253] RAX: 0001dfff RBX:  RCX: 0001e000
[   33.709226] RDX:  RSI: 0078 RDI: 
[   33.717198] RBP: 897c5ba8fb08 R08:  R09: 
[   33.725178] R10:  R11: 0001e000 R12: 897c5ba8fa90
[   33.733156] R13: 897c5ad61d80 R14:  R15: 897c5ba8fba0
[   33.741132] FS:  () GS:897d7c20()
knlGS:
[   33.750164] CS:  0010 DS:  ES:  CR0: 80050033
[   33.756593] CR2: 0168 CR3: 02a14000 CR4: 001407e0
[   33.764571] Stack:
[   33.766822]   0046 0048

[   33.775141]  897c5ad61d98 897c5ba8fa20 0036
03ab
[   33.783461]  03ab 0139 44e8
00010003
[   33.791789] Call Trace:
[   33.794549]  [] load_balance+0x1c8/0x8d0
[   33.800701]  [] ? __lock_acquire+0xadb/0xce0
[   33.807222]  [] idle_balance+0x101/0x1c0
[   33.813355]  [] ? idle_balance+0x44/0x1c0
[   33.819618]  [] __schedule+0x2cb/0xa10
[   33.825584]  [] ? trace_hardirqs_off_caller+0x28/0x160
[   33.833089]  [] ? trace_hardirqs_off+0xd/0x10
[   33.839731]  [] ? local_clock+0x34/0x60
[   33.845788]  [] ? worker_thread+0x2db/0x370
[   33.852241]  [] ? _raw_spin_unlock_irq+0x30/0x40
[   33.859150]  [] schedule+0x65/0x70
[   33.864700]  [] worker_thread+0x2e0/0x370
[   33.870932]  [] ? trace_hardirqs_on+0xd/0x10
[   33.877472]  [] ? manage_workers.isra.17+0x330/0x330
[   33.884789]  [] kthread+0x108/0x110
[   33.890441]  [] ? __init_kthread_worker+0x70/0x70
[   33.897465]  [] ret_from_fork+0x7c/0xb0
[   33.903504]  [] ? __init_kthread_worker+0x70/0x70
[   33.910508] Code: 89 85 b8 fe ff ff 49 8b 45 10 41 8b 75 0c 44 8b
50 08 44 8b 58 04 89 f0 48 c1 e0 0a 45 89 d1 49 8d 44 01 ff 48 89 c2
48 c1 fa 3f <49> f7 f9 31 d2 49 89 c1 89 f0 44 89 de 41 f7 f1 48 81 c6
00 02
[   33.932375] RIP  [] find_busiest_group+0x2ac/0x880
[   33.939491]  RSP 
[   33.943418] ---[ end trace 7a833c0cac54cac8 ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 3/3] dma: Add Freescale eDMA engine driver support

2013-11-27 Thread Vinod Koul
On Wed, Nov 27, 2013 at 09:38:02AM +, Jingchang Lu wrote:
> > > > > +* DMAMUX
> > > > > +Required properties:
> > > >
> > > > No compatible?
> > > >
> > > > Where are DMAMUX nodes expected to live?
> > > >
> > > > How to they relate to the eDMA controller in HW? Are they a
> > > > subcomponent, or a logically separate unit that happens to be
> > connected?
> > > [Lu Jingchang-b35083]
> > > DMAMUX is a multiplexer between dma controller channels and peripheral
> > deivces,
> > > each DMAMUX provides 16 independently selectable DMA channel routers,
> > and each
> > > channel router can be assigned to one of the possible peripheral DMA
> > slots.
> > > So it's not a standalone device, it's just required by the DMA
> > controller to
> > > connect the channels and slaves, So it's got by DMA controller's
> > "fsl,dma-mux" property.
> > > Thanks!
> > 
> > Ok.
> > 
> > I'm not so sure on the way this is described, from the point of view of
> > the device, its DMA channel is wired to the MUX, not to the DMA engine
> > directly:
> > 
> >+---+
> >  /-|DEVICE0|
> >  | +---+
> > +-+   +--+
> > | DMA |===|DMAMUX|
> > +-+   +--+
> >  | +---+
> >  \-|DEVICE1|
> >+---+
> > 
> > If that's the case, I'd expect the DMAMUX to have a #dma-cells and
> > describe each device as being wired to the mux, and then the mux as
> > being wired to the DMA. If the MUXes are sub-blocks of the DMA, then I'm
> > not sure why they need to be described at all.
> > 
> > Currently, the DMA code is handling information that's specific to the
> > MUX (i.e. the channel ID that's specific to the MUX), and that feels odd
> > unless the MUX is a component of the DMA (which if true I'd expect it to
> > be described differently).
> > 
> Yes, the connection is as your imagination, except for each DMA has two MUX. 
> The DMA helper looks the registered DMA engineer for DMA channel binding,
> and the registered DMA engineer is the eDMA node, if binding to DMAMUX,
> the helper will not find out the dma engineer.
> The only DMAMUX configuration is programming the slave id into its
> corresponding register, so its code is handled by the eDMA driver,
> the DMAMUX is not optional.
This is fairly common representation of how DMA engines interact with slave
clients via a programmable mux. Now the slave id value in this case (which we
program to mux) needs to be for the client and _not_ for the dmanegine. DMA
engine would have a register to configure the mux value required for the
transaction.

The dma engine API provides a way to program the slave id (ie mux value) and IMO
this should be a property of slave device (perhaps part of its dma resource) and
used while programming channel. Making it dma driver property makes no sense
here

--
~Vinod
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] dmaengine: at_hdmac: remove unused function

2013-11-27 Thread Vinod Koul
On Tue, Nov 26, 2013 at 10:43:35AM -0800, Olof Johansson wrote:
> commit 54f8d501e8428 ('dmaengine: remove DMA unmap from drivers')
> refactored some code which resulted in an unused function in the at_hdmac
> driver:
> 
> drivers/dma/at_hdmac_regs.h:350:23: warning: 'chan2parent' defined but
> not used [-Wunused-function]
> 
> Fixes: 54f8d501e8428 ('dmaengine: remove DMA unmap from drivers')
> Signed-off-by: Olof Johansson 
> Cc: Bartlomiej Zolnierkiewicz 
Acked-by: Vinod Koul 

> ---
> 
> Dan,
> 
> Looks like you fixed up the DesignWare driver for the same issue but
> this needs it to.
> 
>  drivers/dma/at_hdmac_regs.h | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/dma/at_hdmac_regs.h b/drivers/dma/at_hdmac_regs.h
> index f31d647acdfa..2787aba60c6b 100644
> --- a/drivers/dma/at_hdmac_regs.h
> +++ b/drivers/dma/at_hdmac_regs.h
> @@ -347,10 +347,6 @@ static struct device *chan2dev(struct dma_chan *chan)
>  {
>   return >dev->device;
>  }
> -static struct device *chan2parent(struct dma_chan *chan)
> -{
> - return chan->dev->device.parent;
> -}
>  
>  #if defined(VERBOSE_DEBUG)
>  static void vdbg_dump_regs(struct at_dma_chan *atchan)
> -- 
> 1.8.4.1.601.g02b3b1d
> 
> --
> To unsubscribe from this list: send the line "unsubscribe dmaengine" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] dmaengine: pl08x: fix conversioin for generic unmap data

2013-11-27 Thread Vinod Koul
On Tue, Nov 26, 2013 at 08:53:24PM -0800, Olof Johansson wrote:
> commit d38a8c622a1b ('dmaengine: prepare for generic 'unmap' data')
> added a generic unmap call but used the wrong argument for it. Fix it.
> 
> Fixes: d38a8c622a1b ('dmaengine: prepare for generic 'unmap' data')
> Signed-off-by: Olof Johansson 
Acked-by: Vinod Koul 

> ---
> 
> I can't actually tell what the intent of d38a8cc622a1b and how mappings
> are expected to be managed, but it's obviously passing the wrong thing
> in here so it seems like the appropriate fix.
> 
>  drivers/dma/amba-pl08x.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/dma/amba-pl08x.c b/drivers/dma/amba-pl08x.c
> index 16a2aa28f856..ec4ee5c1fe9d 100644
> --- a/drivers/dma/amba-pl08x.c
> +++ b/drivers/dma/amba-pl08x.c
> @@ -1169,7 +1169,7 @@ static void pl08x_desc_free(struct virt_dma_desc *vd)
>   struct pl08x_txd *txd = to_pl08x_txd(>tx);
>   struct pl08x_dma_chan *plchan = to_pl08x_chan(vd->tx.chan);
>  
> - dma_descriptor_unmap(txd);
> + dma_descriptor_unmap(>tx);
>   if (!txd->done)
>   pl08x_release_mux(plchan);
>  
> -- 
> 1.8.4.1.601.g02b3b1d
> 

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] dmaengine: pl08x: fix conversioin for generic unmap data

2013-11-27 Thread Vinod Koul
On Wed, Nov 27, 2013 at 11:55:01AM -0800, Dan Williams wrote:
> On Tue, Nov 26, 2013 at 8:53 PM, Olof Johansson  wrote:
> > commit d38a8c622a1b ('dmaengine: prepare for generic 'unmap' data')
> > added a generic unmap call but used the wrong argument for it. Fix it.
> >
> > Fixes: d38a8c622a1b ('dmaengine: prepare for generic 'unmap' data')
> > Signed-off-by: Olof Johansson 
> > ---
> >
> > I can't actually tell what the intent of d38a8cc622a1b and how mappings
> > are expected to be managed, but it's obviously passing the wrong thing
> > in here so it seems like the appropriate fix.
> >
> 
> For most dma-slave usage cases the client is explicitly handling the
> lifetime of the dma mappings.  It's primarily the mem-to-mem usage
> cases that rely on the dma driver to do the unmapping when the
> transaction is complete.  dma_descriptor_unmap() is a common
> implementation rather than requiring each driver to implement it
> uniquely.  Longer term we can require all clients to handle their
> mapping lifetimes and remove the responsibility from the individual
> drivers completely.
I think that would make sense and also remove any abuiguity on who does the
mapping in different usages..

~Vinod
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/17] tracing/probes: Fix basic print type functions

2013-11-27 Thread Namhyung Kim
Hi Masami,

On Thu, 28 Nov 2013 13:16:09 +0900, Masami Hiramatsu wrote:
> (2013/11/27 23:39), Namhyung Kim wrote:
>> Hi Masami,
>> 
>> 2013-11-27 (수), 20:57 +0900, Masami Hiramatsu:
>>> (2013/11/27 15:19), Namhyung Kim wrote:
  
 -DEFINE_BASIC_PRINT_TYPE_FUNC(u8, "%x", unsigned int)
 -DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "%x", unsigned int)
 -DEFINE_BASIC_PRINT_TYPE_FUNC(u32, "%lx", unsigned long)
 -DEFINE_BASIC_PRINT_TYPE_FUNC(u64, "%llx", unsigned long long)
 -DEFINE_BASIC_PRINT_TYPE_FUNC(s8, "%d", int)
 -DEFINE_BASIC_PRINT_TYPE_FUNC(s16, "%d", int)
 -DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%ld", long)
 -DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%lld", long long)
 +DEFINE_BASIC_PRINT_TYPE_FUNC(u8 , "%#x")
 +DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "%#x")
 +DEFINE_BASIC_PRINT_TYPE_FUNC(u32, "%#x")
 +DEFINE_BASIC_PRINT_TYPE_FUNC(u64, "%#Lx")
>>>
>>> As I said I'd like to ask you to change it in %x.
>>>
>>> I just checked in Fedora18, but %#x is not supported on this glibc-2.17.
>>> Since this format is exported via debugfs (format file), I think %x is
>>> better.
>> 
>> Hmm.. but in most cases it's used for printf() not scanf(), right?  In
>> that case adding 0x prefix will help human readers a lot.
>> 
>> How about mandating the prefix with "0x%x"?  This way it can be used
>> both for printf() and scanf() IMHO.
>
> Agreed, you can just use "0x%x" in above case instead of "%#x". :)

Okay, will change.


> For other traceevents, from the human readability point of view,
> I think we should move all the event format should use 0x%x instead
> of %x, because sometimes it confuse users (e.g. 100 => 0x64, without 0x,
> it is just "64").

Agreed.  I'll take a look at them later.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] dma: mv_xor: remove mv_desc_get_dest_addr()

2013-11-27 Thread Vinod Koul
+ Dan

On Mon, Nov 25, 2013 at 07:39:25PM +, Jason Cooper wrote:
> The following commit:
> 
>   54f8d501e842 dmaengine: remove DMA unmap from drivers
> 
> removed the last caller to mv_desc_get_dest_addr(), creating the
> warning:
> 
>   drivers/dma/mv_xor.c:57:12: warning: 'mv_desc_get_dest_addr' defined
>   but not used [-Wunused-function]
> 
> Remove it.
> 
> Signed-off-by: Jason Cooper 
Acked-by: Vinod Koul 

This should go thru Dan's tree

> ---
>  drivers/dma/mv_xor.c | 6 --
>  1 file changed, 6 deletions(-)
> 
> diff --git a/drivers/dma/mv_xor.c b/drivers/dma/mv_xor.c
> index 7807f0ef4e20..23bcc9158cbc 100644
> --- a/drivers/dma/mv_xor.c
> +++ b/drivers/dma/mv_xor.c
> @@ -54,12 +54,6 @@ static void mv_desc_init(struct mv_xor_desc_slot *desc, 
> unsigned long flags)
>   hw_desc->desc_command = (1 << 31);
>  }
>  
> -static u32 mv_desc_get_dest_addr(struct mv_xor_desc_slot *desc)
> -{
> - struct mv_xor_desc *hw_desc = desc->hw_desc;
> - return hw_desc->phy_dest_addr;
> -}
> -
>  static void mv_desc_set_byte_count(struct mv_xor_desc_slot *desc,
>  u32 byte_count)
>  {
> -- 
> 1.8.4.4
> 

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/3] bq2415x_charger: Use power_supply notifier for automode

2013-11-27 Thread Pali Rohár
On Thursday 28 November 2013 01:25:50 Sebastian Reichel wrote:
> On Wed, Nov 27, 2013 at 10:16:47PM +0100, Pali Rohár wrote:
> > On Monday 25 November 2013 22:50:01 Sebastian Reichel wrote:
> > > > 2 seems more generic to me, but as rx51-battery is
> > > > missing the functionality to send events on temperature
> > > > change, I guess 1 will be easier to implement.
> > > 
> > > The temperature must be polled anyway, if the ADC does not
> > > support interrupts.
> > 
> > Yes, ADC does not support interrupts, temperature must be
> > polled. Also bq27200 chip does not support interrupts, but
> > bq27x00_battery driver using delayed work which every 60s
> > poll all values (timeout can be configured via modprobe
> > param). So similar code can be added to rx51_battery.ko
> > too.
> 
> I think the safest implementation would be:
> 
> bq2415x polls the temperature from rx51-battery in the bq2415x
> watchdog handler. That way discontinuation of the charge
> process is guaranteed.
> 
> To avoid useless ADC conversion the rx51-battery driver caches
> the converted temperature value for a reasonable time (e.g.
> 10 seconds). This helps if multiple users want to read the
> battery temperature (e.g. userspace).
> 
> This also means, that the kernel stuff can handle charging
> autonomously and the userland daemon checks the battery
> temperature only for emergency shutdown (I guess the
> temperatures for stopping the charging and emergency shutdown
> are different).
> 
> IMHO it makes sense to move the emergency shutdown also into
> the kernel (but different driver!) in the future, but that's
> another topic :)
> 
> -- Sebastian

Just to note, here is original nokia table of temperature limits:

https://gitorious.org/rx51-bme-replacement/dsme-thermalobject-surface/source/master:modules/thermalobject_surface.c#L40

-- 
Pali Rohár
pali.ro...@gmail.com


signature.asc
Description: This is a digitally signed message part.


Re: [PATCH] m68k : Kill CONFIG_MTD_PARTITIONS

2013-11-27 Thread Greg Ungerer
Hi EunBong,

On 28/11/13 11:04, Eunbong Song wrote:
>> Hi Eunbong,
> 
>> Acked-by: Greg Ungerer
> 
>> Did you want me to pick this up for the m68knommu git tree?
> Hello, Greg.
> I'm glad if  you pick up this patch.

Ok, done. Added it to the for-next branch of the m68knommu git
tree. Thanks.

Regards
Greg



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] cpufreq, highbank: enable ECME thermal notifications

2013-11-27 Thread Viresh Kumar
Hi Mark,

Sorry for a months delay.. I would suggest adding specific people in --to
field whom you want to review your patches. That's always helpful.

On Sun, Oct 20, 2013 at 5:24 AM, Mark Langsdorf
 wrote:
> The ECME sends thermal messages with a maximum and minimum allowed
> frequency when the SoC status reaches certain trip points known to the
> ECME. Use a notifier function to capture those messages and pass them
> to a work-queued function that can trigger a policy re-evaluation by
> cpufreq, capping the allowable frequencies.
>
> The core of the policy adjusting code was taken from
> drivers/thermal/cpu_cooling.c.
>
> Signed-off-by: Mark Langsdorf 
> ---
>  drivers/cpufreq/highbank-cpufreq.c | 117 
> +
>  1 file changed, 117 insertions(+)
>
> diff --git a/drivers/cpufreq/highbank-cpufreq.c 
> b/drivers/cpufreq/highbank-cpufreq.c
> index b61b5a3..f0521d3 100644
> --- a/drivers/cpufreq/highbank-cpufreq.c
> +++ b/drivers/cpufreq/highbank-cpufreq.c
> @@ -17,15 +17,32 @@
>  #include 
>  #include 
>  #include 
> +#include 

How was this missing earlier :)

>  #include 
>  #include 
>  #include 
>  #include 
> +#include 

Please add this after cpu.h to keep them in ascending order..
Also rearrange module and kernel.h to make this list sorted out in the
same patch..

> +#include 
>
>  #define HB_CPUFREQ_CHANGE_NOTE 0x8001
> +#define HB_CPUFREQ_HEALTH_NOTE  0x1001
> +#define HB_CPUFREQ_TPS_REPORT  1
>  #define HB_CPUFREQ_IPC_LEN 7
>  #define HB_CPUFREQ_VOLT_RETRIES15
>
> +#define NOTIFY_INVALID NULL

Not actually required. Just use NULL everywhere you want this..

> +struct hb_notify_device {
> +   unsigned long max_freq;
> +   unsigned long min_freq;
> +   unsigned long thermal_state;
> +   struct cpumask *allowed_cpus;

Because this is always equal to cpu_possible_mask, you
don't actually need it..

> +};
> +
> +static struct hb_notify_device hb_records, *notify_device = NOTIFY_INVALID;
> +static struct work_struct hb_thermal_wq;
> +
>  static int hb_voltage_change(unsigned int freq)
>  {
> u32 msg[HB_CPUFREQ_IPC_LEN] = {HB_CPUFREQ_CHANGE_NOTE, freq / 
> 100};
> @@ -33,6 +50,89 @@ static int hb_voltage_change(unsigned int freq)
> return pl320_ipc_transmit(msg);
>  }
>
> +static int hb_thermal_cpufreq_notify(struct notifier_block *nb,
> +   unsigned long event, void *data)
> +{
> +   struct cpufreq_policy *policy = data;
> +   unsigned long max_freq = 0, min_freq = 0;
> +
> +   if (event != CPUFREQ_ADJUST || notify_device == NOTIFY_INVALID)

I believe you should check notify_device == NOTIFY_INVALID first and then
value of event.

> +   return 0;
> +
> +   if (cpumask_test_cpu(policy->cpu, notify_device->allowed_cpus)) {

What's your system configuration (Just for my understanding)? How many
CPUs you have? They are sharing clock lines?

And anyway how can policy->cpu be out of cpu_possible_mask ?

> +   max_freq = notify_device->max_freq;
> +   min_freq = notify_device->min_freq;
> +   }
> +
> +   /* Never exceed user_policy.max */
> +   if (max_freq > policy->user_policy.max)
> +   max_freq = policy->user_policy.max;
> +   if (min_freq < policy->user_policy.min)
> +   min_freq = policy->user_policy.min;
> +
> +   if ((policy->max != max_freq) || (policy->min != min_freq))

Probably just call cpufreq_verify_within_limits() directly as we have
similar checks in there, it isn't too heavy ?

> +   cpufreq_verify_within_limits(policy, min_freq, max_freq);
> +
> +   return 0;
> +}
> +
> +static struct notifier_block hb_thermal_cpufreq_nb = {
> +   .notifier_call = hb_thermal_cpufreq_notify,
> +};
> +
> +/*
> + * We can't call cpufreq_adjust_policy from inside a notifier, so
> + * do it from inside a workqueue
> + */

Why exactly? Adding that into comment would be more useful..

> +static void hb_thermal_wq_task(struct work_struct *work)
> +{
> +   unsigned int cpuid;
> +   struct cpumask *mask = hb_records.allowed_cpus;
> +
> +   notify_device = _records;
> +
> +   for_each_cpu(cpuid, mask) {

That's not a optimal solution. We have CPUs in groups normally
(i.e. sharing clock lines) and so each policy structure might be
common for few.. And so we need to do below only for policy->cpu
and no other CPUs from policy->affected_cpus.

> +   struct cpufreq_policy policy;
> +   if (cpufreq_get_policy(, cpuid) == 0) {

Do you expect any CPUs to exist without a policy? If not, then
this check is just not required (Its heavy, memcpy)..

Otherwise, also just call cpufreq_update_policy() and check
return type for -ENODEV, which is returned if there is no
policy associated with it..

> +   cpufreq_update_policy(cpuid);
> +   break;
> +   }
> +   }
> +   notify_device = NOTIFY_INVALID;
> +}

[PATCH] perf target: Move what map fucntion to call check into function.

2013-11-27 Thread Dongsheng Yang
Check for cpu_map__dummy_new() or cpu_map__new() to be called in
perf_evlist__create_maps() is more complicated. This patch moves
the checking work into target.h, combining two conditions and making
perf_evlist__create_maps() more readable.

Signed-off-by: Dongsheng Yang 
---
 tools/perf/util/evlist.c |  8 +---
 tools/perf/util/target.h | 13 +
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 76fa764..7bb6ee1 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -819,13 +819,7 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, 
struct target *target)
if (evlist->threads == NULL)
return -1;
 
-   if (target->default_per_cpu)
-   evlist->cpus = target->per_thread ?
-   cpu_map__dummy_new() :
-   cpu_map__new(target->cpu_list);
-   else if (target__has_task(target))
-   evlist->cpus = cpu_map__dummy_new();
-   else if (!target__has_cpu(target) && !target->uses_mmap)
+   if (target__uses_dummy_map(target))
evlist->cpus = cpu_map__dummy_new();
else
evlist->cpus = cpu_map__new(target->cpu_list);
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index 31dd2e9..7381b1c 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -63,4 +63,17 @@ static inline bool target__none(struct target *target)
return !target__has_task(target) && !target__has_cpu(target);
 }
 
+static inline bool target__uses_dummy_map(struct target *target)
+{
+   bool use_dummy = false;
+
+   if (target->default_per_cpu)
+   use_dummy = target->per_thread ? true : false;
+   else if (target__has_task(target) ||
+(!target__has_cpu(target) && !target->uses_mmap))
+   use_dummy = true;
+
+   return use_dummy;
+}
+
 #endif /* _PERF_TARGET_H */
-- 
1.8.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix race between oom kill and task exit

2013-11-27 Thread Johannes Weiner
Cc William and azur who might have encountered this problem.

On Thu, Nov 28, 2013 at 05:09:16AM +, Ma, Xindong wrote:
> From: Leon Ma 
> Date: Thu, 28 Nov 2013 12:46:09 +0800
> Subject: [PATCH] Fix race between oom kill and task exit
> 
> There is a race between oom kill and task exit. Scenario is:
>TASK  A  TASK  B
> TASK B is selected to oom kill
> in oom_kill_process()
> check PF_EXITING of TASK B
> task call do_exit()
> task set PF_EXITING flag
> write_lock_irq(_lock);
> remove TASK B from thread group in 
> __unhash_process()
> write_unlock_irq(_lock);
> read_lock(_lock);
> traverse threads of TASK B
> read_unlock(_lock);
> 
> After that, the following traversal of threads in TASK B will not end because 
> TASK B is not in the thread group:
> do {
> 
> } while_each_thread(p, t);
> 
> Signed-off-by: Leon Ma 
> Signed-off-by: xiaobing tu 
> ---
>  mm/oom_kill.c |   20 ++--
>  1 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 1e4a600..32ec88d 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -412,16 +412,6 @@ void oom_kill_process(struct task_struct *p, gfp_t 
> gfp_mask, int order,
>   static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
> DEFAULT_RATELIMIT_BURST);
>  
> - /*
> -  * If the task is already exiting, don't alarm the sysadmin or kill
> -  * its children or threads, just set TIF_MEMDIE so it can die quickly
> -  */
> - if (p->flags & PF_EXITING) {
> - set_tsk_thread_flag(p, TIF_MEMDIE);
> - put_task_struct(p);
> - return;
> - }
> -
>   if (__ratelimit(_rs))
>   dump_header(p, gfp_mask, order, memcg, nodemask);
>  
> @@ -437,6 +427,16 @@ void oom_kill_process(struct task_struct *p, gfp_t 
> gfp_mask, int order,
>* still freeing memory.
>*/
>   read_lock(_lock);
> + /*
> +  * If the task is already exiting, don't alarm the sysadmin or kill
> +  * its children or threads, just set TIF_MEMDIE so it can die quickly
> +  */
> + if (p->flags & PF_EXITING) {
> + set_tsk_thread_flag(p, TIF_MEMDIE);
> + put_task_struct(p);
> + read_unlock(_lock);
> + return;
> + }
>   do {
>   list_for_each_entry(child, >children, sibling) {
>   unsigned int child_points;
> -- 
> 1.7.4.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5] serial: 8250_pci: use DEFINE_PCI_DEVICE_TABLE macro

2013-11-27 Thread Joe Perches
On Wed, 2013-11-27 at 21:53 -0800, 'Greg Kroah-Hartman' wrote:
> On Wed, Nov 27, 2013 at 09:40:13PM -0800, Joe Perches wrote:
> > On Thu, 2013-11-28 at 14:29 +0900, Jingoo Han wrote:
> > > On Thursday, November 28, 2013 1:08 PM, Greg Kroah-Hartman wrote:
> > > > On Thu, Nov 28, 2013 at 10:55:35AM +0900, Jingoo Han wrote:
> > > > > This macro is used to create a struct pci_device_id array.
> > > > 
> > > > Yeah, and it's a horrid macro that deserves to be removed, please don't
> > > > use it in more places.
> > > > 
> > > > Actually, if you could just remove it, that would be best, sorry, I'm
> > > > not going to take these patches.
> > > 
> > > (+cc Joe Perches, Andrew Morton, Andy Whitcroft)
> > > 
> > > Hi Joe Perches,
> > > 
> > > Would you fix checkpatch.pl about DEFINE_PCI_DEVICE_TABLE?
> > > Currently, checkpatch.pl guides to use DEFINE_PCI_DEVICE_TABLE
> > > as below.
> > > 
> > >   WARNING: Use DEFINE_PCI_DEVICE_TABLE for struct pci_device_id
> > >   #331: FILE: drivers/usb/host/ehci-pci.c:331:
> > >   +static const struct pci_device_id pci_ids [] = { {
> > > 
> > > However, Greg Kroah-Hartman mentioned that DEFINE_PCI_DEVICE_TABLE
> > > shouldn't be used anymore.
> > > 
> > > So, would you change checkpatch.pl in order to guide to use
> > > struct pci_device_id instead of DEFINE_PCI_DEVICE_TABLE?
> > > 
> > > For example,
> > >   WARNING: Use struct pci_device_id instead of DEFINE_PCI_DEVICE_TABLE
> > 
> > The documentation doesn't agree with Greg.
[]
> I say just remove it, I should have done that years ago when I was the
> PCI maintainer, just never got around to it.  No other bus has something
> like this for their device ids, why should PCI be "special"?

Anyone else have an opinion?

I don't care one way or another, but please, one way
not two.

Changing checkpatch is a trifle, but there are a _lot_
of maintainers to work through if it's to be removed.

It'll probably take several releases.

$ git grep --name-only -w DEFINE_PCI_DEVICE_TABLE | \
  cut -f1,2 -d/ | uniq -c
  1 Documentation/PCI
  1 arch/x86
  1 drivers/bcma
  3 drivers/block
  1 drivers/char
  1 drivers/cpufreq
  2 drivers/dma
 18 drivers/edac
  6 drivers/gpio
  6 drivers/gpu
  6 drivers/hwmon
 20 drivers/i2c
  2 drivers/infiniband
  1 drivers/ipack
  1 drivers/leds
  3 drivers/media
 10 drivers/mfd
  2 drivers/misc
  1 drivers/mmc
  1 drivers/mtd
132 drivers/net
  1 drivers/ntb
  1 drivers/pci
  5 drivers/pcmcia
  2 drivers/platform
  1 drivers/ptp
  1 drivers/rapidio
  7 drivers/scsi
  3 drivers/spi
 65 drivers/staging
  3 drivers/tty
  1 drivers/uio
  5 drivers/usb
  1 drivers/video
  1 drivers/virtio
  3 drivers/vme
  9 drivers/watchdog
  1 drivers/xen
  1 include/linux
  1 scripts/checkpatch.pl
  1 scripts/tags.sh
  1 sound/oss
 67 sound/pci


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Q: use vlan in container

2013-11-27 Thread Libo Chen
Hello LXC experts,

I meet a problem. When using vlan as network device in suse11 system 
container,
I can not use halt to stop this container. It hung on "eth0 is still used from 
interfaces eth0" in cycle.

The config file:

lxc.network.type = vlan
lxc.network.flags = up
lxc.network.link = eth0
lxc.network.name = eth0
lxc.network.vlan.id = 1301
lxc.network.ipv4 = 128.5.131.100/24


The reason is in the shell command /sbin/ifdown, see below:

##
# Shut down depending interfaces
#
# Check if there are interfaces which depend on this interface. If yes these
# have to be shut down first.
# For example these might be bonding or vlan interfaces. Note that we don't
# catch all types of depending interfaces currently. See function
# 'get_depending_ifaces' in file 'functions' for details.
#
test "$SCRIPTNAME" = ifdown && DEP_IFACES=`get_depending_ifaces $INTERFACE`
if [ "$?" = 0 -a "$NODEPS" != yes ] ; then
message "`printf "%-9s is still used from interfaces %s" \
 $INTERFACE "$DEP_IFACES"`"
for DI in $DEP_IFACES; do
ifdown $DI -o $OPTIONS
done

message "`printf "%-9s now going down itself" $INTERFACE`"
# check if iface is (still) avaliable
# [bonding master may go down itself
#  while the last slave gets removed]
if ! is_iface_available $INTERFACE; then
exit $R_SUCCESS
fi
fi


$DEP_IFACES is also eth0 in this scene, so ifdown will call ifdown again and 
again.

if we set lxc.network.name = eth1, it will be ok, so can we add a judgment to 
make
lxc.network.link and lxc.network.name are not equal in lxc-start command.

simple implement like:

if [ lxc.network.type == vlan ] ; then
if [ lxc.network.link == lxc.network.name ] ; then
return false
fi
fi


Is it reasonable?  or any other way to achieve this?



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 2/2 v2] irqchip: mmp: add dt support for wakeup

2013-11-27 Thread Neil Zhang
Mark,

Sorry for reply late.

> -Original Message-
> From: Mark Rutland [mailto:mark.rutl...@arm.com]
> Sent: 2013年11月15日 20:50
> To: Neil Zhang
> Cc: Haojian Zhuang; devicet...@vger.kernel.org; linux-kernel@vger.kernel.org;
> Thomas Gleixner
> Subject: Re: [PATCH 2/2 v2] irqchip: mmp: add dt support for wakeup
> 
> On Fri, Nov 15, 2013 at 11:49:20AM +, Neil Zhang wrote:
> >
> > > -Original Message-
> > > From: Mark Rutland [mailto:mark.rutl...@arm.com]
> > > Sent: 2013年11月14日 20:28
> > > To: Haojian Zhuang
> > > Cc: Neil Zhang; devicet...@vger.kernel.org;
> > > linux-kernel@vger.kernel.org; Thomas Gleixner
> > > Subject: Re: [PATCH 2/2 v2] irqchip: mmp: add dt support for wakeup
> > >
> > > On Thu, Nov 14, 2013 at 10:28:53AM +, Haojian Zhuang wrote:
> > > > On Fri, Oct 11, 2013 at 4:23 PM, Neil Zhang 
> wrote:
> > > > > Some of the Marvell SoCs use GIC as its interrupt controller,and
> > > > > ICU only used as wakeup logic. When AP subsystem is powered off,
> > > > > GIC will lose its context, the PMU will need ICU to wakeup the AP
> subsystem.
> > > > > So add wakeup entry for such kind of usage.
> > > > >
> > > > > Signed-off-by: Neil Zhang 
> > > > > ---
> > > > >  .../devicetree/bindings/arm/mrvl/intc.txt  |   14 ++-
> > > > >  drivers/irqchip/irq-mmp.c  |  124
> > > 
> > > > >  include/linux/irqchip/mmp.h|   13 ++
> > > > >  3 files changed, 150 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/Documentation/devicetree/bindings/arm/mrvl/intc.txt
> > > > > b/Documentation/devicetree/bindings/arm/mrvl/intc.txt
> > > > > index 8b53273..4180928 100644
> > > > > --- a/Documentation/devicetree/bindings/arm/mrvl/intc.txt
> > > > > +++ b/Documentation/devicetree/bindings/arm/mrvl/intc.txt
> > > > > @@ -2,7 +2,7 @@
> > > > >
> > > > >  Required properties:
> > > > >  - compatible : Should be "mrvl,mmp-intc", "mrvl,mmp2-intc" or
> > > > > -  "mrvl,mmp2-mux-intc"
> > > > > +  "mrvl,mmp2-mux-intc", "mrvl,mmp-intc-wakeupgen"
> > >
> > > Why do we need a new compatible string?
> >
> > As the patch comments said, we don't use the ICU as an interrupt
> > controller in some Marvell Socs, Just use them to wakeup CPU when GIC is
> powered off.
> 
> Hmm. Is it possible to use the ICU as an interrupt controller in those SoCs?

No, we have to use GIC as an interrupt controller since they are SMP system.

> 
> >
> > >
> > > > >  - reg : Address and length of the register set of the interrupt 
> > > > > controller.
> > > > >If the interrupt controller is intc, address and length means the 
> > > > > range
> > > > >of the whold interrupt controller. If the interrupt
> > > > > controller is mux-intc, @@ -15,6 +15,9 @@ Required properties:
> > > > >  - interrupt-controller : Identifies the node as an interrupt 
> > > > > controller.
> > > > >  - #interrupt-cells : Specifies the number of cells needed to encode 
> > > > > an
> > > > >interrupt source.
> > > > > +- mrvl,intc-gbl-mask : Specifies the address and value for
> > > > > +global mask in the
> > > > > +  interrupt controller.
> > > >
> > > > As my understanding, we should avoid to write register settings in DTS 
> > > > file.
> > >
> > > In general, yes. We should describe the hardware and let Linux
> > > choose how to configure it as far as possible.
> > >
> > > What is this global mask? What is it used for? Why do there seem to
> > > be multiple global masks (judging by the example)?
> > >
> >
> > Global mask will prevent distributing interrupt from ICU to GIC.
> > Since we will use GIC as the interrupt controller, so we need to mask the 
> > ICU
> global mask.
> > ICU has connection to every core in the system, so we need to mask all 
> > global
> mask registers for each core.
> 
> Why can the driver not figure out these masks for itself?

Different SoCs will have different global mask registers, so it's not suitable 
to hard code in driver.

> 
> >
> > > >
> > > > Loop devicetree guys.
> > > >
> > > > > +- mrvl,intc-for-cp : Specifies the irqs that will be routed to
> > > > > +cp
> > >
> > > cp?
> > >
> > > _why_ do we need this, and what exactly does routing the irqs to the cp
> imply?
> >
> > Communication processor.
> > Kernel should avoid to handle the irq lines that has been routed to
> communication processor.
> 
> Ok. Does this just tell the kernel the set of IRQs to ignore, or does this 
> imply that
> the kernel must configure something in the hardware based on this? If so, what
> specifically?
> 

As the patch did, we need to configure it to cp when init and kernel will 
ignore to change them in runtime.

> >
> > >
> > > > >  - mrvl,intc-nr-irqs : Specifies the number of interrupts in the 
> > > > > interrupt
> > > > >controller.
> > > > >  - mrvl,clr-mfp-irq : Specifies the interrupt that needs to
> > > > > clear MFP edge @@ -39,6 +42,15 @@ Example:
> > > > > mrvl,intc-nr-irqs = <2>;
> > > > > };
> > > > 

Re: [PATCH] watchdog: davinci: rename platform driver to davinci-wdt

2013-11-27 Thread Sekhar Nori
On Wednesday 27 November 2013 09:27 PM, Guenter Roeck wrote:
> On 11/27/2013 06:00 AM, Sekhar Nori wrote:
>> On Wednesday 27 November 2013 07:01 PM, Ivan Khoronzhuk wrote:
>>> As we switch to use the watchdog core which permits more than one
>>> active watchdog in the system, rename platform driver to
>>> "davinci-wdt" to be identifiable.
>>>
>>> Signed-off-by: Ivan Khoronzhuk 
>>
>> Looks good to me. Since bulk of this patch touches mach-davinci, I would
>> like to take this through my tree to avoid conflicts with other
>> mach-davinci patches I accept.
>>
> 
> Good idea, and makes sense.

Added to v3.14/soc with your Reviewed-by

Thanks,
Sekhar

[1]
https://git.kernel.org/cgit/linux/kernel/git/nsekhar/linux-davinci.git/log/?h=v3.14/soc
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] virtio-net: make all RX paths handle erors consistently

2013-11-27 Thread Jason Wang
On 11/28/2013 12:31 AM, Michael S. Tsirkin wrote:
> receive mergeable now handles errors internally.
> Do same for big and small packet paths, otherwise
> the logic is too hard to follow.
>
> Signed-off-by: Michael S. Tsirkin 
> ---
>
> While I can't point at a bug this fixes, I'm not sure
> there's no bug in the existing logic.
> So not exactly a bug fix bug I think it's justified for net.
>
>  drivers/net/virtio_net.c | 53 
> +---
>  1 file changed, 37 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 0e6ea69..97c6212 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -299,6 +299,35 @@ static struct sk_buff *page_to_skb(struct receive_queue 
> *rq,
>   return skb;
>  }
>  
> +static struct sk_buff *receive_small(void *buf, unsigned int len)
> +{
> + struct sk_buff * skb = buf;
> +
> + len -= sizeof(struct virtio_net_hdr);
> + skb_trim(skb, len);
> +
> + return skb;
> +}
> +
> +static struct sk_buff *receive_big(struct net_device *dev,
> +struct receive_queue *rq,
> +void *buf,
> +unsigned int len)
> +{
> + struct page *page = buf;
> + struct sk_buff *skb = page_to_skb(rq, page, 0, len, PAGE_SIZE);
> +
> + if (unlikely(!skb))
> + goto err;
> +
> + return skb;
> +
> +err:
> + dev->stats.rx_dropped++;
> + give_pages(rq, page);
> + return NULL;
> +}
> +
>  static struct sk_buff *receive_mergeable(struct net_device *dev,
>struct receive_queue *rq,
>void *buf,
> @@ -407,23 +436,15 @@ static void receive_buf(struct receive_queue *rq, void 
> *buf, unsigned int len)
>   return;
>   }
>  
> - if (!vi->mergeable_rx_bufs && !vi->big_packets) {
> - skb = buf;
> - len -= sizeof(struct virtio_net_hdr);
> - skb_trim(skb, len);
> - } else if (vi->mergeable_rx_bufs) {
> + if (vi->mergeable_rx_bufs)
>   skb = receive_mergeable(dev, rq, buf, len);
> - if (unlikely(!skb))
> - return;
> - } else {
> - page = buf;
> - skb = page_to_skb(rq, page, 0, len, PAGE_SIZE);
> - if (unlikely(!skb)) {
> - dev->stats.rx_dropped++;
> - give_pages(rq, page);
> - return;
> - }
> - }
> + else if (vi->big_packets)
> + skb = receive_big(dev, rq, buf, len);
> + else
> + skb = receive_small(buf, len);
> +
> + if (unlikely(!skb))
> + return;
>  
>   hdr = skb_vnet_hdr(skb);
>  

Acked-by: Jason Wang 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 09/41] sysfs, kernfs: introduce kernfs_create_dir[_ns]()

2013-11-27 Thread Greg KH
On Sat, Nov 23, 2013 at 06:07:04PM -0500, Tejun Heo wrote:
> Introduce kernfs interface to manipulate a directory which takes and
> returns sysfs_dirents.
> 
> create_dir() is renamed to kernfs_create_dir_ns() and its argumantes
> and return value are updated.  create_dir() usages are replaced with
> kernfs_create_dir_ns() and sysfs_create_subdir() usages are replaced
> with kernfs_create_dir().  Dup warnings are handled explicitly by
> sysfs users of the kernfs interface.
> 
> sysfs_enable_ns() is renamed to kernfs_enable_ns().
> 
> This patch doesn't introduce any behavior changes.
> 
> v2: Dummy implementation for !CONFIG_SYSFS updated to return -ENOSYS.
> 
> v3: kernfs_enable_ns() added.
> 
> v4: Refreshed on top of "sysfs: drop kobj_ns_type handling, take #2"
> so that this patch removes sysfs_enable_ns().

Care to go for v5?  This doesn't apply on my tree anymore, I'm pretty
sure due to the other fixes in fs/sysfs/dir.c.

I tried to do the merge myself, but I don't think I got it right at all,
so could you just check out my driver-core.git tree on the
driver-core-next branch and refresh this patch?

thanks so much, I've applied the other 8 patches in this series already,
they seem fine.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v11 7/7] ARM: tegra: support Trusted Foundations by default

2013-11-27 Thread Alexandre Courbot
On Wed, Nov 27, 2013 at 1:47 AM, Dave Martin  wrote:
> On Tue, Nov 26, 2013 at 10:35:58AM +0900, Alexandre Courbot wrote:
>> On Tue, Nov 26, 2013 at 9:06 AM, Olof Johansson  wrote:
>> > On Sun, Nov 24, 2013 at 03:30:52PM +0900, Alexandre Courbot wrote:
>> >> Support for Trusted Foundations is light and allows the kernel to run on
>> >> a wider range of devices, so enable it by default.
>> >>
>> >> Signed-off-by: Alexandre Courbot 
>> >> Reviewed-by: Tomasz Figa 
>> >> Reviewed-by: Stephen Warren 
>> >> ---
>> >>  arch/arm/configs/tegra_defconfig | 1 +
>> >>  1 file changed, 1 insertion(+)
>> >
>> > I think we want this enabled on multi_v7_defconfig too? Send a separate
>> > patch for that once this is merged though.
>>
>> Will do.
>
> Should it just be default y if one of the relevant
> CONFIG_ARCH_TEGRA_*_SOC is selected?
>
> That way, it's automatically included if relevant, and automatically
> excluded if not -- regardless of whether the kernel is multiplatform
> or not.

So basically, that would mean setting the default to 'y' since the
option is not available unless a supported platform is included?

I'm fine this way too, if Stephen also agrees.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [f2fs-dev] [PATCH] f2fs: readahead contiguous pages for restore_node_summary

2013-11-27 Thread Chao Yu
Hi,

> -Original Message-
> From: Jaegeuk Kim [mailto:jaegeuk@samsung.com]
> Sent: Thursday, November 28, 2013 11:33 AM
> To: Chao Yu
> Cc: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; 
> linux-f2fs-de...@lists.sourceforge.net; '谭姝'
> Subject: RE: [f2fs-dev] [PATCH] f2fs: readahead contiguous pages for 
> restore_node_summary
> 
> Hi,
> 
> 2013-11-28 (목), 09:26 +0800, Chao Yu:
> > Hi Kim,
> >
> > > -Original Message-
> > > From: Jaegeuk Kim [mailto:jaegeuk@samsung.com]
> > > Sent: Wednesday, November 27, 2013 4:19 PM
> > > To: Chao Yu
> > > Cc: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; 
> > > linux-f2fs-de...@lists.sourceforge.net; '谭姝'
> > > Subject: RE: [f2fs-dev] [PATCH] f2fs: readahead contiguous pages for 
> > > restore_node_summary
> > >
> > > Hi,
> > >
> > > 2013-11-27 (수), 15:58 +0800, Chao Yu:
> > > > Hi Kim,
> > > >
> > > > > -Original Message-
> > > > > From: Jaegeuk Kim [mailto:jaegeuk@samsung.com]
> > > > > Sent: Wednesday, November 27, 2013 1:30 PM
> > > > > To: Chao Yu
> > > > > Cc: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; 
> > > > > linux-f2fs-de...@lists.sourceforge.net; 谭姝
> > > > > Subject: Re: [f2fs-dev] [PATCH] f2fs: readahead contiguous pages for 
> > > > > restore_node_summary
> > > > >
> > > > > Hi Chao,
> > > > >
> > > > > It seems that we already have a readahed function for node pages,
> > > > > ra_node_page().
> > > > > So, we don't make a page list for this, but can use the node_inode's
> > > > > page cache.
> > > >
> > > > So you mean it's waste to release page list with updated data after we
> > > > finish work in restore_node_summary, right?
> > >
> > > Right.
> >
> > So how about add all pages of page list to node_inode's address space by
> > add_to_page_cache_lru() with arg sum_entry->nid?
> 
> I don't think it's proper way to use add_to_page_cache_lru() directly.

This is the way used in VM readahead(i.e. read_pages/mpage_readpages/
read_cache_pages).
So what you worry about is that using lonely add_to_page_cache_lru()
may cause exception, is it?

> 
> >
> > >
> > > >
> > > > >
> > > > > So how about writing ra_node_pages() which use the node_inode's page
> > > > > cache?
> > > >
> > > > Hmm, so ra_node_pages is introduced for read node_inode's pages which 
> > > > are
> > > > logical contiguously? and it also could take place of ra_node_page?
> > >
> > > Ah. The ra_node_page() read a node page ahead for a given node id.
> > > So it doesn't match exactly between ra_node_page() and ra_node_pages()
> > > that I suggested.
> > > So how about reading node pages and then caching some of them in the
> > > page cache, node_inode's address space?
> >
> > Got it,
> > If we do not use the method above, we should search the NAT for nid number
> > as the index of node_inode's page by the specified node page blkaddr, that 
> > costs
> > a lot.
> > How do you think?
> 
> 1. grab_cache_page(node_footer->nid);
> 2. memcpy();
> 3. SetPageUptodate();
> 4. f2fs_put_page();

It could be.

This make ra_node_pages() synchronized, because we should read node_footer->nid
from updated node page before we cache node pages, and we will still use page 
list to
pass the updated page.

Why not introduce f2fs_cache_node_pages() include your code to cache node pages 
after
ra_node_pages()?

Thanks,
Yu

> 
> Thanks,
> 
> >
> > >
> > > Thanks,
> > >
> > > >
> > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > 2013-11-22 (금), 15:48 +0800, Chao Yu:
> > > > > > If cp has no CP_UMOUNT_FLAG, we will read all pages in whole node 
> > > > > > segment
> > > > > > one by one, it makes low performance. So let's merge contiguous 
> > > > > > pages and
> > > > > > readahead for better performance.
> > > > > >
> > > > > > Signed-off-by: Chao Yu 
> > > > > > ---
> > > > > >  fs/f2fs/node.c |   89 
> > > > > > +++-
> > > > > >  1 file changed, 63 insertions(+), 26 deletions(-)
> > > > > >
> > > > > > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> > > > > > index 4ac4150..81e704a 100644
> > > > > > --- a/fs/f2fs/node.c
> > > > > > +++ b/fs/f2fs/node.c
> > > > > > @@ -1572,47 +1572,84 @@ int recover_inode_page(struct f2fs_sb_info 
> > > > > > *sbi, struct page *page)
> > > > > > return 0;
> > > > > >  }
> > > > > >
> > > > > > +/*
> > > > > > + * ra_sum_pages() merge contiguous pages into one bio and submit.
> > > > > > + * these pre-readed pages are linked in pages list.
> > > > > > + */
> > > > > > +static int ra_sum_pages(struct f2fs_sb_info *sbi, struct list_head 
> > > > > > *pages,
> > > > > > +   int start, int nrpages)
> > > > > > +{
> > > > > > +   struct page *page;
> > > > > > +   int page_idx = start;
> > > > > > +
> > > > > > +   for (; page_idx < start + nrpages; page_idx++) {
> > > > > > +   /* alloc temporal page for read node summary info*/
> > > > > > +   page = alloc_page(GFP_NOFS | __GFP_ZERO);
> > > > > > +

Re: [PATCH v3 5/9] usb: gadget: s3c-hsotg: use generic phy_init()/phy_exit() support

2013-11-27 Thread Kishon Vijay Abraham I
On Thursday 28 November 2013 04:06 AM, Matt Porter wrote:
> On Wed, Nov 27, 2013 at 12:13:25PM -0500, Matt Porter wrote:
>> On Tue, Nov 26, 2013 at 03:53:32PM +0530, Kishon Vijay Abraham I wrote:
>>> Hi,
>>>
>>> On Monday 25 November 2013 11:46 PM, Matt Porter wrote:
 If a generic phy is present, call phy_init()/phy_exit(). This supports
 generic phys that must be soft reset before power on.

 Signed-off-by: Matt Porter 
 ---
  drivers/usb/gadget/s3c-hsotg.c | 5 +
  1 file changed, 5 insertions(+)

 diff --git a/drivers/usb/gadget/s3c-hsotg.c 
 b/drivers/usb/gadget/s3c-hsotg.c
 index da3879b..8dfe33f 100644
 --- a/drivers/usb/gadget/s3c-hsotg.c
 +++ b/drivers/usb/gadget/s3c-hsotg.c
 @@ -3622,6 +3622,9 @@ static int s3c_hsotg_probe(struct platform_device 
 *pdev)
goto err_supplies;
}
  
 +  if (hsotg->phy)
>>>
>>> IS_ERR? If your phy_get fails *phy* will have a error value..
>>
>> Yes, thanks. I'll fix these and also note that the same issue exists in
>> Kamil's patch for these same hsotg->phy conditional uses. I'll work with
>> Kamil to either get those addressed there or in a follow on fix.
> 
> I spoke too soon. If devm_phy_get fails, we don't set hsotg->phy and probe
> defer thus not reaching this point. Since hsotg->phy is either NULL or a
> valid struct phy *, this is correct as is throughout the driver.
> 
>>>
 +  phy_init(hsotg->phy);
 +
/* usb phy enable */
s3c_hsotg_phy_enable(hsotg);
  
 @@ -3715,6 +3718,8 @@ static int s3c_hsotg_remove(struct platform_device 
 *pdev)
}
  
s3c_hsotg_phy_disable(hsotg);
 +  if (hsotg->phy)
>>>
>>> same here.
>>
>> Ok.
> 
> Same above, this will be NULL on failure (but is only applicable at this
> point on the platform data path.

Ah ok.. Btw where is phy_get being called? Is it not part of this series?

Thanks
Kishon

> 
> -Matt
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5] serial: 8250_pci: use DEFINE_PCI_DEVICE_TABLE macro

2013-11-27 Thread 'Greg Kroah-Hartman'
On Wed, Nov 27, 2013 at 09:40:13PM -0800, Joe Perches wrote:
> On Thu, 2013-11-28 at 14:29 +0900, Jingoo Han wrote:
> > On Thursday, November 28, 2013 1:08 PM, Greg Kroah-Hartman wrote:
> > > On Thu, Nov 28, 2013 at 10:55:35AM +0900, Jingoo Han wrote:
> > > > This macro is used to create a struct pci_device_id array.
> > > 
> > > Yeah, and it's a horrid macro that deserves to be removed, please don't
> > > use it in more places.
> > > 
> > > Actually, if you could just remove it, that would be best, sorry, I'm
> > > not going to take these patches.
> > 
> > (+cc Joe Perches, Andrew Morton, Andy Whitcroft)
> > 
> > Hi Joe Perches,
> > 
> > Would you fix checkpatch.pl about DEFINE_PCI_DEVICE_TABLE?
> > Currently, checkpatch.pl guides to use DEFINE_PCI_DEVICE_TABLE
> > as below.
> > 
> >   WARNING: Use DEFINE_PCI_DEVICE_TABLE for struct pci_device_id
> >   #331: FILE: drivers/usb/host/ehci-pci.c:331:
> >   +static const struct pci_device_id pci_ids [] = { {
> > 
> > However, Greg Kroah-Hartman mentioned that DEFINE_PCI_DEVICE_TABLE
> > shouldn't be used anymore.
> > 
> > So, would you change checkpatch.pl in order to guide to use
> > struct pci_device_id instead of DEFINE_PCI_DEVICE_TABLE?
> > 
> > For example,
> >   WARNING: Use struct pci_device_id instead of DEFINE_PCI_DEVICE_TABLE
> 
> The documentation doesn't agree with Greg.
> 
> Documentation/PCI/pci.txt:
> 
> The ID table is an array of struct pci_device_id entries ending with an
> all-zero entry; use of the macro DEFINE_PCI_DEVICE_TABLE is the preferred
> method of declaring the table.

Then it should be fixed.

> Neither does the kernel tree:
> 
> $ git grep -w DEFINE_PCI_DEVICE_TABLE | wc -l
> 410
> 
> $ git grep -E "\bstruct\s+pci_device_id\s+\w+\s*\[\s*\]\s*=" | wc -l
> 376
> 
> Most of the 376 should be const and are not.
> 
> $ git grep -E "\bconst\s+struct\s+pci_device_id\s+\w+\s*\[\s*\]\s*=" | wc -l
> 155

Then fix those and make them const.

Hiding structures behind an odd (and misnamed) macro isn't the best
thing.

> Everything that uses DEFINE_PCI_DEVICE_TABLE is const.
> 
> $ git grep -A1 -E "define\s+DEFINE_PCI_DEVICE_TABLE"
> include/linux/pci.h:#define DEFINE_PCI_DEVICE_TABLE(_table) \
> include/linux/pci.h-const struct pci_device_id _table[]

I say just remove it, I should have done that years ago when I was the
PCI maintainer, just never got around to it.  No other bus has something
like this for their device ids, why should PCI be "special"?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: asm: Configure caches as per the defconfig

2013-11-27 Thread Amit Virdi

On 11/27/2013 5:44 PM, Russell King - ARM Linux wrote:

On Wed, Nov 27, 2013 at 05:24:04PM +0530, Amit Virdi wrote:

From: Amit VIRDI 

In the current implementation of the decompression code, the caches are enabled
irrespective of their configuration in the deconfig. This makes setting the
ICACHE and DCACHE disable options from the menuconfig irrelevant. Change this
implementation to enable caches only if specified in the defconfig.


NAK.  These options are provided more for ARM Ltd's validation of CPUs
rather than for users, and it's not supposed to be used with the
decompressor.



It is perfectly true that these options are used only during CPU 
validations and not in the end product. Still, it doesn't justify why 
these options are not to be used with decompressor. Or alternately, why 
would a user intend to disable a cache when it has been implemented 
correctly and is stable? Without this change, the effect of disabling 
cache is not reflected in entirety.


Regards
Amit Virdi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 01/03] clocksource: Add Kconfig entries for CMT, MTU2, TMU and STI

2013-11-27 Thread Magnus Damm
Hi John, everyone,

Let me get back to you all in a little while together with some code,
but before that let me just clarify this:

On Wed, Nov 13, 2013 at 5:47 AM, John Stultz  wrote:
> On Tue, Nov 12, 2013 at 4:26 AM, Magnus Damm  wrote:
>> On Sat, Nov 9, 2013 at 3:34 AM, John Stultz  wrote:

>> Let me get back to the kernel configuration. Of course, it would be
>> really nice if the kernel configuration was 100% fool proof, but what
>> happens if the user doesn't compile-in certain parts? That hardware
>> won't be used. What happens if wrong console device is passed on the
>> kernel command line? The friendly answer is usually "don't do that".
>>
>> So in case of the serial console, no driver - no output. You can still
>> use the network. If you have no timer then there won't be any timer
>> ticks. You can still get to user space though, but don't try to rely
>> on the timer. This CONFIG_TIMER=y/n case is pretty clear, but isn't
>> there a grey zone too?
>
> And on every new board, I have to fumble around with exactly those
> sorts of no-serial output issues. Its never something I consider a
> great use of my time :)
>
> And your example is a little flawed have no timer ticks, you're not
> getting to userspace. The system won't boot.

Correct me if I'm wrong here but I don't think so!

On some platforms that may very well be the case, but on mach-shmobile
we get to user space without any timers. If the timers are enabled
then they are regular platform devices these days. From my experience
the main blocker for going to user space without timer is on ARM
usually the udelay() calculation and/or the TWD delay, but those can
be handled with preset worst-case values for udelay() and getting the
rate via CCF for the TWD.

Also it of course depends on if some compiled-in driver needs to use
the timer. If we just enable the serial console then all our platforms
make it to initramfs-based user space without any clock source or
clock event.

I'd be happy to give you remote access to a board if you'd like to play. =)

Cheers,

/ magnus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5] serial: 8250_pci: use DEFINE_PCI_DEVICE_TABLE macro

2013-11-27 Thread Joe Perches
On Thu, 2013-11-28 at 14:29 +0900, Jingoo Han wrote:
> On Thursday, November 28, 2013 1:08 PM, Greg Kroah-Hartman wrote:
> > On Thu, Nov 28, 2013 at 10:55:35AM +0900, Jingoo Han wrote:
> > > This macro is used to create a struct pci_device_id array.
> > 
> > Yeah, and it's a horrid macro that deserves to be removed, please don't
> > use it in more places.
> > 
> > Actually, if you could just remove it, that would be best, sorry, I'm
> > not going to take these patches.
> 
> (+cc Joe Perches, Andrew Morton, Andy Whitcroft)
> 
> Hi Joe Perches,
> 
> Would you fix checkpatch.pl about DEFINE_PCI_DEVICE_TABLE?
> Currently, checkpatch.pl guides to use DEFINE_PCI_DEVICE_TABLE
> as below.
> 
>   WARNING: Use DEFINE_PCI_DEVICE_TABLE for struct pci_device_id
>   #331: FILE: drivers/usb/host/ehci-pci.c:331:
>   +static const struct pci_device_id pci_ids [] = { {
> 
> However, Greg Kroah-Hartman mentioned that DEFINE_PCI_DEVICE_TABLE
> shouldn't be used anymore.
> 
> So, would you change checkpatch.pl in order to guide to use
> struct pci_device_id instead of DEFINE_PCI_DEVICE_TABLE?
> 
> For example,
>   WARNING: Use struct pci_device_id instead of DEFINE_PCI_DEVICE_TABLE

The documentation doesn't agree with Greg.

Documentation/PCI/pci.txt:

The ID table is an array of struct pci_device_id entries ending with an
all-zero entry; use of the macro DEFINE_PCI_DEVICE_TABLE is the preferred
method of declaring the table.

Neither does the kernel tree:

$ git grep -w DEFINE_PCI_DEVICE_TABLE | wc -l
410

$ git grep -E "\bstruct\s+pci_device_id\s+\w+\s*\[\s*\]\s*=" | wc -l
376

Most of the 376 should be const and are not.

$ git grep -E "\bconst\s+struct\s+pci_device_id\s+\w+\s*\[\s*\]\s*=" | wc -l
155

Everything that uses DEFINE_PCI_DEVICE_TABLE is const.

$ git grep -A1 -E "define\s+DEFINE_PCI_DEVICE_TABLE"
include/linux/pci.h:#define DEFINE_PCI_DEVICE_TABLE(_table) \
include/linux/pci.h-const struct pci_device_id _table[]


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5] serial: 8250_pci: use DEFINE_PCI_DEVICE_TABLE macro

2013-11-27 Thread Jingoo Han
On Thursday, November 28, 2013 1:08 PM, Greg Kroah-Hartman wrote:
> On Thu, Nov 28, 2013 at 10:55:35AM +0900, Jingoo Han wrote:
> > This macro is used to create a struct pci_device_id array.
> 
> Yeah, and it's a horrid macro that deserves to be removed, please don't
> use it in more places.
> 
> Actually, if you could just remove it, that would be best, sorry, I'm
> not going to take these patches.

(+cc Joe Perches, Andrew Morton, Andy Whitcroft)

Hi Joe Perches,

Would you fix checkpatch.pl about DEFINE_PCI_DEVICE_TABLE?
Currently, checkpatch.pl guides to use DEFINE_PCI_DEVICE_TABLE
as below.

  WARNING: Use DEFINE_PCI_DEVICE_TABLE for struct pci_device_id
  #331: FILE: drivers/usb/host/ehci-pci.c:331:
  +static const struct pci_device_id pci_ids [] = { {

However, Greg Kroah-Hartman mentioned that DEFINE_PCI_DEVICE_TABLE
shouldn't be used anymore.

So, would you change checkpatch.pl in order to guide to use
struct pci_device_id instead of DEFINE_PCI_DEVICE_TABLE?

For example,
  WARNING: Use struct pci_device_id instead of DEFINE_PCI_DEVICE_TABLE

Thank you.

Best regards,
Jingoo Han

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.12 07/19] tcp: gso: fix truesize tracking

2013-11-27 Thread Ben Hutchings
On Mon, 2013-11-18 at 10:37 -0800, Greg Kroah-Hartman wrote:
> 3.12-stable review patch.  If anyone has any objections, please let me know.
> 
> --
> 
> From: Eric Dumazet 
> 
> [ Upstream commit 0d08c42cf9a71530fef5ebcfe368f38f2dd0476f ]
> 
> commit 6ff50cd55545 ("tcp: gso: do not generate out of order packets")
> had an heuristic that can trigger a warning in skb_try_coalesce(),
> because skb->truesize of the gso segments were exactly set to mss.
[...]

That commit went in 3.10, but this fix hasn't been applied to 3.10.y.

David, this code moved from net/ipv4/tcp.c to tcp_offload.c after 3.10
and this patch applies textually after adjusting the filename.  Please
consider including it in the next batch of stable fixes for 3.10.

Ben.

-- 
Ben Hutchings
Usenet is essentially a HUGE group of people passing notes in class.
  - Rachel Kadel, `A Quick Guide to Newsgroup Etiquette'


signature.asc
Description: This is a digitally signed message part


[PATCH] Fix race between oom kill and task exit

2013-11-27 Thread Ma, Xindong
From: Leon Ma 
Date: Thu, 28 Nov 2013 12:46:09 +0800
Subject: [PATCH] Fix race between oom kill and task exit

There is a race between oom kill and task exit. Scenario is:
   TASK  A  TASK  B
TASK B is selected to oom kill
in oom_kill_process()
check PF_EXITING of TASK B
task call do_exit()
task set PF_EXITING flag
write_lock_irq(_lock);
remove TASK B from thread group in 
__unhash_process()
write_unlock_irq(_lock);
read_lock(_lock);
traverse threads of TASK B
read_unlock(_lock);

After that, the following traversal of threads in TASK B will not end because 
TASK B is not in the thread group:
do {

} while_each_thread(p, t);

Signed-off-by: Leon Ma 
Signed-off-by: xiaobing tu 
---
 mm/oom_kill.c |   20 ++--
 1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 1e4a600..32ec88d 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -412,16 +412,6 @@ void oom_kill_process(struct task_struct *p, gfp_t 
gfp_mask, int order,
static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
  DEFAULT_RATELIMIT_BURST);
 
-   /*
-* If the task is already exiting, don't alarm the sysadmin or kill
-* its children or threads, just set TIF_MEMDIE so it can die quickly
-*/
-   if (p->flags & PF_EXITING) {
-   set_tsk_thread_flag(p, TIF_MEMDIE);
-   put_task_struct(p);
-   return;
-   }
-
if (__ratelimit(_rs))
dump_header(p, gfp_mask, order, memcg, nodemask);
 
@@ -437,6 +427,16 @@ void oom_kill_process(struct task_struct *p, gfp_t 
gfp_mask, int order,
 * still freeing memory.
 */
read_lock(_lock);
+   /*
+* If the task is already exiting, don't alarm the sysadmin or kill
+* its children or threads, just set TIF_MEMDIE so it can die quickly
+*/
+   if (p->flags & PF_EXITING) {
+   set_tsk_thread_flag(p, TIF_MEMDIE);
+   put_task_struct(p);
+   read_unlock(_lock);
+   return;
+   }
do {
list_for_each_entry(child, >children, sibling) {
unsigned int child_points;
-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.11.4: kernel BUG at fs/buffer.c:1268

2013-11-27 Thread George Spelvin
Well, it finally triggered.


Not *that* long before, I fiddled with a USB thumb drive, which
I'll mention here, but I don't think it's connected.

[2328294.996152] usb 1-1.3: new high-speed USB device number 6 using ehci-pci
[2328295.080347] usb 1-1.3: New USB device found, idVendor=0781, idProduct=556c
[2328295.080351] usb 1-1.3: New USB device strings: Mfr=1, Product=2, 
SerialNumber=3
[2328295.080352] usb 1-1.3: Product: Ultra
[2328295.080353] usb 1-1.3: Manufacturer: SanDisk
[2328295.080354] usb 1-1.3: SerialNumber: 20054861120C8D407604
[2328295.829526] usb-storage 1-1.3:1.0: USB Mass Storage device detected
[2328295.829571] scsi6 : usb-storage 1-1.3:1.0
[2328295.829615] usbcore: registered new interface driver usb-storage
[2328296.832215] scsi 6:0:0:0: Direct-Access SanDisk  Ultra1.20 
PQ: 0 ANSI: 5
[2328296.832343] sd 6:0:0:0: Attached scsi generic sg3 type 0
[2328296.833579] sd 6:0:0:0: [sdc] 15633408 512-byte logical blocks: (8.00 
GB/7.45 GiB)
[2328296.834942] sd 6:0:0:0: [sdc] Write Protect is off
[2328296.834944] sd 6:0:0:0: [sdc] Mode Sense: 43 00 00 00
[2328296.835947] sd 6:0:0:0: [sdc] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[2328296.848345]  sdc: sdc1
[2328296.851338] sd 6:0:0:0: [sdc] Attached SCSI removable disk
[2328361.332585] FAT-fs (sdc1): utf8 is not a recommended IO charset for FAT 
filesystems, filesystem will be case sensitive!
[2335705.838834] usb 1-1.3: USB disconnect, device number 6


The next thing in the kernel log is the might_sleep() warning (followed by
the oops):

[2348070.539862] BUG: sleeping function called from invalid context at 
fs/ext4/mballoc.c:4791
[2348070.539865] in_atomic(): 0, irqs_disabled(): 1, pid: 4635, name: iceweasel
[2348070.539867] CPU: 4 PID: 4635 Comm: iceweasel Tainted: GW
3.11.5-9-g06a2442 #100
[2348070.539868] Hardware name: Gigabyte Technology Co., Ltd. 
Z68A-D3H-B3/Z68A-D3H-B3, BIOS F13 03/20/2012
[2348070.539870]  88011379f4e0 8801a30179a8 81561017 
0002
[2348070.539872]  8801a30179b8 8106a06f 8801a3017a90 
81197a3e
[2348070.539874]  880011306f68 8801a3017fd8 0001a30179f8 
00f15fd1
[2348070.539876] Call Trace:
[2348070.539881]  [] dump_stack+0x54/0x74
[2348070.539884]  [] __might_sleep+0xcf/0xf0
[2348070.539887]  [] ext4_free_blocks+0x53e/0xa90
[2348070.539889]  [] ext4_ext_remove_space+0x806/0xe20
[2348070.539891]  [] ext4_ext_truncate+0xb8/0xe0
[2348070.539894]  [] ext4_truncate+0x2b5/0x300
[2348070.539895]  [] ext4_evict_inode+0x3f8/0x430
[2348070.539898]  [] evict+0xba/0x1c0
[2348070.539899]  [] iput+0x10b/0x1b0
[2348070.539901]  [] dput+0x278/0x350
[2348070.539904]  [] __fput+0x16a/0x240
[2348070.539905]  [] fput+0x9/0x10
[2348070.539909]  [] task_work_run+0x9c/0xd0
[2348070.539911]  [] do_exit+0x2a7/0x9d0
[2348070.539914]  [] ? __sigqueue_free.part.13+0x2e/0x40
[2348070.539915]  [] do_group_exit+0x3e/0xb0
[2348070.539917]  [] get_signal_to_deliver+0x1b0/0x5f0
[2348070.539919]  [] do_signal+0x43/0x940
[2348070.539921]  [] ? do_send_sig_info+0x58/0x80
[2348070.539923]  [] do_notify_resume+0x5d/0x80
[2348070.539925]  [] int_signal+0x12/0x17
[2348070.539931] [ cut here ]
[2348070.539950] kernel BUG at fs/buffer.c:1268!
[2348070.539962] invalid opcode:  [#1] SMP 
[2348070.539976] Modules linked in: nls_utf8 nls_cp437 vfat fat usb_storage 
fuse pl2303 ftdi_sio usbserial iTCO_wdt
[2348070.540018] CPU: 4 PID: 4635 Comm: iceweasel Tainted: GW
3.11.5-9-g06a2442 #100
[2348070.540040] Hardware name: Gigabyte Technology Co., Ltd. 
Z68A-D3H-B3/Z68A-D3H-B3, BIOS F13 03/20/2012
[2348070.540063] task: 88021688cf00 ti: 8801a3016000 task.ti: 
8801a3016000
[2348070.540082] RIP: 0010:[]  [] 
check_irqs_on.part.16+0x4/0x6
[2348070.540108] RSP: 0018:8801a3017798  EFLAGS: 00010046
[2348070.540122] RAX: 0082 RBX: 8801a3017928 RCX: 
8802162bd000
[2348070.540141] RDX: 1000 RSI: 00980080 RDI: 
8802171b6b00
[2348070.540159] RBP: 8801a3017798 R08: 0002 R09: 
0002
[2348070.540177] R10: 8802162bd000 R11: 8801a301751e R12: 
8802171b6b00
[2348070.540195] R13: 1000 R14: 880213a70600 R15: 
880215228c00
[2348070.540214] FS:  () GS:88021fb0() 
knlGS:
[2348070.540235] CS:  0010 DS: 002b ES: 002b CR0: 80050033
[2348070.540250] CR2: f4832a3c CR3: 0180c000 CR4: 
000407e0
[2348070.540268] Stack:
[2348070.540274]  8801a3017808 8112e547 8801a30177b0 
0092
[2348070.540297]  81568f60 81732ff4 81568f60 
810987bd
[2348070.540320]  ffe743c4 8801a30177f0 8801a3017928 
8802171b6b00
[2348070.540342] Call Trace:
[2348070.540352]  [] __find_get_block+0x1d7/0x1e0
[2348070.540369]  [] ? int_signal+0x12/0x17
[2348070.540384]  [] ? 

[git pull] drm minor fixes

2013-11-27 Thread Dave Airlie

Hi Linus,

just two minor fixes as people keep resending since they are so low 
hanging.

Dave.

The following changes since commit 8ae516aa8b8161254d3e402b3348b2a9b8d1efd0:

  Merge tag 'trace-fixes-v3.13-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace (2013-11-26 
18:04:21 -0800)

are available in the git repository at:


  git://people.freedesktop.org/~airlied/linux drm-fixes

for you to fetch changes up to eec99016e38b740662509f097effb90abc7a1376:

  drm/nouveau/hwmon: fix compilation without CONFIG_HWMON (2013-11-28 14:38:09 
+1000)


David Herrmann (1):
  drm/sysfs: fix OOM verification

Ilia Mirkin (1):
  drm/nouveau/hwmon: fix compilation without CONFIG_HWMON

 drivers/gpu/drm/drm_sysfs.c | 2 +-
 drivers/gpu/drm/nouveau/nouveau_hwmon.c | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] WIFI: handle a neglected case in nl80211_new_interface()

2013-11-27 Thread Chao Bi
On Thu, 2013-11-28 at 11:53 +0800, Chao Bi wrote:
> On Wed, 2013-11-27 at 20:43 +0530, Ujjal Roy wrote:
> > Hi,
> > 
> > 
> > We can use IS_ERR_OR_NULL(wdev) to check NULL as well as error value.
> > 
> > Thanks,
> > UjjaL
> > 
> > On Wed, Nov 27, 2013 at 8:30 AM, Chao Bi  wrote:
> > In nl80211_new_interface(), it calls rdev_add_virtual_intf() to 
> > create
> > a new interface, however, it only checks whether returned value is 
> > err
> > code, but doesn't check if returned value is NULL. The returned 
> 
> Thanks Ujjal. I'll update it.
> 
Hi all,
This patch is not valid, I further check the nl80211 API introduce,
don't need to handle this NULL return value because it bans NULL
feedback of add_virtual_intf() API in cfg80211.h.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESEND PATCH net-next 0/4] net: several cleanups

2013-11-27 Thread Zhi Yong Wu
On Thu, Nov 28, 2013 at 12:43 PM, David Miller  wrote:
> From: Zhi Yong Wu 
> Date: Thu, 28 Nov 2013 09:31:29 +0800
>
>> Per David's request, it's time to resend them now.
>
> No, it is not the time.
>
> You should not submit these kinds of patches until the net-next
> tree is open again, and I make an announcement here when that
> is the case.  I have yet to make such an announcement, and I
> do not plan to do so for several days as I am travelling and
> will be busy dealing with my backlog of patches once I get
> home.
ok, will wait for your announcement to be out.



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] w1: matrox: use DEFINE_PCI_DEVICE_TABLE macro

2013-11-27 Thread Jingoo Han
On Thursday, November 28, 2013 12:46 PM, Jingoo Han wrote:
> 
> This macro is used to create a struct pci_device_id array.
> 
> Signed-off-by: Jingoo Han 

Please, ignore these patches.
According to the Greg Kroah-Hartman, 

"Yeah, and it's a horrid macro that deserves to be removed, please don't
use it in more places.

Actually, if you could just remove it, that would be best, sorry, I'm
not going to take these patches."

So, I will send the patch to remove 'DEFINE_PCI_DEVICE_TABLE' instead.
Sorry for annoying. :-)

Best regards,
Jingoo Han


> ---
>  drivers/w1/masters/matrox_w1.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/w1/masters/matrox_w1.c b/drivers/w1/masters/matrox_w1.c
> index d8667b0..9361cea 100644
> --- a/drivers/w1/masters/matrox_w1.c
> +++ b/drivers/w1/masters/matrox_w1.c
> @@ -42,7 +42,7 @@ MODULE_LICENSE("GPL");
>  MODULE_AUTHOR("Evgeniy Polyakov ");
>  MODULE_DESCRIPTION("Driver for transport(Dallas 1-wire prtocol) over VGA 
> DDC(matrox gpio).");
> 
> -static struct pci_device_id matrox_w1_tbl[] = {
> +static DEFINE_PCI_DEVICE_TABLE(matrox_w1_tbl) = {
>   { PCI_DEVICE(PCI_VENDOR_ID_MATROX, PCI_DEVICE_ID_MATROX_G400) },
>   { },
>  };
> --
> 1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 0/9] mm: thrash detection-based file cache sizing v6

2013-11-27 Thread Johannes Weiner
Here are the changes to the series I have accumulated so far.  Mainly:

o truncate_inode_pages_final() that sets mapping_set_exiting() and
  uses ordered but unlocked nrshadows & nrpages reads to skip the tree
  lock acquisition and IRQ disabling on empty page cache trees.

o revert all efforts to make the lru_lock IRQ-safe just to silence
  lockdep.  Also solves the problem of the list_lru_init() key API.

o in the shadow shrinker, drop the lru_lock after mapping->tree_lock
  has been acquired.  The latter pins the inode by preventing the
  final truncate from removing shadow entries, so we can safely
  release the lru lock once mapping->tree_lock is acquired and the
  node is taken off the list.

o changed radix_tree_node member names and documented them better

o fixed typos

As we agreed to keep the shadow node lru management non-lazy for now,
we don't need to worry about the lifetime of radix tree nodes in the
shrinker beyond taking it off the lru list with the lru_lock and
mapping->tree_lock held.  No complicated RCU scheme required.

---

diff --git a/Documentation/filesystems/porting 
b/Documentation/filesystems/porting
index f089058..fc0de70 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -295,9 +295,9 @@ in the beginning of ->setattr unconditionally.
->clear_inode() and ->delete_inode() are gone; ->evict_inode() should
 be used instead.  It gets called whenever the inode is evicted, whether it has
 remaining links or not.  Caller does *not* evict the pagecache or 
inode-associated
-metadata buffers; getting rid of those is responsibility of method, as it had
-been for ->delete_inode(). Caller makes sure async writeback cannot be running
-for the inode while (or after) ->evict_inode() is called.
+metadata buffers; the method has to use truncate_inode_pages_final() to get rid
+of those. Caller makes sure async writeback cannot be running for the inode 
while
+(or after) ->evict_inode() is called.
 
->drop_inode() returns int now; it's called on final iput() with
 inode->i_lock held and it returns true if filesystems wants the inode to be
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
b/drivers/staging/lustre/lustre/llite/llite_lib.c
index b868c2b..79cbc9c 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1817,7 +1817,7 @@ void ll_delete_inode(struct inode *inode)
cl_sync_file_range(inode, 0, OBD_OBJECT_EOF,
   CL_FSYNC_DISCARD, 1);
 
-   truncate_inode_pages(>i_data, 0);
+   truncate_inode_pages_final(>i_data);
 
/* Workaround for LU-118 */
if (inode->i_data.nrpages) {
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 94de6d1..e6716c2 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -444,7 +444,7 @@ void v9fs_evict_inode(struct inode *inode)
 {
struct v9fs_inode *v9inode = V9FS_I(inode);
 
-   truncate_inode_pages(inode->i_mapping, 0);
+   truncate_inode_pages_final(inode->i_mapping);
clear_inode(inode);
filemap_fdatawrite(inode->i_mapping);
 
diff --git a/fs/affs/inode.c b/fs/affs/inode.c
index 0e092d0..96df91e 100644
--- a/fs/affs/inode.c
+++ b/fs/affs/inode.c
@@ -259,7 +259,7 @@ affs_evict_inode(struct inode *inode)
 {
unsigned long cache_page;
pr_debug("AFFS: evict_inode(ino=%lu, nlink=%u)\n", inode->i_ino, 
inode->i_nlink);
-   truncate_inode_pages(>i_data, 0);
+   truncate_inode_pages_final(>i_data);
 
if (!inode->i_nlink) {
inode->i_size = 0;
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 789bc25..2bbe60e 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -422,7 +422,7 @@ void afs_evict_inode(struct inode *inode)
 
ASSERTCMP(inode->i_ino, ==, vnode->fid.vnode);
 
-   truncate_inode_pages(>i_data, 0);
+   truncate_inode_pages_final(>i_data);
clear_inode(inode);
 
afs_give_up_callback(vnode);
diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c
index 8defc6b..29aa5cf 100644
--- a/fs/bfs/inode.c
+++ b/fs/bfs/inode.c
@@ -172,7 +172,7 @@ static void bfs_evict_inode(struct inode *inode)
 
dprintf("ino=%08lx\n", ino);
 
-   truncate_inode_pages(>i_data, 0);
+   truncate_inode_pages_final(>i_data);
invalidate_inode_buffers(inode);
clear_inode(inode);
 
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 391ffe5..c7a7def 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -419,7 +419,7 @@ static void bdev_evict_inode(struct inode *inode)
 {
struct block_device *bdev = _I(inode)->bdev;
struct list_head *p;
-   truncate_inode_pages(>i_data, 0);
+   truncate_inode_pages_final(>i_data);
invalidate_inode_buffers(inode); /* is it needed here? */
clear_inode(inode);
spin_lock(_lock);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 51e3afa..d3e4983 100644
--- a/fs/btrfs/inode.c
+++ 

Re: [RESEND PATCH net-next 0/4] net: several cleanups

2013-11-27 Thread David Miller
From: Zhi Yong Wu 
Date: Thu, 28 Nov 2013 09:31:29 +0800

> Per David's request, it's time to resend them now.

No, it is not the time.

You should not submit these kinds of patches until the net-next
tree is open again, and I make an announcement here when that
is the case.  I have yet to make such an announcement, and I
do not plan to do so for several days as I am travelling and
will be busy dealing with my backlog of patches once I get
home.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] drivers: input: touchscreen: sur40: remove stack variable 'packet_id' from sur40_poll()

2013-11-27 Thread Chen Gang
The drivers process all blobs in one pass, so there is no need to
preserve value of packet_id between calls to sur40_poll().

And the original implementation may cause 'packet_id' uninitialized,
the related warning (with allmodconfig under hexagon):

  drivers/input/touchscreen/sur40.c: In function 'sur40_poll':
  drivers/input/touchscreen/sur40.c:297:6: warning: 'packet_id' may be used 
uninitialized in this function [-Wuninitialized]


Signed-off-by: Chen Gang 
---
 drivers/input/touchscreen/sur40.c |   10 --
 1 files changed, 0 insertions(+), 10 deletions(-)

diff --git a/drivers/input/touchscreen/sur40.c 
b/drivers/input/touchscreen/sur40.c
index cfd1b7e..2ca32cb 100644
--- a/drivers/input/touchscreen/sur40.c
+++ b/drivers/input/touchscreen/sur40.c
@@ -251,7 +251,6 @@ static void sur40_poll(struct input_polled_dev *polldev)
struct sur40_state *sur40 = polldev->private;
struct input_dev *input = polldev->input;
int result, bulk_read, need_blobs, packet_blobs, i;
-   u32 packet_id;
 
struct sur40_header *header = >bulk_in_buffer->header;
struct sur40_blob *inblob = >bulk_in_buffer->blobs[0];
@@ -286,17 +285,8 @@ static void sur40_poll(struct input_polled_dev *polldev)
if (need_blobs == -1) {
need_blobs = le16_to_cpu(header->count);
dev_dbg(sur40->dev, "need %d blobs\n", need_blobs);
-   packet_id = header->packet_id;
}
 
-   /*
-* Sanity check. when video data is also being retrieved, the
-* packet ID will usually increase in the middle of a series
-* instead of at the end.
-*/
-   if (packet_id != header->packet_id)
-   dev_warn(sur40->dev, "packet ID mismatch\n");
-
packet_blobs = result / sizeof(struct sur40_blob);
dev_dbg(sur40->dev, "received %d blobs\n", packet_blobs);
 
-- 
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] pch_gbe: Remove tx_queue_len setting from the driver

2013-11-27 Thread David Miller

When submitting a series of patches to the same files, you must number
them so that the order in which the patches need to be applied is
explicit.

Please read Documentation/SubmittingPatches in the kernel sources.

Thank you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: page fault deadlock

2013-11-27 Thread Xiaotian Feng
On Thu, Nov 28, 2013 at 12:11 PM, Greg KH  wrote:
> On Thu, Nov 28, 2013 at 11:25:32AM +0800, Xiaotian Feng wrote:
>> Hi,
>>
>> When I upgrade to latest kernel, I found my system hang there. It
>> is reproducible on my virtualbox, and I found each time I mounted my
>> RAID6 partition and tried to vi or build kernel, my whole system
>> lockup very soon.
>>
>> After turning on lockdep, I found following lockdep warning:
>>
>> [   27.848462]
>> [   27.848471] ==
>> [   27.848477] [ INFO: possible circular locking dependency detected ]
>> [   27.848484] 3.13.0-rc1+ #1 Tainted: GF   W
>> [   27.848490] ---
>> [   27.848496] Xorg/1268 is trying to acquire lock:
>> [   27.848501]  (>mutex){+.+.+.}, at: []
>> sysfs_bin_mmap+0x4f/0x120
>> [   27.848516]
>> [   27.848516] but task is already holding lock:
>> [   27.848521]  (>mmap_sem){++}, at: []
>> vm_mmap_pgoff+0x6f/0xc0
>> [   27.848534]
>> [   27.848534] which lock already depends on the new lock.
>> [   27.848534]
>> [   27.848541]
>> [   27.848541] the existing dependency chain (in reverse order) is:
>> [   27.848547]
>> [   27.848547] -> #2 (>mmap_sem){++}:
>> [   27.848556][] lock_acquire+0xb0/0x160
>> [   27.848564][] might_fault+0x8c/0xb0
>> [   27.848572][] md_ioctl+0xa78/0x19b0
>> [   27.848580][] blkdev_ioctl+0x234/0x840
>> [   27.848588][] block_ioctl+0x41/0x50
>> [   27.848597][] do_vfs_ioctl+0x300/0x520
>> [   27.848605][] SyS_ioctl+0x81/0xa0
>> [   27.848613][] tracesys+0xe1/0xe6
>> [   27.848622]
>> [   27.848622] -> #1 (>reconfig_mutex){+.+.+.}:
>> [   27.848630][] lock_acquire+0xb0/0x160
>> [   27.848637][]
>> mutex_lock_interruptible_nested+0x78/0x610
>> [   27.848646][] rdev_attr_show+0x40/0x90
>> [   27.848654][] sysfs_seq_show+0xda/0x170
>> [   27.848662][] seq_read+0x164/0x3e0
>> [   27.848671][] vfs_read+0x95/0x160
>> [   27.848680][] SyS_read+0x49/0xa0
>> [   27.848687][] tracesys+0xe1/0xe6
>> [   27.848695]
>> [   27.848695] -> #0 (>mutex){+.+.+.}:
>> [   27.848703][] __lock_acquire+0x1587/0x1ca0
>> [   27.848711][] lock_acquire+0xb0/0x160
>> [   27.848718][] mutex_lock_nested+0x68/0x510
>> [   27.848725][] sysfs_bin_mmap+0x4f/0x120
>> [   27.848732][] mmap_region+0x3ed/0x5d0
>> [   27.848741][] do_mmap_pgoff+0x34e/0x3d0
>> [   27.848748][] vm_mmap_pgoff+0x90/0xc0
>> [   27.848755][] SyS_mmap_pgoff+0x1d5/0x270
>> [   27.848763][] SyS_mmap+0x22/0x30
>> [   27.848771][] tracesys+0xe1/0xe6
>> [   27.848778]
>> [   27.848778] other info that might help us debug this:
>> [   27.848778]
>> [   27.848785] Chain exists of:
>> [   27.848785]   >mutex --> >reconfig_mutex --> >mmap_sem
>> [   27.848785]
>> [   27.848795]  Possible unsafe locking scenario:
>> [   27.848795]
>> [   27.848800]CPU0CPU1
>> [   27.848805]
>> [   27.848810]   lock(>mmap_sem);
>> [   27.848817]lock(>reconfig_mutex);
>> [   27.848824]lock(>mmap_sem);
>> [   27.848830]   lock(>mutex);
>> [   27.848837]
>> [   27.848837]  *** DEADLOCK ***
>> [   27.848837]
>> [   27.848844] 1 lock held by Xorg/1268:
>> [   27.848849]  #0:  (>mmap_sem){++}, at: []
>> vm_mmap_pgoff+0x6f/0xc0
>> [   27.848861]
>> [   27.848861] stack backtrace:
>> [   27.848868] CPU: 1 PID: 1268 Comm: Xorg Tainted: GF   W
>> 3.13.0-rc1+ #1
>> [   27.848873] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
>> VirtualBox 12/01/2006
>> [   27.848879]  822daa00 8800d0371bc8 817725f7
>> 822cbdc0
>> [   27.848901]  8800d0371c08 8176d9eb 8800d0371c60
>> 880115b42a78
>> [   27.848909]   880115b42a78 880115b422a0
>> 0001
>> [   27.848918] Call Trace:
>> [   27.848930]  [] dump_stack+0x4e/0x7a
>> [   27.848942]  [] print_circular_bug+0x1f9/0x208
>> [   27.848952]  [] __lock_acquire+0x1587/0x1ca0
>> [   27.848964]  [] ? print_context_stack+0x8f/0x100
>> [   27.848975]  [] lock_acquire+0xb0/0x160
>> [   27.848986]  [] ? sysfs_bin_mmap+0x4f/0x120
>> [   27.848996]  [] ? sysfs_bin_mmap+0x4f/0x120
>> [   27.849007]  [] mutex_lock_nested+0x68/0x510
>> [   27.849016]  [] ? sysfs_bin_mmap+0x4f/0x120
>> [   27.849027]  [] ? kmemleak_alloc+0x4e/0xb0
>> [   27.849038]  [] sysfs_bin_mmap+0x4f/0x120
>> [   27.849048]  [] mmap_region+0x3ed/0x5d0
>> [   27.849058]  [] do_mmap_pgoff+0x34e/0x3d0
>> [   27.849070]  [] vm_mmap_pgoff+0x90/0xc0
>> [   27.849080]  [] SyS_mmap_pgoff+0x1d5/0x270
>> [   27.849092]  [] ? syscall_trace_enter+0x145/0x270
>> [   27.849102]  [] SyS_mmap+0x22/0x30
>> [   27.849112]  [] tracesys+0xe1/0xe6
>>
>>
>> I think it is a real deadlock, and it is caused 

[PATCH 3/4] spi: omap-100k: Fixed spacing on commas

2013-11-27 Thread Thomas Behan
checkpatch.pl generated 5 errors due to missing spaces after commas. These
errors have been fixed.

Signed-off-by: Thomas Behan 
---
 drivers/spi/spi-omap-100k.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/spi/spi-omap-100k.c b/drivers/spi/spi-omap-100k.c
index 5db8d2a..fcd783c 100644
--- a/drivers/spi/spi-omap-100k.c
+++ b/drivers/spi/spi-omap-100k.c
@@ -155,7 +155,7 @@ static void spi100k_write_data(struct spi_master *master, 
int len, int data)
 
 static int spi100k_read_data(struct spi_master *master, int len)
 {
-   int dataH,dataL;
+   int dataH, dataL;
struct omap1_spi100k *spi100k = spi_master_get_devdata(master);
 
/* Always do at least 16 bits */
@@ -236,9 +236,9 @@ omap1_spi100k_txrx_pio(struct spi_device *spi, struct 
spi_transfer *xfer)
do {
c -= 2;
if (xfer->tx_buf != NULL)
-   spi100k_write_data(spi->master,word_len, *tx++);
+   spi100k_write_data(spi->master, word_len, 
*tx++);
if (xfer->rx_buf != NULL)
-   *rx++ = spi100k_read_data(spi->master,word_len);
+   *rx++ = spi100k_read_data(spi->master, 
word_len);
} while (c);
} else if (word_len <= 32) {
u32 *rx;
@@ -249,9 +249,9 @@ omap1_spi100k_txrx_pio(struct spi_device *spi, struct 
spi_transfer *xfer)
do {
c -= 4;
if (xfer->tx_buf != NULL)
-   spi100k_write_data(spi->master,word_len, *tx);
+   spi100k_write_data(spi->master, word_len, *tx);
if (xfer->rx_buf != NULL)
-   *rx = spi100k_read_data(spi->master,word_len);
+   *rx = spi100k_read_data(spi->master, word_len);
} while (c);
}
return count - c;
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] spi: omap-100k: Fixed spacing on while loops

2013-11-27 Thread Thomas Behan
checkpatch.pl generated several errors from "while(" statements which should
have read "while (" to comply with the style guide. These errors have been
fixed.

Signed-off-by: Thomas Behan 
---
 drivers/spi/spi-omap-100k.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/spi/spi-omap-100k.c b/drivers/spi/spi-omap-100k.c
index b6ed82b..bdf6696 100644
--- a/drivers/spi/spi-omap-100k.c
+++ b/drivers/spi/spi-omap-100k.c
@@ -147,7 +147,7 @@ static void spi100k_write_data(struct spi_master *master, 
int len, int data)
   spi100k->base + SPI_CTRL);
 
/* Wait for bit ack send change */
-   while((readw(spi100k->base + SPI_STATUS) & SPI_STATUS_WE) != 
SPI_STATUS_WE);
+   while ((readw(spi100k->base + SPI_STATUS) & SPI_STATUS_WE) != 
SPI_STATUS_WE);
udelay(1000);
 
spi100k_disable_clock(master);
@@ -168,7 +168,7 @@ static int spi100k_read_data(struct spi_master *master, int 
len)
   SPI_CTRL_RD,
   spi100k->base + SPI_CTRL);
 
-   while((readw(spi100k->base + SPI_STATUS) & SPI_STATUS_RD) != 
SPI_STATUS_RD);
+   while ((readw(spi100k->base + SPI_STATUS) & SPI_STATUS_RD) != 
SPI_STATUS_RD);
udelay(1000);
 
dataL = readw(spi100k->base + SPI_RX_LSB);
@@ -226,7 +226,7 @@ omap1_spi100k_txrx_pio(struct spi_device *spi, struct 
spi_transfer *xfer)
spi100k_write_data(spi->master, word_len, 
*tx++);
if (xfer->rx_buf != NULL)
*rx++ = spi100k_read_data(spi->master, 
word_len);
-   } while(c);
+   } while (c);
} else if (word_len <= 16) {
u16 *rx;
const u16   *tx;
@@ -239,7 +239,7 @@ omap1_spi100k_txrx_pio(struct spi_device *spi, struct 
spi_transfer *xfer)
spi100k_write_data(spi->master,word_len, *tx++);
if (xfer->rx_buf != NULL)
*rx++ = spi100k_read_data(spi->master,word_len);
-   } while(c);
+   } while (c);
} else if (word_len <= 32) {
u32 *rx;
const u32   *tx;
@@ -252,7 +252,7 @@ omap1_spi100k_txrx_pio(struct spi_device *spi, struct 
spi_transfer *xfer)
spi100k_write_data(spi->master,word_len, *tx);
if (xfer->rx_buf != NULL)
*rx = spi100k_read_data(spi->master,word_len);
-   } while(c);
+   } while (c);
}
return count - c;
 }
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] spi: omap-100k: Fixed spacing on -= operators

2013-11-27 Thread Thomas Behan
checkpatch.pl generated 3 errors from "x-=" statements which should have read
"x -= x" to comply with the style guide. These errors have been fixed.

Signed-off-by: Thomas Behan 
---
 drivers/spi/spi-omap-100k.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/spi/spi-omap-100k.c b/drivers/spi/spi-omap-100k.c
index bdf6696..5db8d2a 100644
--- a/drivers/spi/spi-omap-100k.c
+++ b/drivers/spi/spi-omap-100k.c
@@ -221,7 +221,7 @@ omap1_spi100k_txrx_pio(struct spi_device *spi, struct 
spi_transfer *xfer)
rx = xfer->rx_buf;
tx = xfer->tx_buf;
do {
-   c-=1;
+   c -= 1;
if (xfer->tx_buf != NULL)
spi100k_write_data(spi->master, word_len, 
*tx++);
if (xfer->rx_buf != NULL)
@@ -234,7 +234,7 @@ omap1_spi100k_txrx_pio(struct spi_device *spi, struct 
spi_transfer *xfer)
rx = xfer->rx_buf;
tx = xfer->tx_buf;
do {
-   c-=2;
+   c -= 2;
if (xfer->tx_buf != NULL)
spi100k_write_data(spi->master,word_len, *tx++);
if (xfer->rx_buf != NULL)
@@ -247,7 +247,7 @@ omap1_spi100k_txrx_pio(struct spi_device *spi, struct 
spi_transfer *xfer)
rx = xfer->rx_buf;
tx = xfer->tx_buf;
do {
-   c-=4;
+   c -= 4;
if (xfer->tx_buf != NULL)
spi100k_write_data(spi->master,word_len, *tx);
if (xfer->rx_buf != NULL)
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/4] spi: omap-100k: Fixed style of sizeof operators

2013-11-27 Thread Thomas Behan
checkpatch.pl generated 2 warning due to "sizeof x" being used instead of
"sizeof(x)". These warnings have been fixed.

Signed-off-by: Thomas Behan 
---
 drivers/spi/spi-omap-100k.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi-omap-100k.c b/drivers/spi/spi-omap-100k.c
index fcd783c..2516379 100644
--- a/drivers/spi/spi-omap-100k.c
+++ b/drivers/spi/spi-omap-100k.c
@@ -294,7 +294,7 @@ static int omap1_spi100k_setup(struct spi_device *spi)
spi100k = spi_master_get_devdata(spi->master);
 
if (!cs) {
-   cs = kzalloc(sizeof *cs, GFP_KERNEL);
+   cs = kzalloc(sizeof(*cs), GFP_KERNEL);
if (!cs)
return -ENOMEM;
cs->base = spi100k->base + spi->chip_select * 0x14;
@@ -411,7 +411,7 @@ static int omap1_spi100k_probe(struct platform_device *pdev)
if (!pdev->id)
return -EINVAL;
 
-   master = spi_alloc_master(>dev, sizeof *spi100k);
+   master = spi_alloc_master(>dev, sizeof(*spi100k));
if (master == NULL) {
dev_dbg(>dev, "master allocation failed\n");
return -ENOMEM;
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/4] spi: omap-100k: Style Fixes

2013-11-27 Thread Thomas Behan
First time contributer here submitting some style fixes. 

I watched Greg Kroah-Hartman's youtube video about getting started with
submitting patches and thought I would give it a try. I fixed up some errors
and warnings that checkpatch.pl was giving. All patches are based off of
linux-next.

Any feedback is much appreciated.

Thomas Behan (4):
  spi: omap-100k: Fixed spacing on while loops
  spi: omap-100k: Fixed spacing on -= operators
  spi: omap-100k: Fixed spacing on commas
  spi: omap-100k: Fixed style of sizeof operators

 drivers/spi/spi-omap-100k.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers: input: touchscreen: sur40: use static variable instead of stack varialbe for 'packet_id'

2013-11-27 Thread Chen Gang
On 11/28/2013 12:07 PM, Dmitry Torokhov wrote:
> Hi Chen,
> 
> On Wed, Nov 27, 2013 at 10:15:33AM +0800, Chen Gang wrote:
>> > 'packet_id' is used for checking sequence whether in order, it need be
>> > static variable independent from sur40_poll().
>> > 
>> > The related warning (with allmodconfig under hexagon):
>> > 
>> >   drivers/input/touchscreen/sur40.c: In function 'sur40_poll':
>> >   drivers/input/touchscreen/sur40.c:297:6: warning: 'packet_id' may be 
>> > used uninitialized in this function [-Wuninitialized]
>> > 
>> > 
>> > Signed-off-by: Chen Gang 
>> > ---
>> >  drivers/input/touchscreen/sur40.c |2 +-
>> >  1 files changed, 1 insertions(+), 1 deletions(-)
>> > 
>> > diff --git a/drivers/input/touchscreen/sur40.c 
>> > b/drivers/input/touchscreen/sur40.c
>> > index cfd1b7e..5dfd01a 100644
>> > --- a/drivers/input/touchscreen/sur40.c
>> > +++ b/drivers/input/touchscreen/sur40.c
>> > @@ -251,7 +251,7 @@ static void sur40_poll(struct input_polled_dev 
>> > *polldev)
>> >struct sur40_state *sur40 = polldev->private;
>> >struct input_dev *input = polldev->input;
>> >int result, bulk_read, need_blobs, packet_blobs, i;
>> > -  u32 packet_id;
>> > +  static u32 packet_id;
> It is usually not a good idea to use statics in device drivers as it
> does not work well when you have several devices of the same type
> present in a system. Also, we process all blobs in one pass so there is
> no need to preserve value of packet_id between calls to sur40_poll().

OK, thanks, I will/should send patch v2 for it.

-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [PATCH 02/17] tracing/probes: Fix basic print type functions

2013-11-27 Thread Masami Hiramatsu
(2013/11/27 23:39), Namhyung Kim wrote:
> Hi Masami,
> 
> 2013-11-27 (수), 20:57 +0900, Masami Hiramatsu:
>> (2013/11/27 15:19), Namhyung Kim wrote:
>>>  
>>> -DEFINE_BASIC_PRINT_TYPE_FUNC(u8, "%x", unsigned int)
>>> -DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "%x", unsigned int)
>>> -DEFINE_BASIC_PRINT_TYPE_FUNC(u32, "%lx", unsigned long)
>>> -DEFINE_BASIC_PRINT_TYPE_FUNC(u64, "%llx", unsigned long long)
>>> -DEFINE_BASIC_PRINT_TYPE_FUNC(s8, "%d", int)
>>> -DEFINE_BASIC_PRINT_TYPE_FUNC(s16, "%d", int)
>>> -DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%ld", long)
>>> -DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%lld", long long)
>>> +DEFINE_BASIC_PRINT_TYPE_FUNC(u8 , "%#x")
>>> +DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "%#x")
>>> +DEFINE_BASIC_PRINT_TYPE_FUNC(u32, "%#x")
>>> +DEFINE_BASIC_PRINT_TYPE_FUNC(u64, "%#Lx")
>>
>> As I said I'd like to ask you to change it in %x.
>>
>> I just checked in Fedora18, but %#x is not supported on this glibc-2.17.
>> Since this format is exported via debugfs (format file), I think %x is
>> better.
> 
> Hmm.. but in most cases it's used for printf() not scanf(), right?  In
> that case adding 0x prefix will help human readers a lot.
> 
> How about mandating the prefix with "0x%x"?  This way it can be used
> both for printf() and scanf() IMHO.

Agreed, you can just use "0x%x" in above case instead of "%#x". :)
For other traceevents, from the human readability point of view,
I think we should move all the event format should use 0x%x instead
of %x, because sometimes it confuse users (e.g. 100 => 0x64, without 0x,
it is just "64").

Thank you,

-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: page fault deadlock

2013-11-27 Thread Greg KH
On Thu, Nov 28, 2013 at 11:25:32AM +0800, Xiaotian Feng wrote:
> Hi,
> 
> When I upgrade to latest kernel, I found my system hang there. It
> is reproducible on my virtualbox, and I found each time I mounted my
> RAID6 partition and tried to vi or build kernel, my whole system
> lockup very soon.
> 
> After turning on lockdep, I found following lockdep warning:
> 
> [   27.848462]
> [   27.848471] ==
> [   27.848477] [ INFO: possible circular locking dependency detected ]
> [   27.848484] 3.13.0-rc1+ #1 Tainted: GF   W
> [   27.848490] ---
> [   27.848496] Xorg/1268 is trying to acquire lock:
> [   27.848501]  (>mutex){+.+.+.}, at: []
> sysfs_bin_mmap+0x4f/0x120
> [   27.848516]
> [   27.848516] but task is already holding lock:
> [   27.848521]  (>mmap_sem){++}, at: []
> vm_mmap_pgoff+0x6f/0xc0
> [   27.848534]
> [   27.848534] which lock already depends on the new lock.
> [   27.848534]
> [   27.848541]
> [   27.848541] the existing dependency chain (in reverse order) is:
> [   27.848547]
> [   27.848547] -> #2 (>mmap_sem){++}:
> [   27.848556][] lock_acquire+0xb0/0x160
> [   27.848564][] might_fault+0x8c/0xb0
> [   27.848572][] md_ioctl+0xa78/0x19b0
> [   27.848580][] blkdev_ioctl+0x234/0x840
> [   27.848588][] block_ioctl+0x41/0x50
> [   27.848597][] do_vfs_ioctl+0x300/0x520
> [   27.848605][] SyS_ioctl+0x81/0xa0
> [   27.848613][] tracesys+0xe1/0xe6
> [   27.848622]
> [   27.848622] -> #1 (>reconfig_mutex){+.+.+.}:
> [   27.848630][] lock_acquire+0xb0/0x160
> [   27.848637][]
> mutex_lock_interruptible_nested+0x78/0x610
> [   27.848646][] rdev_attr_show+0x40/0x90
> [   27.848654][] sysfs_seq_show+0xda/0x170
> [   27.848662][] seq_read+0x164/0x3e0
> [   27.848671][] vfs_read+0x95/0x160
> [   27.848680][] SyS_read+0x49/0xa0
> [   27.848687][] tracesys+0xe1/0xe6
> [   27.848695]
> [   27.848695] -> #0 (>mutex){+.+.+.}:
> [   27.848703][] __lock_acquire+0x1587/0x1ca0
> [   27.848711][] lock_acquire+0xb0/0x160
> [   27.848718][] mutex_lock_nested+0x68/0x510
> [   27.848725][] sysfs_bin_mmap+0x4f/0x120
> [   27.848732][] mmap_region+0x3ed/0x5d0
> [   27.848741][] do_mmap_pgoff+0x34e/0x3d0
> [   27.848748][] vm_mmap_pgoff+0x90/0xc0
> [   27.848755][] SyS_mmap_pgoff+0x1d5/0x270
> [   27.848763][] SyS_mmap+0x22/0x30
> [   27.848771][] tracesys+0xe1/0xe6
> [   27.848778]
> [   27.848778] other info that might help us debug this:
> [   27.848778]
> [   27.848785] Chain exists of:
> [   27.848785]   >mutex --> >reconfig_mutex --> >mmap_sem
> [   27.848785]
> [   27.848795]  Possible unsafe locking scenario:
> [   27.848795]
> [   27.848800]CPU0CPU1
> [   27.848805]
> [   27.848810]   lock(>mmap_sem);
> [   27.848817]lock(>reconfig_mutex);
> [   27.848824]lock(>mmap_sem);
> [   27.848830]   lock(>mutex);
> [   27.848837]
> [   27.848837]  *** DEADLOCK ***
> [   27.848837]
> [   27.848844] 1 lock held by Xorg/1268:
> [   27.848849]  #0:  (>mmap_sem){++}, at: []
> vm_mmap_pgoff+0x6f/0xc0
> [   27.848861]
> [   27.848861] stack backtrace:
> [   27.848868] CPU: 1 PID: 1268 Comm: Xorg Tainted: GF   W3.13.0-rc1+ 
> #1
> [   27.848873] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
> VirtualBox 12/01/2006
> [   27.848879]  822daa00 8800d0371bc8 817725f7
> 822cbdc0
> [   27.848901]  8800d0371c08 8176d9eb 8800d0371c60
> 880115b42a78
> [   27.848909]   880115b42a78 880115b422a0
> 0001
> [   27.848918] Call Trace:
> [   27.848930]  [] dump_stack+0x4e/0x7a
> [   27.848942]  [] print_circular_bug+0x1f9/0x208
> [   27.848952]  [] __lock_acquire+0x1587/0x1ca0
> [   27.848964]  [] ? print_context_stack+0x8f/0x100
> [   27.848975]  [] lock_acquire+0xb0/0x160
> [   27.848986]  [] ? sysfs_bin_mmap+0x4f/0x120
> [   27.848996]  [] ? sysfs_bin_mmap+0x4f/0x120
> [   27.849007]  [] mutex_lock_nested+0x68/0x510
> [   27.849016]  [] ? sysfs_bin_mmap+0x4f/0x120
> [   27.849027]  [] ? kmemleak_alloc+0x4e/0xb0
> [   27.849038]  [] sysfs_bin_mmap+0x4f/0x120
> [   27.849048]  [] mmap_region+0x3ed/0x5d0
> [   27.849058]  [] do_mmap_pgoff+0x34e/0x3d0
> [   27.849070]  [] vm_mmap_pgoff+0x90/0xc0
> [   27.849080]  [] SyS_mmap_pgoff+0x1d5/0x270
> [   27.849092]  [] ? syscall_trace_enter+0x145/0x270
> [   27.849102]  [] SyS_mmap+0x22/0x30
> [   27.849112]  [] tracesys+0xe1/0xe6
> 
> 
> I think it is a real deadlock, and it is caused by commit
> 3124eb1679b28726 "sysfs: merge regular and bin file handling".
> 
> With that commit, sysfs_bin_mmap will hold of->mutex.
> 
> So assume cpu0 

[PATCH] staging/lustre: remove wirehdr.c

2013-11-27 Thread Peng Tao
It is not used.

Cc: Andreas Dilger 
Signed-off-by: Peng Tao 
---
 drivers/staging/lustre/lustre/ptlrpc/wirehdr.c |   47 
 1 file changed, 47 deletions(-)
 delete mode 100644 drivers/staging/lustre/lustre/ptlrpc/wirehdr.c

diff --git a/drivers/staging/lustre/lustre/ptlrpc/wirehdr.c 
b/drivers/staging/lustre/lustre/ptlrpc/wirehdr.c
deleted file mode 100644
index 93bc40b..000
--- a/drivers/staging/lustre/lustre/ptlrpc/wirehdr.c
+++ /dev/null
@@ -1,47 +0,0 @@
-/*
- * GPL HEADER START
- *
- * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 only,
- * as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License version 2 for more details (a copy is included
- * in the LICENSE file that accompanied this code).
- *
- * You should have received a copy of the GNU General Public License
- * version 2 along with this program; If not, see
- * http://www.sun.com/software/products/lustre/docs/GPLv2.pdf
- *
- * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara,
- * CA 95054 USA or visit www.sun.com if you need additional information or
- * have any questions.
- *
- * GPL HEADER END
- */
-/*
- * Copyright (c) 2007, 2010, Oracle and/or its affiliates. All rights reserved.
- * Use is subject to license terms.
- *
- * Copyright (c) 2011, Intel Corporation.
- */
-/*
- * This file is part of Lustre, http://www.lustre.org/
- * Lustre is a trademark of Sun Microsystems, Inc.
- */
-
-#define DEBUG_SUBSYSTEM S_RPC
-
-# ifdef CONFIG_FS_POSIX_ACL
-#  include 
-#  include 
-# endif
-
-#include 
-#include 
-#include 
-#include 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Input: ads7846: Convert to hwmon_device_register_with_groups

2013-11-27 Thread Dmitry Torokhov
On Tue, Nov 26, 2013 at 05:20:13PM -0800, Guenter Roeck wrote:
> On 11/26/2013 10:58 AM, Dmitry Torokhov wrote:
> >Hi Guenter,
> >
> >On Mon, Nov 25, 2013 at 08:39:04PM -0800, Guenter Roeck wrote:
> >>Simplify the code and create mandatory 'name' attribute by using
> >>new hwmon API.
> >
> >So this moves hwmon attributes from the parent i2c device to the hwmon
> >device, right? Would not that break userspace which expects to find the
> >attributes where they were?
> >
> 
> In addition to Jean's earlier comments ... s/i2c/spi/, I assume. spi devices
> don't create the mandatory name attribute automatically, which means
> that the created hwmon device was not recognized by standard user space
> applications (eg the sensors command or anything else using libsensors)
> in the first place. Which in turn means that only applications which don't
> support the standard hwmon ABI - if there are any - would be affected.
> What we are more concerned about is to make sure that applications
> which _do_ follow the hwmon ABI are working.

OK, fair enough, I'll apply this then.

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers: input: touchscreen: sur40: use static variable instead of stack varialbe for 'packet_id'

2013-11-27 Thread Dmitry Torokhov
Hi Chen,

On Wed, Nov 27, 2013 at 10:15:33AM +0800, Chen Gang wrote:
> 'packet_id' is used for checking sequence whether in order, it need be
> static variable independent from sur40_poll().
> 
> The related warning (with allmodconfig under hexagon):
> 
>   drivers/input/touchscreen/sur40.c: In function 'sur40_poll':
>   drivers/input/touchscreen/sur40.c:297:6: warning: 'packet_id' may be used 
> uninitialized in this function [-Wuninitialized]
> 
> 
> Signed-off-by: Chen Gang 
> ---
>  drivers/input/touchscreen/sur40.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/input/touchscreen/sur40.c 
> b/drivers/input/touchscreen/sur40.c
> index cfd1b7e..5dfd01a 100644
> --- a/drivers/input/touchscreen/sur40.c
> +++ b/drivers/input/touchscreen/sur40.c
> @@ -251,7 +251,7 @@ static void sur40_poll(struct input_polled_dev *polldev)
>   struct sur40_state *sur40 = polldev->private;
>   struct input_dev *input = polldev->input;
>   int result, bulk_read, need_blobs, packet_blobs, i;
> - u32 packet_id;
> + static u32 packet_id;

It is usually not a good idea to use statics in device drivers as it
does not work well when you have several devices of the same type
present in a system. Also, we process all blobs in one pass so there is
no need to preserve value of packet_id between calls to sur40_poll().

Thanks.

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] staging/lustre: fix build error when CONFIG_FS_POSIX_ACL is off

2013-11-27 Thread Peng Tao
We need to include  regardless of
CONFIG_FS_POSIX_ACL is set or not. Otherwise build fails as
reported by kbuild robot:

>> drivers/staging/lustre/lustre/llite/file.c:2965:2: error: implicit 
>> declaration of function 'posix_acl_dup' 
>> [-Werror=implicit-function-declaration]
 acl = posix_acl_dup(lli->lli_posix_acl);


Reported-by: Fengguang Wu 
Cc: Andreas Dilger 
Signed-off-by: Peng Tao 
---
 .../lustre/lustre/include/linux/lustre_acl.h   |   18 +-
 drivers/staging/lustre/lustre/include/lustre_mdc.h |9 +++--
 .../staging/lustre/lustre/llite/llite_internal.h   |1 +
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c|6 ++
 4 files changed, 15 insertions(+), 19 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/linux/lustre_acl.h 
b/drivers/staging/lustre/lustre/include/linux/lustre_acl.h
index ff4fc4f..778b123 100644
--- a/drivers/staging/lustre/lustre/include/linux/lustre_acl.h
+++ b/drivers/staging/lustre/lustre/include/linux/lustre_acl.h
@@ -47,17 +47,17 @@
 #error Shoud not include direectly. use #include  instead
 #endif
 
-# include 
-# include 
-# ifdef CONFIG_FS_POSIX_ACL
-#  include 
-#  define LUSTRE_POSIX_ACL_MAX_ENTRIES 32
-#  define LUSTRE_POSIX_ACL_MAX_SIZE\
+#include 
+#include 
+
+#include 
+#define LUSTRE_POSIX_ACL_MAX_ENTRIES   32
+#define LUSTRE_POSIX_ACL_MAX_SIZE  \
(sizeof(posix_acl_xattr_header) +   \
 LUSTRE_POSIX_ACL_MAX_ENTRIES * sizeof(posix_acl_xattr_entry))
-# endif /* CONFIG_FS_POSIX_ACL */
-# include 
-# include  /* XATTR_{REPLACE,CREATE} */
+
+#include 
+#include  /* XATTR_{REPLACE,CREATE} */
 
 #ifndef LUSTRE_POSIX_ACL_MAX_SIZE
 # define LUSTRE_POSIX_ACL_MAX_SIZE   0
diff --git a/drivers/staging/lustre/lustre/include/lustre_mdc.h 
b/drivers/staging/lustre/lustre/include/lustre_mdc.h
index 1900025..c1e0270 100644
--- a/drivers/staging/lustre/lustre/include/lustre_mdc.h
+++ b/drivers/staging/lustre/lustre/include/lustre_mdc.h
@@ -48,12 +48,9 @@
  * @{
  */
 
-# include 
-# include 
-# ifdef CONFIG_FS_POSIX_ACL
-#  include 
-# endif /* CONFIG_FS_POSIX_ACL */
-# include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h 
b/drivers/staging/lustre/lustre/llite/llite_internal.h
index c326ff2..6d15e5c 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -47,6 +47,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifndef FMODE_EXEC
 #define FMODE_EXEC 0
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c 
b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index e3f02c7..6cc0d6e 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -36,10 +36,8 @@
 
 #define DEBUG_SUBSYSTEM S_RPC
 
-# ifdef CONFIG_FS_POSIX_ACL
-#  include 
-#  include 
-# endif
+#include 
+#include 
 
 #include 
 #include 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Supporting 4 way connections in LKSCTP

2013-11-27 Thread Sun Paul
How LKSCTP select which source address to use for the INIT_ACK or
HB_ACK? below is the testing result where a router is located in the
middle.

Before starting the application. the packet on eth1 and eth2 are

eth1
0 packets dropped by kernel
[root@localhost ~]# tcpdump -i eth1 -s 0 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes
11:24:14.262489 IP 12.1.1.1.2905 > 110.1.1.1.2905: sctp (1) [INIT]
[init tag: 28362903] [rwnd: 102400] [OS: 16] [MIS: 16] [init TSN: 0]
11:24:14.262522 IP 110.1.1.1.2905 > 12.1.1.1.2905: sctp (1) [ABORT]
11:24:14.539486
11:24:16.262488 IP 12.1.1.1.2905 > 110.1.1.1.2905: sctp (1) [INIT]
[init tag: 29391734] [rwnd: 102400] [OS: 16] [MIS: 16] [init TSN: 0]
11:24:16.262520 IP 110.1.1.1.2905 > 12.1.1.1.2905: sctp (1) [ABORT]

eth2
[root@localhost ~]# tcpdump -i eth2 -s 0 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes

When starting the application. the packet show as below.

eth1
11:26:02.261511 IP 12.1.1.1.2905 > 110.1.1.1.2905: sctp (1) [INIT]
[init tag: 26256828] [rwnd: 102400] [OS: 16] [MIS: 16] [init TSN: 0]
11:26:02.263513 IP 12.1.1.1.2905 > 110.1.1.1.2905: sctp (1) [COOKIE ECHO]
11:26:02.264518 IP 12.1.1.1.2905 > 110.1.1.1.2905: sctp (1) [HB REQ]
11:26:02.563511 IP 12.1.1.1.2905 > 110.1.1.1.2905: sctp (1) [HB REQ]

eth2
11:26:02.261604 IP 120.1.1.1.2905 > 12.1.1.1.2905: sctp (1) [INIT ACK]
[init tag: 3478239387] [rwnd: 131072] [OS: 5] [MIS: 5] [init TSN:
2330749678]
11:26:02.263583 IP 120.1.1.1.2905 > 12.1.1.1.2905: sctp (1) [COOKIE ACK]
11:26:02.264548 IP 120.1.1.1.2905 > 12.1.1.1.2905: sctp (1) [HB ACK]
11:26:02.264652 IP 11.1.1.1.2905 > 120.1.1.1.2905: sctp (1) [HB REQ]
11:26:02.264705 IP 120.1.1.1.2905 > 11.1.1.1.2905: sctp (1) [HB ACK]
11:26:02.563543 IP 120.1.1.1.2905 > 12.1.1.1.2905: sctp (1) [HB ACK]

>From the above result, you can see that the INIT, COOKIE ECHO and
HB_REQ originated from 12.1.1.1 on eth1, but the ACK (INIT_ACK,
COOKIE_ACK, HB_ACK) are returned on eth2 using source address
120.1.1.1 instead of 110.1.1.1.

Why LKSCTP use 120.1.1.1 as source instead of 110.1.1.1?

For simple ICMP ping test, it is normal, but not for SCTP.

eth1
11:30:02.824548 IP 12.1.1.1 > 110.1.1.1: ICMP echo request, id 37178,
seq 12, length 64
11:30:02.824559 IP 110.1.1.1 > 12.1.1.1: ICMP echo reply, id 37178,
seq 12, length 64
11:30:03.825551 IP 12.1.1.1 > 110.1.1.1: ICMP echo request, id 37178,
seq 13, length 64
11:30:03.825561 IP 110.1.1.1 > 12.1.1.1: ICMP echo reply, id 37178,
seq 13, length 64

eth2
11:30:34.027687 IP 11.1.1.1 > 120.1.1.1: ICMP echo request, id 46138,
seq 2, length 64
11:30:34.027697 IP 120.1.1.1 > 11.1.1.1: ICMP echo reply, id 46138,
seq 2, length 64
11:30:35.027686 IP 11.1.1.1 > 120.1.1.1: ICMP echo request, id 46138,
seq 3, length 64
11:30:35.027694 IP 120.1.1.1 > 11.1.1.1: ICMP echo reply, id 46138,
seq 3, length 64

Below is the route information
#route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric RefUse Iface
110.1.1.0   0.0.0.0 255.255.255.0   U 0  00 eth1
120.1.1.0   0.0.0.0 255.255.255.0   U 0  00 eth2

# ip route show
110.1.1.0/24 dev eth1  proto kernel  scope link  src 110.1.1.1
120.1.1.0/24 dev eth2  proto kernel  scope link  src 120.1.1.1

Since we are using iproute2, so we will have dedicate routing table
per interface

# ip route show table SCTP1
default via 110.1.1.254 dev eth1

# ip route show table SCTP2
default via 120.1.1.254 dev eth2

# ip rule ls
0: from all lookup local
101: from 110.1.1.1 lookup SCTP1
102: from 120.1.1.1 lookup SCTP2
32766: from all lookup main
32767: from all lookup default

How LKSCTP select source address to reply? If we know how it works,
then we may know what is going wrong.

On Wed, Nov 27, 2013 at 8:45 PM, Neil Horman  wrote:
> On Wed, Nov 27, 2013 at 07:10:49AM +0800, Sun Paul wrote:
>> Hi Vlad
>>
>> Thank for your reply. If it is based on the destination IP to find the
>> best route, why the problem didn't happen on single-homing sample?
>>
> Because You only ever use one address from NODE A (12.1.1.1)
>
>> In the single-homing sample that provided in the original email, both
>> of the interfaces (eth1 and eth2) are presented on NODE-B during the
>> test. However, the LKSCTP library know to use the interface eth1 to
>> respond to the SCTP request.
>>
> Yes, because it does a route lookup to each of the two ip addresses to NODE B,
> and in both lookups, the route indicates that only one source address should 
> be
> used (12.1.1.1).  If you issue a ip route show command, you'll see that routes
> to both address on NODE B match on a rule that specifies the same src address
> and interface be used.
>
> Neil
>
>> - PS
>>
>> On Wed, Nov 27, 2013 at 7:09 AM, Sun Paul  wrote:
>> > Hi Vlad
>> >
>> > Thank for your 

Re: [PATCH v4 2/5] extcon: max14577: Add extcon-max14577 driver to support MUIC device

2013-11-27 Thread Chanwoo Choi
On 11/23/2013 12:51 AM, Krzysztof Kozlowski wrote:
> From: Chanwoo Choi 
> 
> This patch supports Maxim MAX14577 MUIC(Micro USB Interface Controller)
> device by using EXTCON subsystem to handle various external connectors.
> The max14577 device uses regmap method for i2c communication and
> supports irq domain.
> 
> Signed-off-by: Chanwoo Choi 
> Signed-off-by: Krzysztof Kozlowski 
> Signed-off-by: Kyungmin Park 
> ---
>  drivers/extcon/Kconfig   |   10 +
>  drivers/extcon/Makefile  |1 +
>  drivers/extcon/extcon-max14577.c |  752 
> ++
>  3 files changed, 763 insertions(+)
>  create mode 100644 drivers/extcon/extcon-max14577.c
> 

Applied it.

Thanks,
Chanwoo Choi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [merged] mm-memcg-handle-non-error-oom-situations-more-gracefully.patch removed from -mm tree

2013-11-27 Thread Johannes Weiner
On Wed, Nov 27, 2013 at 07:20:37PM -0800, David Rientjes wrote:
> On Wed, 27 Nov 2013, Johannes Weiner wrote:
> 
> > > It appears as though this work is being developed in Linus's tree rather 
> > > than -mm, so I'm asking if we should consider backing some of it out for 
> > > 3.14 instead.
> > 
> > The changes fix a deadlock problem.  Are they creating problems that
> > are worse than deadlocks, that would justify their revert?
> > 
> 
> None that I am currently aware of, I'll continue to try them out.  I'd 
> suggest just dropping the sta...@kernel.org from the whole series though 
> unless there is another report of such a problem that people are running 
> into.

The series has long been merged, how do we drop sta...@kernel.org from
it?

> > Since we can't physically draw a perfect line, we should strive for a
> > reasonable and intuitive line.  After that it's rapidly diminishing
> > returns.  Killing something after that much reclaim effort without
> > success is a completely reasonable and intuitive line to draw.  It's
> > also the line that has been drawn a long time ago and we're not
> > breaking this because of a micro optmimization.
> > 
> 
> You don't think something like this is helpful after scanning a memcg will 
> a large number of processes?
> 
> We've had this patch internally since we started using memcg, it has 
> avoided some unnecessary oom killing.

Do you have quantified data that OOM kills are reduced over a longer
sampling period?  How many kills are skipped?  How many of them are
deferred temporarily but the VM ended up having to kill something
anyway?  My theory still being that several loops of failed direct
reclaim and charge attempts likely say more about the machine state
than somebody randomly releasing some memory in the last minute...

> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1836,6 +1836,13 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup 
> *memcg, gfp_t gfp_mask,
>   if (!chosen)
>   return;
>   points = chosen_points * 1000 / totalpages;
> +
> + /* One last chance to see if we really need to kill something */
> + if (mem_cgroup_margin(memcg) >= (1 << order)) {
> + put_task_struct(chosen);
> + return;
> + }
> +
>   oom_kill_process(chosen, gfp_mask, order, points, totalpages, memcg,
>NULL, "Memory cgroup out of memory");
>  }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] w1: matrox: use DEFINE_PCI_DEVICE_TABLE macro

2013-11-27 Thread Jingoo Han
This macro is used to create a struct pci_device_id array.

Signed-off-by: Jingoo Han 
---
 drivers/w1/masters/matrox_w1.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/w1/masters/matrox_w1.c b/drivers/w1/masters/matrox_w1.c
index d8667b0..9361cea 100644
--- a/drivers/w1/masters/matrox_w1.c
+++ b/drivers/w1/masters/matrox_w1.c
@@ -42,7 +42,7 @@ MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Evgeniy Polyakov ");
 MODULE_DESCRIPTION("Driver for transport(Dallas 1-wire prtocol) over VGA 
DDC(matrox gpio).");
 
-static struct pci_device_id matrox_w1_tbl[] = {
+static DEFINE_PCI_DEVICE_TABLE(matrox_w1_tbl) = {
{ PCI_DEVICE(PCI_VENDOR_ID_MATROX, PCI_DEVICE_ID_MATROX_G400) },
{ },
 };
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: l2x0: add prefetch and power ctrl registers configuration support

2013-11-27 Thread Jisheng Zhang
Hi Mark,

On Wed, 27 Nov 2013 09:42:04 -0800
Mark Rutland  wrote:

> On Thu, Nov 07, 2013 at 05:07:52AM +, Jisheng Zhang wrote:
> > PL310 supports Prefetch offset/control register from r2p0 and Power
> > control register from r3p0. This patch adds the support to configure
> > these two registers if there are. The dt binding document is also updated.
> 
> I'd like to see a reasoning as to _why_ these should be in the DT.
> 
> While we have tag and data RAM latency information, those are hardware
> properties that we cannot probe. I'm not so clear on the filter-ranges
> property.
> 
> These bits seem to be configuration rather than a hardware description.
> If there are some portions of this that we can describe with higher
> level properties, I'd prefer to do that.
> 
> But primarily, the question to answer is do we need these, and if so do
> they belong in the DT?


After more consideration, configuring these two registers in bootloader or
TrustZone firmware is more reasonable since the prefetch controller can only be
written in Secure World.

PS: the tag/ram latency must also be written in secure world, and IMHO, the
latency and filter-ranges are also configurations

Thanks for having look at this patch,
Jisheng

> 
> > 
> > Signed-off-by: Jisheng Zhang 
> > ---
> >  Documentation/devicetree/bindings/arm/l2cc.txt |  4 
> >  arch/arm/mm/cache-l2x0.c   | 19 +++
> >  2 files changed, 23 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/arm/l2cc.txt
> > b/Documentation/devicetree/bindings/arm/l2cc.txt index c0c7626..32cd08c
> > 100644 --- a/Documentation/devicetree/bindings/arm/l2cc.txt
> > +++ b/Documentation/devicetree/bindings/arm/l2cc.txt
> > @@ -39,6 +39,10 @@ Optional properties:
> >  - arm,filter-ranges :  Starting address and length of
> > window to filter. Addresses in the filter window are directed to the M1
> > port. Other addresses will go to the M0 port.
> > +- arm,prefetch-ctrl : The value for Prefetch Offset/Control Register if
> > there
> > +  is. This is a single cell.
> > +- arm,pwr-ctrl : The value for Power Control Register if there is. This
> > is a
> > +  single cell.
> >  - interrupts : 1 combined interrupt.
> >  - cache-id-part: cache id part number to be used if it is not present
> >on hardware
> > diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c
> > index 447da6f..8f536ea 100644
> > --- a/arch/arm/mm/cache-l2x0.c
> > +++ b/arch/arm/mm/cache-l2x0.c
> > @@ -704,6 +704,8 @@ static void __init pl310_of_setup(const struct
> > device_node *np, u32 data[3] = { 0, 0, 0 };
> > u32 tag[3] = { 0, 0, 0 };
> > u32 filter[2] = { 0, 0 };
> > +   u32 l2x0_revision = readl_relaxed(l2x0_base + L2X0_CACHE_ID) &
> > +   L2X0_CACHE_ID_RTL_MASK;
> >  
> > of_property_read_u32_array(np, "arm,tag-latency", tag,
> > ARRAY_SIZE(tag)); if (tag[0] && tag[1] && tag[2])
> > @@ -730,6 +732,23 @@ static void __init pl310_of_setup(const struct
> > device_node *np, writel_relaxed((filter[0] & ~(SZ_1M - 1)) |
> > L2X0_ADDR_FILTER_EN, l2x0_base + L2X0_ADDR_FILTER_START);
> > }
> > +
> > +   if (l2x0_revision >= L2X0_CACHE_ID_RTL_R2P0) {
> > +   u32 prefetch_ctrl = 0;
> > +
> > +   of_property_read_u32(np, "arm,prefetch-ctrl",
> > +_ctrl);
> > +   if (prefetch_ctrl)
> > +   writel_relaxed(prefetch_ctrl, l2x0_base +
> > +   L2X0_PREFETCH_CTRL);
> 
> Some of the prefetch control regsiter bits are reserved, and the
> prefetch offset may only take a subset of possible values. I wouldn't
> want to poke the hardware without performing some sanity checking on
> these values.
> 
> > +   if (l2x0_revision >= L2X0_CACHE_ID_RTL_R3P0) {
> > +   u32 pwr_ctrl = 0;
> > +   of_property_read_u32(np, "arm,pwr-ctrl",
> > _ctrl);
> > +   if (pwr_ctrl)
> > +   writel_relaxed(pwr_ctrl, l2x0_base +
> > +   L2X0_POWER_CTRL);
> 
> Similarly, all but the lower 2 bits are reserved...
> 
> Thanks,
> Mark.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [f2fs-dev] [PATCH] f2fs: readahead contiguous pages for restore_node_summary

2013-11-27 Thread Jaegeuk Kim
Hi,

2013-11-28 (목), 09:26 +0800, Chao Yu:
> Hi Kim,
> 
> > -Original Message-
> > From: Jaegeuk Kim [mailto:jaegeuk@samsung.com]
> > Sent: Wednesday, November 27, 2013 4:19 PM
> > To: Chao Yu
> > Cc: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; 
> > linux-f2fs-de...@lists.sourceforge.net; '谭姝'
> > Subject: RE: [f2fs-dev] [PATCH] f2fs: readahead contiguous pages for 
> > restore_node_summary
> > 
> > Hi,
> > 
> > 2013-11-27 (수), 15:58 +0800, Chao Yu:
> > > Hi Kim,
> > >
> > > > -Original Message-
> > > > From: Jaegeuk Kim [mailto:jaegeuk@samsung.com]
> > > > Sent: Wednesday, November 27, 2013 1:30 PM
> > > > To: Chao Yu
> > > > Cc: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; 
> > > > linux-f2fs-de...@lists.sourceforge.net; 谭姝
> > > > Subject: Re: [f2fs-dev] [PATCH] f2fs: readahead contiguous pages for 
> > > > restore_node_summary
> > > >
> > > > Hi Chao,
> > > >
> > > > It seems that we already have a readahed function for node pages,
> > > > ra_node_page().
> > > > So, we don't make a page list for this, but can use the node_inode's
> > > > page cache.
> > >
> > > So you mean it's waste to release page list with updated data after we
> > > finish work in restore_node_summary, right?
> > 
> > Right.
> 
> So how about add all pages of page list to node_inode's address space by
> add_to_page_cache_lru() with arg sum_entry->nid?

I don't think it's proper way to use add_to_page_cache_lru() directly.

> 
> > 
> > >
> > > >
> > > > So how about writing ra_node_pages() which use the node_inode's page
> > > > cache?
> > >
> > > Hmm, so ra_node_pages is introduced for read node_inode's pages which are
> > > logical contiguously? and it also could take place of ra_node_page?
> > 
> > Ah. The ra_node_page() read a node page ahead for a given node id.
> > So it doesn't match exactly between ra_node_page() and ra_node_pages()
> > that I suggested.
> > So how about reading node pages and then caching some of them in the
> > page cache, node_inode's address space?
> 
> Got it,
> If we do not use the method above, we should search the NAT for nid number
> as the index of node_inode's page by the specified node page blkaddr, that 
> costs
> a lot.
> How do you think?

1. grab_cache_page(node_footer->nid);
2. memcpy();
3. SetPageUptodate();
4. f2fs_put_page();

Thanks,

> 
> > 
> > Thanks,
> > 
> > >
> > >
> > > >
> > > > Thanks,
> > > >
> > > > 2013-11-22 (금), 15:48 +0800, Chao Yu:
> > > > > If cp has no CP_UMOUNT_FLAG, we will read all pages in whole node 
> > > > > segment
> > > > > one by one, it makes low performance. So let's merge contiguous pages 
> > > > > and
> > > > > readahead for better performance.
> > > > >
> > > > > Signed-off-by: Chao Yu 
> > > > > ---
> > > > >  fs/f2fs/node.c |   89 
> > > > > +++-
> > > > >  1 file changed, 63 insertions(+), 26 deletions(-)
> > > > >
> > > > > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> > > > > index 4ac4150..81e704a 100644
> > > > > --- a/fs/f2fs/node.c
> > > > > +++ b/fs/f2fs/node.c
> > > > > @@ -1572,47 +1572,84 @@ int recover_inode_page(struct f2fs_sb_info 
> > > > > *sbi, struct page *page)
> > > > >   return 0;
> > > > >  }
> > > > >
> > > > > +/*
> > > > > + * ra_sum_pages() merge contiguous pages into one bio and submit.
> > > > > + * these pre-readed pages are linked in pages list.
> > > > > + */
> > > > > +static int ra_sum_pages(struct f2fs_sb_info *sbi, struct list_head 
> > > > > *pages,
> > > > > + int start, int nrpages)
> > > > > +{
> > > > > + struct page *page;
> > > > > + int page_idx = start;
> > > > > +
> > > > > + for (; page_idx < start + nrpages; page_idx++) {
> > > > > + /* alloc temporal page for read node summary info*/
> > > > > + page = alloc_page(GFP_NOFS | __GFP_ZERO);
> > > > > + if (!page) {
> > > > > + struct page *tmp;
> > > > > + list_for_each_entry_safe(page, tmp, pages, lru) 
> > > > > {
> > > > > + list_del(>lru);
> > > > > + unlock_page(page);
> > > > > + __free_pages(page, 0);
> > > > > + }
> > > > > + return -ENOMEM;
> > > > > + }
> > > > > +
> > > > > + lock_page(page);
> > > > > + page->index = page_idx;
> > > > > + list_add_tail(>lru, pages);
> > > > > + }
> > > > > +
> > > > > + list_for_each_entry(page, pages, lru)
> > > > > + submit_read_page(sbi, page, page->index, READ_SYNC);
> > > > > +
> > > > > + f2fs_submit_read_bio(sbi, READ_SYNC);
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > >  int restore_node_summary(struct f2fs_sb_info *sbi,
> > > > >   unsigned int segno, struct f2fs_summary_block 
> > > > > *sum)
> > > > >  {
> > > > >   struct f2fs_node *rn;
> > > > 

RE: [PATCHv1] ASoC: SGTL5000: Fix kernel failed while getting regulator consumers

2013-11-27 Thread Li Xiubo

> Subject: Re: [PATCHv1] ASoC: SGTL5000: Fix kernel failed while getting
> regulator consumers
> 
> On Wed, Nov 27, 2013 at 08:13:03AM +, Li Xiubo wrote:
> 
> Please fix your mailer to word wrap within paragraphs, it makes your mail
> much more legible.
> 

Yes, I will.


> > There is one dependency patch: "regulator: core: Provide a dummy
> regulator with full constraints".
> 
> > From the dependency patch, we can see that using regulator_get_optional()
> instead can resovle the problem you descripted above.
> 
> > When or will this dependency patch be merged into the -next tree ?
> 
> That patch is already in mainline.
> 

Okey.

> > > I'd expect to see a commit description that describes how the driver
> > > currently tries to handle this, why it doesn't work and how the
> > > patch fixes it.
> 
> > The SGTL5000 requires 2 external power supplies: VDDA and VDDIO. An
> > optional third external power supply VDDD may be provided externally
> 
> You need to put your working through of this stuff in the commit message.

I did it already, please see the next version.

--
Best Regards,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/23] mtd: st_spi_fsm: Add new device

2013-11-27 Thread Huang Shijie

于 2013年11月27日 19:52, Lee Jones 写道:

However, as we send entire 'message sequences' to the FSM Controller
as opposed to merely OPCODEs we would have to extract the OPCODE from
flash->command[0] and call our own functions to craft the correct
'message sequence' for the task. For this reason we rejected the idea
and went with a stand-alone driver.


could you send me the datasheet of your spi nor controller?
I can change my code if it really not good enough.

we can store the opcode to a field, such as spi_nor_write_op.

The framework which Huang is proposing suffers from the same issues.
Only providing read(), write(), read_reg() and write_reg() doesn't
work for our use-case, as we'd have to decode the flash->command[0] and
invoke our own internal routines as before.

The only framework with would work for us would consist almost all
of the important functions such as; read(), write(), erase(),
wait_busy(), read_jedec(), read_status_reg(), write_status_reg(),
read_control_reg(), write_control_reg(), etc. However, this approach
  

read_jedec() can be replaced by read_reg(0x9f);

read_status() can be replaced by read_reg(0x5);



write_control_reg() can be replaced by write_reg(xx).


Please correct me if i am wrong.

IMHO, the current four hooks for spi-nor{} can do all the things.

 read/write/read_reg/write_reg.

thanks
Huang Shijie









--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ARM : Kill CONFIG_MTD_PARTITIONS

2013-11-27 Thread Eunbong Song
This patch removes CONFIG_MTD_PARTITIONS in config files for arm.
 Because CONFIG_MTD_PARTITIONS was removed by commit 
6a8a98b22b10f1560d5f90aded4a54234b9b2724.


Signed-off-by: Eunbong Song 
Acked-by: Tony Lindgren 
---
I resend this patch because i forgot signoff. I really sorry for annoying 
you... 
--
 arch/arm/configs/acs5k_defconfig|1 -
 arch/arm/configs/acs5k_tiny_defconfig   |1 -
 arch/arm/configs/assabet_defconfig  |1 -
 arch/arm/configs/at91x40_defconfig  |1 -
 arch/arm/configs/badge4_defconfig   |1 -
 arch/arm/configs/cerfcube_defconfig |1 -
 arch/arm/configs/cm_x300_defconfig  |1 -
 arch/arm/configs/cns3420vb_defconfig|1 -
 arch/arm/configs/collie_defconfig   |1 -
 arch/arm/configs/corgi_defconfig|1 -
 arch/arm/configs/davinci_all_defconfig  |1 -
 arch/arm/configs/h5000_defconfig|1 -
 arch/arm/configs/iop13xx_defconfig  |1 -
 arch/arm/configs/iop32x_defconfig   |1 -
 arch/arm/configs/iop33x_defconfig   |1 -
 arch/arm/configs/ixp4xx_defconfig   |1 -
 arch/arm/configs/ks8695_defconfig   |1 -
 arch/arm/configs/lart_defconfig |1 -
 arch/arm/configs/lpd270_defconfig   |1 -
 arch/arm/configs/lubbock_defconfig  |1 -
 arch/arm/configs/mackerel_defconfig |1 -
 arch/arm/configs/magician_defconfig |1 -
 arch/arm/configs/mainstone_defconfig|1 -
 arch/arm/configs/mini2440_defconfig |1 -
 arch/arm/configs/mv78xx0_defconfig  |1 -
 arch/arm/configs/neponset_defconfig |1 -
 arch/arm/configs/netx_defconfig |1 -
 arch/arm/configs/nuc910_defconfig   |1 -
 arch/arm/configs/nuc950_defconfig   |1 -
 arch/arm/configs/nuc960_defconfig   |1 -
 arch/arm/configs/omap1_defconfig|1 -
 arch/arm/configs/pcm027_defconfig   |1 -
 arch/arm/configs/pleb_defconfig |1 -
 arch/arm/configs/pxa255-idp_defconfig   |1 -
 arch/arm/configs/raumfeld_defconfig |1 -
 arch/arm/configs/realview-smp_defconfig |1 -
 arch/arm/configs/realview_defconfig |1 -
 arch/arm/configs/shannon_defconfig  |1 -
 arch/arm/configs/simpad_defconfig   |1 -
 arch/arm/configs/spitz_defconfig|1 -
 arch/arm/configs/tct_hammer_defconfig   |1 -
 arch/arm/configs/versatile_defconfig|1 -
 42 files changed, 0 insertions(+), 42 deletions(-)

diff --git a/arch/arm/configs/acs5k_defconfig b/arch/arm/configs/acs5k_defconfig
index 92b0f90..27ca89d 100644
--- a/arch/arm/configs/acs5k_defconfig
+++ b/arch/arm/configs/acs5k_defconfig
@@ -35,7 +35,6 @@ CONFIG_IP_PNP_DHCP=y
 CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
 CONFIG_MTD=y
 CONFIG_MTD_CONCAT=y
-CONFIG_MTD_PARTITIONS=y
 CONFIG_MTD_CHAR=y
 CONFIG_MTD_BLOCK=y
 CONFIG_MTD_CFI=y
diff --git a/arch/arm/configs/acs5k_tiny_defconfig 
b/arch/arm/configs/acs5k_tiny_defconfig
index 2a27a14..1f663ca 100644
--- a/arch/arm/configs/acs5k_tiny_defconfig
+++ b/arch/arm/configs/acs5k_tiny_defconfig
@@ -30,7 +30,6 @@ CONFIG_INET=y
 CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
 CONFIG_MTD=y
 CONFIG_MTD_CONCAT=y
-CONFIG_MTD_PARTITIONS=y
 CONFIG_MTD_CHAR=y
 CONFIG_MTD_BLOCK=y
 CONFIG_MTD_CFI=y
diff --git a/arch/arm/configs/assabet_defconfig 
b/arch/arm/configs/assabet_defconfig
index 558ecd8..bdf6f9c 100644
--- a/arch/arm/configs/assabet_defconfig
+++ b/arch/arm/configs/assabet_defconfig
@@ -22,7 +22,6 @@ CONFIG_IRDA=m
 CONFIG_IRLAN=m
 CONFIG_SA1100_FIR=m
 CONFIG_MTD=y
-CONFIG_MTD_PARTITIONS=y
 CONFIG_MTD_REDBOOT_PARTS=y
 CONFIG_MTD_CHAR=y
 CONFIG_MTD_BLOCK=y
diff --git a/arch/arm/configs/at91x40_defconfig 
b/arch/arm/configs/at91x40_defconfig
index c55e921..5886aea 100644
--- a/arch/arm/configs/at91x40_defconfig
+++ b/arch/arm/configs/at91x40_defconfig
@@ -29,7 +29,6 @@ CONFIG_BINFMT_FLAT=y
 # CONFIG_SUSPEND is not set
 # CONFIG_FW_LOADER is not set
 CONFIG_MTD=y
-CONFIG_MTD_PARTITIONS=y
 CONFIG_MTD_CHAR=y
 CONFIG_MTD_BLOCK=y
 CONFIG_MTD_RAM=y
diff --git a/arch/arm/configs/badge4_defconfig 
b/arch/arm/configs/badge4_defconfig
index 5b54abb..b21bd0a 100644
--- a/arch/arm/configs/badge4_defconfig
+++ b/arch/arm/configs/badge4_defconfig
@@ -30,7 +30,6 @@ CONFIG_BT_HCIVHCI=m
 # CONFIG_FW_LOADER is not set
 CONFIG_MTD=y
 CONFIG_MTD_DEBUG=y
-CONFIG_MTD_PARTITIONS=y
 CONFIG_MTD_CHAR=y
 CONFIG_MTD_BLOCK=y
 CONFIG_MTD_CFI=y
diff --git a/arch/arm/configs/cerfcube_defconfig 
b/arch/arm/configs/cerfcube_defconfig
index dce912d..dcee643 100644
--- a/arch/arm/configs/cerfcube_defconfig
+++ b/arch/arm/configs/cerfcube_defconfig
@@ -29,7 +29,6 @@ CONFIG_IP_PNP_BOOTP=y
 CONFIG_IP_PNP_RARP=y
 # CONFIG_IPV6 is not set
 CONFIG_MTD=y
-CONFIG_MTD_PARTITIONS=y
 CONFIG_MTD_REDBOOT_PARTS=y
 CONFIG_MTD_CMDLINE_PARTS=y
 CONFIG_MTD_CHAR=m
diff --git a/arch/arm/configs/cm_x300_defconfig 
b/arch/arm/configs/cm_x300_defconfig
index f4b7672..1bddbd9 100644
--- a/arch/arm/configs/cm_x300_defconfig
+++ b/arch/arm/configs/cm_x300_defconfig

page fault deadlock

2013-11-27 Thread Xiaotian Feng
Hi,

When I upgrade to latest kernel, I found my system hang there. It
is reproducible on my virtualbox, and I found each time I mounted my
RAID6 partition and tried to vi or build kernel, my whole system
lockup very soon.

After turning on lockdep, I found following lockdep warning:

[   27.848462]
[   27.848471] ==
[   27.848477] [ INFO: possible circular locking dependency detected ]
[   27.848484] 3.13.0-rc1+ #1 Tainted: GF   W
[   27.848490] ---
[   27.848496] Xorg/1268 is trying to acquire lock:
[   27.848501]  (>mutex){+.+.+.}, at: []
sysfs_bin_mmap+0x4f/0x120
[   27.848516]
[   27.848516] but task is already holding lock:
[   27.848521]  (>mmap_sem){++}, at: []
vm_mmap_pgoff+0x6f/0xc0
[   27.848534]
[   27.848534] which lock already depends on the new lock.
[   27.848534]
[   27.848541]
[   27.848541] the existing dependency chain (in reverse order) is:
[   27.848547]
[   27.848547] -> #2 (>mmap_sem){++}:
[   27.848556][] lock_acquire+0xb0/0x160
[   27.848564][] might_fault+0x8c/0xb0
[   27.848572][] md_ioctl+0xa78/0x19b0
[   27.848580][] blkdev_ioctl+0x234/0x840
[   27.848588][] block_ioctl+0x41/0x50
[   27.848597][] do_vfs_ioctl+0x300/0x520
[   27.848605][] SyS_ioctl+0x81/0xa0
[   27.848613][] tracesys+0xe1/0xe6
[   27.848622]
[   27.848622] -> #1 (>reconfig_mutex){+.+.+.}:
[   27.848630][] lock_acquire+0xb0/0x160
[   27.848637][]
mutex_lock_interruptible_nested+0x78/0x610
[   27.848646][] rdev_attr_show+0x40/0x90
[   27.848654][] sysfs_seq_show+0xda/0x170
[   27.848662][] seq_read+0x164/0x3e0
[   27.848671][] vfs_read+0x95/0x160
[   27.848680][] SyS_read+0x49/0xa0
[   27.848687][] tracesys+0xe1/0xe6
[   27.848695]
[   27.848695] -> #0 (>mutex){+.+.+.}:
[   27.848703][] __lock_acquire+0x1587/0x1ca0
[   27.848711][] lock_acquire+0xb0/0x160
[   27.848718][] mutex_lock_nested+0x68/0x510
[   27.848725][] sysfs_bin_mmap+0x4f/0x120
[   27.848732][] mmap_region+0x3ed/0x5d0
[   27.848741][] do_mmap_pgoff+0x34e/0x3d0
[   27.848748][] vm_mmap_pgoff+0x90/0xc0
[   27.848755][] SyS_mmap_pgoff+0x1d5/0x270
[   27.848763][] SyS_mmap+0x22/0x30
[   27.848771][] tracesys+0xe1/0xe6
[   27.848778]
[   27.848778] other info that might help us debug this:
[   27.848778]
[   27.848785] Chain exists of:
[   27.848785]   >mutex --> >reconfig_mutex --> >mmap_sem
[   27.848785]
[   27.848795]  Possible unsafe locking scenario:
[   27.848795]
[   27.848800]CPU0CPU1
[   27.848805]
[   27.848810]   lock(>mmap_sem);
[   27.848817]lock(>reconfig_mutex);
[   27.848824]lock(>mmap_sem);
[   27.848830]   lock(>mutex);
[   27.848837]
[   27.848837]  *** DEADLOCK ***
[   27.848837]
[   27.848844] 1 lock held by Xorg/1268:
[   27.848849]  #0:  (>mmap_sem){++}, at: []
vm_mmap_pgoff+0x6f/0xc0
[   27.848861]
[   27.848861] stack backtrace:
[   27.848868] CPU: 1 PID: 1268 Comm: Xorg Tainted: GF   W3.13.0-rc1+ #1
[   27.848873] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006
[   27.848879]  822daa00 8800d0371bc8 817725f7
822cbdc0
[   27.848901]  8800d0371c08 8176d9eb 8800d0371c60
880115b42a78
[   27.848909]   880115b42a78 880115b422a0
0001
[   27.848918] Call Trace:
[   27.848930]  [] dump_stack+0x4e/0x7a
[   27.848942]  [] print_circular_bug+0x1f9/0x208
[   27.848952]  [] __lock_acquire+0x1587/0x1ca0
[   27.848964]  [] ? print_context_stack+0x8f/0x100
[   27.848975]  [] lock_acquire+0xb0/0x160
[   27.848986]  [] ? sysfs_bin_mmap+0x4f/0x120
[   27.848996]  [] ? sysfs_bin_mmap+0x4f/0x120
[   27.849007]  [] mutex_lock_nested+0x68/0x510
[   27.849016]  [] ? sysfs_bin_mmap+0x4f/0x120
[   27.849027]  [] ? kmemleak_alloc+0x4e/0xb0
[   27.849038]  [] sysfs_bin_mmap+0x4f/0x120
[   27.849048]  [] mmap_region+0x3ed/0x5d0
[   27.849058]  [] do_mmap_pgoff+0x34e/0x3d0
[   27.849070]  [] vm_mmap_pgoff+0x90/0xc0
[   27.849080]  [] SyS_mmap_pgoff+0x1d5/0x270
[   27.849092]  [] ? syscall_trace_enter+0x145/0x270
[   27.849102]  [] SyS_mmap+0x22/0x30
[   27.849112]  [] tracesys+0xe1/0xe6


I think it is a real deadlock, and it is caused by commit
3124eb1679b28726 "sysfs: merge regular and bin file handling".

With that commit, sysfs_bin_mmap will hold of->mutex.

So assume cpu0 called sysfs_bin_mmap, acquired mmap_sem and trying
to get of->mutex.

 CPU1 called sysfs_seq_show, acqured of->mutex and trying to
get mddev->reconfig_mutex.

 CPU2 called md_ioctl, acquired mddev->reconfig_mutex, and
later call copy_from_user and page fault trying to get mmap_sem.

Re: [PATCH] WIFI: handle a neglected case in nl80211_new_interface()

2013-11-27 Thread Chao Bi
On Wed, 2013-11-27 at 20:43 +0530, Ujjal Roy wrote:
> Hi,
> 
> 
> We can use IS_ERR_OR_NULL(wdev) to check NULL as well as error value.
> 
> Thanks,
> UjjaL
> 
> On Wed, Nov 27, 2013 at 8:30 AM, Chao Bi  wrote:
> In nl80211_new_interface(), it calls rdev_add_virtual_intf() to create
> a new interface, however, it only checks whether returned value is err
> code, but doesn't check if returned value is NULL. The returned 

Thanks Ujjal. I'll update it.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2] cpufreq: Make sure CPU is running on a freq from freq-table

2013-11-27 Thread Viresh Kumar
On 28 November 2013 01:51, Rafael J. Wysocki  wrote:
> I have a concern that on some systems you can't really say what frequency
> you're running at the moment, however.

Which ones? I know ACPI tries to play smart by handling the frequency stuff
itself by marking CPUs not-related to each other for the kernel where they
might actually be sharing clock line... But probably in these cases as well,
atleast the cpufreq core should believe that it is running on a valid frequency
even if actual hardware is running at something different..

Any other platforms you are aware of that implement ->target/target_index
and where we can't say what freq are they running at?

> So there should be a flag for
> drivers indicating whether or not frequencies (or operation points in
> general) are directly testable and the check should only be done for
> the drivers with the flag set.

Probably a flag with properties exactly opposite to what you mentioned,
so that we don't need to modify most of the drivers..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [merged] mm-memcg-handle-non-error-oom-situations-more-gracefully.patch removed from -mm tree

2013-11-27 Thread David Rientjes
On Wed, 27 Nov 2013, Johannes Weiner wrote:

> > It appears as though this work is being developed in Linus's tree rather 
> > than -mm, so I'm asking if we should consider backing some of it out for 
> > 3.14 instead.
> 
> The changes fix a deadlock problem.  Are they creating problems that
> are worse than deadlocks, that would justify their revert?
> 

None that I am currently aware of, I'll continue to try them out.  I'd 
suggest just dropping the sta...@kernel.org from the whole series though 
unless there is another report of such a problem that people are running 
into.

> Since we can't physically draw a perfect line, we should strive for a
> reasonable and intuitive line.  After that it's rapidly diminishing
> returns.  Killing something after that much reclaim effort without
> success is a completely reasonable and intuitive line to draw.  It's
> also the line that has been drawn a long time ago and we're not
> breaking this because of a micro optmimization.
> 

You don't think something like this is helpful after scanning a memcg will 
a large number of processes?

We've had this patch internally since we started using memcg, it has 
avoided some unnecessary oom killing.
---
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1836,6 +1836,13 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup 
*memcg, gfp_t gfp_mask,
if (!chosen)
return;
points = chosen_points * 1000 / totalpages;
+
+   /* One last chance to see if we really need to kill something */
+   if (mem_cgroup_margin(memcg) >= (1 << order)) {
+   put_task_struct(chosen);
+   return;
+   }
+
oom_kill_process(chosen, gfp_mask, order, points, totalpages, memcg,
 NULL, "Memory cgroup out of memory");
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM : Kill CONFIG_MTD_PARTITIONS

2013-11-27 Thread Baruch Siach
Hi Eunbong,

On Thu, Nov 28, 2013 at 01:18:45AM +, Eunbong Song wrote:
> This patch removes CONFIG_MTD_PARTITIONS in config files for arm.
> Because CONFIG_MTD_PARTITIONS was removed by commit 
> 6a8a98b22b10f1560d5f90aded4a54234b9b2724.
> ---
> I resend this patch because i forgot signoff.

Well, it seems you forgot it again.

baruch

-- 
 http://baruch.siach.name/blog/  ~. .~   Tk Open Systems
=}ooO--U--Ooo{=
   - bar...@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il -
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] mm, memcg: avoid oom notification when current needs access to memory reserves

2013-11-27 Thread Johannes Weiner
On Wed, Nov 27, 2013 at 06:52:10PM -0800, David Rientjes wrote:
> On Wed, 27 Nov 2013, Johannes Weiner wrote:
> 
> > The long-standing, user-visible definition of the current line agrees
> > with me.  You can't just redefine this, period.
> > 
> > I tried to explain to you how insane the motivation for this patch is,
> > but it does not look like you are reading what I write.  But you don't
> > get to change user-visible behavior just like that anyway, much less
> > so without a sane reason, so this was a complete waste of time :-(
> > 
> 
> If you would like to leave this to Andrew's decision, that's fine.  
> Michal has already agreed with my patch and has acked it in -mm.
> 
> If userspace is going to handle oom conditions, which is possible today 
> and will be extended in the future, then it should only wakeup as a last 
> resort when there is no possibility of future memory freeing.

I'll ack a patch that accomplishes that.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] virtio_net: fix error handling for mergeable buffers

2013-11-27 Thread Jason Wang
On 11/28/2013 12:31 AM, Michael S. Tsirkin wrote:
> Eric Dumazet noticed that if we encounter an error
> when processing a mergeable buffer, we don't
> dequeue all of the buffers from this packet,
> the result is almost sure to be loss of networking.
>
> Jason Wang noticed that we also leak a page and that we don't decrement
> the rq buf count, so we won't repost buffers (a resource leak).
>
> Fix both issues, and also make the logic a bit more
> robust against device errors by not looping when e.g. because of a leak
> like the one we are fixing here the number of buffers is 0.
>
> Cc: Rusty Russell 
> Cc: Michael Dalton 
> Reported-by: Eric Dumazet 
> Reported-by: Jason Wang 
> Signed-off-by: Michael S. Tsirkin 
> ---
>
> Note: this bugfix is needed on stable too, but backport
> might not be trivial.
> I'll send a backport for stable separately.

That will be fine.
>
>  drivers/net/virtio_net.c | 84 
> ++--
>  1 file changed, 52 insertions(+), 32 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 7bab4de..0e6ea69 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -299,41 +299,53 @@ static struct sk_buff *page_to_skb(struct receive_queue 
> *rq,
>   return skb;
>  }
>  
> -static int receive_mergeable(struct receive_queue *rq, struct sk_buff 
> *head_skb)
> +static struct sk_buff *receive_mergeable(struct net_device *dev,
> +  struct receive_queue *rq,
> +  void *buf,
> +  unsigned int len)
>  {
> - struct skb_vnet_hdr *hdr = skb_vnet_hdr(head_skb);
> + struct skb_vnet_hdr *hdr = buf;
> + int num_buf = hdr->mhdr.num_buffers;
> + struct page *page = virt_to_head_page(buf);
> + int offset = buf - page_address(page);
> + struct sk_buff *head_skb = page_to_skb(rq, page, offset, len,
> +MERGE_BUFFER_LEN);
>   struct sk_buff *curr_skb = head_skb;
> - char *buf;
> - struct page *page;
> - int num_buf, len, offset;
>  
> - num_buf = hdr->mhdr.num_buffers;
> + if (unlikely(!curr_skb))
> + goto err_skb;
> +
>   while (--num_buf) {
> - int num_skb_frags = skb_shinfo(curr_skb)->nr_frags;
> + int num_skb_frags;
> +
>   buf = virtqueue_get_buf(rq->vq, );
>   if (unlikely(!buf)) {
> - pr_debug("%s: rx error: %d buffers missing\n",
> -  head_skb->dev->name, hdr->mhdr.num_buffers);
> - head_skb->dev->stats.rx_length_errors++;
> - return -EINVAL;
> + pr_debug("%s: rx error: %d buffers out of %d missing\n",
> +  dev->name, num_buf, hdr->mhdr.num_buffers);
> + dev->stats.rx_length_errors++;
> + goto err_buf;

Not sure it's correct here. Since the we break immediately if buffer is
missed in err_buf and rx_length_error will be miss counted.

Maybe an ERR_PTR(-EINVAL) is better.
>   }
>   if (unlikely(len > MERGE_BUFFER_LEN)) {
>   pr_debug("%s: rx error: merge buffer too long\n",
> -  head_skb->dev->name);
> +  dev->name);
>   len = MERGE_BUFFER_LEN;
>   }
> +
> + page = virt_to_head_page(buf);
> + --rq->num;
> +
> + num_skb_frags = skb_shinfo(curr_skb)->nr_frags;
>   if (unlikely(num_skb_frags == MAX_SKB_FRAGS)) {
>   struct sk_buff *nskb = alloc_skb(0, GFP_ATOMIC);
> - if (unlikely(!nskb)) {
> - head_skb->dev->stats.rx_dropped++;
> - return -ENOMEM;
> - }
> +
> + if (unlikely(!nskb))
> + goto err_skb;
>   if (curr_skb == head_skb)
>   skb_shinfo(curr_skb)->frag_list = nskb;
>   else
>   curr_skb->next = nskb;
> - curr_skb = nskb;
>   head_skb->truesize += nskb->truesize;
> + curr_skb = nskb;

This change seems unnecessary.

Other looks good.
>   num_skb_frags = 0;
>   }
>   if (curr_skb != head_skb) {
> @@ -341,8 +353,7 @@ static int receive_mergeable(struct receive_queue *rq, 
> struct sk_buff *head_skb)
>   head_skb->len += len;
>   head_skb->truesize += MERGE_BUFFER_LEN;
>   }
> - page = virt_to_head_page(buf);
> - offset = buf - (char *)page_address(page);
> + offset = buf - page_address(page);
>   if (skb_can_coalesce(curr_skb, num_skb_frags, 

Re: [merged] mm-memcg-handle-non-error-oom-situations-more-gracefully.patch removed from -mm tree

2013-11-27 Thread Johannes Weiner
On Wed, Nov 27, 2013 at 06:38:31PM -0800, David Rientjes wrote:
> On Wed, 27 Nov 2013, Johannes Weiner wrote:
> 
> > > The task that is bypassing the memcg charge to the root memcg may not be 
> > > the process that is chosen by the oom killer, and it's possible the 
> > > amount 
> > > of memory freed by killing the victim is less than the amount of memory 
> > > bypassed.
> > 
> > That's true, though unlikely.
> > 
> 
> Well, the "goto bypass" allows it and it's trivial to cause by 
> manipulating /proc/pid/oom_score_adj values to prefer processes with very 
> little rss.  It will just continue looping and killing processes as they 
> are forked and never cause the memcg to free memory below its limit.  At 
> least the "goto nomem" allows us to free some memory instead of leaking to 
> the root memcg.

Yes, that's the better way of doing it, I'll send the patch.  Thanks.

> > > Were you targeting these to 3.13 instead?  If so, it would have already 
> > > appeared in 3.13-rc1 anyway.  Is it still a work in progress?
> > 
> > I don't know how to answer this question.
> > 
> 
> It appears as though this work is being developed in Linus's tree rather 
> than -mm, so I'm asking if we should consider backing some of it out for 
> 3.14 instead.

The changes fix a deadlock problem.  Are they creating problems that
are worse than deadlocks, that would justify their revert?

> > > Should we be checking mem_cgroup_margin() here to ensure 
> > > task_in_memcg_oom() is still accurate and we haven't raced by freeing 
> > > memory?
> > 
> > We would have invoked the OOM killer long before this point prior to
> > my patches.  There is a line we draw and from that point on we start
> > killing things.  I tried to explain multiple times now that there is
> > no race-free OOM killing and I'm tired of it.  Convince me otherwise
> > or stop repeating this non-sense.
> > 
> 
> In our internal kernel we call mem_cgroup_margin() with the order of the 
> charge immediately prior to sending the SIGKILL to see if it's still 
> needed even after selecting the victim.  It makes the race smaller.
> 
> It's obvious that after the SIGKILL is sent, either from the kernel or 
> from userspace, that memory might subsequently be freed or another process 
> might exit before the process killed could even wake up.  There's nothing 
> we can do about that since we don't have psychic abilities.  I think we 
> should try to reduce the chance for unnecessary oom killing as much as 
> possible, however.

Since we can't physically draw a perfect line, we should strive for a
reasonable and intuitive line.  After that it's rapidly diminishing
returns.  Killing something after that much reclaim effort without
success is a completely reasonable and intuitive line to draw.  It's
also the line that has been drawn a long time ago and we're not
breaking this because of a micro optmimization.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [tip:sched/urgent] sched: Check sched_domain before computing group power

2013-11-27 Thread David Rientjes
On Thu, 21 Nov 2013, Yinghai Lu wrote:

> original one in linus's tree:
> 
> [8.952728] NMI watchdog: enabled on all CPUs, permanently consumes
> one hw-PMU counter.
> [8.965697] BUG: unable to handle kernel NULL pointer dereference
> at 0010
> [8.969495] IP: [] update_group_power+0x1d3/0x250

This should have been fixed by Srikar's patch, no?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/13] mfd: menelaus: a few cleanups

2013-11-27 Thread Felipe Balbi
Hi,

On Wed, Nov 27, 2013 at 10:46:21PM +0200, Aaro Koskinen wrote:
> Hi,
> 
> On Wed, Nov 27, 2013 at 02:11:49PM -0600, Felipe Balbi wrote:
> > On Wed, Nov 27, 2013 at 10:02:47PM +0200, Aaro Koskinen wrote:
> > > On Wed, Nov 27, 2013 at 01:06:44PM -0600, Felipe Balbi wrote:
> > > > few cleanups on the old menelaus driver. I don't have
> > > > HW to test these patches, maybe Aaro can help here ?
> > > 
> > > Hmm, I got:
> > > 
> > > [1.33] Unable to handle kernel NULL pointer dereference at 
> > > virtual address 
> > > [1.34] pgd = c0004000
> > > [1.34] [] *pgd=
> > > [1.35] Internal error: Oops: 17 [#1] ARM
> > > [1.35] CPU: 0 PID: 1 Comm: swapper Not tainted 
> > > 3.13.0-rc1-n8x0_tiny-los.git-729021f-00018-g74a0f39 #2
> > > [1.35] task: c782c000 ti: c782e000 task.ti: c782e000
> > > [1.35] PC is at mutex_lock+0x0/0x20
> > > [1.35] LR is at __irq_get_desc_lock+0x6c/0x88
> [...]
> > > [1.35] [] (mutex_lock+0x0/0x20) from [] 
> > > (__irq_get_desc_lock+0x6c/0x88)
> > > [1.35] [] (__irq_get_desc_lock+0x6c/0x88) from 
> > > [] (__irq_set_handler+0x24/0x128)
> > > [1.35] [] (__irq_set_handler+0x24/0x128) from 
> > > [] (menelaus_probe+0xbc/0x280)
> > > [1.35] [] (menelaus_probe+0xbc/0x280) from [] 
> > > (i2c_device_probe+0x98/0xc0)
> 
> [...]
> 
> > hmm, irq_set_chip_and_handler() will call back into the irq_chip we just
> > registered, so my ->irq_bus_lock needs to have everything setup
> > (chip_data my mutex), this should solve it:
> 
> Yes, that fixes it. Seems to work fine now.

Awesome, should I add your tested-by ? I also added a few extra patches
on top which I'll send soon.

-- 
balbi


signature.asc
Description: Digital signature


  1   2   3   4   5   6   7   8   9   10   >