Re: [PATCH v3] xfs: introduce object readahead to log recovery

2013-08-13 Thread Zhi Yong Wu
On Wed, Aug 14, 2013 at 1:35 PM, Dave Chinner  wrote:
> On Wed, Jul 31, 2013 at 04:42:45PM +0800, zwu.ker...@gmail.com wrote:
>> From: Zhi Yong Wu 
>>
>>   It can take a long time to run log recovery operation because it is
>> single threaded and is bound by read latency. We can find that it took
>> most of the time to wait for the read IO to occur, so if one object
>> readahead is introduced to log recovery, it will obviously reduce the
>> log recovery time.
>>
>> Log recovery time stat:
>>
>>   w/o this patchw/ this patch
>>
>> real:0m15.023s 0m7.802s
>> user:0m0.001s  0m0.001s
>> sys: 0m0.246s  0m0.107s
>
> This version works as advertised as well.
>
>> @@ -3216,6 +3351,18 @@ xlog_recover_commit_trans(
>>   goto out;
>>   }
>>
>> + if (!list_empty(_list)) {
>> + error = xlog_recover_items_pass2(log, trans,
>> + _list, _list);
>> + if (error)
>> + goto out;
>> +
>> + list_splice_tail_init(_list, _list);
>> + }
>> +
>> + if (!list_empty(_list))
>> + list_splice_init(_list, >r_itemq);
>> +
>>   xlog_recover_free_trans(trans);
>
> I think this still leaks the trans structure when an error occurs.
> Indeed, I think this is a pre-existing leak, as the current code
> will skip freeing the trans structure on item recovery failure and
> nothing else frees it.  So it appears to me to be busted before this
> patch is added.
Yes, i also found this and think so.
>
> Hence on a xlog_recover_items_pass2() error we need to splice the
> ra-list to the done_list and free trans. i.e. the "if (error) goto
> out;" lines in the above hunk do not need to be there, and the
> "out:" label moved to above the call to xlog_recover_free_trans() so
> the main loop does the right thing when an error occurs.
Do you need to draft one patch to fix trans leaking? or can it be
fixed in this patch?
or will you draft one patch?

>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> da...@fromorbit.com



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 4/5] mm: export unmap_kernel_range

2013-08-13 Thread Minchan Kim
Now zsmalloc needs exported unmap_kernel_range for building it
as module. In detail, here it is.
https://lkml.org/lkml/2013/1/18/487

We didn't send patch to make unmap_kernel_range exportable at that time.
Because zram is staging stuff and we didn't think make VM function
exportable for staging stuff makes sense so we decided giving up build=m
for zsmalloc but zsmalloc moved under zram directory so if we can't build
zsmalloc as module, it means we can't build zram as module, either.
In addition, another reason we should export it is that buddy map_vm_area
is already exported.

Signed-off-by: Minchan Kim 
---
 mm/vmalloc.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 93d3182..0e9a9f8 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1254,6 +1254,7 @@ void unmap_kernel_range(unsigned long addr, unsigned long 
size)
vunmap_page_range(addr, end);
flush_tlb_kernel_range(addr, end);
 }
+EXPORT_SYMBOL_GPL(unmap_kernel_range);
 
 int map_vm_area(struct vm_struct *area, pgprot_t prot, struct page ***pages)
 {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 0/5] zram/zsmalloc promotion

2013-08-13 Thread Minchan Kim
It's 6th trial of zram/zsmalloc promotion.
[patch 5, zram: promote zram from staging] explains why we need zram.

Main reason to block promotion is there was no review of zsmalloc part
while Jens already acked zram part.

At that time, zsmalloc was used for zram, zcache and zswap so everybody
wanted to make it general and at last, Mel reviewed it.
Most of review was related to zswap dumping mechanism which can pageout
compressed page into swap in runtime and zswap gives up using zsmalloc
and invented a new wheel, zbud. Other reviews were not major.
http://lkml.indiana.edu/hypermail/linux/kernel/1304.1/04334.html

Zcache don't use zsmalloc either so only zsmalloc user is zram now.
So I think there is no concern any more.

Patch 1 adds new Kconfig for zram to use page table method instead
of copy. Andrew suggested it.

Patch 2 adds lots of commnt for zsmalloc.

Patch 3 moves zsmalloc under driver/staging/zram because zram is only
user for zram now.

Patch 4 makes unmap_kernel_range exportable function because zsmalloc
have used map_vm_area which is already exported function but zsmalloc
need to use unmap_kernel_range and it should be built with module.

Patch 5 moves zram from driver/staging to driver/blocks, finally.

It touches mm, staging, blocks so I am not sure who is right position
maintainer so I will Cc Andrw, Jens and Greg.

This patch is based on next-20130813.

Thanks.

Minchan Kim (4):
  zsmalloc: add Kconfig for enabling page table method
  zsmalloc: move it under zram
  mm: export unmap_kernel_range
  zram: promote zram from staging

Nitin Cupta (1):
  zsmalloc: add more comment

 drivers/block/Kconfig|2 +
 drivers/block/Makefile   |1 +
 drivers/block/zram/Kconfig   |   37 +
 drivers/block/zram/Makefile  |3 +
 drivers/block/zram/zram.txt  |   71 ++
 drivers/block/zram/zram_drv.c|  987 +++
 drivers/block/zram/zsmalloc.c| 1084 ++
 drivers/staging/Kconfig  |4 -
 drivers/staging/Makefile |2 -
 drivers/staging/zram/Kconfig |   25 -
 drivers/staging/zram/Makefile|3 -
 drivers/staging/zram/zram.txt|   77 ---
 drivers/staging/zram/zram_drv.c  |  984 ---
 drivers/staging/zram/zram_drv.h  |  125 
 drivers/staging/zsmalloc/Kconfig |   10 -
 drivers/staging/zsmalloc/Makefile|3 -
 drivers/staging/zsmalloc/zsmalloc-main.c | 1063 -
 drivers/staging/zsmalloc/zsmalloc.h  |   43 --
 include/linux/zram.h |  123 
 include/linux/zsmalloc.h |   52 ++
 mm/vmalloc.c |1 +
 21 files changed, 2361 insertions(+), 2339 deletions(-)
 create mode 100644 drivers/block/zram/Kconfig
 create mode 100644 drivers/block/zram/Makefile
 create mode 100644 drivers/block/zram/zram.txt
 create mode 100644 drivers/block/zram/zram_drv.c
 create mode 100644 drivers/block/zram/zsmalloc.c
 delete mode 100644 drivers/staging/zram/Kconfig
 delete mode 100644 drivers/staging/zram/Makefile
 delete mode 100644 drivers/staging/zram/zram.txt
 delete mode 100644 drivers/staging/zram/zram_drv.c
 delete mode 100644 drivers/staging/zram/zram_drv.h
 delete mode 100644 drivers/staging/zsmalloc/Kconfig
 delete mode 100644 drivers/staging/zsmalloc/Makefile
 delete mode 100644 drivers/staging/zsmalloc/zsmalloc-main.c
 delete mode 100644 drivers/staging/zsmalloc/zsmalloc.h
 create mode 100644 include/linux/zram.h
 create mode 100644 include/linux/zsmalloc.h

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 2/5] zsmalloc: add more comment

2013-08-13 Thread Minchan Kim
From: Nitin Cupta 

This patch adds lots of comments and it will help others
to review and enhance.

Signed-off-by: Seth Jennings 
Signed-off-by: Nitin Gupta 
Signed-off-by: Minchan Kim 
---
 drivers/staging/zsmalloc/zsmalloc-main.c |   66 +-
 drivers/staging/zsmalloc/zsmalloc.h  |9 +++-
 2 files changed, 64 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c 
b/drivers/staging/zsmalloc/zsmalloc-main.c
index f57258fa..52ebddd 100644
--- a/drivers/staging/zsmalloc/zsmalloc-main.c
+++ b/drivers/staging/zsmalloc/zsmalloc-main.c
@@ -10,16 +10,14 @@
  * Released under the terms of GNU General Public License Version 2.0
  */
 
-
 /*
- * This allocator is designed for use with zcache and zram. Thus, the
- * allocator is supposed to work well under low memory conditions. In
- * particular, it never attempts higher order page allocation which is
- * very likely to fail under memory pressure. On the other hand, if we
- * just use single (0-order) pages, it would suffer from very high
- * fragmentation -- any object of size PAGE_SIZE/2 or larger would occupy
- * an entire page. This was one of the major issues with its predecessor
- * (xvmalloc).
+ * This allocator is designed for use with zram. Thus, the allocator is
+ * supposed to work well under low memory conditions. In particular, it
+ * never attempts higher order page allocation which is very likely to
+ * fail under memory pressure. On the other hand, if we just use single
+ * (0-order) pages, it would suffer from very high fragmentation --
+ * any object of size PAGE_SIZE/2 or larger would occupy an entire page.
+ * This was one of the major issues with its predecessor (xvmalloc).
  *
  * To overcome these issues, zsmalloc allocates a bunch of 0-order pages
  * and links them together using various 'struct page' fields. These linked
@@ -27,6 +25,21 @@
  * page boundaries. The code refers to these linked pages as a single entity
  * called zspage.
  *
+ * For simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE
+ * since this satisfies the requirements of all its current users (in the
+ * worst case, page is incompressible and is thus stored "as-is" i.e. in
+ * uncompressed form). For allocation requests larger than this size, failure
+ * is returned (see zs_malloc).
+ *
+ * Additionally, zs_malloc() does not return a dereferenceable pointer.
+ * Instead, it returns an opaque handle (unsigned long) which encodes actual
+ * location of the allocated object. The reason for this indirection is that
+ * zsmalloc does not keep zspages permanently mapped since that would cause
+ * issues on 32-bit systems where the VA region for kernel space mappings
+ * is very small. So, before using the allocating memory, the object has to
+ * be mapped using zs_map_object() to get a usable pointer and subsequently
+ * unmapped using zs_unmap_object().
+ *
  * Following is how we use various fields and flags of underlying
  * struct page(s) to form a zspage.
  *
@@ -98,7 +111,7 @@
 
 /*
  * Object location (, ) is encoded as
- * as single (void *) handle value.
+ * as single (unsigned long) handle value.
  *
  * Note that object index  is relative to system
  * page  it is stored in, so for each sub-page belonging
@@ -264,6 +277,13 @@ static void set_zspage_mapping(struct page *page, unsigned 
int class_idx,
page->mapping = (struct address_space *)m;
 }
 
+/*
+ * zsmalloc divides the pool into various size classes where each
+ * class maintains a list of zspages where each zspage is divided
+ * into equal sized chunks. Each allocation falls into one of these
+ * classes depending on its size. This function returns index of the
+ * size class which has chunk size big enough to hold the give size.
+ */
 static int get_size_class_index(int size)
 {
int idx = 0;
@@ -275,6 +295,13 @@ static int get_size_class_index(int size)
return idx;
 }
 
+/*
+ * For each size class, zspages are divided into different groups
+ * depending on how "full" they are. This was done so that we could
+ * easily find empty or nearly empty zspages when we try to shrink
+ * the pool (not yet implemented). This function returns fullness
+ * status of the given page.
+ */
 static enum fullness_group get_fullness_group(struct page *page)
 {
int inuse, max_objects;
@@ -296,6 +323,12 @@ static enum fullness_group get_fullness_group(struct page 
*page)
return fg;
 }
 
+/*
+ * Each size class maintains various freelists and zspages are assigned
+ * to one of these freelists based on the number of live objects they
+ * have. This functions inserts the given zspage into the freelist
+ * identified by .
+ */
 static void insert_zspage(struct page *page, struct size_class *class,
enum fullness_group fullness)
 {
@@ -313,6 +346,10 @@ static void insert_zspage(struct page *page, struct 
size_class *class,
*head = page;
 }
 
+/*
+ * This function removes 

[PATCH v6 1/5] zsmalloc: add Kconfig for enabling page table method

2013-08-13 Thread Minchan Kim
Zsmalloc has two methods 1) copy-based and 2) pte based to
access objects that span two pages.
You can see history why we supported two approach from [1].

But it was bad choice that adding hard coding to select arch
which want to use pte based method because there are lots of
SoC in an architecure and they can have different cache size,
CPU speed and so on so it would be better to expose it to user
as selectable Kconfig option like Andrew Morton suggested.

[1] https://lkml.org/lkml/2012/7/11/58

Signed-off-by: Minchan Kim 
---
 drivers/staging/zsmalloc/Kconfig |   13 +
 drivers/staging/zsmalloc/zsmalloc-main.c |   19 ---
 2 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig
index 7fab032..e75611a 100644
--- a/drivers/staging/zsmalloc/Kconfig
+++ b/drivers/staging/zsmalloc/Kconfig
@@ -8,3 +8,16 @@ config ZSMALLOC
  non-standard allocator interface where a handle, not a pointer, is
  returned by an alloc().  This handle must be mapped in order to
  access the allocated space.
+
+config PGTABLE_MAPPING
+   bool "Use page table mapping to access object in zsmalloc"
+   depends on ZSMALLOC
+   help
+ By default, zsmalloc uses a copy-based object mapping method to
+ access allocations that span two pages. However, if a particular
+ architecture (ex, ARM) performs VM mapping faster than copying,
+ then you should select this. This causes zsmalloc to use page table
+ mapping rather than copying for object mapping.
+
+ You can check speed with zsmalloc benchmark[1].
+ [1] https://github.com/spartacus06/zsmalloc
diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c 
b/drivers/staging/zsmalloc/zsmalloc-main.c
index 1a67537..f57258fa 100644
--- a/drivers/staging/zsmalloc/zsmalloc-main.c
+++ b/drivers/staging/zsmalloc/zsmalloc-main.c
@@ -218,19 +218,8 @@ struct zs_pool {
 #define CLASS_IDX_MASK ((1 << CLASS_IDX_BITS) - 1)
 #define FULLNESS_MASK  ((1 << FULLNESS_BITS) - 1)
 
-/*
- * By default, zsmalloc uses a copy-based object mapping method to access
- * allocations that span two pages. However, if a particular architecture
- * performs VM mapping faster than copying, then it should be added here
- * so that USE_PGTABLE_MAPPING is defined. This causes zsmalloc to use
- * page table mapping rather than copying for object mapping.
- */
-#if defined(CONFIG_ARM) && !defined(MODULE)
-#define USE_PGTABLE_MAPPING
-#endif
-
 struct mapping_area {
-#ifdef USE_PGTABLE_MAPPING
+#ifdef CONFIG_PGTABLE_MAPPING
struct vm_struct *vm; /* vm area for mapping object that span pages */
 #else
char *vm_buf; /* copy buffer for objects that span pages */
@@ -622,7 +611,7 @@ static struct page *find_get_zspage(struct size_class 
*class)
return page;
 }
 
-#ifdef USE_PGTABLE_MAPPING
+#ifdef CONFIG_PGTABLE_MAPPING
 static inline int __zs_cpu_up(struct mapping_area *area)
 {
/*
@@ -660,7 +649,7 @@ static inline void __zs_unmap_object(struct mapping_area 
*area,
unmap_kernel_range(addr, PAGE_SIZE * 2);
 }
 
-#else /* USE_PGTABLE_MAPPING */
+#else /* CONFIG_PGTABLE_MAPPING */
 
 static inline int __zs_cpu_up(struct mapping_area *area)
 {
@@ -738,7 +727,7 @@ out:
pagefault_enable();
 }
 
-#endif /* USE_PGTABLE_MAPPING */
+#endif /* CONFIG_PGTABLE_MAPPING */
 
 static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action,
void *pcpu)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3] iio: add Bosch BMA180 acceleration sensor driver

2013-08-13 Thread Oleksandr Kravchenko
On Tue, Aug 13, 2013 at 9:28 PM, Jonathan Cameron  wrote:
> On 08/13/13 16:44, Oleksandr Kravchenko wrote:
>> This patch adds IIO driver for Bosch BMA180 triaxial
>> acceleration sensor.
>> http://dlnmh9ip6v2uc.cloudfront.net/datasheets/
>>   Sensors/Accelerometers/BST-BMA180-DS000-07_2.pdf
>>
>> Signed-off-by: Oleksandr Kravchenko 
>> ---
> To play the lazy / busy maintainer.  What changed since v2?
>
> This is where that information should be.

In general: fixed problem with possible not unlocked mutex in
bma180_trigger_handler() and moved range, bandwidth and mode
from devtree to sysfs attributes.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 4/5] mm: export unmap_kernel_range

2013-08-13 Thread Minchan Kim
Now zsmalloc needs exported unmap_kernel_range for building it
as module. In detail, here it is.
https://lkml.org/lkml/2013/1/18/487

We didn't send patch to make unmap_kernel_range exportable at that time.
Because zram is staging stuff and we didn't think make VM function
exportable for staging stuff makes sense so we decided giving up build=m
for zsmalloc but zsmalloc moved under zram directory so if we can't build
zsmalloc as module, it means we can't build zram as module, either.
In addition, another reason we should export it is that buddy map_vm_area
is already exported.

Signed-off-by: Minchan Kim 
---
 mm/vmalloc.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 93d3182..0e9a9f8 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1254,6 +1254,7 @@ void unmap_kernel_range(unsigned long addr, unsigned long 
size)
vunmap_page_range(addr, end);
flush_tlb_kernel_range(addr, end);
 }
+EXPORT_SYMBOL_GPL(unmap_kernel_range);
 
 int map_vm_area(struct vm_struct *area, pgprot_t prot, struct page ***pages)
 {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 2/5] zsmalloc: add more comment

2013-08-13 Thread Minchan Kim
From: Nitin Cupta 

This patch adds lots of comments and it will help others
to review and enhance.

Signed-off-by: Seth Jennings 
Signed-off-by: Nitin Gupta 
Signed-off-by: Minchan Kim 
---
 drivers/staging/zsmalloc/zsmalloc-main.c |   66 +-
 drivers/staging/zsmalloc/zsmalloc.h  |9 +++-
 2 files changed, 64 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c 
b/drivers/staging/zsmalloc/zsmalloc-main.c
index f57258fa..52ebddd 100644
--- a/drivers/staging/zsmalloc/zsmalloc-main.c
+++ b/drivers/staging/zsmalloc/zsmalloc-main.c
@@ -10,16 +10,14 @@
  * Released under the terms of GNU General Public License Version 2.0
  */
 
-
 /*
- * This allocator is designed for use with zcache and zram. Thus, the
- * allocator is supposed to work well under low memory conditions. In
- * particular, it never attempts higher order page allocation which is
- * very likely to fail under memory pressure. On the other hand, if we
- * just use single (0-order) pages, it would suffer from very high
- * fragmentation -- any object of size PAGE_SIZE/2 or larger would occupy
- * an entire page. This was one of the major issues with its predecessor
- * (xvmalloc).
+ * This allocator is designed for use with zram. Thus, the allocator is
+ * supposed to work well under low memory conditions. In particular, it
+ * never attempts higher order page allocation which is very likely to
+ * fail under memory pressure. On the other hand, if we just use single
+ * (0-order) pages, it would suffer from very high fragmentation --
+ * any object of size PAGE_SIZE/2 or larger would occupy an entire page.
+ * This was one of the major issues with its predecessor (xvmalloc).
  *
  * To overcome these issues, zsmalloc allocates a bunch of 0-order pages
  * and links them together using various 'struct page' fields. These linked
@@ -27,6 +25,21 @@
  * page boundaries. The code refers to these linked pages as a single entity
  * called zspage.
  *
+ * For simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE
+ * since this satisfies the requirements of all its current users (in the
+ * worst case, page is incompressible and is thus stored "as-is" i.e. in
+ * uncompressed form). For allocation requests larger than this size, failure
+ * is returned (see zs_malloc).
+ *
+ * Additionally, zs_malloc() does not return a dereferenceable pointer.
+ * Instead, it returns an opaque handle (unsigned long) which encodes actual
+ * location of the allocated object. The reason for this indirection is that
+ * zsmalloc does not keep zspages permanently mapped since that would cause
+ * issues on 32-bit systems where the VA region for kernel space mappings
+ * is very small. So, before using the allocating memory, the object has to
+ * be mapped using zs_map_object() to get a usable pointer and subsequently
+ * unmapped using zs_unmap_object().
+ *
  * Following is how we use various fields and flags of underlying
  * struct page(s) to form a zspage.
  *
@@ -98,7 +111,7 @@
 
 /*
  * Object location (, ) is encoded as
- * as single (void *) handle value.
+ * as single (unsigned long) handle value.
  *
  * Note that object index  is relative to system
  * page  it is stored in, so for each sub-page belonging
@@ -264,6 +277,13 @@ static void set_zspage_mapping(struct page *page, unsigned 
int class_idx,
page->mapping = (struct address_space *)m;
 }
 
+/*
+ * zsmalloc divides the pool into various size classes where each
+ * class maintains a list of zspages where each zspage is divided
+ * into equal sized chunks. Each allocation falls into one of these
+ * classes depending on its size. This function returns index of the
+ * size class which has chunk size big enough to hold the give size.
+ */
 static int get_size_class_index(int size)
 {
int idx = 0;
@@ -275,6 +295,13 @@ static int get_size_class_index(int size)
return idx;
 }
 
+/*
+ * For each size class, zspages are divided into different groups
+ * depending on how "full" they are. This was done so that we could
+ * easily find empty or nearly empty zspages when we try to shrink
+ * the pool (not yet implemented). This function returns fullness
+ * status of the given page.
+ */
 static enum fullness_group get_fullness_group(struct page *page)
 {
int inuse, max_objects;
@@ -296,6 +323,12 @@ static enum fullness_group get_fullness_group(struct page 
*page)
return fg;
 }
 
+/*
+ * Each size class maintains various freelists and zspages are assigned
+ * to one of these freelists based on the number of live objects they
+ * have. This functions inserts the given zspage into the freelist
+ * identified by .
+ */
 static void insert_zspage(struct page *page, struct size_class *class,
enum fullness_group fullness)
 {
@@ -313,6 +346,10 @@ static void insert_zspage(struct page *page, struct 
size_class *class,
*head = page;
 }
 
+/*
+ * This function removes 

[PATCH v6 0/5] zram/zsmalloc promotion

2013-08-13 Thread Minchan Kim
It's 5th trial of zram/zsmalloc promotion.
[patch 5, zram: promote zram from staging] explains why we need zram.

Main reason to block promotion is there was no review of zsmalloc part
while Jens already acked zram part.

At that time, zsmalloc was used for zram, zcache and zswap so everybody
wanted to make it general and at last, Mel reviewed it.
Most of review was related to zswap dumping mechanism which can pageout
compressed page into swap in runtime and zswap gives up using zsmalloc
and invented a new wheel, zbud. Other reviews were not major.
http://lkml.indiana.edu/hypermail/linux/kernel/1304.1/04334.html

Zcache don't use zsmalloc either so only zsmalloc user is zram now.
So I think there is no concern any more.

Patch 1 adds new Kconfig for zram to use page table method instead
of copy. Andrew suggested it.

Patch 2 adds lots of commnt for zsmalloc.

Patch 3 moves zsmalloc under driver/staging/zram because zram is only
user for zram now.

Patch 4 makes unmap_kernel_range exportable function because zsmalloc
have used map_vm_area which is already exported function but zsmalloc
need to use unmap_kernel_range and it should be built with module.

Patch 5 moves zram from driver/staging to driver/blocks, finally.

It touches mm, staging, blocks so I am not sure who is right position
maintainer so I will Cc Andrw, Jens and Greg.

This patch is based on next-20130813.

Thanks.

Minchan Kim (4):
  zsmalloc: add Kconfig for enabling page table method
  zsmalloc: move it under zram
  mm: export unmap_kernel_range
  zram: promote zram from staging

Nitin Cupta (1):
  zsmalloc: add more comment

 drivers/block/Kconfig|2 +
 drivers/block/Makefile   |1 +
 drivers/block/zram/Kconfig   |   37 +
 drivers/block/zram/Makefile  |3 +
 drivers/block/zram/zram.txt  |   71 ++
 drivers/block/zram/zram_drv.c|  987 +++
 drivers/block/zram/zsmalloc.c| 1084 ++
 drivers/staging/Kconfig  |4 -
 drivers/staging/Makefile |2 -
 drivers/staging/zram/Kconfig |   25 -
 drivers/staging/zram/Makefile|3 -
 drivers/staging/zram/zram.txt|   77 ---
 drivers/staging/zram/zram_drv.c  |  984 ---
 drivers/staging/zram/zram_drv.h  |  125 
 drivers/staging/zsmalloc/Kconfig |   10 -
 drivers/staging/zsmalloc/Makefile|3 -
 drivers/staging/zsmalloc/zsmalloc-main.c | 1063 -
 drivers/staging/zsmalloc/zsmalloc.h  |   43 --
 include/linux/zram.h |  123 
 include/linux/zsmalloc.h |   52 ++
 mm/vmalloc.c |1 +
 21 files changed, 2361 insertions(+), 2339 deletions(-)
 create mode 100644 drivers/block/zram/Kconfig
 create mode 100644 drivers/block/zram/Makefile
 create mode 100644 drivers/block/zram/zram.txt
 create mode 100644 drivers/block/zram/zram_drv.c
 create mode 100644 drivers/block/zram/zsmalloc.c
 delete mode 100644 drivers/staging/zram/Kconfig
 delete mode 100644 drivers/staging/zram/Makefile
 delete mode 100644 drivers/staging/zram/zram.txt
 delete mode 100644 drivers/staging/zram/zram_drv.c
 delete mode 100644 drivers/staging/zram/zram_drv.h
 delete mode 100644 drivers/staging/zsmalloc/Kconfig
 delete mode 100644 drivers/staging/zsmalloc/Makefile
 delete mode 100644 drivers/staging/zsmalloc/zsmalloc-main.c
 delete mode 100644 drivers/staging/zsmalloc/zsmalloc.h
 create mode 100644 include/linux/zram.h
 create mode 100644 include/linux/zsmalloc.h

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 1/5] zsmalloc: add Kconfig for enabling page table method

2013-08-13 Thread Minchan Kim
Zsmalloc has two methods 1) copy-based and 2) pte based to
access objects that span two pages.
You can see history why we supported two approach from [1].

But it was bad choice that adding hard coding to select arch
which want to use pte based method because there are lots of
SoC in an architecure and they can have different cache size,
CPU speed and so on so it would be better to expose it to user
as selectable Kconfig option like Andrew Morton suggested.

[1] https://lkml.org/lkml/2012/7/11/58

Signed-off-by: Minchan Kim 
---
 drivers/staging/zsmalloc/Kconfig |   13 +
 drivers/staging/zsmalloc/zsmalloc-main.c |   19 ---
 2 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig
index 7fab032..e75611a 100644
--- a/drivers/staging/zsmalloc/Kconfig
+++ b/drivers/staging/zsmalloc/Kconfig
@@ -8,3 +8,16 @@ config ZSMALLOC
  non-standard allocator interface where a handle, not a pointer, is
  returned by an alloc().  This handle must be mapped in order to
  access the allocated space.
+
+config PGTABLE_MAPPING
+   bool "Use page table mapping to access object in zsmalloc"
+   depends on ZSMALLOC
+   help
+ By default, zsmalloc uses a copy-based object mapping method to
+ access allocations that span two pages. However, if a particular
+ architecture (ex, ARM) performs VM mapping faster than copying,
+ then you should select this. This causes zsmalloc to use page table
+ mapping rather than copying for object mapping.
+
+ You can check speed with zsmalloc benchmark[1].
+ [1] https://github.com/spartacus06/zsmalloc
diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c 
b/drivers/staging/zsmalloc/zsmalloc-main.c
index 1a67537..f57258fa 100644
--- a/drivers/staging/zsmalloc/zsmalloc-main.c
+++ b/drivers/staging/zsmalloc/zsmalloc-main.c
@@ -218,19 +218,8 @@ struct zs_pool {
 #define CLASS_IDX_MASK ((1 << CLASS_IDX_BITS) - 1)
 #define FULLNESS_MASK  ((1 << FULLNESS_BITS) - 1)
 
-/*
- * By default, zsmalloc uses a copy-based object mapping method to access
- * allocations that span two pages. However, if a particular architecture
- * performs VM mapping faster than copying, then it should be added here
- * so that USE_PGTABLE_MAPPING is defined. This causes zsmalloc to use
- * page table mapping rather than copying for object mapping.
- */
-#if defined(CONFIG_ARM) && !defined(MODULE)
-#define USE_PGTABLE_MAPPING
-#endif
-
 struct mapping_area {
-#ifdef USE_PGTABLE_MAPPING
+#ifdef CONFIG_PGTABLE_MAPPING
struct vm_struct *vm; /* vm area for mapping object that span pages */
 #else
char *vm_buf; /* copy buffer for objects that span pages */
@@ -622,7 +611,7 @@ static struct page *find_get_zspage(struct size_class 
*class)
return page;
 }
 
-#ifdef USE_PGTABLE_MAPPING
+#ifdef CONFIG_PGTABLE_MAPPING
 static inline int __zs_cpu_up(struct mapping_area *area)
 {
/*
@@ -660,7 +649,7 @@ static inline void __zs_unmap_object(struct mapping_area 
*area,
unmap_kernel_range(addr, PAGE_SIZE * 2);
 }
 
-#else /* USE_PGTABLE_MAPPING */
+#else /* CONFIG_PGTABLE_MAPPING */
 
 static inline int __zs_cpu_up(struct mapping_area *area)
 {
@@ -738,7 +727,7 @@ out:
pagefault_enable();
 }
 
-#endif /* USE_PGTABLE_MAPPING */
+#endif /* CONFIG_PGTABLE_MAPPING */
 
 static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action,
void *pcpu)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] mce: acpi/apei: trace: Enable ghes memory error trace event

2013-08-13 Thread Borislav Petkov
On Tue, Aug 13, 2013 at 08:13:56PM +, Luck, Tony wrote:
> Generic tracepoints are architected to be able to fire at very high
> rates and log huge amounts of information. So we'd need something
> special to say just log these special tracepoints to network/serial.
>
> > Which reminds me, pstore could also be a good thing to use, in addition.
> > Only put error info there as it is limited anyway.
> 
> Yes - space is very limited.  I don't know how to assign priority for logging
> the dmesg data vs. some error logs.

Didn't we say at some point, "log only the panic messsage which kills
the machine"?

However, we probably could use more the messages before that
catastrophic event because they could give us hints about what lead to
the panic but in that case maybe a limited pstore is the wrong logging
medium.

Actually, I can imagine the full serial/network logs of "special"
tracepoints + dmesg to be the optimal thing.

> If we just "printk()" the most important parts - then that data will
> automatically flow to the serial console and to pstore.

Actually, does the pstore act like a circular buffer? Because if it
contains the last N relevant messages (for an arbitrary definition of
relevant) before the system dies, then that could more helpful than only
the error messages.

And with the advent of UEFI, pretty much every system has a pstore. Too
bad that we have to limit it to 50% of size so that the boxes don't
brick. :-P

> Then we have multiple paths for the critical bits of the error log
> - and the tracepoints give us more details for the cases where the
> machine doesn't spontaneously explode.

Ok, let's sort:

* First we have the not-so-critical hw error messages. We want to carry
those out-of-band, i.e. not in dmesg so that people don't have to parse
and collect dmesg but have a specialized solution which gives them
structured logs and tools can analyze, collect and ... those errors.

* When a critical error happens, the above usage is not necessarily
advantageous anymore in the sense that, in order to debug what caused
the machine to crash, we don't simply necessarily want only the crash
message but also the whole system activity that lead to it.

In which case, we probably actually want to turn off/ignore the error
logging tracepoints and write *only* to dmesg which goes out over serial
and to pstore. Right?

Because in such cases I want to have *all* *relevant* messages that lead
to the explosion + the explosion message itself.

Makes sense? Yes, no? Aspects I've missed?

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3] xfs: introduce object readahead to log recovery

2013-08-13 Thread Dave Chinner
On Wed, Jul 31, 2013 at 04:42:45PM +0800, zwu.ker...@gmail.com wrote:
> From: Zhi Yong Wu 
> 
>   It can take a long time to run log recovery operation because it is
> single threaded and is bound by read latency. We can find that it took
> most of the time to wait for the read IO to occur, so if one object
> readahead is introduced to log recovery, it will obviously reduce the
> log recovery time.
> 
> Log recovery time stat:
> 
>   w/o this patchw/ this patch
> 
> real:0m15.023s 0m7.802s
> user:0m0.001s  0m0.001s
> sys: 0m0.246s  0m0.107s

This version works as advertised as well.

> @@ -3216,6 +3351,18 @@ xlog_recover_commit_trans(
>   goto out;
>   }
>  
> + if (!list_empty(_list)) {
> + error = xlog_recover_items_pass2(log, trans,
> + _list, _list);
> + if (error)
> + goto out;
> +
> + list_splice_tail_init(_list, _list);
> + }
> +
> + if (!list_empty(_list))
> + list_splice_init(_list, >r_itemq);
> +
>   xlog_recover_free_trans(trans);

I think this still leaks the trans structure when an error occurs.
Indeed, I think this is a pre-existing leak, as the current code
will skip freeing the trans structure on item recovery failure and
nothing else frees it.  So it appears to me to be busted before this
patch is added.

Hence on a xlog_recover_items_pass2() error we need to splice the
ra-list to the done_list and free trans. i.e. the "if (error) goto
out;" lines in the above hunk do not need to be there, and the
"out:" label moved to above the call to xlog_recover_free_trans() so
the main loop does the right thing when an error occurs.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] extcon: palmas: Added a new compatible type *ti,palmas-usb-vid*

2013-08-13 Thread Kishon Vijay Abraham I
Hi,

On Wednesday 14 August 2013 12:43 AM, Stephen Warren wrote:
> On 08/12/2013 11:37 PM, Kishon Vijay Abraham I wrote:
>> The Palmas device contains only a USB VID detector, so added a
>> compatible type *ti,palmas-usb-vid*. Dint remove the existing compatible
> 
> s/Dint/Didn't/
> 
>> diff --git a/Documentation/devicetree/bindings/extcon/extcon-twl.txt 
>> b/Documentation/devicetree/bindings/extcon/extcon-twl.txt
> 
>>  PALMAS USB COMPARATOR
>>  Required Properties:
>> - - compatible : Should be "ti,palmas-usb" or "ti,twl6035-usb"
>> + - compatible : Should be "ti,palmas-usb" or "ti,twl6035-usb" or
>> +   "ti,palmas-usb-vid".
> 
> So are ti,palmas-usb and ti,twl6035-usb full EHCI controllers then?

No. I thought I shouldn't remove those if someone is already using those
compatible value.

Thanks
Kishon

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] cpufreq: Only do governor start after successful stop

2013-08-13 Thread Viresh Kumar
On 13 August 2013 12:39, Xiaoguang Chen  wrote:
> cpufreq_add_policy_cpu, __cpufreq_remove_dev and __cpufreq_set_policy
> have operations for governor stop and start.
> Only do the start operation when the previous stop operation succeeds.
>
> Signed-off-by: Xiaoguang Chen 
> ---
>  drivers/cpufreq/cpufreq.c | 25 +++--
>  1 file changed, 15 insertions(+), 10 deletions(-)

I hope you have seen this patch, which is already in Rafael's tree?

https://lkml.org/lkml/2013/8/6/357
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4] pinctrl: pinmux: Don't free pins requested by other devices in pinmux_disable_setting.

2013-08-13 Thread Sonic Zhang
From: Sonic Zhang 

One peripheral may share part of its pins with the 2nd
peripheral and the other pins with the 3rd. If it requests all pins
when part of them has already be requested and owned by the 2nd
peripheral, this request fails and pinmux_disable_setting() is called.
The pinmux_disable_setting() frees all pins of the first peripheral
without checking if the pin is owned by itself or the 2nd, which
results in the malfunction of the 2nd peripheral driver.

Signed-off-by: Sonic Zhang 
---
 drivers/pinctrl/pinmux.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/pinctrl/pinmux.c b/drivers/pinctrl/pinmux.c
index 88cc509..5f51588 100644
--- a/drivers/pinctrl/pinmux.c
+++ b/drivers/pinctrl/pinmux.c
@@ -482,13 +482,13 @@ void pinmux_disable_setting(struct pinctrl_setting const 
*setting)
 pins[i]);
continue;
}
-   desc->mux_setting = NULL;
+   if (desc->mux_setting == &(setting->data.mux)) {
+   desc->mux_setting = NULL;
+   /* And release the pin */
+   pin_free(pctldev, pins[i], NULL);
+   }
}
 
-   /* And release the pins */
-   for (i = 0; i < num_pins; i++)
-   pin_free(pctldev, pins[i], NULL);
-
if (ops->disable)
ops->disable(pctldev, setting->data.mux.func, 
setting->data.mux.group);
 }
-- 
1.8.2.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] cpufreq: Add governor operation ongoing flag

2013-08-13 Thread Viresh Kumar
On 13 August 2013 12:39, Xiaoguang Chen  wrote:
> __cpufreq_governor operation needs to be executed one by one.
> If one operation is ongoing, the other operation can't be executed.
> If the order is not guaranteed, there may be unexpected behavior.

What order??

>  For example, governor is in enable state, and one process
> tries to stop the goveror, but it is scheduled out before policy->
> governor->governor() is executed, but the governor enable flag is
> set to false already. Then one other process tries to start governor,
> It finds enable flag is false, and it can process down to do governor
> start operation, So the governor is started twice.

That's not possible. A process will not and should not call START
before calling STOP. And so the order of calling these routines must
be forced.

Hence, we may not need your patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] userns: initialize the depth of user_namespace chain

2013-08-13 Thread Andy Lutomirski
On Tue, Aug 13, 2013 at 10:04 PM, Rui Xiang  wrote:
> The level of init_user_ns shoule be 1.

What's wrong with zero?

--Andy

>
> Signed-off-by: Rui Xiang 
> ---
>  kernel/user.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/kernel/user.c b/kernel/user.c
> index 69b4c3d..32da187 100644
> --- a/kernel/user.c
> +++ b/kernel/user.c
> @@ -48,6 +48,7 @@ struct user_namespace init_user_ns = {
> },
> },
> .count = ATOMIC_INIT(3),
> +   .level = 1,
> .owner = GLOBAL_ROOT_UID,
> .group = GLOBAL_ROOT_GID,
> .proc_inum = PROC_USER_INIT_INO,
> --
> 1.8.2.2
>



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] userns: initialize the depth of user_namespace chain

2013-08-13 Thread Rui Xiang
The level of init_user_ns shoule be 1.

Signed-off-by: Rui Xiang 
---
 kernel/user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/user.c b/kernel/user.c
index 69b4c3d..32da187 100644
--- a/kernel/user.c
+++ b/kernel/user.c
@@ -48,6 +48,7 @@ struct user_namespace init_user_ns = {
},
},
.count = ATOMIC_INIT(3),
+   .level = 1,
.owner = GLOBAL_ROOT_UID,
.group = GLOBAL_ROOT_GID,
.proc_inum = PROC_USER_INIT_INO,
-- 
1.8.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] clk: fixes for 3.11-rc6

2013-08-13 Thread Mike Turquette
The following changes since commit d4e4ab86bcba5a72779c43dc1459f71fea3d89c8:

  Linux 3.11-rc5 (2013-08-11 18:04:20 -0700)

are available in the git repository at:

  git://git.linaro.org/people/mturquette/linux.git tags/clk-fixes-for-linus

for you to fetch changes up to a701fe3851d9c7f6bd27bc0b92ca1668a42c8406:

  clk: exynos4: Add CLK_GET_RATE_NOCACHE flag for the Exynos4x12 ISP clocks 
(2013-08-13 10:01:56 -0700)


Two small fixes for the Zynq clock controller introduced in 3.11-rc1 and
another Exynos clock patch which fixes a regression that prevents the
video pipeline from functioning on that platform.


Soren Brinkmann (2):
  clk/zynq/clkc: Add dedicated spinlock for the SWDT
  clk/zynq/clkc: Add CLK_SET_RATE_PARENT flag to ethernet muxes

Sylwester Nawrocki (1):
  clk: exynos4: Add CLK_GET_RATE_NOCACHE flag for the Exynos4x12 ISP clocks

 drivers/clk/samsung/clk-exynos4.c | 64 +--
 drivers/clk/zynq/clkc.c   | 13 +---
 2 files changed, 42 insertions(+), 35 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: skip the page buddy block instead of one page

2013-08-13 Thread Xishi Qiu
A large free page buddy block will continue many times, so if the page 
is free, skip the whole page buddy block instead of one page.

Signed-off-by: Xishi Qiu 
---
 mm/compaction.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 05ccb4c..874bae1 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -520,9 +520,10 @@ isolate_migratepages_range(struct zone *zone, struct 
compact_control *cc,
goto next_pageblock;
 
/* Skip if free */
-   if (PageBuddy(page))
+   if (PageBuddy(page)) {
+   low_pfn += (1 << page_order(page)) - 1;
continue;
-
+   }
/*
 * For async migration, also only scan in MOVABLE blocks. Async
 * migration is optimistic to see if the minimum amount of work
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/4] GPIO DT support for da850

2013-08-13 Thread Sekhar Nori
Hi Prabhakar,

On 8/11/2013 3:04 PM, Prabhakar Lad wrote:
> Hi Sekhar,
> 
> On Fri, Jun 14, 2013 at 3:50 PM, Philip, Avinash  wrote:
>> On Fri, Jun 14, 2013 at 15:13:36, Philip, Avinash wrote:
>>> With conversion of GPIO davinci driver to platform driver, gpio-davinci 
>>> driver
>>> can support DT boot.
>>> This patch series
>>> - adds dt binding support for gpio-davinci.
>>> - da850 dt support goio.
>>>
>>> This patch series is based on Linux 3.10-rc4 and is availabel at [1].
>>>
>>> 1. 
>>> https://github.com/avinashphilip/am335x_linux/tree/linux_davinci_v3.10_soc_gpio_v310-rc4
>>
>>
>> This patch series has dependency on [PATCH v2 0/7] Convert GPIO Davinci to 
>> platform driver
>>
>> https://lkml.org/lkml/2013/6/14/120
>>
> What is the status of this patch series is any one taking care of it or
> else I can take care of review comments and repost them.

Avinash is not working for TI anymore. It will be great if you can take
this up.

Thanks,
Sekhar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/mm] x86: Use memblock_set_current_limit() to set limit for memblock.

2013-08-13 Thread tip-bot for Tang Chen
Commit-ID:  2449f343e4adc778de1c3d45b5aa14fe788663f5
Gitweb: http://git.kernel.org/tip/2449f343e4adc778de1c3d45b5aa14fe788663f5
Author: Tang Chen 
AuthorDate: Wed, 14 Aug 2013 11:44:04 +0800
Committer:  H. Peter Anvin 
CommitDate: Tue, 13 Aug 2013 21:27:02 -0700

x86: Use memblock_set_current_limit() to set limit for memblock.

In setup_arch() of x86, it set memblock.current_limit directly.
We should use memblock_set_current_limit(). If the implementation
is changed, it is easy to maintain.

Signed-off-by: Tang Chen 
Link: 
http://lkml.kernel.org/r/1376451844-15682-1-git-send-email-tangc...@cn.fujitsu.com
Signed-off-by: H. Peter Anvin 
---
 arch/x86/kernel/setup.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index f8ec578..de33798 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1070,7 +1070,7 @@ void __init setup_arch(char **cmdline_p)
 
cleanup_highmap();
 
-   memblock.current_limit = ISA_END_ADDRESS;
+   memblock_set_current_limit(ISA_END_ADDRESS);
memblock_x86_fill();
 
/*
@@ -1103,7 +1103,7 @@ void __init setup_arch(char **cmdline_p)
 
setup_real_mode();
 
-   memblock.current_limit = get_max_mapped();
+   memblock_set_current_limit(get_max_mapped());
dma_contiguous_reserve(0);
 
/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] ASoC: wm8960: Fix shared LRCLK support

2013-08-13 Thread Ma Haijun
Shared LRCLK initialization does not survive wm8960_reset,
so place it after the reset.

Signed-off-by: Ma Haijun 
---
 sound/soc/codecs/wm8960.c | 14 +++---
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/sound/soc/codecs/wm8960.c b/sound/soc/codecs/wm8960.c
index 368d39f..b606dd3 100644
--- a/sound/soc/codecs/wm8960.c
+++ b/sound/soc/codecs/wm8960.c
@@ -990,6 +990,9 @@ static int wm8960_probe(struct snd_soc_codec *codec)
 
wm8960->set_bias_level(codec, SND_SOC_BIAS_STANDBY);
 
+   if (pdata && pdata->shared_lrclk)
+   snd_soc_update_bits(codec, WM8960_ADDCTL2, 0x4, 0x4);
+
/* Latch the update bits */
snd_soc_update_bits(codec, WM8960_LINVOL, 0x100, 0x100);
snd_soc_update_bits(codec, WM8960_RINVOL, 0x100, 0x100);
@@ -1041,7 +1044,6 @@ static const struct regmap_config wm8960_regmap = {
 static int wm8960_i2c_probe(struct i2c_client *i2c,
const struct i2c_device_id *id)
 {
-   struct wm8960_data *pdata = dev_get_platdata(>dev);
struct wm8960_priv *wm8960;
int ret;
 
@@ -1054,16 +1056,6 @@ static int wm8960_i2c_probe(struct i2c_client *i2c,
if (IS_ERR(wm8960->regmap))
return PTR_ERR(wm8960->regmap);
 
-   if (pdata && pdata->shared_lrclk) {
-   ret = regmap_update_bits(wm8960->regmap, WM8960_ADDCTL2,
-0x4, 0x4);
-   if (ret != 0) {
-   dev_err(>dev, "Failed to enable LRCM: %d\n",
-   ret);
-   return ret;
-   }
-   }
-
i2c_set_clientdata(i2c, wm8960);
 
ret = snd_soc_register_codec(>dev,
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] ASoC: wm8960: Fix ADC volume bits

2013-08-13 Thread Ma Haijun
Signed-off-by: Ma Haijun 
---
 sound/soc/codecs/wm8960.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/sound/soc/codecs/wm8960.c b/sound/soc/codecs/wm8960.c
index 0a4ffdd..368d39f 100644
--- a/sound/soc/codecs/wm8960.c
+++ b/sound/soc/codecs/wm8960.c
@@ -263,8 +263,8 @@ SOC_SINGLE("ALC Attack", WM8960_ALC3, 0, 15, 0),
 SOC_SINGLE("Noise Gate Threshold", WM8960_NOISEG, 3, 31, 0),
 SOC_SINGLE("Noise Gate Switch", WM8960_NOISEG, 0, 1, 0),
 
-SOC_DOUBLE_R("ADC PCM Capture Volume", WM8960_LINPATH, WM8960_RINPATH,
-   0, 127, 0),
+SOC_DOUBLE_R_TLV("ADC PCM Capture Volume", WM8960_LADC, WM8960_RADC,
+   0, 255, 0, adc_tlv),
 
 SOC_SINGLE_TLV("Left Output Mixer Boost Bypass Volume",
   WM8960_BYPASS1, 4, 7, 1, bypass_tlv),
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] perf tests: Fix compile failure on do_sort_something - v2

2013-08-13 Thread David Ahern
Commit b55ae0a9 added code-reading.c which fails to compile on Fedora 16
with compiler version:
$ gcc --version
gcc (GCC) 4.6.3 20120306 (Red Hat 4.6.3-2)

Failure message is:

tests/code-reading.c: In function ‘do_sort_something’:
tests/code-reading.c:305:13: error: stack protector not protecting local 
variables: variable length buffer [-Werror=stack-protector]
cc1: all warnings being treated as errors
make: *** [/tmp/junk/tests/code-reading.o] Error 1
make: *** Waiting for unfinished jobs

v2: as Adrian noticed changed sizeof to ARRAY_SIZE

Signed-off-by: David Ahern 
Cc: Adrian Hunter 
Cc: Jiri Olsa 
---
 tools/perf/tests/code-reading.c |   11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c
index eec1421..df9afd9 100644
--- a/tools/perf/tests/code-reading.c
+++ b/tools/perf/tests/code-reading.c
@@ -304,15 +304,14 @@ static int comp(const void *a, const void *b)
 
 static void do_sort_something(void)
 {
-   size_t sz = 40960;
-   int buf[sz], i;
+   int buf[40960], i;
 
-   for (i = 0; i < (int)sz; i++)
-   buf[i] = sz - i - 1;
+   for (i = 0; i < (int)ARRAY_SIZE(buf); i++)
+   buf[i] = ARRAY_SIZE(buf) - i - 1;
 
-   qsort(buf, sz, sizeof(int), comp);
+   qsort(buf, ARRAY_SIZE(buf), sizeof(int), comp);
 
-   for (i = 0; i < (int)sz; i++) {
+   for (i = 0; i < (int)ARRAY_SIZE(buf); i++) {
if (buf[i] != i) {
pr_debug("qsort failed\n");
break;
-- 
1.7.10.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the tip tree with the omap_dss2 tree

2013-08-13 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the tip tree got a conflict in
drivers/video/simplefb.c between commit dbb5ff4c2300 ("simplefb: add
support for a8b8g8r8 pixel format") from the omap_dss2 tree and commit
5ef76da644bf ("fbdev: simplefb: add init through platform_data") from the
tip tree.

I fixed it up (I think - see below, please check - I used the latter
version of the above file) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --git a/include/linux/platform_data/simplefb.h 
b/include/linux/platform_data/simplefb.h
index 53774b0..c395d4c 100644
--- a/include/linux/platform_data/simplefb.h
+++ b/include/linux/platform_data/simplefb.h
@@ -27,6 +27,7 @@
{ "a8r8g8b8", 32, {16, 8}, {8, 8}, {0, 8}, {24, 8}, DRM_FORMAT_ARGB 
}, \
{ "x2r10g10b10", 32, {20, 10}, {10, 10}, {0, 10}, {0, 0}, 
DRM_FORMAT_XRGB2101010 }, \
{ "a2r10g10b10", 32, {20, 10}, {10, 10}, {0, 10}, {30, 2}, 
DRM_FORMAT_ARGB2101010 }, \
+   { "a8b8g8r8", 32, {0, 8}, {8, 8}, {16, 8}, {24, 8}, DRM_FORMAT_ABGR 
}, \
 }
 
 /*


pgpdKjeYSluOt.pgp
Description: PGP signature


[PATCH] x86: Use memblock_set_current_limit() to set limit for memblock.

2013-08-13 Thread Tang Chen
In setup_arch() of x86, it set memblock.current_limit directly.
We should use memblock_set_current_limit(). If the implementation
is changed, it is easy to maintain.

Signed-off-by: Tang Chen 
---
 arch/x86/kernel/setup.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index f8ec578..de33798 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1070,7 +1070,7 @@ void __init setup_arch(char **cmdline_p)
 
cleanup_highmap();
 
-   memblock.current_limit = ISA_END_ADDRESS;
+   memblock_set_current_limit(ISA_END_ADDRESS);
memblock_x86_fill();
 
/*
@@ -1103,7 +1103,7 @@ void __init setup_arch(char **cmdline_p)
 
setup_real_mode();
 
-   memblock.current_limit = get_max_mapped();
+   memblock_set_current_limit(get_max_mapped());
dma_contiguous_reserve(0);
 
/*
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 1/8] net: fsl_pq_mdio: use platform_{get,set}_drvdata()

2013-08-13 Thread Libo Chen

We can use the wrapper functions platform_{get,set}_drvdata() instead of
dev_{get,set}_drvdata() with >dev, it is convenient for user.

Also, unnecessary dev_set_drvdata() is removed, because the driver core
clears the driver data to NULL after device_release or on probe failure.

Signed-off-by: Libo Chen 
---
 drivers/net/ethernet/freescale/fsl_pq_mdio.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fsl_pq_mdio.c 
b/drivers/net/ethernet/freescale/fsl_pq_mdio.c
index c93a056..995a3ab 100644
--- a/drivers/net/ethernet/freescale/fsl_pq_mdio.c
+++ b/drivers/net/ethernet/freescale/fsl_pq_mdio.c
@@ -409,7 +409,7 @@ static int fsl_pq_mdio_probe(struct platform_device *pdev)
priv->regs = priv->map + data->mii_offset;

new_bus->parent = >dev;
-   dev_set_drvdata(>dev, new_bus);
+   platform_set_drvdata(pdev, new_bus);

if (data->get_tbipa) {
for_each_child_of_node(np, tbi) {
@@ -468,7 +468,6 @@ static int fsl_pq_mdio_remove(struct platform_device *pdev)

mdiobus_unregister(bus);

-   dev_set_drvdata(device, NULL);

iounmap(priv->map);
mdiobus_free(bus);
-- 
1.7.1




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 7/8] net: xilinx_emaclite: use platform_{get,set}_drvdata()

2013-08-13 Thread Libo Chen

We can use the wrapper functions platform_{get,set}_drvdata() instead of
dev_{get,set}_drvdata() with _dev->dev, it is convenient for user.

Also, unnecessary dev_set_drvdata() is removed, because the driver core
clears the driver data to NULL after device_release or on probe failure.

Signed-off-by: Libo Chen 
---
 drivers/net/ethernet/xilinx/xilinx_emaclite.c |4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c 
b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
index fd4dbda..4c619ea 100644
--- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
+++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
@@ -1230,8 +1230,7 @@ error:
  */
 static int xemaclite_of_remove(struct platform_device *of_dev)
 {
-   struct device *dev = _dev->dev;
-   struct net_device *ndev = dev_get_drvdata(dev);
+   struct net_device *ndev = platform_get_drvdata(of_dev);

struct net_local *lp = netdev_priv(ndev);

@@ -1250,7 +1249,6 @@ static int xemaclite_of_remove(struct platform_device 
*of_dev)
lp->phy_node = NULL;

xemaclite_remove_ndev(ndev, of_dev);
-   dev_set_drvdata(dev, NULL);

return 0;
 }
-- 
1.7.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 8/8] net: davinci_mdio: use platform_{get,set}_drvdata()

2013-08-13 Thread Libo Chen

We can use the wrapper functions platform_{get,set}_drvdata() instead of
dev_{get,set}_drvdata() with >dev, it is convenient for user.

Also, unnecessary dev_set_drvdata() is removed, because the driver core
clears the driver data to NULL after device_release or on probe failure.


Signed-off-by: Libo Chen 
---
 drivers/net/ethernet/ti/davinci_mdio.c |4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/ti/davinci_mdio.c 
b/drivers/net/ethernet/ti/davinci_mdio.c
index 16ddfc3..2249b2c 100644
--- a/drivers/net/ethernet/ti/davinci_mdio.c
+++ b/drivers/net/ethernet/ti/davinci_mdio.c
@@ -421,8 +421,7 @@ bail_out:

 static int davinci_mdio_remove(struct platform_device *pdev)
 {
-   struct device *dev = >dev;
-   struct davinci_mdio_data *data = dev_get_drvdata(dev);
+   struct davinci_mdio_data *data = platform_get_drvdata(pdev);

if (data->bus) {
mdiobus_unregister(data->bus);
@@ -434,7 +433,6 @@ static int davinci_mdio_remove(struct platform_device *pdev)
pm_runtime_put_sync(>dev);
pm_runtime_disable(>dev);

-   dev_set_drvdata(dev, NULL);

kfree(data);

-- 
1.7.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 4/8] net: fs_enet: remove unnecessary dev_set_drvdata()

2013-08-13 Thread Libo Chen

unnecessary dev_set_drvdata() is removed, because the driver core
clears the driver data to NULL after device_release or on probe failure.

Signed-off-by: Libo Chen 
---
 drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c  |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c 
b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
index 8de53a1..7e3de10 100644
--- a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
+++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
@@ -1122,7 +1122,6 @@ static int fs_enet_remove(struct platform_device *ofdev)

fep->ops->free_bd(ndev);
fep->ops->cleanup_data(ndev);
-   dev_set_drvdata(fep->dev, NULL);
of_node_put(fep->fpi->phy_node);
free_netdev(ndev);
return 0;
-- 
1.7.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 6/8] net: sunhme: use platform_{get,set}_drvdata()

2013-08-13 Thread Libo Chen

We can use the wrapper functions platform_{get,set}_drvdata() instead of
dev_{get,set}_drvdata() with >dev, it is convenient for user.

Also, unnecessary dev_set_drvdata() is removed, because the driver core
clears the driver data to NULL after device_release or on probe failure.

Signed-off-by: Libo Chen 
---
 drivers/net/ethernet/sun/sunhme.c |8 +++-
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/sun/sunhme.c 
b/drivers/net/ethernet/sun/sunhme.c
index 171f5b0..a111f29 100644
--- a/drivers/net/ethernet/sun/sunhme.c
+++ b/drivers/net/ethernet/sun/sunhme.c
@@ -3111,7 +3111,7 @@ static int happy_meal_pci_probe(struct pci_dev *pdev,
goto err_out_iounmap;
}

-   dev_set_drvdata(>dev, hp);
+   platform_set_drvdata(pdev, hp);

if (!qfe_slot) {
struct pci_dev *qpdev = qp->quattro_dev;
@@ -3159,7 +3159,7 @@ err_out:

 static void happy_meal_pci_remove(struct pci_dev *pdev)
 {
-   struct happy_meal *hp = dev_get_drvdata(>dev);
+   struct happy_meal *hp = platform_get_drvdata(pdev);
struct net_device *net_dev = hp->dev;

unregister_netdev(net_dev);
@@ -3171,7 +3171,6 @@ static void happy_meal_pci_remove(struct pci_dev *pdev)

free_netdev(net_dev);

-   dev_set_drvdata(>dev, NULL);
 }

 static DEFINE_PCI_DEVICE_TABLE(happymeal_pci_ids) = {
@@ -3231,7 +3230,7 @@ static int hme_sbus_probe(struct platform_device *op)

 static int hme_sbus_remove(struct platform_device *op)
 {
-   struct happy_meal *hp = dev_get_drvdata(>dev);
+   struct happy_meal *hp = platform_get_drvdata(op);
struct net_device *net_dev = hp->dev;

unregister_netdev(net_dev);
@@ -3250,7 +3249,6 @@ static int hme_sbus_remove(struct platform_device *op)

free_netdev(net_dev);

-   dev_set_drvdata(>dev, NULL);

return 0;
 }
-- 
1.7.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 5/8] net: sunbmac: use platform_{get,set}_drvdata()

2013-08-13 Thread Libo Chen

We can use the wrapper functions platform_{get,set}_drvdata() instead of
dev_{get,set}_drvdata() with >dev, it is convenient for user.

Also, unnecessary dev_set_drvdata() is removed, because the driver core
clears the driver data to NULL after device_release or on probe failure.

Signed-off-by: Libo Chen 
---
 drivers/net/ethernet/sun/sunbmac.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/sun/sunbmac.c 
b/drivers/net/ethernet/sun/sunbmac.c
index 0d43fa9..34b94cc 100644
--- a/drivers/net/ethernet/sun/sunbmac.c
+++ b/drivers/net/ethernet/sun/sunbmac.c
@@ -1239,7 +1239,7 @@ static int bigmac_sbus_probe(struct platform_device *op)

 static int bigmac_sbus_remove(struct platform_device *op)
 {
-   struct bigmac *bp = dev_get_drvdata(>dev);
+   struct bigmac *bp = platform_get_drvdata(op);
struct device *parent = op->dev.parent;
struct net_device *net_dev = bp->dev;
struct platform_device *qec_op;
@@ -1259,7 +1259,6 @@ static int bigmac_sbus_remove(struct platform_device *op)

free_netdev(net_dev);

-   dev_set_drvdata(>dev, NULL);

return 0;
 }
-- 
1.7.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 3/8] net: fec_mpc52xx_phy: use platform_{get,set}_drvdata()

2013-08-13 Thread Libo Chen

We can use the wrapper functions platform_{get,set}_drvdata() instead of
dev_{get,set}_drvdata() with >dev, it is convenient for user.

Also, unnecessary dev_set_drvdata() is removed, because the driver core
clears the driver data to NULL after device_release or on probe failure.

Signed-off-by: Libo Chen 
---
 drivers/net/ethernet/freescale/fec_mpc52xx_phy.c |4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fec_mpc52xx_phy.c 
b/drivers/net/ethernet/freescale/fec_mpc52xx_phy.c
index 360a578..e052890 100644
--- a/drivers/net/ethernet/freescale/fec_mpc52xx_phy.c
+++ b/drivers/net/ethernet/freescale/fec_mpc52xx_phy.c
@@ -123,12 +123,10 @@ static int mpc52xx_fec_mdio_probe(struct platform_device 
*of)

 static int mpc52xx_fec_mdio_remove(struct platform_device *of)
 {
-   struct device *dev = >dev;
-   struct mii_bus *bus = dev_get_drvdata(dev);
+   struct mii_bus *bus = platform_get_drvdata(of);
struct mpc52xx_fec_mdio_priv *priv = bus->priv;

mdiobus_unregister(bus);
-   dev_set_drvdata(dev, NULL);
iounmap(priv->regs);
kfree(priv);
mdiobus_free(bus);
-- 
1.7.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 2/8] net: ucc_geth: use platform_{get,set}_drvdata()

2013-08-13 Thread Libo Chen

We can use the wrapper functions platform_{get,set}_drvdata() instead of
dev_{get,set}_drvdata() with >dev, it is convenient for user.

Also, unnecessary dev_set_drvdata() is removed, because the driver core
clears the driver data to NULL after device_release or on probe failure.

Signed-off-by: Libo Chen 
---
 drivers/net/ethernet/freescale/ucc_geth.c |4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/freescale/ucc_geth.c 
b/drivers/net/ethernet/freescale/ucc_geth.c
index 3c43dac..5930c39 100644
--- a/drivers/net/ethernet/freescale/ucc_geth.c
+++ b/drivers/net/ethernet/freescale/ucc_geth.c
@@ -3911,14 +3911,12 @@ static int ucc_geth_probe(struct platform_device* ofdev)

 static int ucc_geth_remove(struct platform_device* ofdev)
 {
-   struct device *device = >dev;
-   struct net_device *dev = dev_get_drvdata(device);
+   struct net_device *dev = platform_get_drvdata(ofdev);
struct ucc_geth_private *ugeth = netdev_priv(dev);

unregister_netdev(dev);
free_netdev(dev);
ucc_geth_memclean(ugeth);
-   dev_set_drvdata(device, NULL);

return 0;
 }
-- 
1.7.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mc13xxx-ts: use zero as default value if no pdata was defined

2013-08-13 Thread Michael Grzeschik
Hi Dimitry,

On Tue, Aug 13, 2013 at 09:46:09AM -0700, Dmitry Torokhov wrote:
> Hi Michael,
> 
> On Tue, Aug 13, 2013 at 02:14:30PM +0200, Michael Grzeschik wrote:
> > In case of devicetree, we currently don't have a way to append pdata for
> > the touchscreen. The current approach is to bail out in that case.
> > This patch makes it possible to probe the touchscreen without pdata
> > and use zero as default values for the atox and ato adc conversion.
> 
> I'd rather you added the devicetree support to the driver.

I know that we will need real devictree glue that generates pdata in the
long run. I am working on that. Beside that, for now this patch makes
sense anyway.

Regards,
Michael

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 0/8] use platform_{get,set}_drvdata()

2013-08-13 Thread Libo Chen

We can use the wrapper functions platform_{get,set}_drvdata() instead of
dev_{get,set}_drvdata() with >dev, it is convenient for user.

Also, unnecessary dev_set_drvdata() is removed, because the driver core
clears the driver data to NULL after device_release or on probe failure.

changelog:
this version add modify record about dev_set_drvdata().

Libo Chen (8):
  net: fsl_pq_mdio: use platform_{get,set}_drvdata()
  net: ucc_geth: use platform_{get,set}_drvdata()
  net: fec_mpc52xx_phy: use platform_{get,set}_drvdata()
  net: fs_enet: remove unnecessary dev_set_drvdata()
  net: sunbmac: use platform_{get,set}_drvdata()
  net: sunhme: use platform_{get,set}_drvdata()
  net: xilinx_emaclite: use platform_{get,set}_drvdata()
  net: davinci_mdio: use platform_{get,set}_drvdata()

 drivers/net/ethernet/freescale/fec_mpc52xx_phy.c   |4 +---
 drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c  |1 -
 drivers/net/ethernet/freescale/fsl_pq_mdio.c   |3 +--
 drivers/net/ethernet/freescale/ucc_geth.c  |4 +---
 drivers/net/ethernet/sun/sunbmac.c |3 +--
 drivers/net/ethernet/sun/sunhme.c  |8 +++-
 drivers/net/ethernet/ti/davinci_mdio.c |4 +---
 drivers/net/ethernet/xilinx/xilinx_emaclite.c  |4 +---
 8 files changed, 9 insertions(+), 22 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC nohz_full 6/7] nohz_full: Add full-system-idle state machine

2013-08-13 Thread Paul E. McKenney
On Fri, Aug 09, 2013 at 06:20:59PM +0200, Frederic Weisbecker wrote:
> On Fri, Jul 26, 2013 at 04:19:23PM -0700, Paul E. McKenney wrote:
> > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> > index 3edae39..ff84bed 100644
> > --- a/kernel/rcutree_plugin.h
> > +++ b/kernel/rcutree_plugin.h
> > @@ -28,7 +28,7 @@
> >  #include 
> >  #include 
> >  #include 
> > -#include 
> > +#include "time/tick-internal.h"
> >  
> >  #define RCU_KTHREAD_PRIO 1
> >  
> > @@ -2395,12 +2395,12 @@ static void rcu_kick_nohz_cpu(int cpu)
> >   * most active flavor of RCU.
> >   */
> >  #ifdef CONFIG_PREEMPT_RCU
> > -static struct rcu_state __maybe_unused *rcu_sysidle_state = 
> > _preempt_state;
> > +static struct rcu_state *rcu_sysidle_state = _preempt_state;
> >  #else /* #ifdef CONFIG_PREEMPT_RCU */
> > -static struct rcu_state __maybe_unused *rcu_sysidle_state = 
> > _sched_state;
> > +static struct rcu_state *rcu_sysidle_state = _sched_state;
> >  #endif /* #else #ifdef CONFIG_PREEMPT_RCU */
> 
> Ah you fixed it here. Ok :)

Bisectability and all that.  ;-)

> > -static int __maybe_unused full_sysidle_state; /* Current system-idle 
> > state. */
> > +static int full_sysidle_state; /* Current system-idle state. */
> >  #define RCU_SYSIDLE_NOT0   /* Some CPU is not idle. */
> >  #define RCU_SYSIDLE_SHORT  1   /* All CPUs idle for brief period. */
> >  #define RCU_SYSIDLE_LONG   2   /* All CPUs idle for long enough. */
> [...]
> > +/*
> > + * Check to see if the system is fully idle, other than the timekeeping 
> > CPU.
> > + * The caller must have disabled interrupts.
> > + */
> > +bool rcu_sys_is_idle(void)
> > +{
> > +   static struct rcu_sysidle_head rsh;
> > +   int rss = ACCESS_ONCE(full_sysidle_state);
> > +
> > +   if (WARN_ON_ONCE(smp_processor_id() != tick_do_timer_cpu))
> > +   return false;
> > +
> > +   /* Handle small-system case by doing a full scan of CPUs. */
> > +   if (nr_cpu_ids <= RCU_SYSIDLE_SMALL) {
> 
> I don't understand how the nr_cpu_ids > RCU_SYSIDLE_SMALL is handled. There 
> don't
> seem to be other calls of rcu_sysidle_check_cpu() than for small systems.

The other calls are from kernel/rcutree.c from the force-quiescent-state
code.  If we have a big system, we don't check until we have some other
reason to touch the cache lines.  If we have a small system, we just
dig through them on transition to idle.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Handle non ABS crc symbols

2013-08-13 Thread Rusty Russell
Michal Marek  writes:
> Added Rusty to CC.
>
> Dne 9.8.2013 21:45, Andi Kleen napsal(a):
>> From: Andi Kleen 
>> 
>> For some reason I managed to trick gcc into create CRC symbols that
>> are not absolute anymore, but weak.
>> 
>> Make modpost handle this case.
>> 
>> Andrew, this should fix the bizarre warning. Seems like a toolchain
>> bug to me.
>> 
>> Signed-off-by: Andi Kleen 

Do you also end up with relocated CRCs, like ppc does?

See ARCH_RELOCATES_KCRCTAB in kernel/module.c.

Cheers,
Rusty.

>> ---
>>  scripts/mod/modpost.c | 15 +++
>>  1 file changed, 7 insertions(+), 8 deletions(-)
>> 
>> diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
>> index 6216434..b3dd735 100644
>> --- a/scripts/mod/modpost.c
>> +++ b/scripts/mod/modpost.c
>> @@ -599,18 +599,17 @@ static void handle_modversions(struct module *mod, 
>> struct elf_info *info,
>>  else
>>  export = export_from_sec(info, get_secindex(info, sym));
>>  
>> +/* CRC'd symbol */
>> +if (strncmp(symname, CRC_PFX, strlen(CRC_PFX)) == 0) {
>> +crc = (unsigned int) sym->st_value;
>> +sym_update_crc(symname + strlen(CRC_PFX), mod, crc,
>> +export);
>> +}
>> +
>>  switch (sym->st_shndx) {
>>  case SHN_COMMON:
>>  warn("\"%s\" [%s] is COMMON symbol\n", symname, mod->name);
>>  break;
>> -case SHN_ABS:
>> -/* CRC'd symbol */
>> -if (strncmp(symname, CRC_PFX, strlen(CRC_PFX)) == 0) {
>> -crc = (unsigned int) sym->st_value;
>> -sym_update_crc(symname + strlen(CRC_PFX), mod, crc,
>> -export);
>> -}
>> -break;
>>  case SHN_UNDEF:
>>  /* undefined symbol */
>>  if (ELF_ST_BIND(sym->st_info) != STB_GLOBAL &&
>> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH net-next 1/3] net/usb/r8152: support aggregation

2013-08-13 Thread hayeswang
 David Miller [mailto:da...@davemloft.net] 
> Sent: Wednesday, August 14, 2013 7:41 AM
> To: oneu...@suse.de
> Cc: Hayeswang; net...@vger.kernel.org; 
> linux-kernel@vger.kernel.org; linux-...@vger.kernel.org
> Subject: Re: [PATCH net-next 1/3] net/usb/r8152: support aggregation
> 
[...]
> > I don't understand what problem the function is supposed to 
> fix. As long
> > as I don't understand it I cannot say for sure whether it 
> is correct.
> > There seems no obvious reason for a memory barrier, but 
> there may be a
> > hidden reason I don't see.
> 
> Hayes, when Oliver asks you "Against what is the memory 
> barrier?" he is asking
> you which memory operations you are trying to order.
> 
> You do not explain this in your commit message, nor do you 
> explain it with a
> suitable comment.  This is not acceptable.
> 
> It is absolutely critical, that any time you add a memory 
> barrier, you add a
> comment above the new memory barrier explaining exactly what 
> the barrier is
> trying to achieve.
> 
> In fact, this is required by our coding standards.

I just want to make sure the rx_desc and rx_data are set correctly before
they are used. However, I study some examples and information from internet,
and I think that the memory barries is not necessary here. Therefore, I would
remove them later.
 
Best Regards,
Hayes

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] cpufreq: sa11x0: cleanups for 3.13

2013-08-13 Thread Viresh Kumar
On 14 August 2013 01:31, Rafael J. Wysocki  wrote:
> On Tuesday, August 13, 2013 07:01:04 PM Viresh Kumar wrote:

> Are the three patches in this series prerequisite for the big target_index
> one?  If so, I think they can go into 3.12 actually, if they are ACKed by the
> appropriate platform maintainers.

Yes, but they depend on cpufreq_table_validate_and_show() which
will also go in 3.13 and so these will be made part of that series in
future..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [E1000-devel] 3.11-rc4 ixgbevf: endless "Last Request of type 00 to PF Nacked" messages

2013-08-13 Thread Bjorn Helgaas
On Tue, Aug 13, 2013 at 3:54 PM, Skidmore, Donald C
 wrote:

> We were unable to recreate your failure here locally so I have some 
> additional questions.  First off you mentioned it was failing as far back as 
> v3.9, was it ever working for you?  If so bisecting would be really helpful 
> as I mentioned we have been unable to cause the failure in house.

I'm not aware of any working version.  I'm exercising in the sysfs
SR-IOV configuration, which I think appeared in v3.8 or so.

>  If not could you see if the problem still occurs without the external Magma 
> PCIe expansion chassis, this is of course assuming that you can plug the X540 
> into your system without it.

I played with this a little more and found this:

1) Magma card in z420, connected to chassis containing X540: fails
(original report)
2) X540 in z420, Magma card in z420, connected to empty chassis: fails
3) X540 in z420, Magma card in z420 but no cable to chassis: works

The only difference I've noticed so far between configs 2 & 3 are
different bus numbers and different IRQ assignments:

Config 2 (failing):
  pci :0c:00.0: [8086:1528] type 00 class 0x02
  pci :0c:00.0: reg 0x10: [mem 0xdac0-0xdadf 64bit pref]
  ixgbe :0c:00.0: irq 82 for MSI/MSI-X
  IRQ 79: 79
  IRQ 80: eth0
  IRQ 81: snd_hda_intel
  IRQ: 82-93 eth1-TxRx-0 through eth1-TxRx-11
  IRQ 94: eth1

Config 3 (working):
  pci :04:00.0: [8086:1528] type 00 class 0x02
  pci :04:00.0: reg 0x10: [mem 0xdac0-0xdadf 64bit pref]
  ixgbe :04:00.0: irq 75 for MSI/MSI-X
  IRQ 72: ahci
  IRQ 73: eth0
  IRQ 74: snd_hda_intel
  IRQ 75-86: eth1-TxRx-0 through eth1-TxRx-11
  IRQ 87: eth1

I'll try to narrow this down a little more; I'm just giving you this
preliminary info in case it rings any bells for you.

>> -Original Message-
>> From: Bjorn Helgaas [mailto:bhelg...@google.com]
>> Sent: Friday, August 09, 2013 10:19 AM
>> To: e1000-de...@lists.sourceforge.net
>> Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org
>> Subject: [E1000-devel] 3.11-rc4 ixgbevf: endless "Last Request of type 00 to
>> PF Nacked" messages
>>
>> When I enable VFs via sysfs on an Intel X540-AT, I see an endless stream of
>>
>> ixgbevf :08:10.2: Last Request of type 03 to PF Nacked
>>
>> messages.  This on an HP z420 with the Intel X540-AT in external Magma PCIe
>> expansion chassis.  No cable is attached to the X540-AT.
>>
>> ixgbe is built as a module and is auto-loaded during boot, with no VFs
>> enabled.  The "Last request Nacked" messages start when I enable VFs
>> with:
>>
>> # echo -n 8 > /sys/bus/pci/devices/:08:00.0/sriov_numvfs
>> ixgbe :08:00.0 eth1: SR-IOV enabled with 8 VFs
>> pci :08:10.0: [8086:1515] type 00 class 0x02
>> pci :08:10.2: [8086:1515] type 00 class 0x02
>> ...
>> ixgbevf: Intel(R) 10 Gigabit PCI Express Virtual Function Network Driver 
>> -
>> version 2.7.12-k
>> ...
>> ixgbevf :08:10.2: Last Request of type 03 to PF Nacked
>> ...
>>
>> This happens with v3.11-rc4, v3.10, and v3.9, which is as far back as I 
>> checked.
>> Complete console log and lspci output are here:
>>
>> http://helgaas.com/linux/ixgbe/z420.log
>> http://helgaas.com/linux/ixgbe/lspci
>>
>> --
>> Get 100% visibility into Java/.NET code with AppDynamics Lite!
>> It's a free troubleshooting tool designed for production.
>> Get down to code-level detail for bottlenecks, with <2% overhead.
>> Download for free and get started troubleshooting in minutes.
>> http://pubads.g.doubleclick.net/gampad/clk?id=48897031=/4140/ostg.clk
>> trk
>> ___
>> E1000-devel mailing list
>> e1000-de...@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/e1000-devel
>> To learn more about Intel Ethernet, visit
>> http://communities.intel.com/community/wired
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: remove redundant local_irq_enable() after cpuidle_idle_call()

2013-08-13 Thread Li Wang
When cpuidle_idle_call() return 0, it shows that linux system is using
idle framwork driver. Now, local irq has already been enabled in
cpuidle_idle_call(). So, it need not enable local irq again, when return 0.

The code is introduced by commit:
97a5b81fa4d3a11dcdf224befc577f2e0abadc0b ("x86: Fix idle consolidation fallout")
In that defect, it does not use idle framework driver, just call 
amd_e400_idle().
That problem is that amd_e400_idle() does not enable irq.

Signed-off-by: Li Wang 
---
 arch/x86/kernel/process.c |2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 83369e5..cb55ee4 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -300,8 +300,6 @@ void arch_cpu_idle(void)
 {
if (cpuidle_idle_call())
x86_idle();
-   else
-   local_irq_enable();
 }
 
 /*
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bridge VLAN kernel/iproute2 incompatibility

2013-08-13 Thread David Miller
From: Asbjørn Sloth Tønnesen 
Date: Mon, 12 Aug 2013 16:24:06 +

> Let's start with a little history:

I've applied your kernel patch, but the detailed analysis you put
here, belongs in the commit message.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 15/16] Add EFI stub for ARM

2013-08-13 Thread Roy Franz
On Tue, Aug 13, 2013 at 6:44 PM, Roy Franz  wrote:
>
> Thanks Dave - comments inline, and I have an updated head.S diff at the end.
>
> Roy
>
>
> On Tue, Aug 13, 2013 at 7:19 AM, Dave Martin  wrote:
>>
>> On Fri, Aug 09, 2013 at 04:26:16PM -0700, Roy Franz wrote:
>> > This patch adds EFI stub support for the ARM Linux kernel.  The EFI stub
>> > operations similarly to the x86 stub: it is a shim between the EFI
>> > firmware
>> > and the normal zImage entry point, and sets up the environment that the
>> > zImage is expecting.  This includes loading the initrd (optionaly) and
>> > device tree from the system partition based on the kernel command line.
>> > The stub updates the device tree as necessary, including adding reserved
>> > memory regions and adding entries for EFI runtime services. The PE/COFF
>> > "MZ" header at offset 0 results in the first instruction being an add
>> > that corrupts r5, which is not used by the zImage interface.
>>
>> Thanks for the update -- a few more comments, nothing major.
>>
>> > Signed-off-by: Roy Franz 
>> > ---
>> >  arch/arm/boot/compressed/Makefile |   15 +-
>> >  arch/arm/boot/compressed/efi-header.S |  111 
>
> ...
>>
>>
>> > + goto fdt_set_fail;
>> > +
>> > + return EFI_SUCCESS;
>>
>> This looks better.
>>
>> > +
>> > +fdt_set_fail:
>> > + if (status == -FDT_ERR_NOSPACE)
>> > + return EFI_BUFFER_TOO_SMALL;
>> > +
>> > + return EFI_LOAD_ERROR;
>> > +}
>> > +
>> > +
>> > +
>>
>> Maybe add a comment to indicate that this returns the address of the
>> relocated fdt, or EFI_LOAD_ERROR.
>>
>> By default "int" feels more likely to return a status code.
>>
>> It is not common to return pointers using the "int" type: it may be
>> preferable to use unsigned long of void * instead.  This won't
>> change the functionality.
>>
>> Casts to (int) which could overflow the signed range can cause GCC
>> to generate bizarre code in some situations, because C doesn't
>> have to guarantee wrapping when casting to signed types.  Since we
>> just pass that value through without doing any arithmetic I think we're
>> unlikely to hit that here, but it's best avoided anyhow.
>
>
> The function now returns only status, not the FDT address, so I have changed
> it to an int.
> When I changed the function to no longer do the memory allocation for the
> new FDT this changed,
> but I missed changing the return type to int.
>
>
>>
>>
>> > +int efi_entry(void *handle, efi_system_table_t *sys_table,
>> > +   unsigned long *zimage_addr)
>> > +{
>> > + efi_loaded_image_t *image;
>> > + int status;
>> > + unsigned long nr_pages;
>> > + const struct fdt_region *region;
>> > +
>> > + void *fdt;
>> > + int err;
>> > + int node;
>> > + unsigned long zimage_size = 0;
>> > + unsigned long dram_base;
>> > + /* addr/point and size pairs for memory management*/
>> > + u64 initrd_addr;
>> > + u64 initrd_size = 0;
>> > + u64 fdt_addr;
>> > + u64 fdt_size = 0;
>> > + u64 kernel_reserve_addr;
>> > + u64 kernel_reserve_size = 0;
>> > + char *cmdline_ptr;
>> > + unsigned long cmdline_size = 0;
>> > +
>> > + unsigned long map_size, desc_size;
>> > + unsigned long mmap_key;
>> > + efi_memory_desc_t *memory_map;
>> > +
>> > + unsigned long new_fdt_size;
>> > + unsigned long new_fdt_addr;
>> > +
>> > + efi_guid_t proto = LOADED_IMAGE_PROTOCOL_GUID;
>> > +
>> > + /* Check if we were booted by the EFI firmware */
>> > + if (sys_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE)
>> > + goto fail;
>> > +
>> > + efi_printk(sys_table, PRINTK_PREFIX"Booting Linux using EFI
>> > stub.\n");
>> > +
>> > +
>> > + /* get the command line from EFI, using the LOADED_IMAGE protocol
>> > */
>> > + status = efi_call_phys3(sys_table->boottime->handle_protocol,
>> > + handle, , (void *));
>> > + if (status != EFI_SUCCESS) {
>> > + efi_printk(sys_table, PRINTK_PREFIX"ERROR: Failed to get
>> > handle for LOADED_IMAGE_PROTOCOL\n");
>> > + goto fail;
>> > + }
>> > +
>> > + /* We are going to copy this into device tree, so we don't care
>> > where in
>> > +  * memory it is.
>> > +  */
>> > + cmdline_ptr = convert_cmdline_to_ascii(sys_table, image,
>> > +_size, 0x);
>> > + if (!cmdline_ptr) {
>> > + efi_printk(sys_table, PRINTK_PREFIX"ERROR: converting
>> > command line to ascii failed.\n");
>>
>> The real reason for this failure is failure to allocate memory: there's
>> no other way it can fail.
>>
>> So, the error message could be "Unable to allocate memory for command
>> line"
>
>
> done.
>>
>>
>> > + goto fail;
>> > + }
>> > +
>> > + /* We first load the device tree, as we need to get the base
>> > address of
>> > +  * DRAM from the device tree.  The zImage, device tree, and initrd
>> > +  * 

Re: [PATCH 2/2] cpuset: remove redundant checks in file write functions

2013-08-13 Thread Li Zefan
On 2013/8/13 23:05, Tejun Heo wrote:
> On Tue, Aug 13, 2013 at 09:17:53AM +0800, Li Zefan wrote:
>> Now cgroup core gets a reference to the css when a cgroup file is
>> opened(), and the reference is dropped at file release. so it's
>> guaranteed the cpuset is online during the write function.
> 
> Hmmm... it doesn't really guarantee that as css's can be offlined with
> residual css refcnts, os the css may well be offlined by the time it
> reaches the rw functions.  What's guaranteed is that their refcnts
> wouldn't be zero.

Oh, right.

But most controllers don't check this in those read/write functions.
It shoudn't do any harm not checking online/offline status.

> Eventually we need to implement proper sever
> semantics (probably by replacing the custom fs implementation with
> sysfs) but right now controllers still need to deal with offline
> css's.
> 
> Thanks.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] DT bindings as ABI [was: Do we have people interested in device tree janitoring / cleanup?]

2013-08-13 Thread Guenter Roeck

On 08/13/2013 04:32 PM, H. Peter Anvin wrote:

On 08/01/2013 08:50 PM, David Gibson wrote:

On Wed, Jul 31, 2013 at 05:26:47PM -0400, jonsm...@gmail.com
wrote:

On Wed, Jul 31, 2013 at 4:48 PM, Russell King - ARM Linux
 wrote:

On Wed, Jul 31, 2013 at 04:37:36PM -0400, jonsm...@gmail.com

wrote:

[snip]

Alternatively you may be of the belief that it is impossible to
get rid of the board specific code. But x86 doesn't have any of
it, why should ARM?


Sure x86 has board specific code.  It's just that x86 basically
only has one board - PC.



That is one aspect (hardware standardization)... but it is more to it
than that.


I have to deal with lots of embedded / non-PC x86 based systems. Worst one
I encountered so far was a board where the VGA memory space was re-used
for an eeprom. The upcoming next generation hardware I'll have to support
is so far off-standard that I'll probably have to define a new platform
type (similar to OLPC or CE4100).

No, it is not all PC. Not anymore. Intel has started to sell into
the embedded space, where PC compatibility is not a requirement.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH part1 0/5] acpica: Split acpi_gbl_root_table_list initialization into two parts.

2013-08-13 Thread Tang Chen

Hi Bob, Rafael,

I think these patches are unnecessary now. mm guys have decided to do
memory hotplug in another way, an easier way that we don't need to touch
ACPICA code.

So please ignore this patch-set.

Thanks for the help and suggestions. :)


On 08/08/2013 11:39 AM, Tang Chen wrote:

[Problem]

The current Linux cannot migrate pages used by the kerenl because
of the kernel direct mapping. In Linux kernel space, va = pa + PAGE_OFFSET.
When the pa is changed, we cannot simply update the pagetable and
keep the va unmodified. So the kernel pages are not migratable.

There are also some other issues will cause the kernel pages not migratable.
For example, the physical address may be cached somewhere and will be used.
It is not to update all the caches.

When doing memory hotplug in Linux, we first migrate all the pages in one
memory device somewhere else, and then remove the device. But if pages are
used by the kernel, they are not migratable. As a result, memory used by
the kernel cannot be hot-removed.

Modifying the kernel direct mapping mechanism is too difficult to do. And
it may cause the kernel performance down and unstable. So we use the following
way to do memory hotplug.


[What we are doing]

In Linux, memory in one numa node is divided into several zones. One of the
zones is ZONE_MOVABLE, which the kernel won't use.

In order to implement memory hotplug in Linux, we are going to arrange all
hotpluggable memory in ZONE_MOVABLE so that the kernel won't use these memory.

To do this, we need ACPI's help.


[How we do this]

In ACPI, SRAT(System Resource Affinity Table) contains NUMA info. The memory
affinities in SRAT record every memory range in the system, and also, flags
specifying if the memory range is hotpluggable.
(Please refer to ACPI spec 5.0 5.2.16)

With the help of SRAT, we have to do the following two things to achieve our
goal:

1. When doing memory hot-add, allow the users arranging hotpluggable as
ZONE_MOVABLE.
(This has been done by the MOVABLE_NODE functionality in Linux.)

2. when the system is booting, prevent bootmem allocator from allocating
hotpluggable memory for the kernel before the memory initialization
finishes.
(This is what we are going to do. And we need to do some modification in
 ACPICA. See below.)


[About this patch-set]

There is a bootmem allocator named memblock in Linux. memblock starts to work
at very early time, and SRAT has not been parsed. So we don't know which memory
is hotpluggable. In order to prevent memblock from allocating hotpluggable
memory for the kernel, we need to obtain SRAT memory affinity info earlier.

In the current Linux kernel, the acpica code iterates acpi_gbl_root_table_list,
and install all the acpi tables into it at boot time. Then, it tries to find
if there is any override table in global array acpi_tables_addr. If any, 
reinstall
the override table into acpi_gbl_root_table_list.

In Linux, global array acpi_tables_addr can be fulfilled by 
ACPI_INITRD_TABLE_OVERRIDE
mechanism, which allows users to specify their own ACPI tables in initrd file, 
and
override the ones from firmware.

The whole procedure looks like the following:

setup_arch()
  |->.. /* Setup direct mapping 
pagetables */
  |->acpi_initrd_override()/* Store all override tables 
in acpi_tables_addr. */
  |...
  |->acpi_boot_table_init()
 |->acpi_table_init()
|   
   (Linux code)
..
|   
  (ACPICA code)
|->acpi_initialize_tables()
   |->acpi_tb_parse_root_table()   /* Parse RSDT or XSDT, find 
all tables in firmware */
  |->for (each item in acpi_gbl_root_table_list)
 |->acpi_tb_install_table()
|->..   /* Install one single table 
*/
|->acpi_tb_table_override()/* Override one single table 
*/

It does the table installation and overriding one by one.

In order to find SRAT at earlier time, we want to initialize 
acpi_gbl_root_table_list
earlier. But at the same time, keep ACPI_INITRD_TABLE_OVERRIDE procedure works 
as well.

The basic idea is, split the acpi_gbl_root_table_list initialization procedure 
into
two steps:
1. Install all tables from firmware, not one by one.
2. Override any table if necessary, not one by one.

After this patch-set, it will work like this:

setup_arch()
  |->  ..   /* Install all tables from 
firmware (Step 1) */
  |->  ..   /* Try to find if any 
override SRAT in initrd file, if yes, use it */
  |->  ..   /* Use the SRAT from 
firmware */
  

[PATCH] drivers/rtc/rtc-max77686.c: Fix wrong register

2013-08-13 Thread Sangjung Woo
Fix to read the wrong register when checking whether the RTC timer has
reached or not.

Signed-off-by: Sangjung Woo 
Signed-off-by: Myugnjoo Ham 
Reviewed-by: Jonghwa Lee 
---
 drivers/rtc/rtc-max77686.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/rtc/rtc-max77686.c b/drivers/rtc/rtc-max77686.c
index 9915cb9..9efe118 100644
--- a/drivers/rtc/rtc-max77686.c
+++ b/drivers/rtc/rtc-max77686.c
@@ -240,9 +240,9 @@ static int max77686_rtc_read_alarm(struct device *dev, 
struct rtc_wkalrm *alrm)
}
 
alrm->pending = 0;
-   ret = regmap_read(info->max77686->regmap, MAX77686_REG_STATUS1, );
+   ret = regmap_read(info->max77686->regmap, MAX77686_REG_STATUS2, );
if (ret < 0) {
-   dev_err(info->dev, "%s:%d fail to read status1 reg(%d)\n",
+   dev_err(info->dev, "%s:%d fail to read status2 reg(%d)\n",
__func__, __LINE__, ret);
goto out;
}
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the arm-current tree with Linus' tree

2013-08-13 Thread Stephen Rothwell
Hi Russell,

Today's linux-next merge of the arm-current tree got a conflict in
arch/arm/kernel/perf_event.c between commit b88a2595b6d8 ("perf/arm: Fix
armpmu_map_hw_event()") from Linus' tree and commit d9f966357b14 ("ARM:
7810/1: perf: Fix array out of bounds access in armpmu_map_hw_event()")
from the arm-current tree.

These are the same patch except for the return code (and the much more
comprehensive commit message in the arm-current tree version).  I fixed it up
(using the arm-current tree version - return -EINVAL instead of -ENOENT - I have
no way to guess which is right) and can carry the fix as necessary (no
action is required).

P.S. the version in Linus' tree has no Signed-off-by from the author.
-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpoLweatfX4s.pgp
Description: PGP signature


Re: [PATCH 0/3] module: Allow parameters without arguments

2013-08-13 Thread Lucas De Marchi
On Tue, Aug 13, 2013 at 9:17 PM, Steven Rostedt  wrote:
> On Tue, 13 Aug 2013 20:34:58 -0300
> Lucas De Marchi  wrote:
>
>
>> so in kcmdline we would have modulename.param instead of modulename.param=1?
>>
>> I guess we need to update kmod then, because currently we ignore and
>> treat this case as a wrong token. From a quick look, allowing it in
>> kmod would be as simple as removing a condition check.
>>
>> Lucas De Marchi
>
> Note, both will still work. And it didn't change much. Today, anything
> that uses "module_param()" with bool type (a quick git grep shows 570
> users), already do not require a value.
>
> Randomly looking at one... drivers/input/mouse/synaptics_i2c.c, you can
> just do:
>
>  insmod synaptics_i2c.ko no_filter
>
> no need to add a "=1" to that.
>
> But anything else will still require a value. I just want to allow
> other parameters that act like a boolean to not require one.

true... but currently "modprobe synaptics_i2c" doesn't get the
parameter correctly from kernel command line if it doesn't have a
value. And I agree this not something that changed but rather a bug in
kmod waiting to be fixed.

Lucas De Marchi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] module: Allow parameters without arguments

2013-08-13 Thread Lucas De Marchi
On Tue, Aug 13, 2013 at 10:00 PM, Lucas De Marchi
 wrote:
> On Tue, Aug 13, 2013 at 9:17 PM, Steven Rostedt  wrote:
>> On Tue, 13 Aug 2013 20:34:58 -0300
>> Lucas De Marchi  wrote:
>>
>>
>>> so in kcmdline we would have modulename.param instead of modulename.param=1?
>>>
>>> I guess we need to update kmod then, because currently we ignore and
>>> treat this case as a wrong token. From a quick look, allowing it in
>>> kmod would be as simple as removing a condition check.
>>>
>>> Lucas De Marchi
>>
>> Note, both will still work. And it didn't change much. Today, anything
>> that uses "module_param()" with bool type (a quick git grep shows 570
>> users), already do not require a value.
>>
>> Randomly looking at one... drivers/input/mouse/synaptics_i2c.c, you can
>> just do:
>>
>>  insmod synaptics_i2c.ko no_filter
>>
>> no need to add a "=1" to that.
>>
>> But anything else will still require a value. I just want to allow
>> other parameters that act like a boolean to not require one.
>
> true... but currently "modprobe synaptics_i2c" doesn't get the
> parameter correctly from kernel command line if it doesn't have a
> value. And I agree this not something that changed but rather a bug in
> kmod waiting to be fixed.

And it's fixed now with a proper test added.

thanks

Lucas De Marchi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/10] KVM: PPC: reserve a capability number for multitce support

2013-08-13 Thread Benjamin Herrenschmidt
On Thu, 2013-08-01 at 14:44 +1000, Alexey Kardashevskiy wrote:
> This is to reserve a capablity number for upcoming support
> of H_PUT_TCE_INDIRECT and H_STUFF_TCE pseries hypercalls
> which support mulptiple DMA map/unmap operations per one call.

Gleb, any chance you can put this (and the next one) into a tree to
"lock in" the numbers ?

I've been wanting to apply the whole series to powerpc-next, that's
stuff has been simmering for way too long and is in a good enough shape
imho, but I need the capabilities and ioctl numbers locked in your tree
first.

Cheers,
Ben.

> Signed-off-by: Alexey Kardashevskiy 
> ---
> Changes:
> 2013/07/16:
> * changed the number
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
>  include/uapi/linux/kvm.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index acccd08..99c2533 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -667,6 +667,7 @@ struct kvm_ppc_smmu_info {
>  #define KVM_CAP_PPC_RTAS 91
>  #define KVM_CAP_IRQ_XICS 92
>  #define KVM_CAP_ARM_EL1_32BIT 93
> +#define KVM_CAP_SPAPR_MULTITCE 94
>  
>  #ifdef KVM_CAP_IRQ_ROUTING
>  


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 08/14] cgroup: move cgroup->subsys[] assignment to online_css()

2013-08-13 Thread Tejun Heo
>From ae7f164a09408bf21ab3c82a9e80a3ff37aa9e3e Mon Sep 17 00:00:00 2001
From: Tejun Heo 
Date: Tue, 13 Aug 2013 20:22:50 -0400

Currently, css (cgroup_subsys_state) lifetime is tied to that of the
associated cgroup.  With the planned unified hierarchy, css's will be
dynamically created and destroyed within the lifetime of a cgroup.  To
enable such usages, css's will be individually RCU protected instead
of being tied to the cgroup.

In preparation, this patch moves cgroup->subsys[] assignment from
init_css() to online_css().  As this means that a newly initialized
css should be remembered separately and that cgroup_css() returns NULL
between init and online, cgroup_create() is updated so that it stores
newly created css's in a local array css_ar[] and
cgroup_init/load_subsys() are updated to use local variable @css
instead of using cgroup_css().  This change also slightly simplifies
error path of cgroup_create().

While this patch changes when cgroup->subsys[] is initialized, this
change isn't visible to subsystems or userland.

v2: This patch wasn't updated accordingly after the previous "cgroup:
reorganize css init / exit paths" was updated leading to missing a
css_ar[] conversion in cgroup_create() and thus boot failure.  Fix
it.

Signed-off-by: Tejun Heo 
Acked-by: Li Zefan 
---
Oops, this needed to be updated too.  git branches updated accordingly.

Thanks.

 kernel/cgroup.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index a1ebc44..b9f736c 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4321,7 +4321,6 @@ static void init_css(struct cgroup_subsys_state *css, 
struct cgroup_subsys *ss,
css->flags |= CSS_ROOT;
 
BUG_ON(cgroup_css(cgrp, ss->subsys_id));
-   rcu_assign_pointer(cgrp->subsys[ss->subsys_id], css);
 }
 
 /* invoke ->css_online() on a new CSS and mark it online if successful */
@@ -4334,8 +4333,10 @@ static int online_css(struct cgroup_subsys_state *css)
 
if (ss->css_online)
ret = ss->css_online(css);
-   if (!ret)
+   if (!ret) {
css->flags |= CSS_ONLINE;
+   rcu_assign_pointer(css->cgroup->subsys[ss->subsys_id], css);
+   }
return ret;
 }
 
@@ -4366,6 +4367,7 @@ static void offline_css(struct cgroup_subsys_state *css)
 static long cgroup_create(struct cgroup *parent, struct dentry *dentry,
 umode_t mode)
 {
+   struct cgroup_subsys_state *css_ar[CGROUP_SUBSYS_COUNT] = { };
struct cgroup *cgrp;
struct cgroup_name *name;
struct cgroupfs_root *root = parent->root;
@@ -4433,12 +4435,11 @@ static long cgroup_create(struct cgroup *parent, struct 
dentry *dentry,
err = PTR_ERR(css);
goto err_free_all;
}
+   css_ar[ss->subsys_id] = css;
 
err = percpu_ref_init(>refcnt, css_release);
-   if (err) {
-   ss->css_free(css);
+   if (err)
goto err_free_all;
-   }
 
init_css(css, ss, cgrp);
 
@@ -4467,7 +4468,7 @@ static long cgroup_create(struct cgroup *parent, struct 
dentry *dentry,
 
/* each css holds a ref to the cgroup's dentry and the parent css */
for_each_root_subsys(root, ss) {
-   struct cgroup_subsys_state *css = cgroup_css(cgrp, 
ss->subsys_id);
+   struct cgroup_subsys_state *css = css_ar[ss->subsys_id];
 
dget(dentry);
percpu_ref_get(>parent->refcnt);
@@ -4478,7 +4479,7 @@ static long cgroup_create(struct cgroup *parent, struct 
dentry *dentry,
 
/* creation succeeded, notify subsystems */
for_each_root_subsys(root, ss) {
-   struct cgroup_subsys_state *css = cgroup_css(cgrp, 
ss->subsys_id);
+   struct cgroup_subsys_state *css = css_ar[ss->subsys_id];
 
err = online_css(css);
if (err)
@@ -4511,7 +4512,7 @@ static long cgroup_create(struct cgroup *parent, struct 
dentry *dentry,
 
 err_free_all:
for_each_root_subsys(root, ss) {
-   struct cgroup_subsys_state *css = cgroup_css(cgrp, 
ss->subsys_id);
+   struct cgroup_subsys_state *css = css_ar[ss->subsys_id];
 
if (css) {
percpu_ref_cancel_init(>refcnt);
@@ -4793,7 +4794,7 @@ static void __init cgroup_init_subsys(struct 
cgroup_subsys *ss)
 * need to invoke fork callbacks here. */
BUG_ON(!list_empty(_task.tasks));
 
-   BUG_ON(online_css(cgroup_css(cgroup_dummy_top, ss->subsys_id)));
+   BUG_ON(online_css(css));
 
mutex_unlock(_mutex);
 
@@ -4897,7 +4898,7 @@ int __init_or_module cgroup_load_subsys(struct 
cgroup_subsys *ss)
}
write_unlock(_set_lock);
 
-   ret = online_css(cgroup_css(cgroup_dummy_top, ss->subsys_id));
+   ret = 

Re: [PATCH 1/8] x86: Add 1/2/4/8 byte optimization to 64bit __copy_{from,to}_user_inatomic

2013-08-13 Thread Linus Torvalds
On Tue, Aug 13, 2013 at 5:07 PM, Andi Kleen  wrote:
>
> 32bit already did this correctly by duplicating the code.

I wonder how much of this could be in asm/uaccess.h? We already do all
the fixed-size get_user/put_user stuff in that generic x86 code, and
I'm wondering why we don't do the "__builtin_constant_p()" cases of
copy_to/from_user() there too?

It's independent of this patch-series, but it might be good to take a
look at unifying these things..

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] module: Allow parameters without arguments

2013-08-13 Thread Steven Rostedt
On Tue, 13 Aug 2013 20:34:58 -0300
Lucas De Marchi  wrote:

 
> so in kcmdline we would have modulename.param instead of modulename.param=1?
> 
> I guess we need to update kmod then, because currently we ignore and
> treat this case as a wrong token. From a quick look, allowing it in
> kmod would be as simple as removing a condition check.
> 
> Lucas De Marchi

Note, both will still work. And it didn't change much. Today, anything
that uses "module_param()" with bool type (a quick git grep shows 570
users), already do not require a value.

Randomly looking at one... drivers/input/mouse/synaptics_i2c.c, you can
just do:

 insmod synaptics_i2c.ko no_filter

no need to add a "=1" to that.

But anything else will still require a value. I just want to allow
other parameters that act like a boolean to not require one.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 0/3] Pin page control subsystem

2013-08-13 Thread Minchan Kim
Hello Christoph,

On Tue, Aug 13, 2013 at 04:21:30PM +, Christoph Lameter wrote:
> On Tue, 13 Aug 2013, Minchan Kim wrote:
> 
> > VM sometime want to migrate and/or reclaim pages for CMA, memory-hotplug,
> > THP and so on but at the moment, it could handle only userspace pages
> > so if above example subsystem have pinned a some page in a range VM want
> > to migrate, migration is failed so above exmaple couldn't work well.
> 
> Dont we have the mmu_notifiers that could help in that case? You could get
> a callback which could prepare the pages for migration?

Now I'm not familiar with mmu_notifier so please could you elaborate it
a bit for me to dive into that? 

Thanks!

> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH] xen/hvc: If we use xen_raw_printk let it also work on HVM guests.

2013-08-13 Thread Konrad Rzeszutek Wilk
On Tue, Aug 13, 2013 at 11:20:05PM +0100, Ian Campbell wrote:
> On Tue, 2013-08-13 at 21:59 +0100, Ian Campbell wrote:
> 
> > > What it should be is:
> > > > >  
> > > > >  void xen_raw_console_write(const char *str)
> > > > >  {
> > > > > - dom0_write_console(0, str, strlen(str));
> > > > > + if (!xen_domain())
> > > > > + return;
> > > > > +
> > > > > + if (xen_pv_domain())
> > > xen_domain()
> > > 
> > > > > + dom0_write_console(0, str, strlen(str));
> > > > > + else if (xen_hvm_domain() || xen_cpuid_base()) {
> > >  
> > >   else if (xen_cpuid_base()) {
> > > 
> > > > > + /* The hyperpage has not been setup yet. */
> > > > > + int i, len = strlen(str);
> > > > > + for (i = 0; i < len; i++)
> > > > > +  outb(str[i], 0xe9);
> > > > > + }
> > > > >  }
> > > 
> > > And then that should adhere to what I wrote up.
> > 
> > I think it does too.
> 
> Except as Daniel notes in <520a7145.5010...@tycho.nsa.gov> for unrelated
> reasons:
> 
> > HVM guests can still use the PV output - they just need to use the 
> > console
> > write hypercall instead of the HVM I/O port. I would think that PVH 
> > guests
> > would default to using the hypercall as it is more efficient (it 
> > takes a
> > string rather than one character per write).
> > 
> > Actually, checking... the console_io hypercall would need to be 
> > added to
> > the hvm_hypercall{32,64}_table for an HVM guest to be able to use 
> > it; they
> > currently must use the I/O port. I didn't check the PVH patches.
> 
> Or did you actually try this code and it worked?

The one I typed up above - no. The one I had sent - yes.

But with that above mentioned comment from Daniel I think it is still
worth trying to do dom0_write_console and if the hypercall returns -ENOSYS 
then fall back on 0xe9.

And lastly send an patch to make hypercall_io work under HVM.

> 
> Ian.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re-tune x86 uaccess code for PREEMPT_VOLUNTARY v2

2013-08-13 Thread Andi Kleen
The x86 user access functions (*_user) were originally very well tuned,
with partial inline code and other optimizations.

Then over time various new checks -- particularly the sleep checks for
a voluntary preempt kernel -- destroyed a lot of the tunings

A typical user access operation is now doing multiple useless
function calls. Also the without force inline gcc's inlining
policy makes it even worse, with adding more unnecessary calls.

Here's a typical example from ftrace:

 10)   |might_fault() {
 10)   |  _cond_resched() {
 10)   |should_resched() {
 10)   |  need_resched() {
 10)   0.063 us|test_ti_thread_flag();
 10)   0.643 us|  }
 10)   1.238 us|}
 10)   1.845 us|  }
 10)   2.438 us|}

So we spent 2.5us doing nothing (ok it's a bit less without
ftrace, but still pretty bad)

Then in other cases we would have an out of line function,
but would actually do the might_sleep() checks in the inlined
caller. This doesn't make any sense at all.

There were also a few other problems, for example the x86-64 uaccess
code regularly falls back to string functions, even though a simple
mov would be enough. For example every futex access to the lock
variable would actually use string instructions, even though 
it's just 4 bytes.

This patch kit is an attempt to get us back to sane code, 
mostly by doing proper inlining and doing sleep checks in the right
place. Unfortunately I had to add one tree sweep to avoid an nasty
include loop.

v2: Now completely remove reschedule checks for uaccess functions.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/8] Move might_sleep and friends from kernel.h to sched.h

2013-08-13 Thread Andi Kleen
From: Andi Kleen 

These are really related to scheduling, so they should be in sched.h
Users usually will need to schedule anyways.

The advantage of having them there is that we can access some of the
scheduler inlines to make their fast path more efficient. This will come
in a followon patch.

Signed-off-by: Andi Kleen 
---
 include/linux/kernel.h | 35 ---
 include/linux/sched.h  | 38 ++
 2 files changed, 38 insertions(+), 35 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 482ad2d..badcc13 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -141,35 +141,6 @@ struct completion;
 struct pt_regs;
 struct user;
 
-#ifdef CONFIG_PREEMPT_VOLUNTARY
-extern int _cond_resched(void);
-# define might_resched() _cond_resched()
-#else
-# define might_resched() do { } while (0)
-#endif
-
-#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
-  void __might_sleep(const char *file, int line, int preempt_offset);
-/**
- * might_sleep - annotation for functions that can sleep
- *
- * this macro will print a stack trace if it is executed in an atomic
- * context (spinlock, irq-handler, ...).
- *
- * This is a useful debugging help to be able to catch problems early and not
- * be bitten later when the calling function happens to sleep when it is not
- * supposed to.
- */
-# define might_sleep() \
-   do { __might_sleep(__FILE__, __LINE__, 0); might_resched(); } while (0)
-#else
-  static inline void __might_sleep(const char *file, int line,
-  int preempt_offset) { }
-# define might_sleep() do { might_resched(); } while (0)
-#endif
-
-#define might_sleep_if(cond) do { if (cond) might_sleep(); } while (0)
-
 /*
  * abs() handles unsigned and signed longs, ints, shorts and chars.  For all
  * input types abs() returns a signed long.
@@ -193,12 +164,6 @@ extern int _cond_resched(void);
(__x < 0) ? -__x : __x; \
})
 
-#if defined(CONFIG_PROVE_LOCKING) || defined(CONFIG_DEBUG_ATOMIC_SLEEP)
-void might_fault(void);
-#else
-static inline void might_fault(void) { }
-#endif
-
 extern struct atomic_notifier_head panic_notifier_list;
 extern long (*panic_blink)(int state);
 __printf(1, 2)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index d722490..773f21d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2433,6 +2433,35 @@ extern int __cond_resched_softirq(void);
__cond_resched_softirq();   \
 })
 
+#ifdef CONFIG_PREEMPT_VOLUNTARY
+extern int _cond_resched(void);
+# define might_resched() _cond_resched()
+#else
+# define might_resched() do { } while (0)
+#endif
+
+#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
+ void __might_sleep(const char *file, int line, int preempt_offset);
+/**
+ * might_sleep - annotation for functions that can sleep
+ *
+ * this macro will print a stack trace if it is executed in an atomic
+ * context (spinlock, irq-handler, ...).
+ *
+ * This is a useful debugging help to be able to catch problems early and not
+ * be bitten later when the calling function happens to sleep when it is not
+ * supposed to.
+ */
+# define might_sleep() \
+   do { __might_sleep(__FILE__, __LINE__, 0); might_resched(); } while (0)
+#else
+  static inline void __might_sleep(const char *file, int line,
+  int preempt_offset) { }
+# define might_sleep() do { might_resched(); } while (0)
+#endif
+
+#define might_sleep_if(cond) do { if (cond) might_sleep(); } while (0)
+
 static inline void cond_resched_rcu(void)
 {
 #if defined(CONFIG_DEBUG_ATOMIC_SLEEP) || !defined(CONFIG_PREEMPT_RCU)
@@ -2442,6 +2471,15 @@ static inline void cond_resched_rcu(void)
 #endif
 }
 
+#ifdef CONFIG_PROVE_LOCKING
+void might_fault(void);
+#else
+static inline void might_fault(void)
+{
+   might_sleep();
+}
+#endif
+
 /*
  * Does a critical section need to be broken due to another
  * task waiting?: (technically does not depend on CONFIG_PREEMPT,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/8] tree-sweep: Include linux/sched.h for might_sleep users

2013-08-13 Thread Andi Kleen
From: Andi Kleen 

might_sleep is moving from linux/kernel.h to linux/sched.h, so any users
need to include linux/sched.h

This was done with a mechanistic script and some uses may be redundant
(already included in some other include file). However it's good practice
to always include any needed symbols from the top level .c file.

Tested with x86-64 allyesconfig. I used to do a x86-32 allyesconfig
on a old kernel, but since that is broken now I didn't retest.

Signed-off-by: Andi Kleen 
---
 arch/arm/common/mcpm_entry.c  | 1 +
 arch/arm/mach-omap2/omap_hwmod.c  | 1 +
 arch/arm/mm/highmem.c | 1 +
 arch/blackfin/kernel/bfin_gpio.c  | 1 +
 arch/frv/mm/highmem.c | 1 +
 arch/m32r/include/asm/uaccess.h   | 1 +
 arch/microblaze/include/asm/highmem.h | 1 +
 arch/mn10300/include/asm/uaccess.h| 1 +
 arch/parisc/include/asm/cacheflush.h  | 1 +
 arch/powerpc/include/asm/highmem.h| 1 +
 arch/powerpc/kernel/rtas.c| 1 +
 arch/powerpc/lib/checksum_wrappers_64.c   | 1 +
 arch/powerpc/lib/usercopy_64.c| 1 +
 arch/tile/mm/highmem.c| 1 +
 arch/x86/include/asm/checksum_32.h| 1 +
 arch/x86/lib/csum-wrappers_64.c   | 1 +
 arch/x86/mm/highmem_32.c  | 1 +
 arch/x86/mm/mmio-mod.c| 1 +
 block/blk-cgroup.c| 1 +
 block/blk-core.c  | 1 +
 block/genhd.c | 1 +
 drivers/base/dma-buf.c| 1 +
 drivers/block/rsxx/dev.c  | 1 +
 drivers/dma/ipu/ipu_irq.c | 1 +
 drivers/gpio/gpiolib.c| 1 +
 drivers/ide/ide-io.c  | 1 +
 drivers/infiniband/hw/amso1100/c2_cq.c| 1 +
 drivers/infiniband/hw/cxgb3/iwch_cm.c | 1 +
 drivers/infiniband/hw/cxgb4/cm.c  | 1 +
 drivers/md/dm.c   | 1 +
 drivers/md/raid5.c| 1 +
 drivers/mmc/core/core.c   | 1 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  | 1 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c| 1 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h | 1 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c | 1 +
 drivers/net/ethernet/intel/e1000e/netdev.c| 1 +
 drivers/net/ethernet/intel/igbvf/netdev.c | 1 +
 drivers/net/ethernet/sfc/falcon.c | 1 +
 drivers/net/ieee802154/at86rf230.c| 1 +
 drivers/net/ieee802154/fakelb.c   | 1 +
 drivers/net/wireless/ath/carl9170/usb.c   | 1 +
 drivers/net/wireless/ath/wil6210/wmi.c| 1 +
 drivers/net/wireless/b43/dma.c| 1 +
 drivers/net/wireless/b43/main.c   | 1 +
 drivers/net/wireless/b43/phy_a.c  | 1 +
 drivers/net/wireless/b43/phy_g.c  | 1 +
 drivers/net/wireless/b43legacy/dma.c  | 1 +
 drivers/net/wireless/b43legacy/radio.c| 1 +
 drivers/net/wireless/cw1200/cw1200_spi.c  | 1 +
 drivers/net/wireless/iwlwifi/dvm/sta.c| 1 +
 drivers/net/wireless/iwlwifi/iwl-op-mode.h| 1 +
 drivers/net/wireless/iwlwifi/iwl-trans.h  | 1 +
 drivers/net/wireless/libertas_tf/cmd.c| 1 +
 drivers/pci/iov.c | 1 +
 drivers/pci/pci.c | 1 +
 drivers/platform/olpc/olpc-ec.c   | 1 +
 drivers/ssb/driver_pcicore.c  | 1 +
 drivers/staging/lustre/lustre/llite/remote_perm.c | 1 +
 drivers/staging/lustre/lustre/obdclass/cl_lock.c  | 1 +
 drivers/staging/lustre/lustre/obdclass/cl_object.c| 1 +
 drivers/staging/lustre/lustre/obdclass/cl_page.c  | 1 +
 drivers/staging/lustre/lustre/osc/osc_lock.c  | 1 +
 drivers/staging/lustre/lustre/osc/osc_page.c  | 1 +
 drivers/staging/lustre/lustre/ptlrpc/client.c | 1 +
 drivers/staging/lustre/lustre/ptlrpc/gss/gss_cli_upcall.c | 1 +
 drivers/staging/lustre/lustre/ptlrpc/gss/gss_pipefs.c | 1 +
 drivers/staging/lustre/lustre/ptlrpc/sec.c| 1 +
 drivers/staging/lustre/lustre/ptlrpc/sec_config.c | 1 +
 

Re: [RFC 0/3] Pin page control subsystem

2013-08-13 Thread Minchan Kim
Hello Benjamin,

On Tue, Aug 13, 2013 at 10:23:38AM -0400, Benjamin LaHaise wrote:
> On Tue, Aug 13, 2013 at 11:46:42AM +0200, Krzysztof Kozlowski wrote:
> > Hi Minchan,
> > 
> > On wto, 2013-08-13 at 16:04 +0900, Minchan Kim wrote:
> > > patch 2 introduce pinpage control
> > > subsystem. So, subsystems want to control pinpage should implement own
> > > pinpage_xxx functions because each subsystem would have other character
> > > so what kinds of data structure for managing pinpage information depends
> > > on them. Otherwise, they can use general functions defined in pinpage
> > > subsystem. patch 3 hacks migration.c so that migration is
> > > aware of pinpage now and migrate them with pinpage subsystem.
> > 
> > I wonder why don't we use page->mapping and a_ops? Is there any
> > disadvantage of such mapping/a_ops?
> 
> That's what the pending aio patches do, and I think this is a better 
> approach for those use-cases that the technique works for.

I saw your implementation roughly and I think it's not a generic solution.
How could it handle the example mentioned in reply of Krzysztof?

> 
> The biggest problem I see with the pinpage approach is that it's based on a
> single page at a time.  I'd venture a guess that many pinned pages are done 
> in groups of pages, not single ones.

In case of z* family, most of allocation is single but I agree many GUP users
would allocate groups of pages. Then, we can cover it by expanding the API
like this.

int set_pinpage(struct pinpage_system *psys, struct page **pages,
unsigned long nr_pages, void **privates);

so we can handle it by batch and the subsystem can manage pinpage_info with
interval tree rather than radix tree which is default.
That's why pinpage control subsystem has room for subsystem specific metadata
handling.

> 
>   -ben
> 
> > Best regards,
> > Krzysztof
> 
> -- 
> "Thought is the essence of where you are now."
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 8/8] sched: Inline the need_resched test into the caller for _cond_resched

2013-08-13 Thread Andi Kleen
From: Andi Kleen 

_cond_resched is very common in kernel calls, e.g. it's used in every user
access. Usually it does at least two explicit calls just to decide to do
nothing: _cond_resched and should_resched(). Inline a need_resched()
into the caller to avoid these calls in the common case of no reschedule
being needed.

Previously this would have been very expensive in terms of binary size
because there were a lot of inlined cond_resched()s in copy_*_user()
and put/get_user().  But with the newest changes to x86 uaccess.h
these not inlined anymore, so we can use a slightly bigger, but much
faster fast path version.

Signed-off-by: Andi Kleen 
---
 include/linux/sched.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index bb7a08a..9e0efa9 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2435,7 +2435,7 @@ extern int __cond_resched_softirq(void);
 
 #ifdef CONFIG_PREEMPT_VOLUNTARY
 extern int _cond_resched(void);
-# define might_resched() _cond_resched()
+# define might_resched() (need_resched() ? _cond_resched() : 0)
 #else
 # define might_resched() do { } while (0)
 #endif
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/8] x86: Include linux/sched.h in asm/uaccess.h

2013-08-13 Thread Andi Kleen
From: Andi Kleen 

uaccess.h uses might_sleep, but there is currently no explicit include for this.
Since a upcoming patch moves might_sleep into sched.h include sched.h here.

Signed-off-by: Andi Kleen 
---
 arch/x86/include/asm/uaccess.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 5ee2687..8fa3bd6 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -3,6 +3,7 @@
 /*
  * User space memory access functions
  */
+#include 
 #include 
 #include 
 #include 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/8] x86: Remove cond_resched() from uaccess code

2013-08-13 Thread Andi Kleen
From: Andi Kleen 

As suggested by Linus, remove cond_resched() from the x86 uaccess code.
Now we only do might_fault() in debug kernels.

This means *_user() is not a reschedule point anymore for
CONFIG_PREEMPT_VOLUNTARY, only explicit cond_resched()s are.

Even in the debug kernels we should probably move
it out of line where possible, but that's left for future patches.

I did some tests with ftrace's max wakeup latency tracer
and CONFIG_PREEMPT_VOLUNTARY:

  no-resched   resched
aim7  45 us319 us
ebizzy123 us   117 us
hackbench 416 us   50 us
kbench14960 us 19741 us

I'm not sure the results are very conclusive, as they go both
ways. Most likely it costs a bit.

Signed-off-by: Andi Kleen 
---
 arch/x86/include/asm/uaccess.h|  4 ++--
 arch/x86/include/asm/uaccess_32.h |  6 +++---
 arch/x86/include/asm/uaccess_64.h | 12 ++--
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 8fa3bd6..c860ebe 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -165,7 +165,7 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 
0ULL, 0UL))
int __ret_gu;   \
register __inttype(*(ptr)) __val_gu asm("%edx");\
__chk_user_ptr(ptr);\
-   might_fault();  \
+   might_fault_debug_only();   \
asm volatile("call __get_user_%P3"  \
 : "=a" (__ret_gu), "=r" (__val_gu) \
 : "0" (ptr), "i" (sizeof(*(ptr;\
@@ -246,7 +246,7 @@ extern void __put_user_8(void);
int __ret_pu;   \
__typeof__(*(ptr)) __pu_val;\
__chk_user_ptr(ptr);\
-   might_fault();  \
+   might_fault_debug_only();   \
__pu_val = x;   \
switch (sizeof(*(ptr))) {   \
case 1: \
diff --git a/arch/x86/include/asm/uaccess_32.h 
b/arch/x86/include/asm/uaccess_32.h
index 7f760a9..e656ee9 100644
--- a/arch/x86/include/asm/uaccess_32.h
+++ b/arch/x86/include/asm/uaccess_32.h
@@ -81,7 +81,7 @@ __copy_to_user_inatomic(void __user *to, const void *from, 
unsigned long n)
 static __always_inline unsigned long __must_check
 __copy_to_user(void __user *to, const void *from, unsigned long n)
 {
-   might_fault();
+   might_fault_debug_only();
return __copy_to_user_inatomic(to, from, n);
 }
 
@@ -136,7 +136,7 @@ __copy_from_user_inatomic(void *to, const void __user 
*from, unsigned long n)
 static __always_inline unsigned long
 __copy_from_user(void *to, const void __user *from, unsigned long n)
 {
-   might_fault();
+   might_fault_debug_only();
if (__builtin_constant_p(n)) {
unsigned long ret;
 
@@ -158,7 +158,7 @@ __copy_from_user(void *to, const void __user *from, 
unsigned long n)
 static __always_inline unsigned long __copy_from_user_nocache(void *to,
const void __user *from, unsigned long n)
 {
-   might_fault();
+   might_fault_debug_only();
if (__builtin_constant_p(n)) {
unsigned long ret;
 
diff --git a/arch/x86/include/asm/uaccess_64.h 
b/arch/x86/include/asm/uaccess_64.h
index 64476bb..5a3bb4e 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -58,7 +58,7 @@ static inline unsigned long __must_check copy_from_user(void 
*to,
 {
int sz = __compiletime_object_size(to);
 
-   might_fault();
+   might_fault_debug_only();
if (likely(sz == -1 || sz >= n))
n = _copy_from_user(to, from, n);
 #ifdef CONFIG_DEBUG_VM
@@ -71,7 +71,7 @@ static inline unsigned long __must_check copy_from_user(void 
*to,
 static __always_inline __must_check
 int copy_to_user(void __user *dst, const void *src, unsigned size)
 {
-   might_fault();
+   might_fault_debug_only();
 
return _copy_to_user(dst, src, size);
 }
@@ -122,7 +122,7 @@ int __copy_from_user_nocheck(void *dst, const void __user 
*src, unsigned size)
 static __always_inline __must_check
 int __copy_from_user(void *dst, const void __user *src, unsigned size)
 {
-   might_fault();
+   might_fault_debug_only();
return __copy_from_user_nocheck(dst, src, size);
 }
 
@@ -172,7 +172,7 @@ int __copy_to_user_nocheck(void __user *dst, const void 
*src, unsigned size)
 static __always_inline __must_check
 int __copy_to_user(void __user *dst, const void *src, unsigned size)
 {
-   might_fault();
+ 

[PATCH 5/8] sched: mark should_resched() __always_inline

2013-08-13 Thread Andi Kleen
From: Andi Kleen 

At least gcc 4.6 and some earlier ones does not inline this function.
Since it's small and on relatively hot paths force inline it.

Signed-off-by: Andi Kleen 
---
 kernel/sched/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 74d7c04..23df96a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3767,7 +3767,7 @@ SYSCALL_DEFINE0(sched_yield)
return 0;
 }
 
-static inline int should_resched(void)
+static __always_inline int should_resched(void)
 {
return need_resched() && !(preempt_count() & PREEMPT_ACTIVE);
 }
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/8] Add might_fault_debug_only()

2013-08-13 Thread Andi Kleen
From: Andi Kleen 

Add a might_fault_debug_only() that only does something in the PROVE_LOCKING
case, but does not cond_resched for PREEMPT_VOLUNTARY. This is for
cases when the cond_resched is done elsewhere

Signed-off-by: Andi Kleen 
---
 include/linux/sched.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 773f21d..bb7a08a 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2473,11 +2473,13 @@ static inline void cond_resched_rcu(void)
 
 #ifdef CONFIG_PROVE_LOCKING
 void might_fault(void);
+#define might_fault_debug_only() might_fault()
 #else
 static inline void might_fault(void)
 {
might_sleep();
 }
+#define might_fault_debug_only() do {} while(0)
 #endif
 
 /*
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/8] x86: Add 1/2/4/8 byte optimization to 64bit __copy_{from,to}_user_inatomic

2013-08-13 Thread Andi Kleen
From: Andi Kleen 

The 64bit __copy_{from,to}_user_inatomic always called
copy_from_user_generic, but skipped the special optimizations for 1/2/4/8
byte accesses.

This especially hurts the futex call, which accesses the 4 byte futex
user value with a complicated fast string operation in a function call,
instead of a single movl.

Use __copy_{from,to}_user for _inatomic instead to get the same
optimizations. The only problem was the might_fault() in those functions.
So move that to new wrapper and call __copy_{f,t}_user_nocheck()
from *_inatomic directly.

32bit already did this correctly by duplicating the code.

Signed-off-by: Andi Kleen 
---
 arch/x86/include/asm/uaccess_64.h | 24 ++--
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/uaccess_64.h 
b/arch/x86/include/asm/uaccess_64.h
index 4f7923d..64476bb 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -77,11 +77,10 @@ int copy_to_user(void __user *dst, const void *src, 
unsigned size)
 }
 
 static __always_inline __must_check
-int __copy_from_user(void *dst, const void __user *src, unsigned size)
+int __copy_from_user_nocheck(void *dst, const void __user *src, unsigned size)
 {
int ret = 0;
 
-   might_fault();
if (!__builtin_constant_p(size))
return copy_user_generic(dst, (__force void *)src, size);
switch (size) {
@@ -121,11 +120,17 @@ int __copy_from_user(void *dst, const void __user *src, 
unsigned size)
 }
 
 static __always_inline __must_check
-int __copy_to_user(void __user *dst, const void *src, unsigned size)
+int __copy_from_user(void *dst, const void __user *src, unsigned size)
+{
+   might_fault();
+   return __copy_from_user_nocheck(dst, src, size);
+}
+
+static __always_inline __must_check
+int __copy_to_user_nocheck(void __user *dst, const void *src, unsigned size)
 {
int ret = 0;
 
-   might_fault();
if (!__builtin_constant_p(size))
return copy_user_generic((__force void *)dst, src, size);
switch (size) {
@@ -165,6 +170,13 @@ int __copy_to_user(void __user *dst, const void *src, 
unsigned size)
 }
 
 static __always_inline __must_check
+int __copy_to_user(void __user *dst, const void *src, unsigned size)
+{
+   might_fault();
+   return __copy_to_user_nocheck(dst, src, size);
+}
+
+static __always_inline __must_check
 int __copy_in_user(void __user *dst, const void __user *src, unsigned size)
 {
int ret = 0;
@@ -220,13 +232,13 @@ int __copy_in_user(void __user *dst, const void __user 
*src, unsigned size)
 static __must_check __always_inline int
 __copy_from_user_inatomic(void *dst, const void __user *src, unsigned size)
 {
-   return copy_user_generic(dst, (__force const void *)src, size);
+   return __copy_from_user_nocheck(dst, (__force const void *)src, size);
 }
 
 static __must_check __always_inline int
 __copy_to_user_inatomic(void __user *dst, const void *src, unsigned size)
 {
-   return copy_user_generic((__force void *)dst, src, size);
+   return __copy_to_user_nocheck((__force void *)dst, src, size);
 }
 
 extern long __copy_user_nocache(void *dst, const void __user *src,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ethernet/arc/arc_emac - fix NAPI "work > weight" warning

2013-08-13 Thread David Miller
From: Alexey Brodkin 
Date: Tue, 13 Aug 2013 17:04:36 +0400

> Initially I improperly set a boundary for maximum number of input
> packets to process on NAPI poll ("work") so it might be more than
> expected amount ("weight").
> 
> This was really harmless but seeing WARN_ON_ONCE on every device boot is
> not nice. So trivial fix ("<" instead of "<=") is here.
> 
> Signed-off-by: Alexey Brodkin 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] idr: Document ida tree sections

2013-08-13 Thread Tejun Heo
Hey, Kent.

On Tue, Aug 13, 2013 at 04:51:33PM -0700, Kent Overstreet wrote:
> Should probably be almost as good, yeah... in theory, but the space
> efficiency still isn't going to be as good, and it'll probably be more
> code... and at this point I really just don't want to futz with it more.
> At this point unless there's something really wrong with this code I
> just want to move onto something else :P

I think it probably would be okay in most cases but don't feel
confident about acking as it's making trade-offs which are unnecessary
and unusual.  So, ummm, I really don't know.  Maybe it's better enough
than what we have now but at the same time if you want to reimplement
the whole thing you should be persistent / reliable enough to see it
through this time around too, right?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Linus Torvalds
On Tue, Aug 13, 2013 at 4:10 PM, Nathan Zimmer  wrote:
>
> The only mm structure we are adding to is a new flag in page->flags.
> That didn't seem too much.

I don't agree.

I see only downsides, and no upsides. Doing the same thing *without*
the downsides seems straightforward, so I simply see no reason for any
extra flags or tests at runtime.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 0/3] Pin page control subsystem

2013-08-13 Thread Minchan Kim
Hello Krzysztof,

On Tue, Aug 13, 2013 at 11:46:42AM +0200, Krzysztof Kozlowski wrote:
> Hi Minchan,
> 
> On wto, 2013-08-13 at 16:04 +0900, Minchan Kim wrote:
> > patch 2 introduce pinpage control
> > subsystem. So, subsystems want to control pinpage should implement own
> > pinpage_xxx functions because each subsystem would have other character
> > so what kinds of data structure for managing pinpage information depends
> > on them. Otherwise, they can use general functions defined in pinpage
> > subsystem. patch 3 hacks migration.c so that migration is
> > aware of pinpage now and migrate them with pinpage subsystem.
> 
> I wonder why don't we use page->mapping and a_ops? Is there any
> disadvantage of such mapping/a_ops?

Most concern of the approach is how to handle nested pin case.
For example, driver A and driver B pin same file-backed page
conincidently by get_user_pages.
For the migration, we needs following operations.

1. [buffer]'s migrate_page for the file-backed page
2. [driver A]'s migrate_page 
3. [driver B]'s migrate_page

But the page's mapping is only one. How can we handle it?

If we give up pinpage subsystem unifying userspace pages(ex, GUP)
and kernel space pages(ex, zswap, zram and zcache), we can go
address_space's migatepages but we might lost abstraction so that
all of users should implement own pinpage manager. It's not hard,
I guess but it's more error-prone and not maintainable for the future.

> 
> Best regards,
> Krzysztof
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] idr: Document ida tree sections

2013-08-13 Thread Kent Overstreet
On Tue, Aug 13, 2013 at 07:22:11PM -0400, Tejun Heo wrote:
> Hello,
> 
> On Tue, Aug 13, 2013 at 03:59:27PM -0700, Kent Overstreet wrote:
> > > Well, it's not necessarily about requiring it but more about surviving
> > > it with some grace when things don't go as expected, which is an
> > > important characteristic for common library stuff.
> > 
> > The patch I posted should solve the high order allocations stuff, and
> > sparseness from cyclic allocations was already solved.
> 
> I don't know.  Yeah, using vmalloc would be able to work around the
> issue for most cases, I suppose.  It's iffy to consume vmalloc space
> from ida, which functionally is such a basic algorithmic construct.
> It probably won't worsen things noticeably but vmalloc area can be a
> very precious resource on 32bit configs.

This is only using it for the array of pointers to sections though, not
the bitmap itself - and only when that allocations is > 16k. For INT_MAX
allocated ids (absolute worst case) we'd be using 256k of vmalloc memory
on 64 bit, half that on 32 bit.

> 
> > Whatever caching optimizations you do with a radix tree version I could
> > apply to this bitmap tree version, and my bitmap tree code is simpler
> > and _considerably_ faster than the existing code.
> 
> But the difference won't really matter.  Cached performance would be
> the same and that's likely to cover most cases, right?  It's not like
> radix tree is orders of magnitude slower.

Should probably be almost as good, yeah... in theory, but the space
efficiency still isn't going to be as good, and it'll probably be more
code... and at this point I really just don't want to futz with it more.
At this point unless there's something really wrong with this code I
just want to move onto something else :P
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 2/2] mm: make lru_add_drain_all() selective

2013-08-13 Thread Tejun Heo
On Tue, Aug 13, 2013 at 07:44:55PM -0400, Chris Metcalf wrote:
> int lru_add_drain_all(void)
> {
> static struct cpumask mask;

Instead of cpumask,

> static DEFINE_MUTEX(lock);

you can DEFINE_PER_CPU(struct work_struct, ...).

> for_each_online_cpu(cpu) {
> if (pagevec_count(_cpu(lru_add_pvec, cpu)) ||
> pagevec_count(_cpu(lru_rotate_pvecs, cpu)) ||
> pagevec_count(_cpu(lru_deactivate_pvecs, cpu)) ||
> need_activate_page_drain(cpu))
> cpumask_set_cpu(cpu, );

and schedule the work items directly.

> }
> 
> rc = schedule_on_cpu_mask(lru_add_drain_per_cpu, );

Open coding flushing can be a bit bothersome but you can create a
per-cpu workqueue and schedule work items on it and then flush the
workqueue instead too.

No matter how flushing is implemented, the path wouldn't have any
memory allocation, which I thought was the topic of the thread, no?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 2/2] mm: make lru_add_drain_all() selective

2013-08-13 Thread Chris Metcalf
On 8/13/2013 7:29 PM, Tejun Heo wrote:
> It won't nest and doing it simultaneously won't buy anything, right?
> Wouldn't it be better to protect it with a mutex and define all
> necessary resources statically (yeah, cpumask is pain in the ass and I
> think we should un-deprecate cpumask_t for static use cases)?  Then,
> there'd be no allocation to worry about on the path.

Here's what lru_add_drain_all() looks like with a guarding mutex.
Pretty much the same code complexity as when we have to allocate the
cpumask, and there really aren't any issues from locking, since we can assume
all is well and return immediately if we fail to get the lock.

int lru_add_drain_all(void)
{
static struct cpumask mask;
static DEFINE_MUTEX(lock);
int cpu, rc;

if (!mutex_trylock())
return 0;  /* already ongoing elsewhere */

cpumask_clear();
get_online_cpus();

/*
 * Figure out which cpus need flushing.  It's OK if we race
 * with changes to the per-cpu lru pvecs, since it's no worse
 * than if we flushed all cpus, since a cpu could still end
 * up putting pages back on its pvec before we returned.
 * And this avoids interrupting other cpus unnecessarily.
 */
for_each_online_cpu(cpu) {
if (pagevec_count(_cpu(lru_add_pvec, cpu)) ||
pagevec_count(_cpu(lru_rotate_pvecs, cpu)) ||
pagevec_count(_cpu(lru_deactivate_pvecs, cpu)) ||
need_activate_page_drain(cpu))
cpumask_set_cpu(cpu, );
}

rc = schedule_on_cpu_mask(lru_add_drain_per_cpu, );

put_online_cpus();
mutex_unlock();
return rc;
}

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next 1/3] net/usb/r8152: support aggregation

2013-08-13 Thread David Miller
From: Oliver Neukum 
Date: Tue, 13 Aug 2013 17:17:10 +0200

> On Tue, 2013-08-13 at 20:32 +0800, hayeswang wrote:
>>  Oliver Neukum [mailto:oneu...@suse.de] 
>> > Sent: Tuesday, August 13, 2013 4:49 PM
>> > To: Hayeswang
>> > Cc: net...@vger.kernel.org; linux-kernel@vger.kernel.org; 
>> > linux-...@vger.kernel.org
>> > Subject: Re: [PATCH net-next 1/3] net/usb/r8152: support aggregation
>> > 
>> [...]
>> > > +   len_used = 0;
>> > > +   rx_desc = agg->head;
>> > > +   rx_data = agg->head;
>> > > +   smp_wmb();
>> > > +   pkt_len = le32_to_cpu(rx_desc->opts1) & RX_LEN_MASK;
>> > > +   len_used += sizeof(struct rx_desc) + pkt_len;
>> > > +
>> > > +   while (urb->actual_length >= len_used) {
>> > > +   if (pkt_len < ETH_ZLEN)
>> > > +   break;
>> > > +
>> > > +   pkt_len -= 4; /* CRC */
>> > > +   rx_data += sizeof(struct rx_desc);
>> > > +
>> > > +   skb = netdev_alloc_skb_ip_align(netdev,
>> > > pkt_len);
>> > > +   if (!skb) {
>> > > +   stats->rx_dropped++;
>> > > +   break;
>> > > +   }
>> > > +   memcpy(skb->data, rx_data, pkt_len);
>> > > +   skb_put(skb, pkt_len);
>> > > +   skb->protocol = eth_type_trans(skb, netdev);
>> > > +   netif_rx(skb);
>> > > +   stats->rx_packets++;
>> > > +   stats->rx_bytes += pkt_len;
>> > > +
>> > > +   rx_data = rx_agg_align(rx_data + 
>> > pkt_len + 4);
>> > > +   rx_desc = (struct rx_desc *)rx_data;
>> > > +   smp_wmb();
>> > 
>> > Against what is the memory barrier?
>> 
>> Excuse me. I don't understand your question. Do you mean the function should 
>> not
>> be used here?
> 
> I don't understand what problem the function is supposed to fix. As long
> as I don't understand it I cannot say for sure whether it is correct.
> There seems no obvious reason for a memory barrier, but there may be a
> hidden reason I don't see.

Hayes, when Oliver asks you "Against what is the memory barrier?" he is asking
you which memory operations you are trying to order.

You do not explain this in your commit message, nor do you explain it with a
suitable comment.  This is not acceptable.

It is absolutely critical, that any time you add a memory barrier, you add a
comment above the new memory barrier explaining exactly what the barrier is
trying to achieve.

In fact, this is required by our coding standards.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] module: Allow parameters without arguments

2013-08-13 Thread Lucas De Marchi
On Tue, Aug 13, 2013 at 6:02 PM, Steven Rostedt  wrote:
> Rusty,
>
> I'm looking at porting my "enable tracepoints in module load" patches
> and one of the comments you gave me (long ago) was to not have:
>
>  trace_foo=1
>
> but to just have:
>
>  trace_foo
>
> as a parameter name. I went and implemented this but discovered that the
> functions that allow no arguments are hard coded in the params.c file.
>
> I changed this to allow other "set" functions to be given no arguments,
> and even noticed that a few already exist in the kernel. So I'm sending
> you this patch set that implements a modification to the parameter
> parsing to allow other kernel_param_ops to not bother with arguments
> passed in.
>
> What do you think?

so in kcmdline we would have modulename.param instead of modulename.param=1?

I guess we need to update kmod then, because currently we ignore and
treat this case as a wrong token. From a quick look, allowing it in
kmod would be as simple as removing a condition check.

Lucas De Marchi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 2/2] perf tools: add attr->mmap2 support

2013-08-13 Thread Andi Kleen
On Tue, Aug 13, 2013 at 01:55:57PM +0200, Stephane Eranian wrote:
> This patch adds support for the new PERF_RECORD_MMAP2
> record type exposed by the kernel. This is an extended
> PERF_RECORD_MMAP record. It adds for each file-backed
> mapping the device major, minor number and the inode
> number. This triplet uniquely identifies the source
> of a file-backed mapping. It can be used to detect
> identical virtual mappings between processes for instance.

Can you also add the generation number please?
That would make it even more unique.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] DT bindings as ABI [was: Do we have people interested in device tree janitoring / cleanup?]

2013-08-13 Thread H. Peter Anvin
On 08/01/2013 08:50 PM, David Gibson wrote:
> On Wed, Jul 31, 2013 at 05:26:47PM -0400, jonsm...@gmail.com
> wrote:
>> On Wed, Jul 31, 2013 at 4:48 PM, Russell King - ARM Linux 
>>  wrote:
>>> On Wed, Jul 31, 2013 at 04:37:36PM -0400, jonsm...@gmail.com
>> wrote:
> [snip]
>> Alternatively you may be of the belief that it is impossible to
>> get rid of the board specific code. But x86 doesn't have any of
>> it, why should ARM?
> 
> Sure x86 has board specific code.  It's just that x86 basically
> only has one board - PC.
> 

That is one aspect (hardware standardization)... but it is more to it
than that.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 2/2] mm: make lru_add_drain_all() selective

2013-08-13 Thread Chris Metcalf
On 8/13/2013 7:29 PM, Tejun Heo wrote:
> Hello,
>
> On Tue, Aug 13, 2013 at 06:53:32PM -0400, Chris Metcalf wrote:
>>  int lru_add_drain_all(void)
>>  {
>> -return schedule_on_each_cpu(lru_add_drain_per_cpu);
>> +return schedule_on_each_cpu_cond(lru_add_drain_per_cpu,
>> + lru_add_drain_cond, NULL);
> It won't nest and doing it simultaneously won't buy anything, right?

Correct on both counts, I think.

> Wouldn't it be better to protect it with a mutex and define all
> necessary resources statically (yeah, cpumask is pain in the ass and I
> think we should un-deprecate cpumask_t for static use cases)?  Then,
> there'd be no allocation to worry about on the path.

If allocation is a real problem on this path, I think this is probably
OK, though I don't want to speak for Andrew.  You could just guard it
with a trylock and any caller that tried to start it while it was
locked could just return happy that it was going on.

I'll put out a version that does that and see how that looks
for comparison's sake.

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] rtnetlink: rtnl_bridge_getlink: Call nlmsg_find_attr() with ifinfomsg header

2013-08-13 Thread David Miller
From: Vlad Yasevich 
Date: Tue, 13 Aug 2013 19:06:37 -0400

> ast explained it in his header message (Bridge VLAN kernel/iproute2
> incompatibility)

That's not a header message.

Header messages have a subject prefix of the form "[PATCH 0/N]".
If he had done this I wouldn't have had to ask such silly
questions.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 2/2] mm: make lru_add_drain_all() selective

2013-08-13 Thread Tejun Heo
Hello,

On Tue, Aug 13, 2013 at 06:53:32PM -0400, Chris Metcalf wrote:
>  int lru_add_drain_all(void)
>  {
> - return schedule_on_each_cpu(lru_add_drain_per_cpu);
> + return schedule_on_each_cpu_cond(lru_add_drain_per_cpu,
> +  lru_add_drain_cond, NULL);

It won't nest and doing it simultaneously won't buy anything, right?
Wouldn't it be better to protect it with a mutex and define all
necessary resources statically (yeah, cpumask is pain in the ass and I
think we should un-deprecate cpumask_t for static use cases)?  Then,
there'd be no allocation to worry about on the path.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] i915: Add a Kconfig option to turn on i915.preliminary_hw_support by default

2013-08-13 Thread Josh Triplett
When building kernels for a preliminary hardware target, having to add a
kernel command-line option can prove inconvenient.  Add a Kconfig option
that changes the default of this option to 1.

Signed-off-by: Josh Triplett 
---

I dropped the indication of the default in the module parameter
documentation, but I could also change it to show the default for the
current kernel via ifdef.

 drivers/gpu/drm/Kconfig | 9 +
 drivers/gpu/drm/i915/i915_drv.c | 4 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index a7c54c8..35d57ed 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -168,6 +168,15 @@ config DRM_I915_KMS
  the driver to bind to PCI devices, which precludes loading things
  like intelfb.
 
+config DRM_I915_PRELIMINARY_HW_SUPPORT
+   bool "Enable preliminary support for prerelease Intel hardware by 
default"
+   depends on DRM_I915
+   help
+ Choose this option if you have prerelease Intel hardware and want the
+ i915 driver to support it by default.  You can enable such support at
+ runtime with the module option i915.preliminary_hw_support=1; this
+ option changes the default for that module option.
+
 config DRM_MGA
tristate "Matrox g200/g400"
depends on DRM && PCI
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 45b3c03..594e06c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -118,10 +118,10 @@ module_param_named(i915_enable_ppgtt, i915_enable_ppgtt, 
int, 0600);
 MODULE_PARM_DESC(i915_enable_ppgtt,
"Enable PPGTT (default: true)");
 
-unsigned int i915_preliminary_hw_support __read_mostly = 0;
+unsigned int i915_preliminary_hw_support __read_mostly = 
IS_ENABLED(CONFIG_DRM_I915_PRELIMINARY_HW_SUPPORT);
 module_param_named(preliminary_hw_support, i915_preliminary_hw_support, int, 
0600);
 MODULE_PARM_DESC(preliminary_hw_support,
-   "Enable preliminary hardware support. (default: false)");
+   "Enable preliminary hardware support.");
 
 int i915_disable_power_well __read_mostly = 1;
 module_param_named(disable_power_well, i915_disable_power_well, int, 0600);
-- 
1.8.4.rc2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] idr: Document ida tree sections

2013-08-13 Thread Tejun Heo
Hello,

On Tue, Aug 13, 2013 at 03:59:27PM -0700, Kent Overstreet wrote:
> > Well, it's not necessarily about requiring it but more about surviving
> > it with some grace when things don't go as expected, which is an
> > important characteristic for common library stuff.
> 
> The patch I posted should solve the high order allocations stuff, and
> sparseness from cyclic allocations was already solved.

I don't know.  Yeah, using vmalloc would be able to work around the
issue for most cases, I suppose.  It's iffy to consume vmalloc space
from ida, which functionally is such a basic algorithmic construct.
It probably won't worsen things noticeably but vmalloc area can be a
very precious resource on 32bit configs.

> Whatever caching optimizations you do with a radix tree version I could
> apply to this bitmap tree version, and my bitmap tree code is simpler
> and _considerably_ faster than the existing code.

But the difference won't really matter.  Cached performance would be
the same and that's likely to cover most cases, right?  It's not like
radix tree is orders of magnitude slower.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] module: Allow parameters without arguments

2013-08-13 Thread Steven Rostedt
On Tue, 13 Aug 2013 17:02:28 -0400
Steven Rostedt  wrote:

> Rusty,
> 
> I'm looking at porting my "enable tracepoints in module load" patches
> and one of the comments you gave me (long ago) was to not have:
> 
>  trace_foo=1
> 
> but to just have:
> 
>  trace_foo
> 
> as a parameter name. I went and implemented this but discovered that the
> functions that allow no arguments are hard coded in the params.c file.
> 
> I changed this to allow other "set" functions to be given no arguments,
> and even noticed that a few already exist in the kernel. So I'm sending
> you this patch set that implements a modification to the parameter
> parsing to allow other kernel_param_ops to not bother with arguments
> passed in.
> 
> What do you think?
> 
> -- Steve
> 
> Steven Rostedt (1):
>   tracing: Enable tracepoints via module parameters
> 

OK, this is what I get for using my scripts along with manually sending
out patches via quilt. I only wanted to send out the three patches
below, but then used my scripts to make this header. The above patch
commit (along with the complete change set below) was not what I
intended on sending :-p

-- Steve



> Steven Rostedt (Red Hat) (3):
>   module: Add flag to allow mod params to have no arguments
>   module: Add NOARG flag for ops with param_set_bool_enable_only() set 
> function
>   module/lsm: Have apparmor module parameters work with no args
> 
> 
>  include/linux/ftrace_event.h |4 +++
>  include/linux/moduleparam.h  |   13 +++-
>  include/trace/ftrace.h   |   23 --
>  kernel/module.c  |1 +
>  kernel/params.c  |6 ++--
>  kernel/trace/trace_events.c  |   71 
> ++
>  security/apparmor/lsm.c  |2 ++
>  7 files changed, 115 insertions(+), 5 deletions(-)
> ---
> diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
> index 120d57a..0395182 100644
> --- a/include/linux/ftrace_event.h
> +++ b/include/linux/ftrace_event.h
> @@ -164,6 +164,8 @@ void tracing_record_cmdline(struct task_struct *tsk);
>  
>  struct event_filter;
>  
> +extern struct kernel_param_ops ftrace_mod_ops;
> +
>  enum trace_reg {
>   TRACE_REG_REGISTER,
>   TRACE_REG_UNREGISTER,
> @@ -202,6 +204,7 @@ enum {
>   TRACE_EVENT_FL_NO_SET_FILTER_BIT,
>   TRACE_EVENT_FL_IGNORE_ENABLE_BIT,
>   TRACE_EVENT_FL_WAS_ENABLED_BIT,
> + TRACE_EVENT_FL_MOD_ENABLE_BIT,
>  };
>  
>  /*
> @@ -220,6 +223,7 @@ enum {
>   TRACE_EVENT_FL_NO_SET_FILTER= (1 << 
> TRACE_EVENT_FL_NO_SET_FILTER_BIT),
>   TRACE_EVENT_FL_IGNORE_ENABLE= (1 << 
> TRACE_EVENT_FL_IGNORE_ENABLE_BIT),
>   TRACE_EVENT_FL_WAS_ENABLED  = (1 << TRACE_EVENT_FL_WAS_ENABLED_BIT),
> + TRACE_EVENT_FL_MOD_ENABLE   = (1 << TRACE_EVENT_FL_MOD_ENABLE_BIT),
>  };
>  
>  struct ftrace_event_call {
> diff --git a/include/linux/moduleparam.h b/include/linux/moduleparam.h
> index 27d9da3..c3eb102 100644
> --- a/include/linux/moduleparam.h
> +++ b/include/linux/moduleparam.h
> @@ -36,7 +36,18 @@ static const char __UNIQUE_ID(name)[]  
>   \
>  
>  struct kernel_param;
>  
> +/*
> + * Flags available for kernel_param_ops
> + *
> + * NOARG - the parameter allows for no argument (foo instead of foo=1)
> + */
> +enum {
> + KERNEL_PARAM_FL_NOARG = (1 << 0)
> +};
> +
>  struct kernel_param_ops {
> + /* How the ops should behave */
> + unsigned int flags;
>   /* Returns 0, or -errno.  arg is in kp->arg. */
>   int (*set)(const char *val, const struct kernel_param *kp);
>   /* Returns length written or -errno.  Buffer is 4k (ie. be short!) */
> @@ -187,7 +198,7 @@ struct kparam_array
>  /* Obsolete - use module_param_cb() */
>  #define module_param_call(name, set, get, arg, perm) \
>   static struct kernel_param_ops __param_ops_##name = \
> -  { (void *)set, (void *)get };  \
> + { 0, (void *)set, (void *)get };\
>   __module_param_call(MODULE_PARAM_PREFIX,\
>   name, &__param_ops_##name, arg, \
>   (perm) + sizeof(__check_old_set_param(set))*0, -1)
> diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
> index 41a6643..d6029ed 100644
> --- a/include/trace/ftrace.h
> +++ b/include/trace/ftrace.h
> @@ -17,6 +17,7 @@
>   */
>  
>  #include 
> +#include 
>  
>  /*
>   * DECLARE_EVENT_CLASS can be used to add a generic function
> @@ -577,6 +578,22 @@ static inline void ftrace_test_probe_##call(void)
> \
>  #undef __get_dynamic_array
>  #undef __get_str
>  
> +/*
> + * Add ftrace trace points in modules to be set by module
> + * parameters. This adds "trace_##call" as a module parameter.
> + * The user could enable trace points on module load with:
> + *  trace_##call=1 as a module parameter.
> + 

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Nathan Zimmer
On Tue, Aug 13, 2013 at 10:51:37AM -0700, Linus Torvalds wrote:
> I realize that benchmarking cares, and yes, I also realize that some
> benchmarks actually want to reboot the machine between some runs just
> to get repeatability, but if you're benchmarking a 16TB machine I'm
> guessing any serious benchmark that actually uses that much memory is
> going to take many hours to a few days to run anyway? Having some way
> to wait until the memory is all done (which might even be just a silly
> shell script that does "ps" and waits for the kernel threads to all go
> away) isn't going to kill the benchmark - and the benchmark itself
> will then not have to worry about hittinf the "oops, I need to
> initialize 2GB of RAM now because I hit an uninitialized page".
> 
I am not overly concerned with cost having to setup a page struct on first
touch but what I need to avoid is adding more permanent cost to page faults
on a system that is already "primed".

> Ok, so I don't know all the issues, and in many ways I don't even
> really care. You could do it other ways, I don't think this is a big
> deal. The part I hate is the runtime hook into the core MM page
> allocation code, so I'm just throwing out any random thing that comes
> to my mind that could be used to avoid that part.
> 

The only mm structure we are adding to is a new flag in page->flags.
That didn't seem too much.

I had hoped to restrict the core mm changes to check_new_page and
free_pages_check but I haven't gotten there yet.

Not putting on uninitialized pages on to the lru would work but then I 
would be concerned over any calculations based on totalpages.  I might be
too paranoid there but having that be incorrect until after a system is booted
worries me.


Nate
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] tracing: Enable tracepoints via module parameters

2013-08-13 Thread Steven Rostedt
On Tue, 13 Aug 2013 18:34:53 -0400
Mathieu Desnoyers  wrote:


> What I like about this approach, if applied to kernel modules, is that
> it does not require users to interact with module load parameters to
> specify which tracepoints should be enabled: this is all done through
> the regular tracer UI, thus greatly improving user experience.
> 

I have thought about adding a file that would let you enable generic
tracepoints as they are created. Doesn't even need to be specifically
modules, but kprobes as well.

But that's agnostic to this patch. One thing I like about the patch is
that it has modinfo show you the available tracepoints in a module.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 2/2] mm: make lru_add_drain_all() selective

2013-08-13 Thread Chris Metcalf
This change makes lru_add_drain_all() only selectively interrupt
the cpus that have per-cpu free pages that can be drained.

This is important in nohz mode where calling mlockall(), for
example, otherwise will interrupt every core unnecessarily.

Signed-off-by: Chris Metcalf 
---
v7: try a version with callbacks instead of cpu masks.
Either this or v6 seem like reasonable solutions.

v6: add Tejun's Acked-by, and add missing get/put_cpu_online to
lru_add_drain_all().

v5: provide validity checking on the cpumask for schedule_on_cpu_mask.
By providing an all-or-nothing EINVAL check, we impose the requirement
that the calling code actually know clearly what it's trying to do.
(Note: no change to the mm/swap.c commit)

v4: don't lose possible -ENOMEM in schedule_on_each_cpu()
(Note: no change to the mm/swap.c commit)

v3: split commit into two, one for workqueue and one for mm, though both
should probably be taken through -mm.

 mm/swap.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/mm/swap.c b/mm/swap.c
index 4a1d0d2..fe3a488 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -405,6 +405,11 @@ static void activate_page_drain(int cpu)
pagevec_lru_move_fn(pvec, __activate_page, NULL);
 }
 
+static bool need_activate_page_drain(int cpu)
+{
+   return pagevec_count(_cpu(activate_page_pvecs, cpu)) != 0;
+}
+
 void activate_page(struct page *page)
 {
if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
@@ -422,6 +427,11 @@ static inline void activate_page_drain(int cpu)
 {
 }
 
+static bool need_activate_page_drain(int cpu)
+{
+   return false;
+}
+
 void activate_page(struct page *page)
 {
struct zone *zone = page_zone(page);
@@ -673,6 +683,14 @@ void lru_add_drain(void)
put_cpu();
 }
 
+static bool lru_add_drain_cond(void *data, int cpu)
+{
+   return pagevec_count(_cpu(lru_add_pvec, cpu)) ||
+   pagevec_count(_cpu(lru_rotate_pvecs, cpu)) ||
+   pagevec_count(_cpu(lru_deactivate_pvecs, cpu)) ||
+   need_activate_page_drain(cpu);
+}
+
 static void lru_add_drain_per_cpu(struct work_struct *dummy)
 {
lru_add_drain();
@@ -683,7 +701,8 @@ static void lru_add_drain_per_cpu(struct work_struct *dummy)
  */
 int lru_add_drain_all(void)
 {
-   return schedule_on_each_cpu(lru_add_drain_per_cpu);
+   return schedule_on_each_cpu_cond(lru_add_drain_per_cpu,
+lru_add_drain_cond, NULL);
 }
 
 /*
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 1/2] workqueue: add schedule_on_each_cpu_cond

2013-08-13 Thread Chris Metcalf
This API supports running work on a subset of the online
cpus determined by a callback function.

Signed-off-by: Chris Metcalf 
---
v7: try a version with callbacks instead of cpu masks.
Either this or v6 seem like reasonable solutions.

v6: add Tejun's Acked-by, and add missing get/put_cpu_online to
lru_add_drain_all().

v5: provide validity checking on the cpumask for schedule_on_cpu_mask.
By providing an all-or-nothing EINVAL check, we impose the requirement
that the calling code actually know clearly what it's trying to do.
(Note: no change to the mm/swap.c commit)

v4: don't lose possible -ENOMEM in schedule_on_each_cpu()
(Note: no change to the mm/swap.c commit)

v3: split commit into two, one for workqueue and one for mm, though both
should probably be taken through -mm.

 include/linux/workqueue.h |  3 +++
 kernel/workqueue.c| 54 +++
 2 files changed, 57 insertions(+)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index a0ed78a..c5ee29f 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -17,6 +17,7 @@ struct workqueue_struct;
 
 struct work_struct;
 typedef void (*work_func_t)(struct work_struct *work);
+typedef bool (*work_cond_func_t)(void *data, int cpu);
 void delayed_work_timer_fn(unsigned long __data);
 
 /*
@@ -471,6 +472,8 @@ extern void drain_workqueue(struct workqueue_struct *wq);
 extern void flush_scheduled_work(void);
 
 extern int schedule_on_each_cpu(work_func_t func);
+extern int schedule_on_each_cpu_cond(work_func_t func, work_cond_func_t cond,
+void *data);
 
 int execute_in_process_context(work_func_t fn, struct execute_work *);
 
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index f02c4a4..5c5b534 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2999,6 +2999,60 @@ int schedule_on_each_cpu(work_func_t func)
 }
 
 /**
+ * schedule_on_each_cpu_cond - execute a function synchronously on each
+ *   online CPU if requested by a condition callback.
+ * @func: the function to call
+ * @cond: the callback function to determine whether to schedule the work
+ * @data: opaque data passed to the callback function
+ *
+ * schedule_on_each_cpu_cond() calls @cond for each online cpu (in the
+ * context of the current cpu), and for each cpu for which @cond returns
+ * true, it executes @func using the system workqueue.  The function
+ * blocks until all CPUs on which work was scheduled have completed.
+ * schedule_on_each_cpu_cond() is very slow.
+ *
+ * The @cond callback is called in the same context as the original
+ * call to schedule_on_each_cpu_cond().
+ *
+ * RETURNS:
+ * 0 on success, -errno on failure.
+ */
+int schedule_on_each_cpu_cond(work_func_t func,
+ work_cond_func_t cond, void *data)
+{
+   int cpu;
+   struct work_struct __percpu *works;
+
+   works = alloc_percpu(struct work_struct);
+   if (!works)
+   return -ENOMEM;
+
+   get_online_cpus();
+
+   for_each_online_cpu(cpu) {
+   struct work_struct *work = per_cpu_ptr(works, cpu);
+
+   if (cond(data, cpu)) {
+   INIT_WORK(work, func);
+   schedule_work_on(cpu, work);
+   } else {
+   work->entry.next = NULL;
+   }
+   }
+
+   for_each_online_cpu(cpu) {
+   struct work_struct *work = per_cpu_ptr(works, cpu);
+
+   if (work->entry.next)
+   flush_work(work);
+   }
+
+   put_online_cpus();
+   free_percpu(works);
+   return 0;
+}
+
+/**
  * flush_scheduled_work - ensure that any scheduled work has run to completion.
  *
  * Forces execution of the kernel-global workqueue and blocks until its
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] rtnetlink: rtnl_bridge_getlink: Call nlmsg_find_attr() with ifinfomsg header

2013-08-13 Thread Vlad Yasevich

On 08/13/2013 06:54 PM, David Miller wrote:

From: Vlad Yasevich 
Date: Mon, 12 Aug 2013 15:57:29 -0400


On 08/12/2013 12:30 PM, Asbjoern Sloth Toennesen wrote:

Fix the iproute2 command `bridge vlan show`, after switching from
rtgenmsg to ifinfomsg.

Signed-off-by: Asbjoern Sloth Toennesen 



Thanks..  I've still been using an older iproute version and didn't
see this.


Reviewed-by: Vlad Yasevich 


What introduced this regression?



ast explained it in his header message (Bridge VLAN kernel/iproute2 
incompatibility)


-vlad
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 2/2] mm: make lru_add_drain_all() selective

2013-08-13 Thread Chris Metcalf
On 8/13/2013 6:26 PM, Andrew Morton wrote:
> On Tue, 13 Aug 2013 18:13:48 -0400 Chris Metcalf  wrote:
>
>> On 8/13/2013 5:13 PM, Andrew Morton wrote:
>>> On Tue, 13 Aug 2013 16:59:54 -0400 Chris Metcalf  
>>> wrote:
>>>
> Then again, why does this patchset exist?  It's a performance
> optimisation so presumably someone cares.  But not enough to perform
> actual measurements :(
 The patchset exists because of the difference between zero overhead on
 cpus that don't have drainable lrus, and non-zero overhead.  This turns
 out to be important on workloads where nohz cores are handling 10 Gb
 traffic in userspace and really, really don't want to be interrupted,
 or they drop packets on the floor.
>>> But what is the effect of the patchset?  Has it been tested against the
>>> problematic workload(s)?
>> Yes.  The result is that syscalls such as mlockall(), which otherwise 
>> interrupt
>> every core, don't interrupt the cores that are running purely in userspace.
>> Since they are purely in userspace they don't have any drainable pagevecs,
>> so the patchset means they don't get interrupted and don't drop packets.
>>
>> I implemented this against Linux 2.6.38 and our home-grown version of nohz
>> cpusets back in July 2012, and we have been shipping it to customers since 
>> then.
> argh.
>
> Those per-cpu LRU pagevecs were a nasty but very effective locking
> amortization hack back in, umm, 2002.  They have caused quite a lot of
> weird corner-case behaviour, resulting in all the lru_add_drain_all()
> calls sprinkled around the place.  I'd like to nuke the whole thing,
> but that would require a fundamental rethnik/rework of all the LRU list
> locking.
>
> According to the 8891d6da17db0f changelog, the lru_add_drain_all() in
> sys_mlock() isn't really required: "it isn't must.  but it reduce the
> failure of moving to unevictable list.  its failure can rescue in
> vmscan later.  but reducing is better."
>
> I suspect we could just kill it.

That's probably true, but I suspect this change is still worthwhile for
nohz environments.  There are other calls of lru_add_drain_all(), and
you just don't want anything in the kernel that interrupts every core
when only a subset could be interrupted.  If the kernel can avoid
generating unnecessary interrupts to uninvolved cores, you can make
guarantees about jitter on cores that are running dedicated userspace code.

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/15] drivers: phy: add generic PHY framework

2013-08-13 Thread Tomasz Figa
On Wednesday 14 of August 2013 00:19:28 Sylwester Nawrocki wrote:
> W dniu 2013-08-13 14:05, Kishon Vijay Abraham I pisze:
> > On Tuesday 13 August 2013 05:07 PM, Tomasz Figa wrote:
> >> On Tuesday 13 of August 2013 16:14:44 Kishon Vijay Abraham I wrote:
> >>> On Wednesday 31 July 2013 11:45 AM, Felipe Balbi wrote:
>  On Wed, Jul 31, 2013 at 11:14:32AM +0530, Kishon Vijay Abraham I 
wrote:
> >> IMHO we need a lookup method for PHYs, just like for clocks,
> >> regulators, PWMs or even i2c busses because there are complex
> >> cases
> >> when passing just a name using platform data will not work. I
> >> would
> >> second what Stephen said [1] and define a structure doing
> >> things
> >> in a
> >> DT-like way.
> >> 
> >> Example;
> >> 
> >> [platform code]
> >> 
> >> static const struct phy_lookup my_phy_lookup[] = {
> >> 
> >>PHY_LOOKUP("s3c-hsotg.0", "otg", "samsung-usbphy.1",
> >>"phy.2"),
> > 
> > The only problem here is that if *PLATFORM_DEVID_AUTO* is used
> > while
> > creating the device, the ids in the device name would change
> > and
> > PHY_LOOKUP wont be useful.
>  
>  I don't think this is a problem. All the existing lookup
>  methods
>  already
>  use ID to identify devices (see regulators, clkdev, PWMs, i2c,
>  ...). You
>  can simply add a requirement that the ID must be assigned
>  manually,
>  without using PLATFORM_DEVID_AUTO to use PHY lookup.
> >>> 
> >>> And I'm saying that this idea, of using a specific name and id,
> >>> is
> >>> frought with fragility and will break in the future in various
> >>> ways
> >>> when
> >>> devices get added to systems, making these strings constantly
> >>> have
> >>> to be
> >>> kept up to date with different board configurations.
> >>> 
> >>> People, NEVER, hardcode something like an id.  The fact that
> >>> this
> >>> happens today with the clock code, doesn't make it right, it
> >>> makes
> >>> the
> >>> clock code wrong.  Others have already said that this is wrong
> >>> there
> >>> as
> >>> well, as systems change and dynamic ids get used more and more.
> >>> 
> >>> Let's not repeat the same mistakes of the past just because we
> >>> refuse to
> >>> learn from them...
> >>> 
> >>> So again, the "find a phy by a string" functions should be
> >>> removed,
> >>> the
> >>> device id should be automatically created by the phy core just
> >>> to
> >>> make
> >>> things unique in sysfs, and no driver code should _ever_ be
> >>> reliant
> >>> on
> >>> the number that is being created, and the pointer to the phy
> >>> structure
> >>> should be used everywhere instead.
> >>> 
> >>> With those types of changes, I will consider merging this
> >>> subsystem,
> >>> but
> >>> without them, sorry, I will not.
> >> 
> >> I'll agree with Greg here, the very fact that we see people
> >> trying to
> >> add a requirement of *NOT* using PLATFORM_DEVID_AUTO already
> >> points
> >> to a big problem in the framework.
> >> 
> >> The fact is that if we don't allow PLATFORM_DEVID_AUTO we will
> >> end up
> >> adding similar infrastructure to the driver themselves to make
> >> sure
> >> we
> >> don't end up with duplicate names in sysfs in case we have
> >> multiple
> >> instances of the same IP in the SoC (or several of the same PCIe
> >> card).
> >> I really don't want to go back to that.
> > 
> > If we are using PLATFORM_DEVID_AUTO, then I dont see any way we
> > can
> > give the correct binding information to the PHY framework. I think
> > we
> > can drop having this non-dt support in PHY framework? I see only
> > one
> > platform (OMAP3) going to be needing this non-dt support and we
> > can
> > use the USB PHY library for it.>
>  
>  you shouldn't drop support for non-DT platform, in any case we
>  lived
>  without DT (and still do) for years. Gotta find a better way ;-)
> >>> 
> >>> hmm..
> >>> 
> >>> how about passing the device names of PHY in platform data of the
> >>> controller? It should be deterministic as the PHY framework assigns
> >>> its
> >>> own id and we *don't* want to add any requirement that the ID must
> >>> be
> >>> assigned manually without using PLATFORM_DEVID_AUTO. We can get rid
> >>> of
> >>> *phy_init_data* in the v10 patch series.
> 
> OK, so the PHY device name would have a fixed part, passed as
> platform data of the controller and a variable part appended
> by the PHY core, depending on the number of registered PHYs ?
> 
> Then same PHY names would be passed as the PHY provider 

Re: [PATCH v4 2/2] mm: make lru_add_drain_all() selective

2013-08-13 Thread Tejun Heo
Hello, Andrew.

On Tue, Aug 13, 2013 at 03:47:40PM -0700, Andrew Morton wrote:
> > Well, I don't buy that either.  Callback based interface has its
> > issues.
> 
> No it hasn't.  It's a common and simple technique which we all understand.

It sure has its uses but it has receded some of its former use cases
to better constructs which are easier to use and maintain.  I'm not
saying it's black and white here as the thing is callback based anyway
but was trying to point out general disadvantages of callback based
interface.  If you're saying callback based interface isn't clunkier
compared to constructs which can be embedded in the caller side, this
discussion probably won't be very fruitful.

> It's a relatively small improvement in the lru_add_drain_all() case. 
> Other callsites can gain improvements as well.

Well, if we're talking about minute performance differences, for
non-huge configurations, it'll actually be a small performance
degradation as there will be more instructions and the control will be
jumping back and forth.

> It results in superior runtime code.  At this and potentially other
> callsites.

It's actually inferior in majority of cases.

> It does buy us things, as I've repeatedly described.  You keep on
> saying things which demonstrably aren't so.  I think I'll give up now.

I just don't think it's something clear cut and it doesn't even matter
for the problem at hand.  Let's please talk about how to solve the
actual problem.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/10] staging: ozwpan: Coding style fixes

2013-08-13 Thread Dan Carpenter
These make me very happy.  Thanks for doing that.

Reviewed-by: Dan Carpenter 

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 10/10] staging: ozwpan: Separate success & failure case for oz_hcd_pd_arrived()

2013-08-13 Thread Dan Carpenter
On Tue, Aug 13, 2013 at 06:29:26PM +0100, Rupesh Gujare wrote:
> From: Dan Carpenter 
> 
> This patch separates success & failure block along with fixing
> following issues:-
> 
> 1. The way oz_hcd_pd_arrived() looks now it's easy to think we free "ep" but
> actually we do this spaghetti thing of setting it to NULL on success.
> 
> 2. It is hard to read it because there are unlocks scattered throughout.
> 
> 3. Currently we set "ep" to NULL on the success path and then test it and or
> free it. In current code you have to scroll to the start of the function
> to read code.
> 
> Original patch was submitted by Dan here :-
> http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2013-August/040113.html
> 
> Signed-off-by: Rupesh Gujare 

Since you gave me the author tag for this then I'll sign off on this
as well.

Signed-off-by: Dan Carpenter 

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] idr: Document ida tree sections

2013-08-13 Thread Kent Overstreet
On Tue, Aug 13, 2013 at 06:44:28PM -0400, Tejun Heo wrote:
> Hello, Kent.
> 
> On Tue, Aug 13, 2013 at 03:27:59PM -0700, Kent Overstreet wrote:
> > It's only naturally a radix tree problem _if_ you require sparseness.
> 
> Well, it's not necessarily about requiring it but more about surviving
> it with some grace when things don't go as expected, which is an
> important characteristic for common library stuff.

The patch I posted should solve the high order allocations stuff, and
sparseness from cyclic allocations was already solved.

> > Otherwise, radix trees require pointer chasing, which we can avoid -
> > which saves us both the cost of chasing pointers (which is significant)
> > and the overhead of storing them.
> 
> Vast majority of which can be avoided with simple caching, right?

Whatever caching optimizations you do with a radix tree version I could
apply to this bitmap tree version, and my bitmap tree code is simpler
and _considerably_ faster than the existing code.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] rtnetlink: rtnl_bridge_getlink: Call nlmsg_find_attr() with ifinfomsg header

2013-08-13 Thread David Miller
From: Vlad Yasevich 
Date: Mon, 12 Aug 2013 15:57:29 -0400

> On 08/12/2013 12:30 PM, Asbjoern Sloth Toennesen wrote:
>> Fix the iproute2 command `bridge vlan show`, after switching from
>> rtgenmsg to ifinfomsg.
>>
>> Signed-off-by: Asbjoern Sloth Toennesen 
> 
> 
> Thanks..  I've still been using an older iproute version and didn't
> see this.
> 
> 
> Reviewed-by: Vlad Yasevich 

What introduced this regression?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 2/2] mm: make lru_add_drain_all() selective

2013-08-13 Thread Andrew Morton
On Tue, 13 Aug 2013 18:33:04 -0400 Tejun Heo  wrote:

> Hello, Andrew.
> 
> On Tue, Aug 13, 2013 at 03:18:05PM -0700, Andrew Morton wrote:
> > I don't buy it.  The callback simply determines whether "we need to
> > schuedule work on this cpu".  It's utterly simple.  Nobody will have
> > trouble understanding or using such a thing.
> 
> Well, I don't buy that either.  Callback based interface has its
> issues.

No it hasn't.  It's a common and simple technique which we all understand.

>  The difference we're talking about here is pretty minute but
> then again the improvement brought on by the callback is pretty minute
> too.

It's a relatively small improvement in the lru_add_drain_all() case. 
Other callsites can gain improvements as well.

> > It removes one memory allocation and initialisation per call.  It
> > removes an entire for_each_online_cpu() loop.
> 
> But that doesn't solve the original problem at all and while it
> removes the loop, it also adds a separate function.

It results in superior runtime code.  At this and potentially other
callsites.

> > I really don't understand what's going on here.  You're advocating for
> > a weaker kernel interface and for inferior kernel runtime behaviour. 
> > Forcing callers to communicate their needs via a large,
> > dynamically-allocated temporary rather than directly.  And what do we
> > get in return for all this?  Some stuff about callbacks which frankly
> > has me scratching my head.
> 
> Well, it is a fairly heavy path and you're pushing for an optimization
> which won't make any noticeable difference at all.  And, yes, I do
> think we need to stick to simpler APIs whereever possible.  Sure the
> difference is minute here but the addition of test callback doesn't
> buy us anything either, so what's the point?

It does buy us things, as I've repeatedly described.  You keep on
saying things which demonstrably aren't so.  I think I'll give up now.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] idr: Document ida tree sections

2013-08-13 Thread Tejun Heo
Hello, Kent.

On Tue, Aug 13, 2013 at 03:27:59PM -0700, Kent Overstreet wrote:
> It's only naturally a radix tree problem _if_ you require sparseness.

Well, it's not necessarily about requiring it but more about surviving
it with some grace when things don't go as expected, which is an
important characteristic for common library stuff.

> Otherwise, radix trees require pointer chasing, which we can avoid -
> which saves us both the cost of chasing pointers (which is significant)
> and the overhead of storing them.

Vast majority of which can be avoided with simple caching, right?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()

2013-08-13 Thread Rafael J. Wysocki
On Tuesday, August 13, 2013 06:13:25 PM Tejun Heo wrote:
> Hello,
> 
> On Tue, Aug 13, 2013 at 02:12:40PM -0700, Stephen Boyd wrote:
> > @@ -308,7 +319,7 @@ static void pm_qos_work_fn(struct work_struct *work)
> >   struct pm_qos_request,
> >   work);
> >  
> > -   pm_qos_update_request(req, PM_QOS_DEFAULT_VALUE);
> > +   __pm_qos_update_request(req, PM_QOS_DEFAULT_VALUE);
> 
> Maybe a short comment explaining why this is different would be nice?
> Other than that,
> 
>  Reviewed-by: Tejun Heo 

Thanks guys, I'm going to push that as a fix for 3.11-rc6 and stable.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >