Re: [PATCH 09/16] slab: use __GFP_COMP flag for allocating slab pages

2013-08-22 Thread Joonsoo Kim
On Thu, Aug 22, 2013 at 06:00:56PM +, Christoph Lameter wrote:
> On Thu, 22 Aug 2013, Joonsoo Kim wrote:
> 
> > If we use 'struct page' of first page as 'struct slab', there is no
> > advantage not to use __GFP_COMP. So use __GFP_COMP flag for all the cases.
> 
> Ok that brings it in line with SLUB and SLOB.

Yes!

> 
> > @@ -2717,17 +2701,8 @@ static void slab_put_obj(struct kmem_cache *cachep, 
> > struct slab *slabp,
> >  static void slab_map_pages(struct kmem_cache *cache, struct slab *slab,
> >struct page *page)
> >  {
> > -   int nr_pages;
> > -
> > -   nr_pages = 1;
> > -   if (likely(!PageCompound(page)))
> > -   nr_pages <<= cache->gfporder;
> > -
> > -   do {
> > -   page->slab_cache = cache;
> > -   page->slab_page = slab;
> > -   page++;
> > -   } while (--nr_pages);
> > +   page->slab_cache = cache;
> > +   page->slab_page = slab;
> >  }
> 
> And saves some processing.

Yes!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] misc: Add crossbar driver

2013-08-22 Thread Sricharan R
On Friday 23 August 2013 12:06 PM, Sekhar Nori wrote:
> On Friday 23 August 2013 11:41 AM, Sricharan R wrote:
>> Hi,
>> On Friday 23 August 2013 10:17 AM, Rajendra Nayak wrote:
>>> On Thursday 22 August 2013 05:03 PM, Sricharan R wrote:
  maps crossbar number<->  to interrupt number and
  calls request_irq(int_no, crossbar_handler,..)
>>> So will this mapping happen based on some data passed from DT or
>>> just based on whats available when the device does a request_irq()?
>>>
>>> If its based on whats available then I see an issue when you need
>>> to remap something thats already mapped by default (and not used)
>>> since you run out of all free ones.
>> Yes, when done based on what is available then there is a
>> problem when we run out of free ones because we do not
>> know which one to replace. I was thinking of something like
>> this,
>> 1) DT would give a list of all free ones, and also if some are
>> mapped as default and not used, mark those also as free.
>>
>>  2) While mapping see if it has a default mapping and use it.
>>   otherwise, pick from free list.   
> Since the entire DT is available to you at boot time, you should be able
> to find each node where interrupt-parent = <&crossbar> and then allocate
> one of 0-160 GIC interrupt numbers for that node, no? Where would there
> be a need for default mapping and remapping? From one the mails in the
> thread the crossbar is completely flexible - any of the 320 crossbar
> interrupts can be mapped to any of the 160 GIC interrupts.
>
> Any GIC interrupts left after this boot-time scan can be added to an
> unused list for use with runtime DT fragments (when that support comes).
>  
> Sorry if I misunderstood, but above proposal sounds like maintaining a
> separate free interrupt lines list in DT. That will quickly go out of sync.
 Say, peripheral x uses crossbar 1 and specifies this in DT.
 During boot crossbar 1 gets mapped int 10. So if by default
some other crossbar has its interrupt mapped to 10,
then it should be removed.  Instead clear all crossbar registers
once and mark all as free, then  allocate only during request.
Correct ?. In this the free no need to maintain any list.

Regards,
 Sricharan
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] x86, ACPI, mm: Kill max_low_pfn_mapped.

2013-08-22 Thread Tang Chen
From: Yinghai Lu 

Now we have pfn_mapped[] in , and max_low_pfn_mapped should not be used anymore.

User should use pfn_mapped[] or just 1UL<<(32-PAGE_SHIFT) instead.

Only user is ACPI_INITRD_TABLE_OVERRIDE, and it should not use that,
as later accessing is using early_ioremap(). We could change to use
1U<<(32_PAGE_SHIFT) with it, aka under 4G.

-v2: Leave alone max_low_pfn_mapped in i915 code according to tj.

Suggested-by: H. Peter Anvin 
Signed-off-by: Yinghai Lu 
Signed-off-by: Tang Chen 
Cc: "Rafael J. Wysocki" 
Cc: Jacob Shin 
Cc: Pekka Enberg 
Cc: linux-a...@vger.kernel.org
Tested-by: Thomas Renninger 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/include/asm/page_types.h |1 -
 arch/x86/kernel/setup.c   |4 +---
 arch/x86/mm/init.c|4 
 drivers/acpi/osl.c|6 +++---
 4 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/page_types.h 
b/arch/x86/include/asm/page_types.h
index 54c9787..b012b82 100644
--- a/arch/x86/include/asm/page_types.h
+++ b/arch/x86/include/asm/page_types.h
@@ -43,7 +43,6 @@
 
 extern int devmem_is_allowed(unsigned long pagenr);
 
-extern unsigned long max_low_pfn_mapped;
 extern unsigned long max_pfn_mapped;
 
 static inline phys_addr_t get_max_mapped(void)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index de33798..f13df7b 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -112,13 +112,11 @@
 #include 
 
 /*
- * max_low_pfn_mapped: highest direct mapped pfn under 4GB
- * max_pfn_mapped: highest direct mapped pfn over 4GB
+ * max_pfn_mapped: highest direct mapped pfn
  *
  * The direct mapping only covers E820_RAM regions, so the ranges and gaps are
  * represented by pfn_mapped
  */
-unsigned long max_low_pfn_mapped;
 unsigned long max_pfn_mapped;
 
 #ifdef CONFIG_DMI
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 2ec29ac..5b2eaca 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -313,10 +313,6 @@ static void add_pfn_range_mapped(unsigned long start_pfn, 
unsigned long end_pfn)
nr_pfn_mapped = clean_sort_range(pfn_mapped, E820_X_MAX);
 
max_pfn_mapped = max(max_pfn_mapped, end_pfn);
-
-   if (start_pfn < (1UL<<(32-PAGE_SHIFT)))
-   max_low_pfn_mapped = max(max_low_pfn_mapped,
-min(end_pfn, 1UL<<(32-PAGE_SHIFT)));
 }
 
 bool pfn_range_is_mapped(unsigned long start_pfn, unsigned long end_pfn)
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index 6ab2c35..378de0d 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -624,9 +624,9 @@ void __init acpi_initrd_override(void *data, size_t size)
if (table_nr == 0)
return;
 
-   acpi_tables_addr =
-   memblock_find_in_range(0, max_low_pfn_mapped << PAGE_SHIFT,
-  all_tables_size, PAGE_SIZE);
+   /* under 4G at first, then above 4G */
+   acpi_tables_addr = memblock_find_in_range(0, (1ULL<<32) - 1,
+   all_tables_size, PAGE_SIZE);
if (!acpi_tables_addr) {
WARN_ON(1);
return;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 05/16] slab: remove cachep in struct slab_rcu

2013-08-22 Thread Joonsoo Kim
On Thu, Aug 22, 2013 at 05:53:00PM +, Christoph Lameter wrote:
> On Thu, 22 Aug 2013, Joonsoo Kim wrote:
> 
> > We can get cachep using page in struct slab_rcu, so remove it.
> 
> Ok but this means that we need to touch struct page. Additional cacheline
> in cache footprint.

In following patch, we overload RCU_HEAD to LRU of struct page and
also overload struct slab to struct page. So there is no
additional cacheline footprint at final stage.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/8] rcu: eliminate deadlock for rcu read site

2013-08-22 Thread Lai Jiangshan
[PATCH] rcu/rt_mutex: eliminate a kind of deadlock for rcu read site

Current rtmutex's lock->wait_lock doesn't disables softirq nor irq, it will
cause rcu read site deadlock when rcu overlaps with any 
softirq-context/irq-context lock.

@L is a spinlock of softirq or irq context.

CPU1cpu2(rcu boost)
rcu_read_lock() rt_mutext_lock()
  raw_spin_lock(lock->wait_lock)
spin_lock_XX(L)   
rcu_read_unlock() do_softirq()
  rcu_read_unlock_special()
rt_mutext_unlock()
  raw_spin_lock(lock->wait_lock)spin_lock_XX(L)  **DEADLOCK**

This patch fixes this kind of deadlock by removing rt_mutext_unlock() from
rcu_read_unlock(), new rt_mutex_rcu_deboost_unlock() is called instead.
Thus rtmutex's lock->wait_lock will not be called from rcu_read_unlock().

This patch does not eliminate all kinds of rcu-read-site deadlock,
if @L is a scheduler lock, it will be deadlock, we should apply Paul's rule
in this case.(avoid overlapping or preempt_disable()).

rt_mutex_rcu_deboost_unlock() requires the @waiter is queued, so we
can't directly call rt_mutex_lock(&mtx) in the rcu_boost thread,
we split rt_mutex_lock(&mtx) into two steps just like pi-futex.
This result a internal state in rcu_boost thread and cause
rcu_boost thread a bit more complicated.

Thanks
Lai

diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 5cd0f09..8830874 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -102,7 +102,7 @@ extern struct group_info init_groups;
 
 #ifdef CONFIG_RCU_BOOST
 #define INIT_TASK_RCU_BOOST()  \
-   .rcu_boost_mutex = NULL,
+   .rcu_boost_waiter = NULL,
 #else
 #define INIT_TASK_RCU_BOOST()
 #endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index e9995eb..1eca99f 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1078,7 +1078,7 @@ struct task_struct {
struct rcu_node *rcu_blocked_node;
 #endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
 #ifdef CONFIG_RCU_BOOST
-   struct rt_mutex *rcu_boost_mutex;
+   struct rt_mutex_waiter *rcu_boost_waiter;
 #endif /* #ifdef CONFIG_RCU_BOOST */
 
 #if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
@@ -1723,7 +1723,7 @@ static inline void rcu_copy_process(struct task_struct *p)
p->rcu_blocked_node = NULL;
 #endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
 #ifdef CONFIG_RCU_BOOST
-   p->rcu_boost_mutex = NULL;
+   p->rcu_boost_waiter = NULL;
 #endif /* #ifdef CONFIG_RCU_BOOST */
INIT_LIST_HEAD(&p->rcu_node_entry);
 }
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 769e12e..d207ddd 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -33,6 +33,7 @@
 #define RCU_KTHREAD_PRIO 1
 
 #ifdef CONFIG_RCU_BOOST
+#include "rtmutex_common.h"
 #define RCU_BOOST_PRIO CONFIG_RCU_BOOST_PRIO
 #else
 #define RCU_BOOST_PRIO RCU_KTHREAD_PRIO
@@ -340,7 +341,7 @@ void rcu_read_unlock_special(struct task_struct *t)
unsigned long flags;
struct list_head *np;
 #ifdef CONFIG_RCU_BOOST
-   struct rt_mutex *rbmp = NULL;
+   struct rt_mutex_waiter *waiter = NULL;
 #endif /* #ifdef CONFIG_RCU_BOOST */
struct rcu_node *rnp;
int special;
@@ -397,10 +398,10 @@ void rcu_read_unlock_special(struct task_struct *t)
 #ifdef CONFIG_RCU_BOOST
if (&t->rcu_node_entry == rnp->boost_tasks)
rnp->boost_tasks = np;
-   /* Snapshot/clear ->rcu_boost_mutex with rcu_node lock held. */
-   if (t->rcu_boost_mutex) {
-   rbmp = t->rcu_boost_mutex;
-   t->rcu_boost_mutex = NULL;
+   /* Snapshot/clear ->rcu_boost_waiter with rcu_node lock held. */
+   if (t->rcu_boost_waiter) {
+   waiter = t->rcu_boost_waiter;
+   t->rcu_boost_waiter = NULL;
}
 #endif /* #ifdef CONFIG_RCU_BOOST */
 
@@ -426,8 +427,8 @@ void rcu_read_unlock_special(struct task_struct *t)
 
 #ifdef CONFIG_RCU_BOOST
/* Unboost if we were boosted. */
-   if (rbmp)
-   rt_mutex_unlock(rbmp);
+   if (waiter)
+   rt_mutex_rcu_deboost_unlock(t, waiter);
 #endif /* #ifdef CONFIG_RCU_BOOST */
 
/*
@@ -1129,9 +1130,6 @@ void exit_rcu(void)
 #endif /* #else #ifdef CONFIG_TREE_PREEMPT_RCU */
 
 #ifdef CONFIG_RCU_BOOST
-
-#include "rtmutex_common.h"
-
 #ifdef CONFIG_RCU_TRACE
 
 static void rcu_initiate_boost_trace(struct rcu_node *rnp)
@@ -1181,14 +1179,15 @@ static int rcu_boost(struct rcu_node *rnp)
 {
unsigned long flags;
struct rt_mutex mtx;
+   struct rt_mutex_waiter rcu_boost_waiter;
struct task_struct *t;
struct list_head *tb;
+   int ret;
 
if (rnp->exp_tasks == NULL && rnp->boost_tasks == NULL)

Re: [PATCH 04/16] slab: remove nodeid in struct slab

2013-08-22 Thread Joonsoo Kim
On Thu, Aug 22, 2013 at 05:51:58PM +, Christoph Lameter wrote:
> On Thu, 22 Aug 2013, Joonsoo Kim wrote:
> 
> > @@ -1099,8 +1098,7 @@ static void drain_alien_cache(struct kmem_cache 
> > *cachep,
> >
> >  static inline int cache_free_alien(struct kmem_cache *cachep, void *objp)
> >  {
> > -   struct slab *slabp = virt_to_slab(objp);
> > -   int nodeid = slabp->nodeid;
> > +   int nodeid = page_to_nid(virt_to_page(objp));
> > struct kmem_cache_node *n;
> > struct array_cache *alien = NULL;
> > int node;
> 
> virt_to_page is a relatively expensive operation. How does this affect
> performance?

Previous code, that is virt_to_slab(), already do virt_to_page().
So this doesn't matter at all.

Thanks.

> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org";> em...@kvack.org 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/3] x86, ACPI, mm: Cleanup for {max|low|max_low}_pfn_mapped.

2013-08-22 Thread Tang Chen
This patch-set does the following:
1. Kill max_low_pfn_mapped as it is useless.
   This patch is from Yinghai.
2. Update min_pfn_mapped and max_pfn_mapped together in add_pfn_range_mapped().
3. Move definition of max_pfn_mapped tp init.c together with min_pfn_mapped.

Tang Chen (2):
  x86, mm: Update min_pfn_mapped in add_pfn_range_mapped().
  x86, mm: Move max_pfn_mapped definition to init.c.

Yinghai Lu (1):
  x86, ACPI, mm: Kill max_low_pfn_mapped.

 arch/x86/include/asm/page_types.h |1 -
 arch/x86/kernel/setup.c   |   10 --
 arch/x86/mm/init.c|   15 ++-
 drivers/acpi/osl.c|6 +++---
 4 files changed, 13 insertions(+), 19 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] x86, mm: Move max_pfn_mapped definition to init.c.

2013-08-22 Thread Tang Chen
min_pfn_mapped is defined in init.c, we can also define max_pfn_mapped here.

Signed-off-by: Tang Chen 
---
 arch/x86/kernel/setup.c |8 
 arch/x86/mm/init.c  |9 +
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index f13df7b..382e20b 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -111,14 +111,6 @@
 #include 
 #include 
 
-/*
- * max_pfn_mapped: highest direct mapped pfn
- *
- * The direct mapping only covers E820_RAM regions, so the ranges and gaps are
- * represented by pfn_mapped
- */
-unsigned long max_pfn_mapped;
-
 #ifdef CONFIG_DMI
 RESERVE_BRK(dmi_alloc, 65536);
 #endif
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index a97749f..793204b 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -24,6 +24,15 @@ static unsigned long __initdata pgt_buf_start;
 static unsigned long __initdata pgt_buf_end;
 static unsigned long __initdata pgt_buf_top;
 
+/*
+ * max_pfn_mapped: highest direct mapped pfn
+ * min_pfn_mapped: lowest direct mapped pfn
+ *
+ * The direct mapping only covers E820_RAM regions, so the ranges and gaps are
+ * represented by pfn_mapped
+ */
+
+unsigned long max_pfn_mapped;
 static unsigned long min_pfn_mapped;
 
 static bool __initdata can_use_brk_pgt = true;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] misc: Add crossbar driver

2013-08-22 Thread Sekhar Nori
On Friday 23 August 2013 11:41 AM, Sricharan R wrote:
> Hi,
> On Friday 23 August 2013 10:17 AM, Rajendra Nayak wrote:
>> On Thursday 22 August 2013 05:03 PM, Sricharan R wrote:
>>>  maps crossbar number<->  to interrupt number and
>>>  calls request_irq(int_no, crossbar_handler,..)
>> So will this mapping happen based on some data passed from DT or
>> just based on whats available when the device does a request_irq()?
>>
>> If its based on whats available then I see an issue when you need
>> to remap something thats already mapped by default (and not used)
>> since you run out of all free ones.
> Yes, when done based on what is available then there is a
> problem when we run out of free ones because we do not
> know which one to replace. I was thinking of something like
> this,
> 1) DT would give a list of all free ones, and also if some are
> mapped as default and not used, mark those also as free.
> 
>  2) While mapping see if it has a default mapping and use it.
>   otherwise, pick from free list.   

Since the entire DT is available to you at boot time, you should be able
to find each node where interrupt-parent = <&crossbar> and then allocate
one of 0-160 GIC interrupt numbers for that node, no? Where would there
be a need for default mapping and remapping? From one the mails in the
thread the crossbar is completely flexible - any of the 320 crossbar
interrupts can be mapped to any of the 160 GIC interrupts.

Any GIC interrupts left after this boot-time scan can be added to an
unused list for use with runtime DT fragments (when that support comes).

Sorry if I misunderstood, but above proposal sounds like maintaining a
separate free interrupt lines list in DT. That will quickly go out of sync.

Thanks,
Sekhar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/16] slab: change return type of kmem_getpages() to struct page

2013-08-22 Thread Joonsoo Kim
On Thu, Aug 22, 2013 at 05:49:43PM +, Christoph Lameter wrote:
> On Thu, 22 Aug 2013, Joonsoo Kim wrote:
> 
> > @@ -2042,7 +2042,7 @@ static void slab_destroy_debugcheck(struct kmem_cache 
> > *cachep, struct slab *slab
> >   */
> >  static void slab_destroy(struct kmem_cache *cachep, struct slab *slabp)
> >  {
> > -   void *addr = slabp->s_mem - slabp->colouroff;
> > +   struct page *page = virt_to_head_page(slabp->s_mem);
> >
> > slab_destroy_debugcheck(cachep, slabp);
> > if (unlikely(cachep->flags & SLAB_DESTROY_BY_RCU)) {
> 
> Ok so this removes slab offset management. The use of a struct page
> pointer therefore results in coloring support to be not possible anymore.

No, slab offset management is done by colour_off in struct kmem_cache.
This colouroff in struct slab is just for getting start address of the page
at free time. If we can get start address properly, we can remove it without
any side-effect. This patch implement it.

Thanks.

> 
> I would suggest to have a separate patch for coloring removal before this
> patch. It seems that the support is removed in two different patches now.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org";> em...@kvack.org 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid

2013-08-22 Thread Peter Zijlstra
On Thu, Aug 22, 2013 at 05:43:47PM -0400, Richard Guy Briggs wrote:
> On Thu, Aug 22, 2013 at 10:05:55PM +0200, Peter Zijlstra wrote:
> > On Tue, Aug 20, 2013 at 05:32:03PM -0400, Richard Guy Briggs wrote:
> > > This stops these four task helper functions from using the deprecated and
> > > error-prone task->pid and task->tgid.
> > > 
> > > (informed by ebiederman's ea5a4d01)
> > > Cc: "Eric W. Biederman" 
> > > Signed-off-by: Richard Guy Briggs 
> > > ---
> > >  include/linux/sched.h |8 
> > >  1 files changed, 4 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > > index 8e69807..46e739d 100644
> > > --- a/include/linux/sched.h
> > > +++ b/include/linux/sched.h
> > > @@ -1579,7 +1579,7 @@ static inline int pid_alive(struct task_struct *p)
> > >   */
> > >  static inline int is_global_init(struct task_struct *tsk)
> > >  {
> > > - return tsk->pid == 1;
> > > + return task_pid_nr(tsk) == 1;
> > >  }
> > >  
> > >  extern struct pid *cad_pid;
> > > @@ -1930,7 +1930,7 @@ extern struct task_struct *idle_task(int cpu);
> > >   */
> > >  static inline bool is_idle_task(const struct task_struct *p)
> > >  {
> > > - return p->pid == 0;
> > > + return task_pid(p) == &init_struct_pid;
> > >  }
> > >  extern struct task_struct *curr_task(int cpu);
> > >  extern void set_curr_task(int cpu, struct task_struct *p);
> > 
> > Why would you ever want to do this? It just makes these tests more
> > expensive for no gain what so ff'ing ever.
> 
> Backups are generally considered a good idea, but in this case, I'd
> quote:
>   "A man with one watch knows what time it is. A man with two is
>   never certain."

Except that's not the case, with namespaces there's a clear hierarchy
and the task_struct::pid is the one true value aka. root namespace.

Furthermore idle threads really are special and it doesn't make sense to
address them in any but the root namespace, doubly so because only
kernel space does this.

As for the init thread, that function is called is_global_init() for
crying out loud, what numb nut doesn't get that that's supposed to be
using the root namespace?

Seriously, you namespace guys should stop messing things up and
confusing yourselves and others.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/16] slab: overload struct slab over struct page to reduce memory usage

2013-08-22 Thread Joonsoo Kim
On Thu, Aug 22, 2013 at 04:47:25PM +, Christoph Lameter wrote:
> On Thu, 22 Aug 2013, Joonsoo Kim wrote:
> 
> > And this patchset change a management method of free objects of a slab.
> > Current free objects management method of the slab is weird, because
> > it touch random position of the array of kmem_bufctl_t when we try to
> > get free object. See following example.
> 
> The ordering is intentional so that the most cache hot objects are removed
> first.

Yes, I know.

> 
> > To get free objects, we access this array with following pattern.
> > 6 -> 3 -> 7 -> 2 -> 5 -> 4 -> 0 -> 1 -> END
> 
> Because that is the inverse order of the objects being freed.
> 
> The cache hot effect may not be that significant since per cpu and per
> node queues have been aded on top. So maybe we do not be so cache aware
> anymore when actually touching struct slab.

I don't change the ordering, I just change how we store that order to
reduce cache footprint. We can simply implement this order via stack.

Assume indexes of free order are 1 -> 0 -> 4.
Currently, this order is stored in very complex way like below.

struct slab's free = 4
kmem_bufctl_t array: 1 END ACTIVE ACTIVE 0

If we allocate one object, we access slab's free and index 4 of
kmem_bufctl_t array.

struct slab's free = 0
kmem_bufctl_t array: 1 END ACTIVE ACTIVE ACTIVE


And then,

struct slab's free = 1
kmem_bufctl_t array: ACTIVE END ACTIVE ACTIVE ACTIVE


And then,

struct slab's free = END
kmem_bufctl_t array: ACTIVE ACTIVE ACTIVE ACTIVE ACTIVE


Following is newly implementation (stack) in same situation.

struct slab's free = 0
kmem_bufctl_t array: 4 0 1

To get an one object,

struct slab's free = 1
kmem_bufctl_t array: dummy 0 1


And then,

struct slab's free = 2
kmem_bufctl_t array: dummy dummy 1


struct slab's free = 3
kmem_bufctl_t array: dummy dummy dummy


The order of returned object is same as previous algorithm.
However this algorithm sequentially accesses kmem_bufctl_t array,
instead of randomly access. This is an advantage of this patch.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] pinctrl: rockchip: Remove non-reachable break statement

2013-08-22 Thread Axel Lin
The break statement after return is non-reachable, remove them.

Signed-off-by: Axel Lin 
---
 drivers/pinctrl/pinctrl-rockchip.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/pinctrl/pinctrl-rockchip.c 
b/drivers/pinctrl/pinctrl-rockchip.c
index 7eb1249..319a64a 100644
--- a/drivers/pinctrl/pinctrl-rockchip.c
+++ b/drivers/pinctrl/pinctrl-rockchip.c
@@ -585,7 +585,6 @@ static int rockchip_pinconf_set(struct pinctrl_dev 
*pctldev, unsigned int pin,
switch (param) {
case PIN_CONFIG_BIAS_DISABLE:
return rockchip_set_pull(bank, pin - bank->pin_base, param);
-   break;
case PIN_CONFIG_BIAS_PULL_UP:
case PIN_CONFIG_BIAS_PULL_DOWN:
case PIN_CONFIG_BIAS_PULL_PIN_DEFAULT:
@@ -596,10 +595,8 @@ static int rockchip_pinconf_set(struct pinctrl_dev 
*pctldev, unsigned int pin,
return -EINVAL;
 
return rockchip_set_pull(bank, pin - bank->pin_base, param);
-   break;
default:
return -ENOTSUPP;
-   break;
}
 
return 0;
@@ -633,7 +630,6 @@ static int rockchip_pinconf_get(struct pinctrl_dev 
*pctldev, unsigned int pin,
break;
default:
return -ENOTSUPP;
-   break;
}
 
return 0;
-- 
1.8.1.2



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] pinctrl: rockchip: Remove of_match_ptr macro for DT only driver

2013-08-22 Thread Axel Lin
This is a DT only driver and rockchip_pinctrl_dt_match is always compiled in.
Thus remove of_match_ptr macro.

Signed-off-by: Axel Lin 
---
 drivers/pinctrl/pinctrl-rockchip.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pinctrl/pinctrl-rockchip.c 
b/drivers/pinctrl/pinctrl-rockchip.c
index 64ad0c0..7eb1249 100644
--- a/drivers/pinctrl/pinctrl-rockchip.c
+++ b/drivers/pinctrl/pinctrl-rockchip.c
@@ -1365,7 +1365,7 @@ static struct platform_driver rockchip_pinctrl_driver = {
.driver = {
.name   = "rockchip-pinctrl",
.owner  = THIS_MODULE,
-   .of_match_table = of_match_ptr(rockchip_pinctrl_dt_match),
+   .of_match_table = rockchip_pinctrl_dt_match,
},
 };
 
-- 
1.8.1.2



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] misc: Add crossbar driver

2013-08-22 Thread Rajendra Nayak
On Friday 23 August 2013 11:41 AM, Sricharan R wrote:
> Hi,
> On Friday 23 August 2013 10:17 AM, Rajendra Nayak wrote:
>> On Thursday 22 August 2013 05:03 PM, Sricharan R wrote:
>>>  maps crossbar number<->  to interrupt number and
>>>  calls request_irq(int_no, crossbar_handler,..)
>> So will this mapping happen based on some data passed from DT or
>> just based on whats available when the device does a request_irq()?
>>
>> If its based on whats available then I see an issue when you need
>> to remap something thats already mapped by default (and not used)
>> since you run out of all free ones.
> Yes, when done based on what is available then there is a
> problem when we run out of free ones because we do not
> know which one to replace. I was thinking of something like
> this,
> 1) DT would give a list of all free ones, and also if some are
> mapped as default and not used, mark those also as free.
> 
>  2) While mapping see if it has a default mapping and use it.
>   otherwise, pick from free list.   
> 
>   This should be ok right ?

yeah, sounds ok.

> 
> Regards,
>  Sricharan
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 12/20] mm, hugetlb: remove vma_has_reserves()

2013-08-22 Thread Joonsoo Kim
On Thu, Aug 22, 2013 at 04:34:22PM +0530, Aneesh Kumar K.V wrote:
> Joonsoo Kim  writes:
> 
> > On Thu, Aug 22, 2013 at 02:14:38PM +0530, Aneesh Kumar K.V wrote:
> >> Joonsoo Kim  writes:
> >> 
> >> > vma_has_reserves() can be substituted by using return value of
> >> > vma_needs_reservation(). If chg returned by vma_needs_reservation()
> >> > is 0, it means that vma has reserves. Otherwise, it means that vma don't
> >> > have reserves and need a hugepage outside of reserve pool. This 
> >> > definition
> >> > is perfectly same as vma_has_reserves(), so remove vma_has_reserves().
> >> >
> >> > Signed-off-by: Joonsoo Kim 
> >> 
> >> Reviewed-by: Aneesh Kumar K.V 
> >
> > Thanks.
> >
> >> > @@ -580,8 +547,7 @@ static struct page *dequeue_huge_page_vma(struct 
> >> > hstate *h,
> >> >   * have no page reserves. This check ensures that reservations 
> >> > are
> >> >   * not "stolen". The child may still get SIGKILLed
> >> >   */
> >> > -if (!vma_has_reserves(vma, chg) &&
> >> > -h->free_huge_pages - h->resv_huge_pages == 0)
> >> > +if (chg && h->free_huge_pages - h->resv_huge_pages == 0)
> >> >  return NULL;
> >> >
> >> >  /* If reserves cannot be used, ensure enough pages are in the 
> >> > pool */
> >> > @@ -600,7 +566,7 @@ retry_cpuset:
> >> >  if (page) {
> >> >  if (avoid_reserve)
> >> >  break;
> >> > -if (!vma_has_reserves(vma, chg))
> >> > +if (chg)
> >> >  break;
> >> >
> >> >  SetPagePrivate(page);
> >> 
> >> Can you add a comment above both the place to explain why checking chg
> >> is good enough ?
> >
> > Yes, I can. But it will be changed to use_reserve in patch 13 and it
> > represent it's meaning perfectly. So commeting may be useless.
> 
> That should be ok, because having a comment in this patch helps in
> understanding the patch better, even though you are removing that
> later. 

Okay. I will add it in next spin.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] misc: Add crossbar driver

2013-08-22 Thread Sricharan R
Hi,
On Friday 23 August 2013 10:17 AM, Rajendra Nayak wrote:
> On Thursday 22 August 2013 05:03 PM, Sricharan R wrote:
>>  maps crossbar number<->  to interrupt number and
>>  calls request_irq(int_no, crossbar_handler,..)
> So will this mapping happen based on some data passed from DT or
> just based on whats available when the device does a request_irq()?
>
> If its based on whats available then I see an issue when you need
> to remap something thats already mapped by default (and not used)
> since you run out of all free ones.
Yes, when done based on what is available then there is a
problem when we run out of free ones because we do not
know which one to replace. I was thinking of something like
this,
1) DT would give a list of all free ones, and also if some are
mapped as default and not used, mark those also as free.

 2) While mapping see if it has a default mapping and use it.
  otherwise, pick from free list.   

  This should be ok right ?

Regards,
 Sricharan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2] PCI: exynos: add support for MSI

2013-08-22 Thread Jingoo Han
This patch adds support for Message Signaled Interrupt in the
Exynos PCIe diver using Synopsys designware PCIe core IP.

Signed-off-by: Siva Reddy Kallam 
Signed-off-by: Srikanth T Shivanand 
Signed-off-by: Jingoo Han 
Cc: Pratyush Anand 
Cc: Mohit KUMAR 
---
Changes since v1:
- removed unnecessary exynos_pcie_clear_irq_level()
- updated the bindings documentation
- used new msi_chip infrastructure
- removed ARCH_SUPPORTS_MSI
- replaced #ifdef guards with IS_ENABLED(CONFIG_PCI_MSI)

 .../devicetree/bindings/pci/designware-pcie.txt|2 +
 arch/arm/boot/dts/exynos5440.dtsi  |2 +
 drivers/pci/host/pci-exynos.c  |   47 
 drivers/pci/host/pcie-designware.c |  225 
 drivers/pci/host/pcie-designware.h |4 +
 5 files changed, 280 insertions(+)

diff --git a/Documentation/devicetree/bindings/pci/designware-pcie.txt 
b/Documentation/devicetree/bindings/pci/designware-pcie.txt
index eabcb4b..00bb935 100644
--- a/Documentation/devicetree/bindings/pci/designware-pcie.txt
+++ b/Documentation/devicetree/bindings/pci/designware-pcie.txt
@@ -43,6 +43,7 @@ SoC specific DT Entry:
interrupt-map-mask = <0 0 0 0>;
interrupt-map = <0x0 0 &gic 53>;
num-lanes = <4>;
+   msi-base = <200>;
};
 
pcie@2a {
@@ -63,6 +64,7 @@ SoC specific DT Entry:
interrupt-map-mask = <0 0 0 0>;
interrupt-map = <0x0 0 &gic 56>;
num-lanes = <4>;
+   msi-base = <232>;
};
 
 Board specific DT Entry:
diff --git a/arch/arm/boot/dts/exynos5440.dtsi 
b/arch/arm/boot/dts/exynos5440.dtsi
index 5d6cf49..17549b9 100644
--- a/arch/arm/boot/dts/exynos5440.dtsi
+++ b/arch/arm/boot/dts/exynos5440.dtsi
@@ -276,6 +276,7 @@
interrupt-map-mask = <0 0 0 0>;
interrupt-map = <0x0 0 &gic 53>;
num-lanes = <4>;
+   msi-base = <200>;
};
 
pcie@2a {
@@ -296,5 +297,6 @@
interrupt-map-mask = <0 0 0 0>;
interrupt-map = <0x0 0 &gic 56>;
num-lanes = <4>;
+   msi-base = <232>;
};
 };
diff --git a/drivers/pci/host/pci-exynos.c b/drivers/pci/host/pci-exynos.c
index 012ca8a..aaead2c 100644
--- a/drivers/pci/host/pci-exynos.c
+++ b/drivers/pci/host/pci-exynos.c
@@ -48,6 +48,7 @@ struct exynos_pcie {
 #define PCIE_IRQ_SPECIAL   0x008
 #define PCIE_IRQ_EN_PULSE  0x00c
 #define PCIE_IRQ_EN_LEVEL  0x010
+#define IRQ_MSI_ENABLE (0x1 << 2)
 #define PCIE_IRQ_EN_SPECIAL0x014
 #define PCIE_PWR_RESET 0x018
 #define PCIE_CORE_RESET0x01c
@@ -320,9 +321,38 @@ static irqreturn_t exynos_pcie_irq_handler(int irq, void 
*arg)
return IRQ_HANDLED;
 }
 
+static irqreturn_t exynos_pcie_msi_irq_handler(int irq, void *arg)
+{
+   struct pcie_port *pp = arg;
+
+   /* handle msi irq */
+   dw_handle_msi_irq(pp);
+
+   return IRQ_HANDLED;
+}
+
+static void exynos_pcie_msi_init(struct pcie_port *pp)
+{
+   u32 val;
+   struct exynos_pcie *exynos_pcie = to_exynos_pcie(pp);
+   void __iomem *elbi_base = exynos_pcie->elbi_base;
+
+   dw_pcie_msi_init(pp);
+
+   /* enable MSI interrupt */
+   val = readl(elbi_base + PCIE_IRQ_EN_LEVEL);
+   val |= IRQ_MSI_ENABLE;
+   writel(val, elbi_base + PCIE_IRQ_EN_LEVEL);
+   return;
+}
+
 static void exynos_pcie_enable_interrupts(struct pcie_port *pp)
 {
exynos_pcie_enable_irq_pulse(pp);
+
+   if (IS_ENABLED(CONFIG_PCI_MSI))
+   exynos_pcie_msi_init(pp);
+
return;
 }
 
@@ -408,6 +438,23 @@ static int add_pcie_port(struct pcie_port *pp, struct 
platform_device *pdev)
return ret;
}
 
+   if (IS_ENABLED(CONFIG_PCI_MSI)) {
+   pp->msi_irq = platform_get_irq(pdev, 0);
+
+   if (!pp->msi_irq) {
+   dev_err(&pdev->dev, "failed to get msi irq\n");
+   return -ENODEV;
+   }
+
+   ret = devm_request_irq(&pdev->dev, pp->msi_irq,
+   exynos_pcie_msi_irq_handler,
+   IRQF_SHARED, "exynos-pcie", pp);
+   if (ret) {
+   dev_err(&pdev->dev, "failed to request msi irq\n");
+   return ret;
+   }
+   }
+
pp->root_bus_nr = -1;
pp->ops = &exynos_pcie_host_ops;
 
diff --git a/drivers/pci/host/pcie-designware.c 
b/drivers/pci/host/pcie-designware.c
index 77b0c25..a4fed11 100644
--- a/drivers/pci/host/pcie-designware.c
+++ b/drivers/pci/host/pcie-designware.c
@@ -11,8 +11,10 @@
  * published by the Free Software Foundation.
  */
 
+#include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -62,6 +64,12 @@
 #define PCIE_ATU_FUNC(x)  

Re: [ANNOUNCE] 3.10.9-rt5

2013-08-22 Thread Fernando Lopez-Lezcano

On 08/22/2013 11:21 AM, Sebastian Andrzej Siewior wrote:

Dear RT folks!

I'm pleased to announce the v3.10.9-rt5 patch set.


Thanks!,


Changes since v3.10.9-rt4
- swait fixes from Steven. It fixed the issues with CONFIG_RCU_NOCB_CPU
   where the system suddenly froze and RCU wasn't doing its job anymore
- hwlat improvements by Steven

Known issues:

...
Trying to build I get (in make modules):

ERROR: "__udivdi3" [drivers/misc/hwlat_detector.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2

(find attached the final configuration used for building)
-- Fernando


build.log.bz2
Description: application/bzip


RE: [linux-nfc] [PATCH RFC] nfc: add a driver for pn532 connected on uart

2013-08-22 Thread Rymarkiewicz, WaldemarX
Hi  Lars,

>This adds a driver for the nxp pn532 nfc chip.
>It is not meant for merging. Instead it is meant to show that some
>progress has been made and what the current state is and to help
>testing.
>Although I can do some basic things with this driver I expect it to
>contain lots of bugs. Be aware!
>This driver is heavily based on the pn533 driver and duplicates much
>code. This has do be factored out some time.

I'm not sure if this is expected approach adding new drivers. You duplicates 
most of pn533 code which is not good.

Also, note that pn533 and pn532  are pretty the same chips (with small 
differences) and it would be quite natural to support both with one driver. 
Pn533 already reads chip version on init, so at this point you already know 
with which chip you are dealing with.

I suggest to separate transport layer from the core in pn533 and add support 
for uart and usb separately. This is exactly what I've planned while changing 
pn533 to support acr122 device.

Thanks,
/Waldek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] fs: use inode_set_user to set uid/gid of inode

2013-08-22 Thread David Miller
From: Rui Xiang 
Date: Fri, 23 Aug 2013 10:48:38 +0800

> Use the new interface to set i_uid/i_gid in inode struct.
> 
> Signed-off-by: Rui Xiang 

For the networking bits:

Acked-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] irqchip: sun4i: Don't write to read-only registers

2013-08-22 Thread Axel Lin
According to the datasheet[1], the Interrupt IRQ Pending Registers are
read-only. The implementation of sun4i_irq_ack() is wrong because it writes
to these read-only registers.

This patch removes the wrong irq_ack callback implementation and all the code
writing to these read-only registers in sun4i_of_init().

[1] 
http://dl.linux-sunxi.org/A10/A10%20User%20Manual%20-%20v1.20%20%282012-04-09%2c%20DECRYPTED%29.pdf

Signed-off-by: Axel Lin 
Acked-by: Maxime Ripard 
---
Hi Thomas,
This patch was sent on https://lkml.org/lkml/2013/7/6/59 with Maxime's Ack.
And re-sent on https://lkml.org/lkml/2013/7/19/229
I change the subject line as the patch does is to avoid writing to read-only
registers.

Axel
 drivers/irqchip/irq-sun4i.c | 18 --
 1 file changed, 18 deletions(-)

diff --git a/drivers/irqchip/irq-sun4i.c b/drivers/irqchip/irq-sun4i.c
index a5438d8..29b75c0a 100644
--- a/drivers/irqchip/irq-sun4i.c
+++ b/drivers/irqchip/irq-sun4i.c
@@ -38,18 +38,6 @@ static struct irq_domain *sun4i_irq_domain;
 
 static asmlinkage void __exception_irq_entry sun4i_handle_irq(struct pt_regs 
*regs);
 
-static void sun4i_irq_ack(struct irq_data *irqd)
-{
-   unsigned int irq = irqd_to_hwirq(irqd);
-   unsigned int irq_off = irq % 32;
-   int reg = irq / 32;
-   u32 val;
-
-   val = readl(sun4i_irq_base + SUN4I_IRQ_PENDING_REG(reg));
-   writel(val | (1 << irq_off),
-  sun4i_irq_base + SUN4I_IRQ_PENDING_REG(reg));
-}
-
 static void sun4i_irq_mask(struct irq_data *irqd)
 {
unsigned int irq = irqd_to_hwirq(irqd);
@@ -76,7 +64,6 @@ static void sun4i_irq_unmask(struct irq_data *irqd)
 
 static struct irq_chip sun4i_irq_chip = {
.name   = "sun4i_irq",
-   .irq_ack= sun4i_irq_ack,
.irq_mask   = sun4i_irq_mask,
.irq_unmask = sun4i_irq_unmask,
 };
@@ -114,11 +101,6 @@ static int __init sun4i_of_init(struct device_node *node,
writel(0, sun4i_irq_base + SUN4I_IRQ_MASK_REG(1));
writel(0, sun4i_irq_base + SUN4I_IRQ_MASK_REG(2));
 
-   /* Clear all the pending interrupts */
-   writel(0x, sun4i_irq_base + SUN4I_IRQ_PENDING_REG(0));
-   writel(0x, sun4i_irq_base + SUN4I_IRQ_PENDING_REG(1));
-   writel(0x, sun4i_irq_base + SUN4I_IRQ_PENDING_REG(2));
-
/* Enable protection mode */
writel(0x01, sun4i_irq_base + SUN4I_IRQ_PROTECTION_REG);
 
-- 
1.8.1.2



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH -next] block: fix error return code in parse_parts()

2013-08-22 Thread Caizhiyong
> -Original Message-
> From: Wei Yongjun [mailto:weiyj...@gmail.com]
> Sent: Friday, August 23, 2013 10:48 AM
> To: ax...@kernel.dk; a...@linux-foundation.org; Caizhiyong; k...@redhat.com;
> m...@sysgo.de; dw...@infradead.org; computersforpe...@gmail.com;
> dedek...@infradead.org
> Cc: yongjun_...@trendmicro.com.cn; linux-kernel@vger.kernel.org
> Subject: [PATCH -next] block: fix error return code in parse_parts()
> 
> From: Wei Yongjun 
> 
> Fix to return -EINVAL in the parts parse error handling case instead
> of 0(may overwrite to 0 by parse_subpart()), as done elsewhere in this
> function.
> 
> Signed-off-by: Wei Yongjun 
> ---
>  block/cmdline-parser.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/block/cmdline-parser.c b/block/cmdline-parser.c
> index 18fb435..cc2637f 100644
> --- a/block/cmdline-parser.c
> +++ b/block/cmdline-parser.c
> @@ -135,6 +135,7 @@ static int parse_parts(struct cmdline_parts **parts, 
> const char
> *bdevdef)
> 
>   if (!newparts->subpart) {
>   pr_warn("cmdline partition has no valid partition.");
> + ret = -EINVAL;


Seems OK to me.

>   goto fail;
>   }
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] PCI: exynos: add support for MSI

2013-08-22 Thread Jingoo Han
On Monday, August 12, 2013 7:57 PM, Thierry Reding wrote:
> On Mon, Aug 12, 2013 at 05:56:47PM +0900, Jingoo Han wrote:
> [...]
> > diff --git a/arch/arm/mach-exynos/Kconfig b/arch/arm/mach-exynos/Kconfig
> > index 855d4a7..9ef1c95 100644
> > --- a/arch/arm/mach-exynos/Kconfig
> > +++ b/arch/arm/mach-exynos/Kconfig
> > @@ -93,6 +93,7 @@ config SOC_EXYNOS5440
> > default y
> > depends on ARCH_EXYNOS5
> > select ARCH_HAS_OPP
> > +   select ARCH_SUPPORTS_MSI
> 
> This symbol goes away in Thomas Petazzoni's MSI patch series which is
> targetted at 3.12, so I don't think you should add that here.

OK, I see.
I will remove ARCH_SUPPORTS_MSI.

[.]

> > +#endif
> > +
> >  static void exynos_pcie_enable_interrupts(struct pcie_port *pp)
> >  {
> > exynos_pcie_enable_irq_pulse(pp);
> > +#ifdef CONFIG_PCI_MSI
> > +   exynos_pcie_msi_init(pp);
> > +#endif
> > return;
> >  }
> 
> Instead of the whole #ifdef business above, can't you just use something
> like this in exynos_pcie_enable_interrupts():
> 
>   if (IS_ENABLED(CONFIG_PCI_MSI))
>   exynos_pcie_msi_init(pp);
> 
> Now you can drop the #ifdef guards and the compiler will throw away all
> the related code automatically if PCI_MSI is not selected because the
> functions are all static and unused. This has the advantage of compiling
> all the code whether or not PCI_MSI is selected or not, therefore
> increasing compile coverage of the driver.

OK, I see.
I will use 'if IS_ENABLED(CONFIG_PCI_MSI))', and remove #ifdef guards.

[.]

> > +
> > +void arch_teardown_msi_irq(unsigned int irq)
> > +{
> > +   clear_irq(irq);
> > +}
> 
> And we've reworked this largely so that drivers no longer provide arch_*
> functions because that prevents multi-platform support. So I think you
> need to port this to the new msi_chip infrastructure that's being
> introduced in 3.12.

OK, I have looked at the new msi_chip infrastructure made by Thomas Petazzoni.
I will use this msi_chip.

I really appreciate your comment. :)
Thank you.

Best regards,
Jingoo Han


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH -next] block: fix error return code in parse_parts()

2013-08-22 Thread Caizhiyong
> From: Wei Yongjun [mailto:weiyj...@gmail.com]
> Sent: Friday, August 23, 2013 10:48 AM
> To: ax...@kernel.dk; a...@linux-foundation.org; Caizhiyong; k...@redhat.com;
> m...@sysgo.de; dw...@infradead.org; computersforpe...@gmail.com;
> dedek...@infradead.org
> Cc: yongjun_...@trendmicro.com.cn; linux-kernel@vger.kernel.org
> Subject: [PATCH -next] block: fix error return code in parse_parts()
> 
> From: Wei Yongjun 
> 
> Fix to return -EINVAL in the parts parse error handling case instead
> of 0(may overwrite to 0 by parse_subpart()), as done elsewhere in this
> function.
> 
> Signed-off-by: Wei Yongjun 
> ---
>  block/cmdline-parser.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/block/cmdline-parser.c b/block/cmdline-parser.c
> index 18fb435..cc2637f 100644
> --- a/block/cmdline-parser.c
> +++ b/block/cmdline-parser.c
> @@ -135,6 +135,7 @@ static int parse_parts(struct cmdline_parts **parts, 
> const char
> *bdevdef)
> 
>   if (!newparts->subpart) {
>   pr_warn("cmdline partition has no valid partition.");
> + ret = -EINVAL;

Seems OK to me.

>   goto fail;
>   }
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] irqchip: gic: Don't complain in gic_get_cpumask() if UP system

2013-08-22 Thread Nicolas Pitre
On Thu, 22 Aug 2013, Stephen Boyd wrote:

> On 08/22, Nicolas Pitre wrote:
> > On Thu, 22 Aug 2013, Stephen Boyd wrote:
> > 
> > > On 07/17, Stephen Boyd wrote:
> > > > On 07/17/13 15:53, Nicolas Pitre wrote:
> > > > > On Wed, 17 Jul 2013, Stephen Boyd wrote:
> > > > >
> > > > >> On 07/17/13 15:34, Nicolas Pitre wrote:
> > > > >>> On Wed, 17 Jul 2013, Stephen Boyd wrote:
> > > > >>>
> > > >  On 07/12/13 05:10, Stephen Boyd wrote:
> > > > > On 07/12, Javi Merino wrote:
> > > > >> I agree, we should drop the check.  It's annoying in 
> > > > >> uniprocessors and
> > > > >> unlikely to be found in the real world unless your gic entry in 
> > > > >> the dt
> > > > >> is wrong.
> > > > >>> And that's a likely outcome in the real world.
> > > > >>>
> > > > > Ok. How about this?
> > > >  Any comments?
> > > > >>> What about this instead:
> > > > >> Unfortunately arm64 doesn't have SMP_ON_UP. 
> > > > > And why does that matter?
> > > > 
> > > > Because the gic driver is compiled on both arm and arm64? I suppose we
> > > > could define is_smp() to 1 on arm64 but its probably better to rely on
> > > > generic kernel things instead of arch specific functions.
> > > > 
> > > > >
> > > > >> It sounds like you preferred the first patch using 
> > > > >> num_possible_cpus()
> > > > > Probably, yes.  I didn't follow the early conversation though.
> > > > 
> > > > This was the first patch:
> > > > 
> > > > ---8<
> > > > 
> > > > diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> > > > index 19ceaa6..589c760 100644
> > > > --- a/drivers/irqchip/irq-gic.c
> > > > +++ b/drivers/irqchip/irq-gic.c
> > > > @@ -368,7 +368,7 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
> > > > break;
> > > > }
> > > >  
> > > > -   if (!mask)
> > > > +   if (!mask && num_possible_cpus() > 1)
> > > > pr_crit("GIC CPU mask not found - kernel will fail to 
> > > > boot.\n");
> > > >  
> > > > return mask;
> > > 
> > > Can one of these two patches be picked up?
> > 
> > Sure.  Just send it to RMK's patch system with my ACK.
> > 
> 
> I'm confused on that. MAINTAINERS says this patch should go
> through Thomas Gleixner's irq/core branch but it looks like only
> arm-soc has been taking patches for the current location.

Blah.  OK then, just send it to Thomas.

Initially this code was written and committed by RMK which is why I 
suggested you send him the fix.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/10] sched: Factor out code to should_we_balance()

2013-08-22 Thread Joonsoo Kim
On Thu, Aug 22, 2013 at 12:42:57PM +0200, Peter Zijlstra wrote:
> > >
> > > +redo:
> > 
> > One behavioral change worth noting here is that in the redo case if a
> > CPU has become idle we'll continue trying to load-balance in the
> > !new-idle case.
> > 
> > This could be unpleasant in the case where a package has a pinned busy
> > core allowing this and a newly idle cpu to start dueling for load.
> > 
> > While more deterministically bad in this case now, it could racily do
> > this before anyway so perhaps not worth worrying about immediately.
> 
> Ah, because the old code would effectively redo the check and find the
> idle cpu and thereby our cpu would no longer be the balance_cpu.
> 
> Indeed. And I don't think this was an intentional change. I'll go put
> the redo back before should_we_balance().

Ah, yes.
It isn't my intention. Please fix it.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] misc: Add crossbar driver

2013-08-22 Thread Rajendra Nayak
On Thursday 22 August 2013 05:03 PM, Sricharan R wrote:
>  maps crossbar number<->  to interrupt number and
>  calls request_irq(int_no, crossbar_handler,..)

So will this mapping happen based on some data passed from DT or
just based on whats available when the device does a request_irq()?

If its based on whats available then I see an issue when you need
to remap something thats already mapped by default (and not used)
since you run out of all free ones.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [BUGFIX] drivers/base: fix show_mem_removable section count

2013-08-22 Thread Greg Kroah-Hartman
On Thu, Aug 22, 2013 at 11:17:50PM -0500, Russ Anderson wrote:
> On Thu, Aug 22, 2013 at 09:10:45PM -0700, Greg Kroah-Hartman wrote:
> > On Thu, Aug 22, 2013 at 09:38:38PM -0500, Russ Anderson wrote:
> > > "cat /sys/devices/system/memory/memory*/removable" crashed the system.
> > 
> > On what kernels?  linux-next or Linus's tree, or 3.10.y?
> 
> Linus 3.11-rc6

So 3.10 is ok?  Trying to figure out where to send the fix to, thanks.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] irqchip: gic: Don't complain in gic_get_cpumask() if UP system

2013-08-22 Thread Stephen Boyd
On 08/22, Nicolas Pitre wrote:
> On Thu, 22 Aug 2013, Stephen Boyd wrote:
> 
> > On 07/17, Stephen Boyd wrote:
> > > On 07/17/13 15:53, Nicolas Pitre wrote:
> > > > On Wed, 17 Jul 2013, Stephen Boyd wrote:
> > > >
> > > >> On 07/17/13 15:34, Nicolas Pitre wrote:
> > > >>> On Wed, 17 Jul 2013, Stephen Boyd wrote:
> > > >>>
> > >  On 07/12/13 05:10, Stephen Boyd wrote:
> > > > On 07/12, Javi Merino wrote:
> > > >> I agree, we should drop the check.  It's annoying in uniprocessors 
> > > >> and
> > > >> unlikely to be found in the real world unless your gic entry in 
> > > >> the dt
> > > >> is wrong.
> > > >>> And that's a likely outcome in the real world.
> > > >>>
> > > > Ok. How about this?
> > >  Any comments?
> > > >>> What about this instead:
> > > >> Unfortunately arm64 doesn't have SMP_ON_UP. 
> > > > And why does that matter?
> > > 
> > > Because the gic driver is compiled on both arm and arm64? I suppose we
> > > could define is_smp() to 1 on arm64 but its probably better to rely on
> > > generic kernel things instead of arch specific functions.
> > > 
> > > >
> > > >> It sounds like you preferred the first patch using num_possible_cpus()
> > > > Probably, yes.  I didn't follow the early conversation though.
> > > 
> > > This was the first patch:
> > > 
> > > ---8<
> > > 
> > > diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> > > index 19ceaa6..589c760 100644
> > > --- a/drivers/irqchip/irq-gic.c
> > > +++ b/drivers/irqchip/irq-gic.c
> > > @@ -368,7 +368,7 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
> > >   break;
> > >   }
> > >  
> > > - if (!mask)
> > > + if (!mask && num_possible_cpus() > 1)
> > >   pr_crit("GIC CPU mask not found - kernel will fail to boot.\n");
> > >  
> > >   return mask;
> > 
> > Can one of these two patches be picked up?
> 
> Sure.  Just send it to RMK's patch system with my ACK.
> 

I'm confused on that. MAINTAINERS says this patch should go
through Thomas Gleixner's irq/core branch but it looks like only
arm-soc has been taking patches for the current location.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 0/5] clk: dt: bindings for mux, divider & gate clocks

2013-08-22 Thread Stephen Boyd
On 08/21, Mike Turquette wrote:
> 
> I just happened across a to-do list note telling me to respond to this
> email. Better late than never.
> 
[snip]
>
> This is a way to establish initial configuration from the consumer's
> perspective. Similarly something can be done for the clock rate with
> assigned-clock-rate.

Ok. Thanks for the information. Unfortunately it isn't what I
thought it was.

> 
> With all of that said this is consumer-level stuff. We'll definitely
> talk about the clock provider DT bindings at the ARM Summit, which is
> what you discuss above.

I can't wait another 2 months to start discussing the clock
provider DT bindings. We need to discuss it on the list.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [PATCH 0/3] kprobes: add new dma insn slot cache for s390

2013-08-22 Thread Masami Hiramatsu
(2013/08/22 14:52), Heiko Carstens wrote:
> Hi Masami,
> 
>> (2013/08/21 21:01), Heiko Carstens wrote:
>>> The current kpropes insn caches allocate memory areas for insn slots with
>>> module_alloc(). The assumption is that the kernel image and module area
>>> are both within the same +/- 2GB memory area.
>>> This however is not true for s390 where the kernel image resides within
>>> the first 2GB (DMA memory area), but the module area is far away in the
>>> vmalloc area, usually somewhere close below the 4TB area.
>>>
>>> For new pc relative instructions s390 needs insn slots that are within
>>> +/- 2GB of each area. That way we can patch displacements of pc-relative
>>> instructions within the insn slots just like x86 and powerpc.
>>>
>>> The module area works already with the normal insn slot allocator, however
>>> there is currently no way to get insn slots that are within the first 2GB
>>> on s390 (aka DMA area).
>>
>> The reason why we allocate instruction buffers from module area is
>> to execute a piece of code on the buffer, which should be executable.
>> I'm not good for s390, is that allows kernel to execute the code
>> on such DMA buffer?
> 
> Yes, the kernel image itself resides in DMA capable memory and it is all
> executable.
> 
>>> Therefore this patch set introduces a third insn slot cache besides the
>>> normal insn and optinsn slot caches: the dmainsn slot cache. Slots can be
>>> allocated and freed with get_dmainsn_slot() and free_dmainsn_slot().
>>
>> OK, but it seems that your patch introduced unneeded complexity. Perhaps,
>> you just have to introduce 2 weak functions to allocate/release such
>> executable and jump-able buffers, like below,
>>
>> void * __weak arch_allocate_executable_page(void)
>> {
>>  return module_alloc(PAGE_SIZE);
>> }
>>
>> void __weak arch_free_executable_page(void *page)
>> {
>>  module_free(NULL, page);
>> }
>>
>> Thus, all you need to do is implementing dmaalloc() version of above
>> functions on s390. No kconfig, no ifdefs are needed. :)
> 
> Hm, I don't see how that can work, or maybe I just don't get your idea ;)
> Or maybe my intention was not clear? So let me try again:
> 
> If the to be probed instruction resides within the first 2GB of memory
> (aka DMA memory, aka kernel image) the insn slot must be within the first
> 2GB as well, otherwise I can't patch pc-relative instructions.
> 
> On the other hand if the to be probed instruction resides in a module
> (aka part of the vmalloc area), the insn slot must reside within the same
> 2GB area as well.
> 
> Therefore I need to different insn slot caches, where the slots are either
> allocated with __get_free_page(GFP_KERNEL | GFP_DMA) (for the kernel image)
> or module_alloc(PAGE_SIZE) for modules.
> 
> I can't have a single cache which satifies both areas.

Oh, I see.
Indeed, that enough reason to add a new cache... By the way, is there
any way to implement it without new kconfig like DMAPROBE and dma flag?
AFAICS, since such flag is strongly depends on the s390 arch, I don't
like to put it in kernel/kprobes.c.

Perhaps, we can make insn slot more generic, e.g. create new slot type
with passing page allocator.

Thank you,

-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 10/13] tracing/uprobes: Fetch args before reserving a ring buffer

2013-08-22 Thread Masami Hiramatsu
(2013/08/23 8:57), zhangwei(Jovi) wrote:
> On 2013/8/23 0:42, Steven Rostedt wrote:
>> On Fri, 09 Aug 2013 18:56:54 +0900
>> Masami Hiramatsu  wrote:
>>
>>> (2013/08/09 17:45), Namhyung Kim wrote:
 From: Namhyung Kim 

 Fetching from user space should be done in a non-atomic context.  So
 use a temporary buffer and copy its content to the ring buffer
 atomically.

 While at it, use __get_data_size() and store_trace_args() to reduce
 code duplication.
>>>
>>> I just concern using kmalloc() in the event handler. For fetching user
>>> memory which can be swapped out, that is true. But most of the cases,
>>> we can presume that it exists on the physical memory.
>>>
>>
>>
>> What about creating a per cpu buffer when uprobes are registered, and
>> delete them when they are finished? Basically what trace_printk() does
>> if it detects that there are users of trace_printk() in the kernel.
>> Note, it does not deallocate them when finished, as it is never
>> finished until reboot ;-)
>>
>> -- Steve
>>
> I also thought out this approach, but the issue is we cannot fetch user
> memory into per-cpu buffer, because use per-cpu buffer should under
> preempt disabled, and fetching user memory could sleep.

Hm, perhaps, we just need a "hot" buffer pool which can be allocate/free
soon, and whan the pool shortage caller just wait or allocate new page
from "cold" area, this is a.k.a. kmem_cache :)

Anyway, kmem_cache/kmalloc looks so heavy to just allocate temporally
buffers for trace handler (and also, those have tracepoints), so I think
you may just need a memory pool whose has enough number of slots with
a semaphore (which will wait if the all slots are currently used).

Thank you,

-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [BUGFIX] drivers/base: fix show_mem_removable section count

2013-08-22 Thread Russ Anderson
On Thu, Aug 22, 2013 at 09:10:45PM -0700, Greg Kroah-Hartman wrote:
> On Thu, Aug 22, 2013 at 09:38:38PM -0500, Russ Anderson wrote:
> > "cat /sys/devices/system/memory/memory*/removable" crashed the system.
> 
> On what kernels?  linux-next or Linus's tree, or 3.10.y?

Linus 3.11-rc6

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [BUGFIX] drivers/base: fix show_mem_removable section count

2013-08-22 Thread Greg Kroah-Hartman
On Thu, Aug 22, 2013 at 09:38:38PM -0500, Russ Anderson wrote:
> "cat /sys/devices/system/memory/memory*/removable" crashed the system.

On what kernels?  linux-next or Linus's tree, or 3.10.y?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] fs: supply inode uid/gid setting interface

2013-08-22 Thread Greg KH
On Fri, Aug 23, 2013 at 10:48:36AM +0800, Rui Xiang wrote:
> This patchset implements an accessor functions to set uid/gid
> in inode struct. Just finish code clean up.

Why?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: unused swap offset / bad page map.

2013-08-22 Thread Dave Jones
On Fri, Aug 23, 2013 at 11:27:29AM +0800, Hillf Danton wrote:
 > On Fri, Aug 23, 2013 at 11:21 AM, Dave Jones  wrote:
 > >
 > > I still see the swap_free messages with this applied.
 > >
 > Decremented?

It actually seems worse, seems I can trigger it even easier now, as if
there's a leak.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] kernel/padata.c: share code between CPU_ONLINE and CPU_DOWN_FAILED, same to CPU_DOWN_PREPARE and CPU_UP_CANCELED

2013-08-22 Thread Chen Gang
On 08/22/2013 02:43 PM, Chen Gang wrote:
> Share code between CPU_ONLINE and CPU_DOWN_FAILED, same to
> CPU_DOWN_PREPARE and CPU_UP_CANCELED.
> 
> It will fix 2 bugs:
> 
>   "not check the return value of __padata_remove_cpu() and 
> __padata_add_cpu()".
>   "need add 'break' between CPU_UP_CANCELED and CPU_DOWN_FAILED".
> 

Do we need more details descriptions ?

If so, could Steffen give more expert details information ?

Thanks.

> 
> Signed-off-by: Chen Gang 
> ---
>  kernel/padata.c |   20 
>  1 files changed, 4 insertions(+), 16 deletions(-)
> 
> diff --git a/kernel/padata.c b/kernel/padata.c
> index 072f4ee..2f0037a 100644
> --- a/kernel/padata.c
> +++ b/kernel/padata.c
> @@ -846,6 +846,8 @@ static int padata_cpu_callback(struct notifier_block *nfb,
>   switch (action) {
>   case CPU_ONLINE:
>   case CPU_ONLINE_FROZEN:
> + case CPU_DOWN_FAILED:
> + case CPU_DOWN_FAILED_FROZEN:
>   if (!pinst_has_cpu(pinst, cpu))
>   break;
>   mutex_lock(&pinst->lock);
> @@ -857,6 +859,8 @@ static int padata_cpu_callback(struct notifier_block *nfb,
>  
>   case CPU_DOWN_PREPARE:
>   case CPU_DOWN_PREPARE_FROZEN:
> + case CPU_UP_CANCELED:
> + case CPU_UP_CANCELED_FROZEN:
>   if (!pinst_has_cpu(pinst, cpu))
>   break;
>   mutex_lock(&pinst->lock);
> @@ -865,22 +869,6 @@ static int padata_cpu_callback(struct notifier_block 
> *nfb,
>   if (err)
>   return notifier_from_errno(err);
>   break;
> -
> - case CPU_UP_CANCELED:
> - case CPU_UP_CANCELED_FROZEN:
> - if (!pinst_has_cpu(pinst, cpu))
> - break;
> - mutex_lock(&pinst->lock);
> - __padata_remove_cpu(pinst, cpu);
> - mutex_unlock(&pinst->lock);
> -
> - case CPU_DOWN_FAILED:
> - case CPU_DOWN_FAILED_FROZEN:
> - if (!pinst_has_cpu(pinst, cpu))
> - break;
> - mutex_lock(&pinst->lock);
> - __padata_add_cpu(pinst, cpu);
> - mutex_unlock(&pinst->lock);
>   }
>  
>   return NOTIFY_OK;
> 


-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 06/10] sched, fair: Make group power more consitent

2013-08-22 Thread Preeti U Murthy
On 08/19/2013 09:31 PM, Peter Zijlstra wrote:


Reviewed-by: Preeti U Murthy 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] iommu: WARN_ON when removing a device with no iommu_group associated

2013-08-22 Thread Alex Williamson
[+cc iommu]

On Fri, 2013-08-23 at 09:55 +0800, Wei Yang wrote:
> When removing a device from the system, iommu_group driver will try to
> disconnect it from its group. While in some cases, one device may not
> associated with any iommu_group. For example, not enough DMA address space.
> 
> In the generic bus notification, it will check dev->iommu_group before calling
> iommu_group_remove_device(). While in some cases, developers may call
> iommu_group_remove_device() in a different code path and without check. For
> those devices with dev->iommu_group set to NULL, kernel will crash.
> 
> This patch gives a warning and return when trying to remove a device from an
> iommu_group with dev->iommu_group set to NULL. This helps to indicate some bad
> behavior and also guard the kernel.
> 
> Signed-off-by: Wei Yang 

Acked-by: Alex Williamson 

> ---
>  drivers/iommu/iommu.c |3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index fbe9ca7..43396f0 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -379,6 +379,9 @@ void iommu_group_remove_device(struct device *dev)
>   struct iommu_group *group = dev->iommu_group;
>   struct iommu_device *tmp_device, *device = NULL;
>  
> + if (WARN_ON(!group))
> + return;
> +
>   /* Pre-notify listeners that a device is being removed. */
>   blocking_notifier_call_chain(&group->notifier,
>IOMMU_GROUP_NOTIFY_DEL_DEVICE, dev);



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/6] mm/hwpoison: fix num_poisoned_pages error statistics for thp

2013-08-22 Thread Naoya Horiguchi
Hi Wanpeng,

On Fri, Aug 23, 2013 at 07:52:40AM +0800, Wanpeng Li wrote:
> Hi Naoya,
> On Thu, Aug 22, 2013 at 12:43:08PM -0400, Naoya Horiguchi wrote:
> >On Thu, Aug 22, 2013 at 05:48:24PM +0800, Wanpeng Li wrote:
> >> There is a race between hwpoison page and unpoison page, memory_failure 
> >> set the page hwpoison and increase num_poisoned_pages without hold page 
> >> lock, and one page count will be accounted against thp for 
> >> num_poisoned_pages.
> >> However, unpoison can occur before memory_failure hold page lock and 
> >> split transparent hugepage, unpoison will decrease num_poisoned_pages 
> >> by 1 << compound_order since memory_failure has not yet split transparent 
> >> hugepage with page lock held. That means we account one page for hwpoison
> >> and 1 << compound_order for unpoison. This patch fix it by decrease one 
> >> account for num_poisoned_pages against no hugetlbfs pages case.
> >> 
> >> Signed-off-by: Wanpeng Li 
> >
> >I think that a thp never becomes hwpoisoned without splitting, so "trying
> >to unpoison thp" never happens (I think that this implicit fact should be
> 
> There is a race window here for hwpoison thp: 

OK, thanks for great explanation (it's worth written in description.)
And I found my previous comment was comletely pointless, sorry :(

>   A   
> B
>   memory_failue 
>   TestSetPageHWPoison(p);
>   if (PageHuge(p))
>   nr_pages = 1 << compound_order(hpage);
>   else 
>   nr_pages = 1;
>   atomic_long_add(nr_pages, &num_poisoned_pages); 
>   
> unpoison_memory
>   
> nr_pages = 1<< 
> compound_trans_order(page;)
> 
>   
> if(TestClearPageHWPoison(p))
>   
> 
> atomic_long_sub(nr_pages, &num_poisoned_pages);
>   lock page 
>   if (!PageHWPoison(p))
>   unlock page and return 
>   hwpoison_user_mappings
>   if (PageTransHuge(hpage))
>   split_huge_page(hpage);

When this race happens, our expectation is that num_poisoned_pages is
increased by 1 because finally thread A succeeds to hwpoison one normal page.
So thread B should fail to unpoison without clearing PageHWPoison nor
decreasing num_poisoned_pages.  My suggestion is inserting a PageTransHuge
check before doing TestClearPageHWPoison like follows:

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 1cb3b7d..f551b72 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1336,6 +1336,16 @@ int unpoison_memory(unsigned long pfn)
return 0;
}
 
+   /*
+* unpoison_memory() can encounter thp only when the thp is being
+* worked by memory_failure() and the page lock is not held yet.
+* In such case, we yield to memory_failure() and make unpoison fail.
+*/
+   if (PageTransHuge(page)) {
+   pr_info("MCE: Memory failure is now running on %#lx\n", pfn);
+   return 0;
+   }
+
nr_pages = 1 << compound_trans_order(page);
 
if (!get_page_unless_zero(page)) {


I think that replacing atomic_long_sub() with atomic_long_dec() still
has a meaning, so you don't have to drop that.

> 
> We increase one page count, however, decrease 1 << compound_trans_order.
> The compound_trans_order you mentioned is used here for thp, that's why 
> I don't drop it in patch 2/6.

I don't think that we have to use compound_trans_order() any more, because
with the above change we don't calculate nr_pages any more for thp.
We can reduce the cost to lock/unlock compound_lock as described in 2/6.

> >commented somewhere or asserted with VM_BUG_ON().)
> 
> I will add the VM_BUG_ON() in unpoison_memory after lock page in next
> version.

Sorry, my previous suggestion didn't make sense.

Thank you!
Naoya Horiguchi

> >And nr_pages in unpoison_memory() can be greater than 1 for hugetlbfs page.
> >So does this patch break counting when unpoisoning free hugetlbfs pages?
> >
> >Thanks,
> >Naoya Horiguchi
> >
> >> ---
> >>  mm/memory-failure.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >> 
> >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> >> index 5092e06..6bfd51e 100644
> >> --- a/mm/memory-failure.c
> >> +++ b/mm/memory-failure.c
> >> @@ -1350,7 +1350,7 @@ int unpoison_memory(unsigned long pfn)
> 

RE: [PATCH] cpuidle: coupled: fix dead loop corner case

2013-08-22 Thread Neil Zhang

> -Original Message-
> From: Colin Cross [mailto:ccr...@google.com]
> Sent: 2013年8月23日 5:08
> To: Neil Zhang
> Cc: Rafael J. Wysocki; Daniel Lezcano; Linux PM list; lkml
> Subject: Re: [PATCH] cpuidle: coupled: fix dead loop corner case
> 
> On Mon, Aug 19, 2013 at 10:17 PM, Neil Zhang 
> wrote:
> > There is a corener case when no peripheral irqs route to secondary
> > cores.
> > Let's take dual core system for example, the sequence is as following:
> >
> > Core 0  Core1
> > 1. set waiting bit and enter waiting
> loop
> > 2. set waiting bit and poke core1
> > 3. clear poke in irq and enter safe
> state
> > 4. set ready bit and enter ready loop
> >
> > Since there is no peripheral irq route to core 1, so it will stay in
> > safe state forever, and core 0 will dead loop in the following code.
> > while (!cpuidle_coupled_cpus_ready(coupled)) {
> > /* Check if any other cpus bailed out of idle. */
> > if (!cpuidle_coupled_cpus_waiting(coupled))
> > }
> >
> > The solution is don't let secondary core enter safe state when it has
> > already handled the poke interrupt.
> >
> > Signed-off-by: Neil Zhang 
> > Reviewed-by: Fangsuo Wu 
> > ---
> >  drivers/cpuidle/coupled.c |7 +++
> >  1 files changed, 7 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/cpuidle/coupled.c b/drivers/cpuidle/coupled.c
> > index 2a297f8..a37c718 100644
> > --- a/drivers/cpuidle/coupled.c
> > +++ b/drivers/cpuidle/coupled.c
> > @@ -119,6 +119,7 @@ struct cpuidle_coupled {
> >  #define CPUIDLE_COUPLED_NOT_IDLE   (-1)
> >
> >  static DEFINE_MUTEX(cpuidle_coupled_lock);
> > +static DEFINE_PER_CPU(bool, poke_sync);
> >  static DEFINE_PER_CPU(struct call_single_data,
> > cpuidle_coupled_poke_cb);
> >
> >  /*
> > @@ -295,6 +296,7 @@ static void cpuidle_coupled_poked(void *info)  {
> > int cpu = (unsigned long)info;
> > cpumask_clear_cpu(cpu, &cpuidle_coupled_poked_mask);
> > +   __this_cpu_write(poke_sync, true);
> >  }
> >
> >  /**
> > @@ -473,6 +475,7 @@ retry:
> >  * allowed for a single cpu.
> >  */
> > while (!cpuidle_coupled_cpus_waiting(coupled)) {
> > +   __this_cpu_write(poke_sync, false);
> > if (cpuidle_coupled_clear_pokes(dev->cpu)) {
> > cpuidle_coupled_set_not_waiting(dev->cpu,
> coupled);
> > goto out;
> > @@ -483,6 +486,10 @@ retry:
> > goto out;
> > }
> >
> > +   if (cpuidle_coupled_cpus_waiting(coupled)
> > +   && __this_cpu_read(poke_sync))
> > +   break;
> > +
> > entered_state = cpuidle_enter_state(dev, drv,
> > dev->safe_state_index);
> > }
> > --
> > 1.7.4.1
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe
> > linux-kernel" in the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 
> I have a similar patch that avoids adding another check for
> cpuidle_coupled_cpus_waiting, and uses the return value from
> cpuidle_coupled_clear_pokes instead of adding a percpu bool.  I will post it
> shortly.
> 
> Do you have a test case that can reproduce this easily?

It's not easy to reproduce.
We only catch one time till now.

Best Regards,
Neil Zhang


Re: unused swap offset / bad page map.

2013-08-22 Thread Dave Jones
On Thu, Aug 22, 2013 at 11:21:28AM +0800, Hillf Danton wrote:
 > On Thu, Aug 22, 2013 at 4:49 AM, Dave Jones  wrote:
 > >
 > > didn't hit the bug_on, but got a bunch of
 > >
 > > [  424.077993] swap_free: Unused swap offset entry 000187d5
 > > [  439.377194] swap_free: Unused swap offset entry 000187e7
 > > [  441.998411] swap_free: Unused swap offset entry 000187ee
 > > [  446.956551] swap_free: Unused swap offset entry 245f
 > >
 > If page is reused, its swap entry is freed.
 > 
 > reuse_swap_page()
 >   delete_from_swap_cache()
 > swapcache_free()
 >   count = swap_entry_free(p, entry, SWAP_HAS_CACHE);
 > 
 > If count drops to zero, then swap_free() gives warning.
 > 
 > 
 > --- a/mm/memory.c Wed Aug  7 16:29:34 2013
 > +++ b/mm/memory.c Thu Aug 22 10:44:32 2013
 > @@ -3123,6 +3123,7 @@ static int do_swap_page(struct mm_struct
 >   /* It's better to call commit-charge after rmap is established */
 >   mem_cgroup_commit_charge_swapin(page, ptr);
 > 
 > + if (!exclusive)
 >   swap_free(entry);
 >   if (vm_swap_full() || (vma->vm_flags & VM_LOCKED) || PageMlocked(page))
 >   try_to_free_swap(page);
 > --

I still see the swap_free messages with this applied.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [dm-devel] [PATCH] dm: allow error target to replace either bio-based and request-based targets

2013-08-22 Thread Jun'ichi Nomura
Hello Mike,

On 08/23/13 09:17, Mike Snitzer wrote:
>> I do like the idea of a single error target that is hybrid (supports
>> both bio-based and request-based) but the DM core would need to be
>> updated to support this.
>>
>> Specifically, we'd need to check if the device (and active table) is
>> already bio-based or request-based and select the appropriate type.  If
>> it is a new device, default to selecting bio-based.
>>
>> There are some wrappers and other logic thoughout DM core that will need
>> auditing too.
> 
> Here is a patch that should work for your needs (I tested it to work
> with 'dmsetup wipe_table' on both request-based and bio-based devices):

How about moving the default handling in dm_table_set_type() outside of
the for-each-target loop, like the modified patch below?

For example, if a table has 2 targets, hybrid and request_based,
and live_md_type is DM_TYPE_NONE, the table should be considered as
request_based, not inconsistent.
Though the end result is same as such a table is rejected by other
constraint anyway, I think it's good to keep the semantics clean
and error messages consistent.

I.e. for the above case, the error message should be
"Request-based dm doesn't support multiple targets yet",
not "Inconsistent table: different target types can't be mixed up".

---
Jun'ichi Nomura, NEC Corporation


diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index f221812..6e683c8 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -860,14 +860,16 @@ EXPORT_SYMBOL(dm_consume_args);
 static int dm_table_set_type(struct dm_table *t)
 {
unsigned i;
-   unsigned bio_based = 0, request_based = 0;
+   unsigned bio_based = 0, request_based = 0, hybrid = 0;
struct dm_target *tgt;
struct dm_dev_internal *dd;
struct list_head *devices;
 
for (i = 0; i < t->num_targets; i++) {
tgt = t->targets + i;
-   if (dm_target_request_based(tgt))
+   if (dm_target_hybrid(tgt))
+   hybrid = 1;
+   else if (dm_target_request_based(tgt))
request_based = 1;
else
bio_based = 1;
@@ -879,6 +881,25 @@ static int dm_table_set_type(struct dm_table *t)
}
}
 
+   if (hybrid && !bio_based && !request_based) {
+   /*
+* The targets can work either way.
+* Determine the type from the live device.
+*/
+   unsigned live_md_type;
+   dm_lock_md_type(t->md);
+   live_md_type = dm_get_md_type(t->md);
+   dm_unlock_md_type(t->md);
+   switch (live_md_type) {
+   case DM_TYPE_REQUEST_BASED:
+   request_based = 1;
+   break;
+   default:
+   bio_based = 1;
+   break;
+   }
+   }
+
if (bio_based) {
/* We must use this table as bio-based */
t->type = DM_TYPE_BIO_BASED;
diff --git a/drivers/md/dm-target.c b/drivers/md/dm-target.c
index 37ba5db..242e3ce 100644
--- a/drivers/md/dm-target.c
+++ b/drivers/md/dm-target.c
@@ -131,12 +131,19 @@ static int io_err_map(struct dm_target *tt, struct bio 
*bio)
return -EIO;
 }
 
+static int io_err_map_rq(struct dm_target *ti, struct request *clone,
+union map_info *map_context)
+{
+   return -EIO;
+}
+
 static struct target_type error_target = {
.name = "error",
-   .version = {1, 1, 0},
+   .version = {1, 2, 0},
.ctr  = io_err_ctr,
.dtr  = io_err_dtr,
.map  = io_err_map,
+   .map_rq = io_err_map_rq,
 };
 
 int __init dm_target_init(void)
diff --git a/drivers/md/dm.h b/drivers/md/dm.h
index 45b97da..8b4c075 100644
--- a/drivers/md/dm.h
+++ b/drivers/md/dm.h
@@ -89,10 +89,21 @@ int dm_setup_md_queue(struct mapped_device *md);
 #define dm_target_is_valid(t) ((t)->table)
 
 /*
+ * To check whether the target type is bio-based or not (request-based).
+ */
+#define dm_target_bio_based(t) ((t)->type->map != NULL)
+
+/*
  * To check whether the target type is request-based or not (bio-based).
  */
 #define dm_target_request_based(t) ((t)->type->map_rq != NULL)
 
+/*
+ * To check whether the target type is a hybrid (capable of being
+ * either request-based or bio-based).
+ */
+#define dm_target_hybrid(t) (dm_target_bio_based(t) && 
dm_target_request_based(t))
+
 /*-
  * A registry of target types.
  *---*/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 2/3] DMA: Freescale: Add new 8-channel DMA engine device tree nodes

2013-08-22 Thread Hongbo Zhang

On 08/22/2013 07:16 AM, Stephen Warren wrote:

On 08/21/2013 05:00 PM, Scott Wood wrote:

On Wed, 2013-08-21 at 16:40 -0600, Stephen Warren wrote:

On 07/29/2013 04:49 AM, hongbo.zh...@freescale.com wrote:

+- reg   : 
+- interrupts: 

s/interrupts/specifier/

Do you mean s/interrupt mapping/interrupt specifier/?

And probably s/registers mapping/register specifier/ as well.

Yup.


OK, I will update these descriptions.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH V5 3/5] POWER/cpuidle: Generic IBM-POWER backend cpuidle driver.

2013-08-22 Thread Wang Dongsheng-B40534

> diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
> index 0e2cd5c..e805dcd 100644
> --- a/drivers/cpuidle/Kconfig
> +++ b/drivers/cpuidle/Kconfig

Maybe drivers/cpuidle/Kconfig.powerpc is better? Like arm.

> +obj-$(CONFIG_CPU_IDLE_IBM_POWER) += cpuidle-ibm-power.o
> diff --git a/drivers/cpuidle/cpuidle-ibm-power.c
> b/drivers/cpuidle/cpuidle-ibm-power.c
> new file mode 100644
> index 000..4ee5a94
> --- /dev/null
> +++ b/drivers/cpuidle/cpuidle-ibm-power.c
> @@ -0,0 +1,304 @@
> +/*
> + *  cpuidle-ibm-power - idle state cpuidle driver.
> + *  Adapted from drivers/idle/intel_idle.c and
> + *  drivers/acpi/processor_idle.c
> + *
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +struct cpuidle_driver power_idle_driver = {
> + .name = "IBM-POWER-Idle",
> + .owner= THIS_MODULE,
> +};
> +
> +#define MAX_IDLE_STATE_COUNT 2
> +
> +static int max_idle_state = MAX_IDLE_STATE_COUNT - 1;

Again, do not use the macro.

> +static struct cpuidle_state *cpuidle_state_table;
> +
> +static inline void idle_loop_prolog(unsigned long *in_purr)
> +{
> + *in_purr = mfspr(SPRN_PURR);
> + /*
> +  * Indicate to the HV that we are idle. Now would be
> +  * a good time to find other work to dispatch.
> +  */
> + get_lppaca()->idle = 1;
> +}
> +
> +static inline void idle_loop_epilog(unsigned long in_purr)
> +{
> + get_lppaca()->wait_state_cycles += mfspr(SPRN_PURR) - in_purr;
> + get_lppaca()->idle = 0;
> +}
> +
> +static int snooze_loop(struct cpuidle_device *dev,
> + struct cpuidle_driver *drv,
> + int index)
> +{
> + unsigned long in_purr;
> +
> + idle_loop_prolog(&in_purr);
> + local_irq_enable();

snooze_loop has already registered in cpuidle framework to handle snooze state.
where disable the irq? Why do "enable" here?

> +/*
> + * States for dedicated partition case.
> + */
> +static struct cpuidle_state dedicated_states[MAX_IDLE_STATE_COUNT] = {
> + { /* Snooze */
> + .name = "snooze",
> + .desc = "snooze",
> + .flags = CPUIDLE_FLAG_TIME_VALID,
> + .exit_latency = 0,
> + .target_residency = 0,
> + .enter = &snooze_loop },
> + { /* CEDE */
> + .name = "CEDE",
> + .desc = "CEDE",
> + .flags = CPUIDLE_FLAG_TIME_VALID,
> + .exit_latency = 10,
> + .target_residency = 100,
> + .enter = &dedicated_cede_loop },
> +};
> +
> +/*
> + * States for shared partition case.
> + */
> +static struct cpuidle_state shared_states[MAX_IDLE_STATE_COUNT] = {
> + { /* Shared Cede */
> + .name = "Shared Cede",
> + .desc = "Shared Cede",
> + .flags = CPUIDLE_FLAG_TIME_VALID,
> + .exit_latency = 0,
> + .target_residency = 0,
> + .enter = &shared_cede_loop },
> +};
> +
> +static void __exit power_processor_idle_exit(void)
> +{
> +
> + unregister_cpu_notifier(&setup_hotplug_notifier);

Remove a blank line.

> + cpuidle_unregister(&power_idle_driver);
> + return;
> +}
> +
> +module_init(power_processor_idle_init);
> +module_exit(power_processor_idle_exit);
> +

Did you have tested the module? If not tested, please don't use the module.

> +MODULE_AUTHOR("Deepthi Dharwar ");
> +MODULE_DESCRIPTION("Cpuidle driver for IBM POWER platforms");
> +MODULE_LICENSE("GPL");
> 



[PATCH -next] dma: cppi41: fix error return code in cppi41_dma_probe()

2013-08-22 Thread Wei Yongjun
From: Wei Yongjun 

Fix to return -EINVAL in the irq parse and map error handling
case instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun 
---
 drivers/dma/cppi41.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
index 5dcebca..49ea05a 100644
--- a/drivers/dma/cppi41.c
+++ b/drivers/dma/cppi41.c
@@ -973,8 +973,10 @@ static int cppi41_dma_probe(struct platform_device *pdev)
goto err_chans;
 
irq = irq_of_parse_and_map(pdev->dev.of_node, 0);
-   if (!irq)
+   if (!irq) {
+   ret = -EINVAL;
goto err_irq;
+   }
 
cppi_writel(USBSS_IRQ_PD_COMP, cdd->usbss_mem + USBSS_IRQ_ENABLER);
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] fs: use inode_set_user to set uid/gid of inode

2013-08-22 Thread Rui Xiang
Use the new interface to set i_uid/i_gid in inode struct.

Signed-off-by: Rui Xiang 
---
 arch/ia64/kernel/perfmon.c|  3 +--
 arch/powerpc/platforms/cell/spufs/inode.c |  3 +--
 arch/s390/hypfs/inode.c   |  3 +--
 drivers/infiniband/hw/qib/qib_fs.c|  3 +--
 drivers/usb/gadget/f_fs.c |  3 +--
 drivers/usb/gadget/inode.c|  5 +++--
 fs/9p/vfs_inode.c |  6 ++
 fs/adfs/inode.c   |  3 +--
 fs/affs/inode.c   |  6 ++
 fs/afs/inode.c|  6 ++
 fs/anon_inodes.c  |  3 +--
 fs/autofs4/inode.c|  4 ++--
 fs/befs/linuxvfs.c|  8 
 fs/ceph/caps.c|  5 +++--
 fs/ceph/inode.c   |  8 
 fs/cifs/inode.c   |  6 ++
 fs/configfs/inode.c   |  3 +--
 fs/debugfs/inode.c|  3 +--
 fs/devpts/inode.c |  7 +++
 fs/ext2/ialloc.c  |  3 +--
 fs/ext3/ialloc.c  |  3 +--
 fs/ext4/ialloc.c  |  3 +--
 fs/fat/inode.c|  6 ++
 fs/fuse/control.c |  3 +--
 fs/fuse/inode.c   |  4 ++--
 fs/hfs/inode.c|  6 ++
 fs/hfsplus/inode.c|  3 +--
 fs/hpfs/inode.c   |  3 +--
 fs/hpfs/namei.c   | 12 
 fs/hugetlbfs/inode.c  |  3 +--
 fs/isofs/inode.c  |  3 +--
 fs/isofs/rock.c   |  3 +--
 fs/ncpfs/inode.c  |  3 +--
 fs/nfs/inode.c|  4 ++--
 fs/ntfs/inode.c   | 12 
 fs/ntfs/mft.c |  3 +--
 fs/ntfs/super.c   |  3 +--
 fs/ocfs2/refcounttree.c   |  3 +--
 fs/omfs/inode.c   |  3 +--
 fs/pipe.c |  3 +--
 fs/proc/base.c| 15 +--
 fs/proc/fd.c  |  8 
 fs/proc/inode.c   |  3 +--
 fs/proc/self.c|  3 +--
 fs/stack.c|  3 +--
 fs/sysfs/inode.c  |  3 +--
 fs/xfs/xfs_iops.c |  4 ++--
 ipc/mqueue.c  |  3 +--
 kernel/cgroup.c   |  3 +--
 mm/shmem.c|  3 +--
 net/socket.c  |  3 +--
 51 files changed, 86 insertions(+), 142 deletions(-)

diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
index 5a9ff1c..73e1e55 100644
--- a/arch/ia64/kernel/perfmon.c
+++ b/arch/ia64/kernel/perfmon.c
@@ -2202,8 +2202,7 @@ pfm_alloc_file(pfm_context_t *ctx)
DPRINT(("new inode ino=%ld @%p\n", inode->i_ino, inode));
 
inode->i_mode = S_IFCHR|S_IRUGO;
-   inode->i_uid  = current_fsuid();
-   inode->i_gid  = current_fsgid();
+   inode_set_user(inode, current_fsuid(), current_fsgid());
 
/*
 * allocate a new dcache entry
diff --git a/arch/powerpc/platforms/cell/spufs/inode.c 
b/arch/powerpc/platforms/cell/spufs/inode.c
index 87ba7cf..4580c9b 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -101,8 +101,7 @@ spufs_new_inode(struct super_block *sb, umode_t mode)
 
inode->i_ino = get_next_ino();
inode->i_mode = mode;
-   inode->i_uid = current_fsuid();
-   inode->i_gid = current_fsgid();
+   inode_set_user(inode, current_fsuid(), current_fsgid());
inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
 out:
return inode;
diff --git a/arch/s390/hypfs/inode.c b/arch/s390/hypfs/inode.c
index 7a539f4..742e430 100644
--- a/arch/s390/hypfs/inode.c
+++ b/arch/s390/hypfs/inode.c
@@ -103,8 +103,7 @@ static struct inode *hypfs_make_inode(struct super_block 
*sb, umode_t mode)
struct hypfs_sb_info *hypfs_info = sb->s_fs_info;
ret->i_ino = get_next_ino();
ret->i_mode = mode;
-   ret->i_uid = hypfs_info->uid;
-   ret->i_gid = hypfs_info->gid;
+   inode_set_user(ret, hypfs_info->uid, hypfs_info->gid);
ret->i_atime = ret->i_mtime = ret->i_ctime = CURRENT_TIME;
if (S_ISDIR(mode))
set_nlink(ret, 2);
diff --git a/drivers/infiniband/hw/qib/qib_fs.c 
b/drivers/infiniband/hw/qib/qib_fs.c
index f247fc6e..6683837 100644
--- a/drivers/infiniband/hw/qib/qib_fs.c
+++ b/drivers/infiniband/hw/qib/qib_fs.c
@@ -61,13 +61,12 @@ static int qibfs_mknod(struct inode *dir, struct 

[PATCH 1/2] fs: implement inode uid/gid setting function

2013-08-22 Thread Rui Xiang
Supply a interface inode_set_user  to set uid/gid of inode
structs.

Signed-off-by: Rui Xiang 
---
 fs/inode.c | 7 +++
 include/linux/fs.h | 1 +
 2 files changed, 8 insertions(+)

diff --git a/fs/inode.c b/fs/inode.c
index e315c0a..3f90499 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -343,6 +343,13 @@ void inc_nlink(struct inode *inode)
 }
 EXPORT_SYMBOL(inc_nlink);
 
+void inode_set_user(struct inode *inode, kuid_t uid, kgid_t gid)
+{
+   inode->i_uid = uid;
+   inode->i_gid = gid;
+}
+EXPORT_SYMBOL(inode_set_user);
+
 void address_space_init_once(struct address_space *mapping)
 {
memset(mapping, 0, sizeof(*mapping));
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 729e81b..36ac51b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2619,6 +2619,7 @@ void __inode_sub_bytes(struct inode *inode, loff_t bytes);
 void inode_sub_bytes(struct inode *inode, loff_t bytes);
 loff_t inode_get_bytes(struct inode *inode);
 void inode_set_bytes(struct inode *inode, loff_t bytes);
+void inode_set_user(struct inode *inode, kuid_t uid, kgid_t gid);
 
 extern int vfs_readdir(struct file *, filldir_t, void *);
 extern int iterate_dir(struct file *, struct dir_context *);
-- 
1.8.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] fs: supply inode uid/gid setting interface

2013-08-22 Thread Rui Xiang
This patchset implements an accessor functions to set uid/gid
in inode struct. Just finish code clean up.

Rui Xiang (2):
  fs: implement inode uid/gid setting function
  fs: use inode_set_user to set uid/gid of inode

 arch/ia64/kernel/perfmon.c|  3 +--
 arch/powerpc/platforms/cell/spufs/inode.c |  3 +--
 arch/s390/hypfs/inode.c   |  3 +--
 drivers/infiniband/hw/qib/qib_fs.c|  3 +--
 drivers/usb/gadget/f_fs.c |  3 +--
 drivers/usb/gadget/inode.c|  5 +++--
 fs/9p/vfs_inode.c |  6 ++
 fs/adfs/inode.c   |  3 +--
 fs/affs/inode.c   |  6 ++
 fs/afs/inode.c|  6 ++
 fs/anon_inodes.c  |  3 +--
 fs/autofs4/inode.c|  4 ++--
 fs/befs/linuxvfs.c|  8 
 fs/ceph/caps.c|  5 +++--
 fs/ceph/inode.c   |  8 
 fs/cifs/inode.c   |  6 ++
 fs/configfs/inode.c   |  3 +--
 fs/debugfs/inode.c|  3 +--
 fs/devpts/inode.c |  7 +++
 fs/ext2/ialloc.c  |  3 +--
 fs/ext3/ialloc.c  |  3 +--
 fs/ext4/ialloc.c  |  3 +--
 fs/fat/inode.c|  6 ++
 fs/fuse/control.c |  3 +--
 fs/fuse/inode.c   |  4 ++--
 fs/hfs/inode.c|  6 ++
 fs/hfsplus/inode.c|  3 +--
 fs/hpfs/inode.c   |  3 +--
 fs/hpfs/namei.c   | 12 
 fs/hugetlbfs/inode.c  |  3 +--
 fs/inode.c|  7 +++
 fs/isofs/inode.c  |  3 +--
 fs/isofs/rock.c   |  3 +--
 fs/ncpfs/inode.c  |  3 +--
 fs/nfs/inode.c|  4 ++--
 fs/ntfs/inode.c   | 12 
 fs/ntfs/mft.c |  3 +--
 fs/ntfs/super.c   |  3 +--
 fs/ocfs2/refcounttree.c   |  3 +--
 fs/omfs/inode.c   |  3 +--
 fs/pipe.c |  3 +--
 fs/proc/base.c| 15 +--
 fs/proc/fd.c  |  8 
 fs/proc/inode.c   |  3 +--
 fs/proc/self.c|  3 +--
 fs/stack.c|  3 +--
 fs/sysfs/inode.c  |  3 +--
 fs/xfs/xfs_iops.c |  4 ++--
 include/linux/fs.h|  1 +
 ipc/mqueue.c  |  3 +--
 kernel/cgroup.c   |  3 +--
 mm/shmem.c|  3 +--
 net/socket.c  |  3 +--
 53 files changed, 94 insertions(+), 142 deletions(-)

-- 
1.8.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -next] block: fix error return code in parse_parts()

2013-08-22 Thread Wei Yongjun
From: Wei Yongjun 

Fix to return -EINVAL in the parts parse error handling case instead
of 0(may overwrite to 0 by parse_subpart()), as done elsewhere in this
function.

Signed-off-by: Wei Yongjun 
---
 block/cmdline-parser.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/cmdline-parser.c b/block/cmdline-parser.c
index 18fb435..cc2637f 100644
--- a/block/cmdline-parser.c
+++ b/block/cmdline-parser.c
@@ -135,6 +135,7 @@ static int parse_parts(struct cmdline_parts **parts, const 
char *bdevdef)
 
if (!newparts->subpart) {
pr_warn("cmdline partition has no valid partition.");
+   ret = -EINVAL;
goto fail;
}
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/9] target: Make spc_parse_naa_6h_vendor_specific non static

2013-08-22 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch makes spc_parse_naa_6h_vendor_specific() available to
other target code, which is required by EXTENDED_COPY when comparing
the received NAA WWN device identifer for locating the associated
se_device backend.

Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Martin Petersen 
Cc: Chris Mason 
Cc: Roland Dreier 
Cc: Zach Brown 
Cc: James Bottomley 
Cc: Nicholas Bellinger 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_spc.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/target/target_core_spc.c b/drivers/target/target_core_spc.c
index ed7077a..bd16a93 100644
--- a/drivers/target/target_core_spc.c
+++ b/drivers/target/target_core_spc.c
@@ -126,8 +126,8 @@ spc_emulate_evpd_80(struct se_cmd *cmd, unsigned char *buf)
return 0;
 }
 
-static void spc_parse_naa_6h_vendor_specific(struct se_device *dev,
-   unsigned char *buf)
+void spc_parse_naa_6h_vendor_specific(struct se_device *dev,
+ unsigned char *buf)
 {
unsigned char *p = &dev->t10_wwn.unit_serial[0];
int cnt;
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/9] target: Add global device list for EXTENDED_COPY

2013-08-22 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

EXTENDED_COPY needs to be able to search a global list of devices
based on NAA WWN device identifiers, so add a simple g_device_list
protected by g_device_mutex.

Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Martin Petersen 
Cc: Chris Mason 
Cc: Roland Dreier 
Cc: Zach Brown 
Cc: James Bottomley 
Cc: Nicholas Bellinger 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_device.c |   13 +
 include/target/target_core_base.h   |1 +
 2 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/drivers/target/target_core_device.c 
b/drivers/target/target_core_device.c
index de89046..458944e 100644
--- a/drivers/target/target_core_device.c
+++ b/drivers/target/target_core_device.c
@@ -47,6 +47,9 @@
 #include "target_core_pr.h"
 #include "target_core_ua.h"
 
+DEFINE_MUTEX(g_device_mutex);
+LIST_HEAD(g_device_list);
+
 static struct se_hba *lun0_hba;
 /* not static, needed by tpg.c */
 struct se_device *g_lun0_dev;
@@ -1406,6 +1409,7 @@ struct se_device *target_alloc_device(struct se_hba *hba, 
const char *name)
INIT_LIST_HEAD(&dev->delayed_cmd_list);
INIT_LIST_HEAD(&dev->state_list);
INIT_LIST_HEAD(&dev->qf_cmd_list);
+   INIT_LIST_HEAD(&dev->g_dev_node);
spin_lock_init(&dev->stats_lock);
spin_lock_init(&dev->execute_task_lock);
spin_lock_init(&dev->delayed_cmd_lock);
@@ -1525,6 +1529,11 @@ int target_configure_device(struct se_device *dev)
spin_lock(&hba->device_lock);
hba->dev_count++;
spin_unlock(&hba->device_lock);
+
+   mutex_lock(&g_device_mutex);
+   list_add_tail(&dev->g_dev_node, &g_device_list);
+   mutex_unlock(&g_device_mutex);
+
return 0;
 
 out_free_alua:
@@ -1543,6 +1552,10 @@ void target_free_device(struct se_device *dev)
if (dev->dev_flags & DF_CONFIGURED) {
destroy_workqueue(dev->tmr_wq);
 
+   mutex_lock(&g_device_mutex);
+   list_del(&dev->g_dev_node);
+   mutex_unlock(&g_device_mutex);
+
spin_lock(&hba->device_lock);
hba->dev_count--;
spin_unlock(&hba->device_lock);
diff --git a/include/target/target_core_base.h 
b/include/target/target_core_base.h
index 0783b2c..6b14f3c 100644
--- a/include/target/target_core_base.h
+++ b/include/target/target_core_base.h
@@ -686,6 +686,7 @@ struct se_device {
struct list_headdelayed_cmd_list;
struct list_headstate_list;
struct list_headqf_cmd_list;
+   struct list_headg_dev_node;
/* Pointer to associated SE HBA */
struct se_hba   *se_hba;
/* T10 Inquiry and VPD WWN Information */
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/9] target: Add support for EXTENDED_COPY copy offload emulation

2013-08-22 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch adds support for EXTENDED_COPY emulation from SPC-3, that
enables full copy offload target support within both a single virtual
backend device, and across multiple virtual backend devices.  It also
functions independent of target fabric, and supports copy offload
across multiple target fabric ports.

This implemenation supports both EXTENDED_COPY PUSH and PULL models
of operation, so the actual CDB may be received on either source or
desination logical unit.

For Target Descriptors, it currently supports the NAA IEEE Registered
Extended designator (type 0xe4), which allows the reference of target
ports to occur independent of fabric type using EVPD 0x83 WWNs.

For Segment Descriptors, it currently supports copy from block to
block (0x02) mode.

It also honors any present SCSI reservations of the destination target
port.  Note that only Supports No List Identifier (SNLID=1) mode is
supported.

Also included is basic RECEIVE_COPY_RESULTS with service action type
OPERATING PARAMETERS (0x03) required for SNLID=1 operation.

Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Martin Petersen 
Cc: Chris Mason 
Cc: Roland Dreier 
Cc: Zach Brown 
Cc: James Bottomley 
Cc: Nicholas Bellinger 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/Makefile|3 +-
 drivers/target/target_core_xcopy.c | 1122 
 drivers/target/target_core_xcopy.h |   62 ++
 include/target/target_core_base.h  |1 +
 4 files changed, 1187 insertions(+), 1 deletions(-)
 create mode 100644 drivers/target/target_core_xcopy.c
 create mode 100644 drivers/target/target_core_xcopy.h

diff --git a/drivers/target/Makefile b/drivers/target/Makefile
index 9fdcb56..85b012d 100644
--- a/drivers/target/Makefile
+++ b/drivers/target/Makefile
@@ -13,7 +13,8 @@ target_core_mod-y := target_core_configfs.o \
   target_core_spc.o \
   target_core_ua.o \
   target_core_rd.o \
-  target_core_stat.o
+  target_core_stat.o \
+  target_core_xcopy.o
 
 obj-$(CONFIG_TARGET_CORE)  += target_core_mod.o
 
diff --git a/drivers/target/target_core_xcopy.c 
b/drivers/target/target_core_xcopy.c
new file mode 100644
index 000..e0fabea
--- /dev/null
+++ b/drivers/target/target_core_xcopy.c
@@ -0,0 +1,1122 @@
+/***
+ * Filename: target_core_xcopy.c
+ *
+ * This file contains support for SPC-4 Extended-Copy offload with generic
+ * TCM backends.
+ *
+ * Copyright (c) 2011-2013 Datera, Inc. All rights reserved.
+ *
+ * Author:
+ * Nicholas A. Bellinger 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ 
**/
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include "target_core_pr.h"
+#include "target_core_ua.h"
+#include "target_core_xcopy.h"
+
+/* #define XCOPY_DBG_CTL */
+#ifdef XCOPY_DBG_CTL
+#define XCOPY_CTL(x...) printk(KERN_INFO x)
+#else
+#define XCOPY_CTL(x...)
+#endif
+
+/* #define XCOPY_DBG_IO */
+#ifdef XCOPY_DBG_IO
+#define XCOPY_IO(x...) printk(KERN_INFO x)
+#else
+#define XCOPY_IO(x...)
+#endif
+
+static struct workqueue_struct *xcopy_wq = NULL;
+/*
+ * From target_core_spc.c
+ */
+extern void spc_parse_naa_6h_vendor_specific(struct se_device *, unsigned char 
*);
+/*
+ * From target_core_device.c
+ */
+extern struct mutex g_device_mutex;
+extern struct list_head g_device_list;
+/*
+ * From target_core_configfs.c
+ */
+extern struct configfs_subsystem *target_core_subsystem[];
+
+static int target_xcopy_gen_naa_ieee(struct se_device *dev, unsigned char *buf)
+{
+   int off = 0;
+
+   buf[off++] = (0x6 << 4);
+   buf[off++] = 0x01;
+   buf[off++] = 0x40;
+   buf[off] = (0x5 << 4);
+
+   spc_parse_naa_6h_vendor_specific(dev, &buf[off]);
+   return 0;
+}
+
+static int target_xcopy_locate_se_dev_e4(struct se_cmd *se_cmd, struct 
xcopy_op *xop,
+   bool src)
+{
+   struct se_device *se_dev;
+   struct configfs_subsystem *subsys = target_core_subsystem[0];
+   unsigned char tmp_dev_wwn[XCOPY_NAA_IEEE_REGEX_LEN], *dev_wwn;
+   int rc;
+
+   if (src == true)
+   dev_wwn = &xop->dst_tid_ww

[PATCH 3/9] target: Make helpers non static for EXTENDED_COPY command setup

2013-08-22 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

Both transport_generic_get_mem() and transport_generic_map_mem_to_cmd()
are required by EXTENDED_COPY logic when setting up internally
dispatched command descriptors, so go ahead and make both of these
non static.

Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Martin Petersen 
Cc: Chris Mason 
Cc: Roland Dreier 
Cc: Zach Brown 
Cc: James Bottomley 
Cc: Nicholas Bellinger 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_transport.c |5 ++---
 include/target/target_core_backend.h   |4 
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/target/target_core_transport.c 
b/drivers/target/target_core_transport.c
index 3009cda..2f9c402 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -67,7 +67,6 @@ struct kmem_cache *t10_alua_tg_pt_gp_mem_cache;
 static void transport_complete_task_attr(struct se_cmd *cmd);
 static void transport_handle_queue_full(struct se_cmd *cmd,
struct se_device *dev);
-static int transport_generic_get_mem(struct se_cmd *cmd);
 static int transport_put_cmd(struct se_cmd *cmd);
 static void target_complete_ok_work(struct work_struct *work);
 
@@ -1254,7 +1253,7 @@ int transport_handle_cdb_direct(
 }
 EXPORT_SYMBOL(transport_handle_cdb_direct);
 
-static sense_reason_t
+sense_reason_t
 transport_generic_map_mem_to_cmd(struct se_cmd *cmd, struct scatterlist *sgl,
u32 sgl_count, struct scatterlist *sgl_bidi, u32 sgl_bidi_count)
 {
@@ -2164,7 +2163,7 @@ out:
return -ENOMEM;
 }
 
-static int
+int
 transport_generic_get_mem(struct se_cmd *cmd)
 {
u32 length = cmd->data_length;
diff --git a/include/target/target_core_backend.h 
b/include/target/target_core_backend.h
index 77f25e0..9f07231 100644
--- a/include/target/target_core_backend.h
+++ b/include/target/target_core_backend.h
@@ -74,6 +74,10 @@ int  transport_set_vpd_ident(struct t10_vpd *, unsigned char 
*);
 /* core helpers also used by command snooping in pscsi */
 void   *transport_kmap_data_sg(struct se_cmd *);
 void   transport_kunmap_data_sg(struct se_cmd *);
+/* core helpers also used by xcopy during internal command setup */
+inttransport_generic_get_mem(struct se_cmd *);
+sense_reason_t transport_generic_map_mem_to_cmd(struct se_cmd *,
+   struct scatterlist *, u32, struct scatterlist *, u32);
 
 void   array_free(void *array, int n);
 
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/9] target: Make target_core_subsystem defined as non static

2013-08-22 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch makes the top-level target_core_subsystem array available
to other target code, which is required by EXTENDED_COPY to pin the
backend se_device using configfs_depend_item(), in order to ensure
it can't be removed for the duration of a EXTENDED_COPY operation.

Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Martin Petersen 
Cc: Chris Mason 
Cc: Roland Dreier 
Cc: Zach Brown 
Cc: James Bottomley 
Cc: Nicholas Bellinger 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_configfs.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/target/target_core_configfs.c 
b/drivers/target/target_core_configfs.c
index 24517d4..939ecc5 100644
--- a/drivers/target/target_core_configfs.c
+++ b/drivers/target/target_core_configfs.c
@@ -268,7 +268,7 @@ static struct configfs_subsystem target_core_fabrics = {
},
 };
 
-static struct configfs_subsystem *target_core_subsystem[] = {
+struct configfs_subsystem *target_core_subsystem[] = {
&target_core_fabrics,
NULL,
 };
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 00/10] tracing: trace event triggers

2013-08-22 Thread Steven Rostedt
On Thu, 22 Aug 2013 18:27:16 -0500
Tom Zanussi  wrote:

> Hi,
> 
> This is v6 of the trace event triggers patchset.  This is essentially
> the same as v5, but rebased to trace/for-next, which had a couple of
> new conflicting patches pulled in since I had cut v5.  This version
> just fixes up those conflicts.
> 
> v6:
>  - fixed up the conflicts in trace_events.c related to the actual
>creation of the per-event 'trigger' files.

Thanks Tom!

Just to let you know, I wont be able to take a look at these till
Monday.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/9] target: Enable EXTENDED_COPY setup in spc_parse_cdb

2013-08-22 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

Setup up the se_cmd->execute_cmd() pointers for EXTENDED_COPY and
RECEIVE_COPY_RESULTS handling within spc_parse_cdb()

Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Martin Petersen 
Cc: Chris Mason 
Cc: Roland Dreier 
Cc: Zach Brown 
Cc: James Bottomley 
Cc: Nicholas Bellinger 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_spc.c |   10 --
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/target/target_core_spc.c b/drivers/target/target_core_spc.c
index bd16a93..894e83b 100644
--- a/drivers/target/target_core_spc.c
+++ b/drivers/target/target_core_spc.c
@@ -35,7 +35,7 @@
 #include "target_core_alua.h"
 #include "target_core_pr.h"
 #include "target_core_ua.h"
-
+#include "target_core_xcopy.h"
 
 static void spc_fill_alua_data(struct se_port *port, unsigned char *buf)
 {
@@ -1252,8 +1252,14 @@ spc_parse_cdb(struct se_cmd *cmd, unsigned int *size)
*size = (cdb[6] << 24) | (cdb[7] << 16) | (cdb[8] << 8) | 
cdb[9];
break;
case EXTENDED_COPY:
-   case READ_ATTRIBUTE:
+   *size = get_unaligned_be32(&cdb[10]);
+   cmd->execute_cmd = target_do_xcopy;
+   break;
case RECEIVE_COPY_RESULTS:
+   *size = get_unaligned_be32(&cdb[10]);
+   cmd->execute_cmd = target_do_receive_copy_results;
+   break;
+   case READ_ATTRIBUTE:
case WRITE_ATTRIBUTE:
*size = (cdb[10] << 24) | (cdb[11] << 16) |
   (cdb[12] << 8) | cdb[13];
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/9] target: Avoid non-existent tg_pt_gp_mem in target_alua_state_check

2013-08-22 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch adds an check for a non-existent port->sep_alua_tg_pt_gp_mem
within target_alua_state_check(), which is not present for internally
dispatched EXTENDED_COPY WRITE I/O to the destination target port.

Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Martin Petersen 
Cc: Chris Mason 
Cc: Roland Dreier 
Cc: Zach Brown 
Cc: James Bottomley 
Cc: Nicholas Bellinger 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_alua.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/target/target_core_alua.c 
b/drivers/target/target_core_alua.c
index 5403186..ea928c4 100644
--- a/drivers/target/target_core_alua.c
+++ b/drivers/target/target_core_alua.c
@@ -557,6 +557,9 @@ target_alua_state_check(struct se_cmd *cmd)
 * a ALUA logical unit group.
 */
tg_pt_gp_mem = port->sep_alua_tg_pt_gp_mem;
+   if (!tg_pt_gp_mem)
+   return 0;
+
spin_lock(&tg_pt_gp_mem->tg_pt_gp_mem_lock);
tg_pt_gp = tg_pt_gp_mem->tg_pt_gp;
out_alua_state = atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state);
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 9/9] target: Enable global EXTENDED_COPY setup/release

2013-08-22 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

Add calls to target_xcopy_setup_pt() + target_xcopy_release_pt() to
target_core_init_configfs() and target_core_exit_configfs()
respectively.

Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Martin Petersen 
Cc: Chris Mason 
Cc: Roland Dreier 
Cc: Zach Brown 
Cc: James Bottomley 
Cc: Nicholas Bellinger 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_configfs.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/drivers/target/target_core_configfs.c 
b/drivers/target/target_core_configfs.c
index 026e42b..328f425 100644
--- a/drivers/target/target_core_configfs.c
+++ b/drivers/target/target_core_configfs.c
@@ -48,6 +48,7 @@
 #include "target_core_alua.h"
 #include "target_core_pr.h"
 #include "target_core_rd.h"
+#include "target_core_xcopy.h"
 
 extern struct t10_alua_lu_gp *default_lu_gp;
 
@@ -2935,6 +2936,10 @@ static int __init target_core_init_configfs(void)
if (ret < 0)
goto out;
 
+   ret = target_xcopy_setup_pt();
+   if (ret < 0)
+   goto out;
+
return 0;
 
 out:
@@ -3007,6 +3012,7 @@ static void __exit target_core_exit_configfs(void)
 
core_dev_release_virtual_lun0();
rd_module_exit();
+   target_xcopy_release_pt();
release_se_kmem_caches();
 }
 
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] w1: mxc_w1: remove unnecessary platform_set_drvdata()

2013-08-22 Thread Shawn Guo
On Thu, Aug 22, 2013 at 11:20:58AM +0900, Jingoo Han wrote:
> The driver core clears the driver data to NULL after device_release
> or on probe failure. Thus, it is not needed to manually clear the
> device driver data to NULL.
> 
> Signed-off-by: Jingoo Han 

Acked-by: Shawn Guo 

> ---
>  drivers/w1/masters/mxc_w1.c |2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/w1/masters/mxc_w1.c b/drivers/w1/masters/mxc_w1.c
> index 47e12cf..15c7251 100644
> --- a/drivers/w1/masters/mxc_w1.c
> +++ b/drivers/w1/masters/mxc_w1.c
> @@ -152,8 +152,6 @@ static int mxc_w1_remove(struct platform_device *pdev)
>  
>   clk_disable_unprepare(mdev->clk);
>  
> - platform_set_drvdata(pdev, NULL);
> -
>   return 0;
>  }
>  
> -- 
> 1.7.10.4
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 8/9] target: Add Third Party Copy (3PC) bit in INQUIRY response

2013-08-22 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch adds the Third Party Copy (3PC) bit to signal support
for EXTENDED_COPY within standard inquiry response data.

Also add emulate_3pc device attribute in configfs (enabled by default)
to allow the exposure of this bit to be disabled, if necessary.

Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Martin Petersen 
Cc: Chris Mason 
Cc: Roland Dreier 
Cc: Zach Brown 
Cc: James Bottomley 
Cc: Nicholas Bellinger 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_configfs.c |4 
 drivers/target/target_core_device.c   |   14 ++
 drivers/target/target_core_internal.h |1 +
 drivers/target/target_core_spc.c  |6 ++
 include/target/target_core_base.h |3 +++
 5 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/drivers/target/target_core_configfs.c 
b/drivers/target/target_core_configfs.c
index 939ecc5..026e42b 100644
--- a/drivers/target/target_core_configfs.c
+++ b/drivers/target/target_core_configfs.c
@@ -639,6 +639,9 @@ SE_DEV_ATTR(emulate_tpws, S_IRUGO | S_IWUSR);
 DEF_DEV_ATTRIB(emulate_caw);
 SE_DEV_ATTR(emulate_caw, S_IRUGO | S_IWUSR);
 
+DEF_DEV_ATTRIB(emulate_3pc);
+SE_DEV_ATTR(emulate_3pc, S_IRUGO | S_IWUSR);
+
 DEF_DEV_ATTRIB(enforce_pr_isids);
 SE_DEV_ATTR(enforce_pr_isids, S_IRUGO | S_IWUSR);
 
@@ -697,6 +700,7 @@ static struct configfs_attribute 
*target_core_dev_attrib_attrs[] = {
&target_core_dev_attrib_emulate_tpu.attr,
&target_core_dev_attrib_emulate_tpws.attr,
&target_core_dev_attrib_emulate_caw.attr,
+   &target_core_dev_attrib_emulate_3pc.attr,
&target_core_dev_attrib_enforce_pr_isids.attr,
&target_core_dev_attrib_is_nonrot.attr,
&target_core_dev_attrib_emulate_rest_reord.attr,
diff --git a/drivers/target/target_core_device.c 
b/drivers/target/target_core_device.c
index 458944e..6f492c7 100644
--- a/drivers/target/target_core_device.c
+++ b/drivers/target/target_core_device.c
@@ -906,6 +906,19 @@ int se_dev_set_emulate_caw(struct se_device *dev, int flag)
return 0;
 }
 
+int se_dev_set_emulate_3pc(struct se_device *dev, int flag)
+{
+   if (flag != 0 && flag != 1) {
+   pr_err("Illegal value %d\n", flag);
+   return -EINVAL;
+   }
+   dev->dev_attrib.emulate_3pc = flag;
+   pr_debug("dev[%p]: SE Device 3rd Party Copy (EXTENDED_COPY): %d\n",
+   dev, flag);
+
+   return 0;
+}
+
 int se_dev_set_enforce_pr_isids(struct se_device *dev, int flag)
 {
if ((flag != 0) && (flag != 1)) {
@@ -1442,6 +1455,7 @@ struct se_device *target_alloc_device(struct se_hba *hba, 
const char *name)
dev->dev_attrib.emulate_tpu = DA_EMULATE_TPU;
dev->dev_attrib.emulate_tpws = DA_EMULATE_TPWS;
dev->dev_attrib.emulate_caw = DA_EMULATE_CAW;
+   dev->dev_attrib.emulate_3pc = DA_EMULATE_3PC;
dev->dev_attrib.enforce_pr_isids = DA_ENFORCE_PR_ISIDS;
dev->dev_attrib.is_nonrot = DA_IS_NONROT;
dev->dev_attrib.emulate_rest_reord = DA_EMULATE_REST_REORD;
diff --git a/drivers/target/target_core_internal.h 
b/drivers/target/target_core_internal.h
index 805ceb4..579128a 100644
--- a/drivers/target/target_core_internal.h
+++ b/drivers/target/target_core_internal.h
@@ -34,6 +34,7 @@ int   se_dev_set_emulate_tas(struct se_device *, int);
 intse_dev_set_emulate_tpu(struct se_device *, int);
 intse_dev_set_emulate_tpws(struct se_device *, int);
 intse_dev_set_emulate_caw(struct se_device *, int);
+intse_dev_set_emulate_3pc(struct se_device *, int);
 intse_dev_set_enforce_pr_isids(struct se_device *, int);
 intse_dev_set_is_nonrot(struct se_device *, int);
 intse_dev_set_emulate_rest_reord(struct se_device *dev, int);
diff --git a/drivers/target/target_core_spc.c b/drivers/target/target_core_spc.c
index 894e83b..566dd27 100644
--- a/drivers/target/target_core_spc.c
+++ b/drivers/target/target_core_spc.c
@@ -95,6 +95,12 @@ spc_emulate_inquiry_std(struct se_cmd *cmd, unsigned char 
*buf)
 */
spc_fill_alua_data(lun->lun_sep, buf);
 
+   /*
+* Set Third-Party Copy (3PC) bit to indicate support for EXTENDED_COPY
+*/
+   if (dev->dev_attrib.emulate_3pc)
+   buf[5] |= 0x8;
+
buf[7] = 0x2; /* CmdQue=1 */
 
snprintf(&buf[8], 8, "LIO-ORG");
diff --git a/include/target/target_core_base.h 
b/include/target/target_core_base.h
index f54a015..ba9ca79 100644
--- a/include/target/target_core_base.h
+++ b/include/target/target_core_base.h
@@ -99,6 +99,8 @@
 #define DA_EMULATE_TPWS0
 /* Emulation for CompareAndWrite (AtomicTestandSet) by default */
 #define DA_EMULATE_CAW 1
+/* Emulation for 3rd Party Copy (ExtendedCopy) by default */
+#define DA_EMULATE_3PC 1
 /* No Emulation for PSCSI by default */
 #define DA_EMULATE_ALUA0
 /* Enforce SCSI Initiator Port TransportID with 'ISID' 

[PATCH 0/9] target: Add support for EXTENDED_COPY (VAAI) offload emulation

2013-08-22 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

Hi folks!

This series adds support to target-core for generic EXTENDED_COPY offload
emulation as defined by SPC-4 using virtual (IBLOCK, FILEIO, RAMDISK)
backends.

EXTENDED_COPY is a VMWare ESX VAAI primative that is used to perform copy
offload, that allows a target to perform local READ + WRITE I/O requests
for bulk data transfers (cloning a virtual machine for example), instead
of requiring these I/Os to actually be sent to/from the requesting SCSI
initiator port.

This implemenation fully supports copy offload between the same device
backend, and across multiple device backends.  It supports copy offload
transparently across multiple target ports of different fabrics, eg:
iSCSI -> FC, FC -> iSER, iSER -> FCoE and so on.

It also supports both PUSH and PULL models of operation, so the actual
EXTENDED_COPY CDB may be received on either source or destination logical
unit.

For Target Descriptors, it currently supports the NAA IEEE Registered
Extended designator (type 0xe4), which allows the reference of target
ports to occur independent of fabric type using EVPD 0x83 WWNs.  For
Segment Descriptors, it currently supports copy from block to block
(0x02) mode.

Here's a quick snippet of the code in action with sg_xcopy performing
copy offload between two IBLOCK and FILEIO backends:

[  644.638215] Processing XCOPY with list_id: 0x00 list_id_usage: 0x10 tdll: 64 
sdll: 28 inline_dl: 0
[  644.648227] XCOPY 0xe4: RELATIVE INITIATOR PORT IDENTIFIER: 0
[  644.654639] XCOPY 0xe4: desig_len: 16
[  644.658722] XCOPY 0xe4: Set xop->src_dev 88045d77 from source 
received xop
[  644.667179] XCOPY 0xe4: RELATIVE INITIATOR PORT IDENTIFIER: 0
[  644.673597] XCOPY 0xe4: desig_len: 16
[  644.677699] XCOPY 0xe4: Setting xop->dst_dev: 88045d771048 from located 
se_dev
[  644.686297] Called configfs_depend_item for subsys: a00f2570 se_dev: 
88045d771048 se_dev->se_dev_group: 88045d7714f8
[  644.699607] XCOPY TGT desc: Source dev: 88045d77 NAA IEEE WWN: 
0x6001405d2e0745b08564acea3ca401e5
[  644.710296] XCOPY TGT desc: Dest dev: 88045d771048 NAA IEEE WWN: 
0x60014056da9d8672d4b437596ab764b3
[  644.720782] XCOPY: Processed 2 target descriptors, length: 64
[  644.727203] XCOPY seg desc 0x02: desc_len: 24 stdi: 0 dtdi: 1, DC: 2
[  644.734304] XCOPY seg desc 0x02: nolb: 1 src_lba: 0 dst_lba: 0
[  644.740819] XCOPY seg desc 0x02: DC=1 w/ dbl: 0
[  644.745881] XCOPY: Processed 1 segment descriptors, length: 28
[  644.752402] target_xcopy_do_work: nolb: 1, max_nolb: 1024 end_lba: 1
[  644.759504] target_xcopy_do_work: Starting src_lba: 0, dst_lba: 0
[  644.766303] target_xcopy_do_work: Calling read src_dev: 88045d77 
src_lba: 0, cur_nolb: 1
[  644.776115] XCOPY: Built READ_16: LBA: 0 Sectors: 1 Length: 512
[  644.782751] Honoring local SRC port from ec_cmd->se_dev: 88045d77
[  644.790335] Honoring local SRC port from ec_cmd->se_lun: 88085a1977e0
[  644.797921] XCOPY-READ: Saved xop->xop_data_sg: 880459d3e3a8, num: 1 for 
READ memory
[  644.807203] target_xcopy_issue_pt_cmd(): SCSI status: 0x00
[  644.81] target_xcopy_do_work: Incremented READ src_lba to 1
[  644.819947] target_xcopy_do_work: Calling write dst_dev: 88045d771048 
dst_lba: 0, cur_nolb: 1
[  644.829854] XCOPY: Built WRITE_16: LBA: 0 Sectors: 1 Length: 512
[  644.836568] Setup emulated se_dev: 88045d771048 from se_dev
[  644.843185] Setup emulated se_dev: 88045d771048 to 
pt_cmd->se_lun->lun_se_dev
[  644.851545] Setup emulated remote DEST xcopy_pt_port: a00f7610 to 
cmd->se_lun->lun_sep for X-COPY data PUSH
[  644.863198] Setup PASSTHROUGH_NOALLOC t_data_sg: 880459d3e3a8 
t_data_nents: 1
[  644.895203] target_xcopy_issue_pt_cmd(): SCSI status: 0x00
[  644.901332] target_xcopy_do_work: Incremented WRITE dst_lba to 1
[  644.908044] Calling configfs_undepend_item for subsys: a00f2570 
remote_dev: 88045d771048 remote_dev->dev_group: 88045d7714f8
[  644.922129] target_xcopy_do_work: Final src_lba: 1, dst_lba: 1
[  644.928646] target_xcopy_do_work: Blocks copied: 1, Bytes Copied: 512
[  644.935840] target_xcopy_do_work: Setting X-COPY GOOD status -> sending 
response

For all intensive purposes this code is completely standalone, and the amount
of changes required to enable it's function within target-core code is small.

Please review as v3.12 material.

Thank you,

--nab

Nicholas Bellinger (9):
  target: Make target_core_subsystem defined as non static
  target: Make spc_parse_naa_6h_vendor_specific non static
  target: Make helpers non static for EXTENDED_COPY command setup
  target: Add global device list for EXTENDED_COPY
  target: Avoid non-existent tg_pt_gp_mem in target_alua_state_check
  target: Add support for EXTENDED_COPY copy offload emulation
  target: Enable EXTENDED_COPY setup in spc_parse_cdb
  target: Add Third Party Copy (3PC) bit in INQUIRY response
  target: Enable global EXTENDED_COPY setup/release

 drivers/target/Makef

$22.5 Million

2013-08-22 Thread LEUNG CHEUNG



Hello,I have a mutual business for us worth $22.5 Million ,contact me for
details,e-mail at
mr.leungwche...@outlook.com

Mr Cheung

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/4] bus: mvebu: add missing of_node_put() to fix reference leak

2013-08-22 Thread Jisheng Zhang
Add of_node_put to properly decrement the refcount when we are
done using a given node.

Signed-off-by: Jisheng Zhang 
---
 drivers/bus/mvebu-mbus.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/bus/mvebu-mbus.c b/drivers/bus/mvebu-mbus.c
index 33c6947..20da90f 100644
--- a/drivers/bus/mvebu-mbus.c
+++ b/drivers/bus/mvebu-mbus.c
@@ -837,6 +837,7 @@ int __init mvebu_mbus_init(const char *soc, phys_addr_t 
mbuswins_phys_base,
 {
struct mvebu_mbus_state *mbus = &mbus_state;
const struct of_device_id *of_id;
+   struct device_node *np;
int win;
 
for (of_id = of_mvebu_mbus_ids; of_id->compatible; of_id++)
@@ -860,8 +861,11 @@ int __init mvebu_mbus_init(const char *soc, phys_addr_t 
mbuswins_phys_base,
return -ENOMEM;
}
 
-   if (of_find_compatible_node(NULL, NULL, "marvell,coherency-fabric"))
+   np = of_find_compatible_node(NULL, NULL, "marvell,coherency-fabric");
+   if (np) {
mbus->hw_io_coherency = 1;
+   of_node_put(np);
+   }
 
for (win = 0; win < mbus->soc->num_wins; win++)
mvebu_mbus_disable_window(mbus, win);
-- 
1.8.4.rc3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 3/4] clk: mvebu: add missing iounmap

2013-08-22 Thread Jisheng Zhang
Add missing iounmap to setup error path.

Change-Id: I4371569d14d7026aa9f90d7cd53f669d365fe26a
Signed-off-by: Jisheng Zhang 
---
 drivers/clk/mvebu/clk-cpu.c |  4 +++-
 drivers/clk/mvebu/common.c  | 18 --
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/clk/mvebu/clk-cpu.c b/drivers/clk/mvebu/clk-cpu.c
index b0fbc07..1466865 100644
--- a/drivers/clk/mvebu/clk-cpu.c
+++ b/drivers/clk/mvebu/clk-cpu.c
@@ -119,7 +119,7 @@ void __init of_cpu_clk_setup(struct device_node *node)
 
cpuclk = kzalloc(ncpus * sizeof(*cpuclk), GFP_KERNEL);
if (WARN_ON(!cpuclk))
-   return;
+   goto cpuclk_out;
 
clks = kzalloc(ncpus * sizeof(*clks), GFP_KERNEL);
if (WARN_ON(!clks))
@@ -170,6 +170,8 @@ bail_out:
kfree(cpuclk[ncpus].clk_name);
 clks_out:
kfree(cpuclk);
+cpuclk_out:
+   iounmap(clock_complex_base);
 }
 
 CLK_OF_DECLARE(armada_xp_cpu_clock, "marvell,armada-xp-cpu-clock",
diff --git a/drivers/clk/mvebu/common.c b/drivers/clk/mvebu/common.c
index adaa4a1..25ceccf 100644
--- a/drivers/clk/mvebu/common.c
+++ b/drivers/clk/mvebu/common.c
@@ -45,8 +45,10 @@ void __init mvebu_coreclk_setup(struct device_node *np,
clk_data.clk_num = 2 + desc->num_ratios;
clk_data.clks = kzalloc(clk_data.clk_num * sizeof(struct clk *),
GFP_KERNEL);
-   if (WARN_ON(!clk_data.clks))
+   if (WARN_ON(!clk_data.clks)) {
+   iounmap(base);
return;
+   }
 
/* Register TCLK */
of_property_read_string_index(np, "clock-output-names", 0,
@@ -134,7 +136,7 @@ void __init mvebu_clk_gating_setup(struct device_node *np,
 
ctrl = kzalloc(sizeof(*ctrl), GFP_KERNEL);
if (WARN_ON(!ctrl))
-   return;
+   goto ctrl_out;
 
spin_lock_init(&ctrl->lock);
 
@@ -145,10 +147,8 @@ void __init mvebu_clk_gating_setup(struct device_node *np,
ctrl->num_gates = n;
ctrl->gates = kzalloc(ctrl->num_gates * sizeof(struct clk *),
  GFP_KERNEL);
-   if (WARN_ON(!ctrl->gates)) {
-   kfree(ctrl);
-   return;
-   }
+   if (WARN_ON(!ctrl->gates))
+   goto gates_out;
 
for (n = 0; n < ctrl->num_gates; n++) {
const char *parent =
@@ -160,4 +160,10 @@ void __init mvebu_clk_gating_setup(struct device_node *np,
}
 
of_clk_add_provider(np, clk_gating_get_src, ctrl);
+
+   return;
+gates_out:
+   kfree(ctrl);
+ctrl_out:
+   iounmap(base);
 }
-- 
1.8.4.rc3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/4] arm: mvebu: add missing of_node_put() to fix reference leak

2013-08-22 Thread Jisheng Zhang
Add of_node_put to properly decrement the refcount when we are
done using a given node.

Signed-off-by: Jisheng Zhang 
---
 arch/arm/mach-mvebu/armada-370-xp.c | 1 +
 arch/arm/mach-mvebu/coherency.c | 8 +++-
 arch/arm/mach-mvebu/platsmp.c   | 1 +
 arch/arm/mach-mvebu/pmsu.c  | 1 +
 arch/arm/mach-mvebu/system-controller.c | 1 +
 5 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mach-mvebu/armada-370-xp.c 
b/arch/arm/mach-mvebu/armada-370-xp.c
index 97cbb80..8a1ae83 100644
--- a/arch/arm/mach-mvebu/armada-370-xp.c
+++ b/arch/arm/mach-mvebu/armada-370-xp.c
@@ -64,6 +64,7 @@ static void __init armada_370_xp_mbus_init(void)
ARMADA_370_XP_MBUS_WINS_SIZE,
of_translate_address(dn, &sdram_wins_offs),
ARMADA_370_XP_SDRAM_WINS_SIZE);
+   of_node_put(dn);
 }
 
 static void __init armada_370_xp_timer_and_clk_init(void)
diff --git a/arch/arm/mach-mvebu/coherency.c b/arch/arm/mach-mvebu/coherency.c
index 4c24303..58adf2f 100644
--- a/arch/arm/mach-mvebu/coherency.c
+++ b/arch/arm/mach-mvebu/coherency.c
@@ -140,6 +140,7 @@ int __init coherency_init(void)
coherency_base = of_iomap(np, 0);
coherency_cpu_base = of_iomap(np, 1);
set_cpu_coherent(cpu_logical_map(smp_processor_id()), 0);
+   of_node_put(np);
}
 
return 0;
@@ -147,9 +148,14 @@ int __init coherency_init(void)
 
 static int __init coherency_late_init(void)
 {
-   if (of_find_matching_node(NULL, of_coherency_table))
+   struct device_node *np;
+
+   np = of_find_matching_node(NULL, of_coherency_table);
+   if (np) {
bus_register_notifier(&platform_bus_type,
  &mvebu_hwcc_platform_nb);
+   of_node_put(np);
+   }
return 0;
 }
 
diff --git a/arch/arm/mach-mvebu/platsmp.c b/arch/arm/mach-mvebu/platsmp.c
index ce81d30..e7edb82 100644
--- a/arch/arm/mach-mvebu/platsmp.c
+++ b/arch/arm/mach-mvebu/platsmp.c
@@ -95,6 +95,7 @@ static void __init armada_xp_smp_init_cpus(void)
panic("No 'cpus' node found\n");
 
ncores = of_get_child_count(np);
+   of_node_put(np);
if (ncores == 0 || ncores > ARMADA_XP_MAX_CPUS)
panic("Invalid number of CPUs in DT\n");
 
diff --git a/arch/arm/mach-mvebu/pmsu.c b/arch/arm/mach-mvebu/pmsu.c
index 3cc4bef..27fc4f0 100644
--- a/arch/arm/mach-mvebu/pmsu.c
+++ b/arch/arm/mach-mvebu/pmsu.c
@@ -67,6 +67,7 @@ int __init armada_370_xp_pmsu_init(void)
pr_info("Initializing Power Management Service Unit\n");
pmsu_mp_base = of_iomap(np, 0);
pmsu_reset_base = of_iomap(np, 1);
+   of_node_put(np);
}
 
return 0;
diff --git a/arch/arm/mach-mvebu/system-controller.c 
b/arch/arm/mach-mvebu/system-controller.c
index f875124..5175083c 100644
--- a/arch/arm/mach-mvebu/system-controller.c
+++ b/arch/arm/mach-mvebu/system-controller.c
@@ -98,6 +98,7 @@ static int __init mvebu_system_controller_init(void)
BUG_ON(!match);
system_controller_base = of_iomap(np, 0);
mvebu_sc = (struct mvebu_system_controller *)match->data;
+   of_node_put(np);
}
 
return 0;
-- 
1.8.4.rc3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] [BUGFIX] drivers/base: fix show_mem_removable section count

2013-08-22 Thread Russ Anderson
"cat /sys/devices/system/memory/memory*/removable" crashed the system.

The problem is that show_mem_removable() is passing a
bad pfn to is_mem_section_removable(), which causes
if (!node_online(page_to_nid(page))) to blow up.
Why is it passing in a bad pfn?

show_mem_removable() will loop sections_per_block times.
sections_per_block is 16, but mem->section_count is 8
for this memory block.  Changing to loop the actual number
of sections (mem->section_count) fixes the problem.
The assumption that all memory blocks will have the same
sections_per_block is not always true.

I suspect other usages of sections_per_block will also
need to be fixed.

Signed-off-by: Russ Anderson 


The failing output:
---
harp5-sys:~ # cat /sys/devices/system/memory/memory*/removable
0
1
1
1
1
1
1
1
1
1
1
1
1
1
[  372.78] BUG: unable to handle kernel paging request at ea00c320
[  372.119230] IP: [] is_pageblock_removable_nolock+0x1/0x90
[  372.127022] PGD 83ffd4067 PUD 37bdfce067 PMD 0
[  372.132109] Oops:  [#1] SMP
[  372.135730] Modules linked in: autofs4 binfmt_misc rdma_ucm rdma_cm iw_cm 
ib_addr ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_cm ib_uverbs ib_umad 
iw_cxgb3 cxgb3 mdio mlx4_en mlx4_ib ib_sa mlx4_core ib_mthca ib_mad ib_core 
fuse nls_iso8859_1 nls_cp437 vfat fat joydev loop hid_generic usbhid hid 
hwperf(O) numatools(O) dm_mod iTCO_wdt ipv6 iTCO_vendor_support igb i2c_i801 
ioatdma i2c_algo_bit ehci_pci pcspkr lpc_ich i2c_core ehci_hcd ptp sg mfd_core 
dca rtc_cmos pps_core mperf button xhci_hcd sd_mod crc_t10dif usbcore 
usb_common scsi_dh_emc scsi_dh_hp_sw scsi_dh_alua scsi_dh_rdac scsi_dh gru(O) 
xvma(O) xfs crc32c libcrc32c thermal sata_nv processor piix mptsas mptscsih 
scsi_transport_sas mptbase megaraid_sas fan thermal_sys hwmon ext3 jbd ata_piix 
ahci libahci libata scsi_mod
[  372.213536] CPU: 4 PID: 5991 Comm: cat Tainted: G   O 
3.11.0-rc5-rja-uv+ #10
[  372.222173] Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series 
BIOS 01/15/2013
[  372.231391] task: 88081f034580 ti: 880820022000 task.ti: 
880820022000
[  372.239737] RIP: 0010:[]  [] 
is_pageblock_removable_nolock+0x1/0x90
[  372.250229] RSP: 0018:880820023df8  EFLAGS: 00010287
[  372.256151] RAX: 0004 RBX: ea00c320 RCX: 0004
[  372.264111] RDX: ea00c30b RSI: 001c RDI: ea00c320
[  372.272071] RBP: 880820023e38 R08:  R09: 0001
[  372.280030] R10:  R11: 0001 R12: ea00c33c
[  372.287987] R13: 1600 R14: 6db6db6db6db6db7 R15: 0001
[  372.295945] FS:  77fb2700() GS:88083fc8() 
knlGS:
[  372.304970] CS:  0010 DS:  ES:  CR0: 80050033
[  372.311378] CR2: ea00c320 CR3: 00081b954000 CR4: 000407e0
[  372.319335] Stack:
[  372.321575]  880820023e38 81161e94 81d9e940 
0009
[  372.329872]   8817bb97b800 88081e928000 
8817bb97b870
[  372.338167]  880820023e68 813730d1 fffb 
81a97600
[  372.346463] Call Trace:
[  372.349201]  [] ? is_mem_section_removable+0x84/0x110
[  372.356579]  [] show_mem_removable+0x41/0x70
[  372.363094]  [] dev_attr_show+0x2a/0x60
[  372.369122]  [] sysfs_read_file+0xf7/0x1c0
[  372.375441]  [] vfs_read+0xc8/0x130
[  372.381076]  [] SyS_read+0x5d/0xa0
[  372.386624]  [] system_call_fastpath+0x16/0x1b
[  372.393313] Code: 01 00 00 00 e9 3c ff ff ff 90 0f b6 4a 30 44 89 d8 d3 e0 
89 c1 83 e9 01 48 63 c9 49 01 c8 eb 92 66 2e 0f 1f 84 00 00 00 00 00 55 <48> 8b 
0f 49 89 f8 48 89 e5 48 89 ca 48 c1 ea 36 0f a3 15 d8 2f
[  372.415032] RIP  [] is_pageblock_removable_nolock+0x1/0x90
[  372.422905]  RSP 
[  372.426792] CR2: ea00c320
-


---
 drivers/base/memory.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/drivers/base/memory.c
===
--- linux.orig/drivers/base/memory.c2013-08-22 21:16:03.477826999 -0500
+++ linux/drivers/base/memory.c 2013-08-22 21:22:38.885478035 -0500
@@ -140,7 +140,7 @@ static ssize_t show_mem_removable(struct
struct memory_block *mem =
container_of(dev, struct memory_block, dev);
 
-   for (i = 0; i < sections_per_block; i++) {
+   for (i = 0; i < mem->section_count; i++) {
pfn = section_nr_to_pfn(mem->start_section_nr + i);
ret &= is_mem_section_removable(pfn, PAGES_PER_SECTION);
}
-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please

[PATCH v2 0/4] arm: mvebu: fix resource leak

2013-08-22 Thread Jisheng Zhang
These patches try to fix resource leak by adding missing of_node_put(),
iounmap or using devm_ioremap_resource() if available.

v2:
  - use devm_ioremap_resource() as suggested by Ezequiel Garcia
  - use gates_out instead of bail_out as suggested by Mike Turquette

Jisheng Zhang (4):
  arm: mvebu: add missing of_node_put() to fix reference leak
  bus: mvebu: add missing of_node_put() to fix reference leak
  clk: mvebu: add missing iounmap
  pinctrl: mvebu: Convert to use devm_ioremap_resource

 arch/arm/mach-mvebu/armada-370-xp.c |  1 +
 arch/arm/mach-mvebu/coherency.c |  8 +++-
 arch/arm/mach-mvebu/platsmp.c   |  1 +
 arch/arm/mach-mvebu/pmsu.c  |  1 +
 arch/arm/mach-mvebu/system-controller.c |  1 +
 drivers/bus/mvebu-mbus.c|  6 +-
 drivers/clk/mvebu/clk-cpu.c |  4 +++-
 drivers/clk/mvebu/common.c  | 18 --
 drivers/pinctrl/mvebu/pinctrl-mvebu.c   | 11 +--
 9 files changed, 36 insertions(+), 15 deletions(-)

-- 
1.8.4.rc3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 4/4] pinctrl: mvebu: Convert to use devm_ioremap_resource

2013-08-22 Thread Jisheng Zhang
Signed-off-by: Jisheng Zhang 
---
 drivers/pinctrl/mvebu/pinctrl-mvebu.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/pinctrl/mvebu/pinctrl-mvebu.c 
b/drivers/pinctrl/mvebu/pinctrl-mvebu.c
index bb7ddb1..1caa45f 100644
--- a/drivers/pinctrl/mvebu/pinctrl-mvebu.c
+++ b/drivers/pinctrl/mvebu/pinctrl-mvebu.c
@@ -579,7 +579,7 @@ static int mvebu_pinctrl_build_functions(struct 
platform_device *pdev,
 int mvebu_pinctrl_probe(struct platform_device *pdev)
 {
struct mvebu_pinctrl_soc_info *soc = dev_get_platdata(&pdev->dev);
-   struct device_node *np = pdev->dev.of_node;
+   struct resource *res;
struct mvebu_pinctrl *pctl;
void __iomem *base;
struct pinctrl_pin_desc *pdesc;
@@ -591,11 +591,10 @@ int mvebu_pinctrl_probe(struct platform_device *pdev)
return -EINVAL;
}
 
-   base = of_iomap(np, 0);
-   if (!base) {
-   dev_err(&pdev->dev, "unable to get base address\n");
-   return -ENODEV;
-   }
+   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+   base = devm_ioremap_resource(&pdev->dev, res);
+   if (IS_ERR(base))
+   return PTR_ERR(base);
 
pctl = devm_kzalloc(&pdev->dev, sizeof(struct mvebu_pinctrl),
GFP_KERNEL);
-- 
1.8.4.rc3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -next] drm/rcar-du: fix return value check in rcar_du_lvdsenc_get_resources()

2013-08-22 Thread Wei Yongjun
From: Wei Yongjun 

In case of error, the function devm_ioremap_resource() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should be
replaced with IS_ERR(). Also remove the dev_err call to avoid redundant
error message.

Signed-off-by: Wei Yongjun 
---
 drivers/gpu/drm/rcar-du/rcar_du_lvdsenc.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/rcar-du/rcar_du_lvdsenc.c 
b/drivers/gpu/drm/rcar-du/rcar_du_lvdsenc.c
index a0f6a17..f59cbc4 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_lvdsenc.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_lvdsenc.c
@@ -151,11 +151,8 @@ static int rcar_du_lvdsenc_get_resources(struct 
rcar_du_lvdsenc *lvds,
}
 
lvds->mmio = devm_ioremap_resource(&pdev->dev, mem);
-   if (lvds->mmio == NULL) {
-   dev_err(&pdev->dev, "failed to remap memory resource for %s\n",
-   name);
-   return -ENOMEM;
-   }
+   if (IS_ERR(lvds->mmio))
+   return PTR_ERR(lvds->mmio);
 
lvds->clock = devm_clk_get(&pdev->dev, name);
if (IS_ERR(lvds->clock)) {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] iommu: WARN_ON when removing a device with no iommu_group associated

2013-08-22 Thread Wei Yang
When removing a device from the system, iommu_group driver will try to
disconnect it from its group. While in some cases, one device may not
associated with any iommu_group. For example, not enough DMA address space.

In the generic bus notification, it will check dev->iommu_group before calling
iommu_group_remove_device(). While in some cases, developers may call
iommu_group_remove_device() in a different code path and without check. For
those devices with dev->iommu_group set to NULL, kernel will crash.

This patch gives a warning and return when trying to remove a device from an
iommu_group with dev->iommu_group set to NULL. This helps to indicate some bad
behavior and also guard the kernel.

Signed-off-by: Wei Yang 
---
 drivers/iommu/iommu.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index fbe9ca7..43396f0 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -379,6 +379,9 @@ void iommu_group_remove_device(struct device *dev)
struct iommu_group *group = dev->iommu_group;
struct iommu_device *tmp_device, *device = NULL;
 
+   if (WARN_ON(!group))
+   return;
+
/* Pre-notify listeners that a device is being removed. */
blocking_notifier_call_chain(&group->notifier,
 IOMMU_GROUP_NOTIFY_DEL_DEVICE, dev);
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] powerpc/iommu: check dev->iommu_group before remove a device from iommu_group

2013-08-22 Thread Wei Yang
On Thu, Aug 22, 2013 at 10:17:20AM -0600, Alex Williamson wrote:
>On Thu, 2013-08-22 at 23:41 +0800, Wei Yang wrote:
>> >> 
>> >> Alex,
>> >> 
>> >> Sorry for not including you in the very beginning, which may spend you 
>> >> more
>> >> efforts to track previous mails in this thread.
>> >> 
>> >> Do you think it is reasonable to check the dev->iommu_group in
>> >> iommu_group_remove_device()? Or we can count on the bus notifier to check 
>> >> it?
>> >> 
>> >> Welcome your suggestions~
>> >
>> >I don't really see the point of patch 1/2. iommu_group_remove_device()
>> >is specifically to remove a device from an iommu_group, so why would you
>> >call it on a device that's not part of an iommu_group.  If you want to
>> >avoid testing dev->iommu_group, then implement the .remove_device
>> >callback rather than using the notifier.  Thanks,
>> >
>> 
>> You mean the .remove_device like intel_iommu_remove_device()? 
>> 
>> Hmm... this function didn't check the dev->iommu_group and just call
>> iommu_group_remove_device(). I see this guard is put in 
>> iommu_bus_notifier(), 
>> which will check dev->iommu_group before invoke .remove_device.
>> 
>> Let me explain the case to triger the problem a little. 
>> 
>> On some platform, like powernv, we implement another bus notifier when 
>> devices
>> are added or removed in the system. Like Alexey mentioned, he missed the 
>> check
>> for dev->iommu_group in the notifier before removing it from iommu_group. 
>> This
>> trigger the crash.
>> 
>> So do you think it is reasonable to guard the kernel in
>> iommu_group_remove_device(), or we give the platform developers the
>> responsibility to check the dev->iommu_group before calling it?
>
>I don't see it as we need either patch 1/2 or patch 2/2.  We absolutely
>need some form of patch 2/2.  Patch 1/2 isn't necessarily bad, but it
>facilitates sloppy usage.  The iommu driver shouldn't be calling
>iommu_group_remove_device() on arbitrary devices that may or may not be
>part of an iommu_group.  Perhaps patch 1/2 should be:
>
>if (WARN_ON(!group))
>   return;
>

Agree, this one sounds more reasonable. :-)

Since patch 2/2 is merged by Alexey, I will re-send patch 1/2 alone.

Thanks for your comments ~

>Thanks,
>
>Alex
>
>___
>Linuxppc-dev mailing list
>linuxppc-...@lists.ozlabs.org
>https://lists.ozlabs.org/listinfo/linuxppc-dev

-- 
Richard Yang
Help you, Help me

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] DMA: let filter functions of of_dma_simple_xlate possible check of_node

2013-08-22 Thread Richard Zhao
On Fri, Aug 23, 2013 at 04:18:27AM +0800, Stephen Warren wrote:
> On 08/21/2013 11:19 PM, Richard Zhao wrote:
> > On Fri, Aug 02, 2013 at 10:00:00AM +0800, Richard Zhao wrote:
> >> pass of_phandle_args dma_spec to dma_request_channel in 
> >> of_dma_simple_xlate,
> >> so the filter function could access of_node in of_phandle_args.
> >>
> >> It also remove restriction of #dma-cells has to be one.
> >>
> >> Signed-off-by: Richard Zhao 
> >> ---
> >>  drivers/dma/edma.c |  7 +--
> >>  drivers/dma/of-dma.c   | 10 --
> >>  drivers/dma/omap-dma.c |  6 --
> >>  3 files changed, 13 insertions(+), 10 deletions(-)
> >>
> > 
> > Hi Vinod,
> > 
> > Can you please pick up this change?
> > 
> > Hi Stephen,
> > 
> > Can you please give a ack or reviewed-by etc?
> 
> Hmm. Looking at the patch, I'm not sure it's right.
> 
> This patch simply passes all the specfier args to the filter function,
> and the code to check the equality of the of_node to the filter args is
> still duplicated in each DMA driver. Instead, the DMA core should be
> implementing the equality check, and only even calling the
> driver-specific filter function for devices where the client's phandle
> matches the DMA providing device's of_node handle.

Filter function is called in dmaengine core code, independent of dt.
And the reason why the driver has to write its own filter function is
it has to store slave id there in its own way.

Thanks
Richard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] extcon: arizona: Fix up minor coding style to remove unnecessary braces

2013-08-22 Thread Chanwoo Choi
This fixes up braces coding style issue by using checkpatch script.

Cc: Charles Keepax 
Cc: Mark Brown 
Signed-off-by: Chanwoo Choi 
---
 drivers/extcon/extcon-arizona.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/extcon/extcon-arizona.c b/drivers/extcon/extcon-arizona.c
index 08c4590..72fc28e 100644
--- a/drivers/extcon/extcon-arizona.c
+++ b/drivers/extcon/extcon-arizona.c
@@ -564,11 +564,10 @@ static irqreturn_t arizona_hpdet_irq(int irq, void *data)
}
 
ret = arizona_hpdet_read(info);
-   if (ret == -EAGAIN) {
+   if (ret == -EAGAIN)
goto out;
-   } else if (ret < 0) {
+   else if (ret < 0)
goto done;
-   }
reading = ret;
 
/* Reset back to starting range */
@@ -578,11 +577,10 @@ static irqreturn_t arizona_hpdet_irq(int irq, void *data)
   0);
 
ret = arizona_hpdet_do_id(info, &reading, &mic);
-   if (ret == -EAGAIN) {
+   if (ret == -EAGAIN)
goto out;
-   } else if (ret < 0) {
+   else if (ret < 0)
goto done;
-   }
 
/* Report high impedence cables as line outputs */
if (reading >= 5000)
-- 
1.8.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] extcon: class: Remove unnecessary extern declaration

2013-08-22 Thread Chanwoo Choi
This patch remove unnecessary extern declaration (extcon_set_state).
checkpatch found this coding style issue.

Signed-off-by: Chanwoo Choi 
Signed-off-by: Myungjoo Ham 
---
 drivers/extcon/extcon-class.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/extcon/extcon-class.c b/drivers/extcon/extcon-class.c
index 7704a3d..b8589cc 100644
--- a/drivers/extcon/extcon-class.c
+++ b/drivers/extcon/extcon-class.c
@@ -129,7 +129,6 @@ static ssize_t state_show(struct device *dev, struct 
device_attribute *attr,
return count;
 }
 
-int extcon_set_state(struct extcon_dev *edev, u32 state);
 static ssize_t state_store(struct device *dev, struct device_attribute *attr,
   const char *buf, size_t count)
 {
-- 
1.8.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] extcon: Fix up 80 column coding style issues

2013-08-22 Thread Chanwoo Choi
This patch fix 80 column coding sytle issues by using checkpatch script.

Cc: Charles Keepax 
Cc: Mark Brown 
Signed-off-by: Chanwoo Choi 
Signed-off-by: Myungjoo Ham 
---
 drivers/extcon/extcon-arizona.c  |  25 +---
 drivers/extcon/extcon-class.c|   6 +-
 drivers/extcon/extcon-max77693.c | 129 +--
 drivers/extcon/extcon-max8997.c  |   6 +-
 4 files changed, 94 insertions(+), 72 deletions(-)

diff --git a/drivers/extcon/extcon-arizona.c b/drivers/extcon/extcon-arizona.c
index 2064eac..08c4590 100644
--- a/drivers/extcon/extcon-arizona.c
+++ b/drivers/extcon/extcon-arizona.c
@@ -738,8 +738,8 @@ err:
 static void arizona_micd_timeout_work(struct work_struct *work)
 {
struct arizona_extcon_info *info = container_of(work,
-   struct 
arizona_extcon_info,
-   micd_timeout_work.work);
+   struct arizona_extcon_info,
+   micd_timeout_work.work);
 
mutex_lock(&info->lock);
 
@@ -756,8 +756,8 @@ static void arizona_micd_timeout_work(struct work_struct 
*work)
 static void arizona_micd_detect(struct work_struct *work)
 {
struct arizona_extcon_info *info = container_of(work,
-   struct 
arizona_extcon_info,
-   micd_detect_work.work);
+   struct arizona_extcon_info,
+   micd_detect_work.work);
struct arizona *arizona = info->arizona;
unsigned int val = 0, lvl;
int ret, i, key;
@@ -769,7 +769,8 @@ static void arizona_micd_detect(struct work_struct *work)
for (i = 0; i < 10 && !(val & 0x7fc); i++) {
ret = regmap_read(arizona->regmap, ARIZONA_MIC_DETECT_3, &val);
if (ret != 0) {
-   dev_err(arizona->dev, "Failed to read MICDET: %d\n", 
ret);
+   dev_err(arizona->dev,
+   "Failed to read MICDET: %d\n", ret);
mutex_unlock(&info->lock);
return;
}
@@ -777,7 +778,8 @@ static void arizona_micd_detect(struct work_struct *work)
dev_dbg(arizona->dev, "MICDET: %x\n", val);
 
if (!(val & ARIZONA_MICD_VALID)) {
-   dev_warn(arizona->dev, "Microphone detection state 
invalid\n");
+   dev_warn(arizona->dev,
+"Microphone detection state invalid\n");
mutex_unlock(&info->lock);
return;
}
@@ -925,8 +927,8 @@ static irqreturn_t arizona_micdet(int irq, void *data)
 static void arizona_hpdet_work(struct work_struct *work)
 {
struct arizona_extcon_info *info = container_of(work,
-   struct 
arizona_extcon_info,
-   hpdet_work.work);
+   struct arizona_extcon_info,
+   hpdet_work.work);
 
mutex_lock(&info->lock);
arizona_start_hpdet_acc_id(info);
@@ -973,10 +975,13 @@ static irqreturn_t arizona_jackdet(int irq, void *data)
   &info->hpdet_work,
   msecs_to_jiffies(HPDET_DEBOUNCE));
 
-   if (cancelled_mic)
+   if (cancelled_mic) {
+   int micd_timeout = info->micd_timeout;
+
queue_delayed_work(system_power_efficient_wq,
   &info->micd_timeout_work,
-  
msecs_to_jiffies(info->micd_timeout));
+  msecs_to_jiffies(micd_timeout));
+   }
 
goto out;
}
diff --git a/drivers/extcon/extcon-class.c b/drivers/extcon/extcon-class.c
index 1446152..7704a3d 100644
--- a/drivers/extcon/extcon-class.c
+++ b/drivers/extcon/extcon-class.c
@@ -450,7 +450,8 @@ int extcon_register_interest(struct 
extcon_specific_cable_nb *obj,
if (!obj->edev)
return -ENODEV;
 
-   obj->cable_index = extcon_find_cable_index(obj->edev, 
cable_name);
+   obj->cable_index = extcon_find_cable_index(obj->edev,
+ cable_name);
if (obj->cable_index < 0)
return obj->cable_index;
 
@@ -458,7 +459,8 @@ int extcon_register_interest(struct 
extcon_specific_cable_nb *obj,
 
obj->internal_nb.notifier_call = _call_per_cable;
 
-   return raw_notifier_chain_register(&obj->edev->nh, 
&obj->internal_nb);
+

Re: [PATCH v2] DMA: add help function to check whether dma controller registered

2013-08-22 Thread Richard Zhao
On Fri, Aug 23, 2013 at 04:36:53AM +0800, Stephen Warren wrote:
> On 08/22/2013 12:43 AM, Richard Zhao wrote:
> > DMA client device driver usually needs to know at probe time whether
> > dma controller has been registered to deffer probe. So add a help
> > function of_dma_check_controller.
> > 
> > DMA request channel functions can also used to check it, but they
> > are usually called at open() time.
> 
> This new function is almost identical to the existing
> of_dma_request_slave_channel(). Surely the code should be shared?
ofdma->of_dma_xlate(&dma_spec, ofdma);
The above is called holding of_dma_lock. If I want to abstract the
common lines, there' two options.

Option 1:
static struct of_dma* of_dma_check_controller_locked(np, name)
{
parameter check
get dma-names count and check return value
for loop to get of_dma
return PTR_ERR(err) or of_dma
}

struct dma_chan *of_dma_request_slave_channel(struct device_node *np,
  const char *name)
{
chan = null;
mutex_lock(&of_dma_lock);
of_dma = of_dma_check_controller_locked(np, name)
if(!IS_ERR(of_dma))
chan = ofdma->of_dma_xlate(&dma_spec, ofdma);
mutex_unlock(&of_dma_lock);
return chan;
}

int of_dma_check_controller(struct device *dev, const char *name)
{
mutex_lock(&of_dma_lock);
ofdma = of_dma_check_controller_locked(dev->of_node, name);
mutex_unlock(&of_dma_lock);
if (IS_ERR(ofdma))
return ERR_PTR(ofdma);
else
return 0;
}

Option 2:
static struct of_dma* of_dma_check_controller_getlock(np, name)
{
parameter check
get dma-names count and check return value
for loop to get of_dma, get lock at old place
if failed, unlock.
return PTR_ERR(err) or of_dma
}

struct dma_chan *of_dma_request_slave_channel(struct device_node *np,
  const char *name)
{
}   of_dma = of_dma_check_controller_getlock(np, name)
if(!IS_ERR(of_dma)) {
chan = ofdma->of_dma_xlate(&dma_spec, ofdma);
unlock;
}
return chan;
}

int of_dma_check_controller(struct device *dev, const char *name)
ofdma = of_dma_check_controller_locked(dev->of_node, name);

if (IS_ERR(ofdma)) {
return ERR_PTR(ofdma);
} else {
unlock;
return 0;
}
}

> But that said, I don't see any need for a new function; why can't
> drivers simply call of_dma_request_slave_channel() at probe time;
It'll mislead user. channel supposed to be request at open time.

> from
> what I can see, that function doesn't actually request the channel, but
> rather simply looks it up, just like this one. The only difference is
> that of_dma_xlate() is also called, but that's just doing some data
> transformation, not actually recording channel ownership.
xlate function request the channel if things go well.

Thanks
Richard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 0/5] Rework mtime and ctime updates on mmaped writes

2013-08-22 Thread Andy Lutomirski
On 08/22/2013 05:03 PM, Andy Lutomirski wrote:
> Writes via mmap currently update mtime and ctime in ->page_mkwrite.

The subject should be [PATCH v4 0.7]...  Sorry for the cut-and-pasteo.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 10/13] tracing/uprobes: Fetch args before reserving a ring buffer

2013-08-22 Thread Steven Rostedt
On Fri, 23 Aug 2013 07:57:15 +0800
"zhangwei(Jovi)"  wrote:


> > 
> > What about creating a per cpu buffer when uprobes are registered, and
> > delete them when they are finished? Basically what trace_printk() does
> > if it detects that there are users of trace_printk() in the kernel.
> > Note, it does not deallocate them when finished, as it is never
> > finished until reboot ;-)
> > 
> > -- Steve
> >
> I also thought out this approach, but the issue is we cannot fetch user
> memory into per-cpu buffer, because use per-cpu buffer should under
> preempt disabled, and fetching user memory could sleep.

Actually, we could create a per_cpu mutex to match the per_cpu buffers.
This is not unlike what we do in -rt.

int cpu;
struct mutex *mutex;
void *buf;


/*
 * Use per cpu buffers for fastest access, but we might migrate
 * So the mutex makes sure we have sole access to it.
 */

cpu = raw_smp_processor_id();
mutex = per_cpu(uprobe_cpu_mutex, cpu);
buf = per_cpu(uprobe_cpu_buffer, cpu);

mutex_lock(mutex);
store_trace_args(..., buf,...);
mutex_unlock(mutex);

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] vfs: Tighten up linkat(..., AT_EMPTY_PATH)

2013-08-22 Thread Al Viro
On Thu, Aug 22, 2013 at 01:54:15PM -0700, Linus Torvalds wrote:
> On Thu, Aug 22, 2013 at 1:48 PM, Andy Lutomirski  wrote:
> >
> > Sure.  But aren't they always last?
> 
> What do you mean? I'd say that the /proc lookup is always *innermost*.
> Which means that it certainly cannot bail out, since there are many
> levels of nesting outside of it.
> 
> > With the current code structure, trying to enforce some kind of
> > security restriction in the middle of lookup seems really unpleasant.
> 
> If it's conditional (ie "linkat behaves differently from openat"), it
> certainly means that we'd have to pass in that info in annoying ways.

Nope.  All we need to pass is one more LOOKUP_...  Add
if (unlikely(nd->last_type == LAST_BIND)) {
if ((nd->flags & LOOKUP_BLAH) && !may_flink(...)) {
terminate_walk(nd);
return -EINVAL;
}
}
in the beginning of lookup_last() and pass LOOKUP_BLAH in flags when
linkat() calls user_path_at().  That will affect *only* the terminal
symlinks and cost nothing in all normal cases.  The same check can
bloody well go into path_init() - take
if (*name) {
if (!can_lookup(dentry->d_inode)) {
fdput(f);
return -ENOTDIR;
}
}
in there and slap
else {
if ((flags & LOOKUP_BLAH) && !may_flink(...)) {
fdput(f);
return -EINVAL;
}
}
after it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] dm: allow error target to replace either bio-based and request-based targets

2013-08-22 Thread Joe Jin
On 08/23/13 08:17, Mike Snitzer wrote:
> Here is a patch that should work for your needs (I tested it to work
> with 'dmsetup wipe_table' on both request-based and bio-based devices):

This really what I looking for, thanks!

> 
> From: Mike Snitzer 
> Date: Thu, 22 Aug 2013 18:21:38 -0400
> Subject: [PATCH] dm: allow error target to replace either bio-based and 
> request-based targets
> 
> In may be useful to switch a request-based table to the "error" target.
> Enhance the DM core to allow a single hybrid target to be capable of
> handling either bios or requests.
> 
> Add a request-based (.map_rq) member to the error target_type and train
> dm_table_set_type() to prefer the md's established type (request-based
> or bio-based).  If the md doesn't have an established type default to
> making the hybrid target bio-based.

Signed-off-by: Joe Jin 

Thanks,
Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 17/17] clk: zynq: remove call to of_clk_init

2013-08-22 Thread Sören Brinkmann
On Thu, Aug 22, 2013 at 05:26:47PM -0700, Sören Brinkmann wrote:
> Hi Sebastian,
> 
> On Tue, Aug 20, 2013 at 04:04:31AM +0200, Sebastian Hesselbarth wrote:
> > With arch/arm calling of_clk_init(NULL) from time_init(), we can now
> > remove it from corresponding drivers/clk code.
> 
> I think that would break Zynq.
> If I see this correctly you call of_clk_init() from common code,
> _before_ the SOC specific time init function is called.
> The problem is, that we have code setting up a global pointer which is
> required by zynq_clk_setup() which is triggered when of_clk_init() is
> called.
> 
> Let me try to illustrate the current call graph:
> 
> time_init()
>   zynq_timer_init()   // this machines init_time()
>   zynq_slcr_init()// setup System Level Control Registers 
> including a global pointer
>   zynq_clock_init()
>   of_clk_init()
>   zynq_clk_setup()   // requires pointer 
> setup in zynq_slcr_init()
>   ...
> 
> IIUC, your series would change this to:
> time_init()
>   of_clk_init()
>   zynq_clk_setup()// SLCR pointer is not setup/NULL
>   ...
>   zynq_timer_init()
>   zynq_slcr_init()// now the pointer becomes valid

I guess we could move zynq_slcr_init() into init_irq(). I'll give that a
shot tomorrow.

Sören


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Finding out who's holding a lock?

2013-08-22 Thread Andy Lutomirski
My program is occasionally seeing slow page faults.  latencytop says
they're slow because they're waiting for read access to mmap_sem, but
latencytop isn't showing any other thread in the process blocking.

Is there any straightforward way to find out who called down_write on
mmap_sem when down_read is slow?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -next] f2fs: fix error return code in init_f2fs_fs()

2013-08-22 Thread Wei Yongjun
From: Wei Yongjun 

Fix to return -ENOMEM in the kset create and add error handling
case instead of 0, as done elsewhere in this function.

Introduced by commit b59d0bae6ca30c496f298881616258f9cde0d9c6.
(f2fs: add sysfs support for controlling the gc_thread)

Signed-off-by: Wei Yongjun 
---
 fs/f2fs/super.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 66d1ec1..33a809f 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1013,8 +1013,10 @@ static int __init init_f2fs_fs(void)
if (err)
goto fail;
f2fs_kset = kset_create_and_add("f2fs", NULL, fs_kobj);
-   if (!f2fs_kset)
+   if (!f2fs_kset) {
+   err = -ENOMEM;
goto fail;
+   }
err = register_filesystem(&f2fs_fs_type);
if (err)
goto fail;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 17/17] clk: zynq: remove call to of_clk_init

2013-08-22 Thread Sören Brinkmann
Hi Sebastian,

On Tue, Aug 20, 2013 at 04:04:31AM +0200, Sebastian Hesselbarth wrote:
> With arch/arm calling of_clk_init(NULL) from time_init(), we can now
> remove it from corresponding drivers/clk code.

I think that would break Zynq.
If I see this correctly you call of_clk_init() from common code,
_before_ the SOC specific time init function is called.
The problem is, that we have code setting up a global pointer which is
required by zynq_clk_setup() which is triggered when of_clk_init() is
called.

Let me try to illustrate the current call graph:

time_init()
zynq_timer_init()   // this machines init_time()
zynq_slcr_init()// setup System Level Control Registers 
including a global pointer
zynq_clock_init()
of_clk_init()
zynq_clk_setup()   // requires pointer 
setup in zynq_slcr_init()
...

IIUC, your series would change this to:
time_init()
of_clk_init()
zynq_clk_setup()// SLCR pointer is not setup/NULL
...
zynq_timer_init()
zynq_slcr_init()// now the pointer becomes valid

Sören


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] dm: allow error target to replace either bio-based and request-based targets

2013-08-22 Thread Mike Snitzer
On Thu, Aug 22 2013 at  4:19pm -0400,
Mike Snitzer  wrote:

> Hi Joe,
> 
> Unfortunately this isn't going to work properly.  Mikulas suggested a
> new "error-rq" target.
> 
> I do like the idea of a single error target that is hybrid (supports
> both bio-based and request-based) but the DM core would need to be
> updated to support this.
> 
> Specifically, we'd need to check if the device (and active table) is
> already bio-based or request-based and select the appropriate type.  If
> it is a new device, default to selecting bio-based.
> 
> There are some wrappers and other logic thoughout DM core that will need
> auditing too.

Here is a patch that should work for your needs (I tested it to work
with 'dmsetup wipe_table' on both request-based and bio-based devices):

From: Mike Snitzer 
Date: Thu, 22 Aug 2013 18:21:38 -0400
Subject: [PATCH] dm: allow error target to replace either bio-based and 
request-based targets

In may be useful to switch a request-based table to the "error" target.
Enhance the DM core to allow a single hybrid target to be capable of
handling either bios or requests.

Add a request-based (.map_rq) member to the error target_type and train
dm_table_set_type() to prefer the md's established type (request-based
or bio-based).  If the md doesn't have an established type default to
making the hybrid target bio-based.

Cc: Joe Jin 
Cc: Mikulas Patocka 
Signed-off-by: Mike Snitzer 
---
 drivers/md/dm-table.c  |   18 +-
 drivers/md/dm-target.c |9 -
 drivers/md/dm.h|   11 +++
 3 files changed, 36 insertions(+), 2 deletions(-)

Index: linux/drivers/md/dm-table.c
===
--- linux.orig/drivers/md/dm-table.c
+++ linux/drivers/md/dm-table.c
@@ -864,10 +864,26 @@ static int dm_table_set_type(struct dm_t
struct dm_target *tgt;
struct dm_dev_internal *dd;
struct list_head *devices;
+   unsigned live_md_type;
+
+   dm_lock_md_type(t->md);
+   live_md_type = dm_get_md_type(t->md);
+   dm_unlock_md_type(t->md);
 
for (i = 0; i < t->num_targets; i++) {
tgt = t->targets + i;
-   if (dm_target_request_based(tgt))
+   if (dm_target_hybrid(tgt)) {
+   switch (live_md_type) {
+   case DM_TYPE_NONE:
+   case DM_TYPE_BIO_BASED:
+   bio_based = 1;
+   break;
+   case DM_TYPE_REQUEST_BASED:
+   request_based = 1;
+   break;
+   }
+   }
+   else if (dm_target_request_based(tgt))
request_based = 1;
else
bio_based = 1;
Index: linux/drivers/md/dm-target.c
===
--- linux.orig/drivers/md/dm-target.c
+++ linux/drivers/md/dm-target.c
@@ -131,12 +131,19 @@ static int io_err_map(struct dm_target *
return -EIO;
 }
 
+static int io_err_map_rq(struct dm_target *ti, struct request *clone,
+union map_info *map_context)
+{
+   return -EIO;
+}
+
 static struct target_type error_target = {
.name = "error",
-   .version = {1, 1, 0},
+   .version = {1, 2, 0},
.ctr  = io_err_ctr,
.dtr  = io_err_dtr,
.map  = io_err_map,
+   .map_rq = io_err_map_rq,
 };
 
 int __init dm_target_init(void)
Index: linux/drivers/md/dm.h
===
--- linux.orig/drivers/md/dm.h
+++ linux/drivers/md/dm.h
@@ -91,10 +91,21 @@ int dm_setup_md_queue(struct mapped_devi
 #define dm_target_is_valid(t) ((t)->table)
 
 /*
+ * To check whether the target type is bio-based or not (request-based).
+ */
+#define dm_target_bio_based(t) ((t)->type->map != NULL)
+
+/*
  * To check whether the target type is request-based or not (bio-based).
  */
 #define dm_target_request_based(t) ((t)->type->map_rq != NULL)
 
+/*
+ * To check whether the target type is a hybrid (capable of being
+ * either request-based or bio-based).
+ */
+#define dm_target_hybrid(t) (dm_target_bio_based(t) && 
dm_target_request_based(t))
+
 /*-
  * A registry of target types.
  *---*/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 2/7] fs: Add inode_update_time_writable

2013-08-22 Thread Andy Lutomirski
This is like file_update_time, except that it acts on a struct inode *
instead of a struct file *.

Signed-off-by: Andy Lutomirski 
---
 fs/inode.c | 64 +-
 include/linux/fs.h |  1 +
 2 files changed, 50 insertions(+), 15 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index d6dfb09..2bbcb19 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1637,6 +1637,34 @@ int file_remove_suid(struct file *file)
 }
 EXPORT_SYMBOL(file_remove_suid);
 
+/*
+ * This does the work that's common to file_update_time and
+ * inode_update_time.
+ */
+static int prepare_update_cmtime(struct inode *inode, struct timespec *now)
+{
+   int sync_it;
+
+   /* First try to exhaust all avenues to not sync */
+   if (IS_NOCMTIME(inode))
+   return 0;
+
+   *now = current_fs_time(inode->i_sb);
+   if (!timespec_equal(&inode->i_mtime, now))
+   sync_it = S_MTIME;
+
+   if (!timespec_equal(&inode->i_ctime, now))
+   sync_it |= S_CTIME;
+
+   if (IS_I_VERSION(inode))
+   sync_it |= S_VERSION;
+
+   if (!sync_it)
+   return 0;
+
+   return sync_it;
+}
+
 /**
  * file_update_time-   update mtime and ctime time
  * @file: file accessed
@@ -1654,23 +1682,9 @@ int file_update_time(struct file *file)
 {
struct inode *inode = file_inode(file);
struct timespec now;
-   int sync_it = 0;
+   int sync_it = prepare_update_cmtime(inode, &now);
int ret;
 
-   /* First try to exhaust all avenues to not sync */
-   if (IS_NOCMTIME(inode))
-   return 0;
-
-   now = current_fs_time(inode->i_sb);
-   if (!timespec_equal(&inode->i_mtime, &now))
-   sync_it = S_MTIME;
-
-   if (!timespec_equal(&inode->i_ctime, &now))
-   sync_it |= S_CTIME;
-
-   if (IS_I_VERSION(inode))
-   sync_it |= S_VERSION;
-
if (!sync_it)
return 0;
 
@@ -1685,6 +1699,26 @@ int file_update_time(struct file *file)
 }
 EXPORT_SYMBOL(file_update_time);
 
+/**
+ * inode_update_time_writable  -   update mtime and ctime time
+ * @inode: inode accessed
+ *
+ * This is like file_update_time, but it assumes the mnt is
+ * writable and not frozen and takes an inode parameter instead.
+ */
+
+int inode_update_time_writable(struct inode *inode)
+{
+   struct timespec now;
+   int sync_it = prepare_update_cmtime(inode, &now);
+
+   if (!sync_it)
+   return 0;
+
+   return update_time(inode, &now, sync_it);
+}
+EXPORT_SYMBOL(inode_update_time_writable);
+
 int inode_needs_sync(struct inode *inode)
 {
if (IS_SYNC(inode))
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 9818747..86cf0a4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2590,6 +2590,7 @@ extern int inode_newsize_ok(const struct inode *, loff_t 
offset);
 extern void setattr_copy(struct inode *inode, const struct iattr *attr);
 
 extern int file_update_time(struct file *file);
+extern int inode_update_time_writable(struct inode *inode);
 
 extern int generic_show_options(struct seq_file *m, struct dentry *root);
 extern void save_mount_options(struct super_block *sb, char *options);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 0/5] Rework mtime and ctime updates on mmaped writes

2013-08-22 Thread Andy Lutomirski
Writes via mmap currently update mtime and ctime in ->page_mkwrite.
This hurts both throughput and latency.  In workloads that dirty a
large number of mmapped pages, ->page_mkwrite can be hot and
file_update_time is slow and scales poorly.  Updating timestamps can
also sleep, which hurts latency for real-time workloads.

This is also a correctness issue.  SuS says:

The st_ctime and st_mtime fields of a file that is mapped with
MAP_SHARED and PROT_WRITE, will be marked for update at some point
in the interval between a write reference to the mapped region and
the next call to msync() with MS_ASYNC or MS_SYNC for that portion
of the file by any process. If there is no such call, these fields
may be marked for update at any time after a write reference if
the underlying file is modified as a result.

Currently, if the same mmapped page is written twice, the timestamp
may not be update at all after the second write, whereas SuS (and
anything using timestamps to invalidate caches, backup data, etc.)
would expect the timestamp to eventually be updated.

This patchset attempts to fix both issues at once.  It adds a new
address_space flag AS_CMTIME that is set atomically whenever the
system transfers a pte dirty bit to a struct page backed by the
address_space.  This can happen with various locks held and when low
on memory.

Later on, a_ops.update_cmtime_deferred is called to tell the FS to
update cmtime due to a previous mmapped write.

The core changes have no effect on unmodified filesystems.  To opt in,
a filesystem should implement .update_cmtime_deferred (most likely by
using generic_update_cmtime_deferred) and must call either
mapping_flush_cmtime or mapping_test_clear_cmtime in .writepages.
Filesystems should avoid updating timestamps in ->page_mkwrite.

The reason that this is not completely automatic is that filesystems
without backing stores do not really fit in to this model.
Eventually, someone can add support.

I've converted ext4, xfs, and btrfs.  Converting most other
filesystems should be straightforward.

I wrote an xfstest for this.  ext4, xfs, and btrfs pass.  It's here:

https://github.com/amluto/xfstests/commit/5fbb72ac799cc44a9c4c6d3919f00a479202c899

This series is pullable from:

https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/log/?h=mmap_mtime/patch_v4

Changes from v3:
 - The new address space op is now called update_cmtime_deferred.
   Callers take care of protection from fs freezing and checking
   AS_CMTIME.  I fixed a deadlock in the freezer interaction.
 - Block plugs should be handled better.
 - Fixed an infinite loop in msync(MS_ASYNC).
 - Converted xfs and btrfs.
 - Misc minor cleanups.
 - Fixed a corner case: reclaim or migration could have cleaned all
   pages without updating cmtime.

Changes from v2:
 - The core code now interacts with filesystems only through
   address_space ops, so there should be fewer layering issues.
 - MS_ASYNC is handled correctly.

Changes from v1:
 - inode_update_time_writable now locks against the fs freezer.
 - Minor cleanups.
 - Major changelog improvements.

Andy Lutomirski (7):
  mm: Track mappings that have been written via ptes
  fs: Add inode_update_time_writable
  mm: Allow filesystems to defer cmtime updates
  mm: Scan for dirty ptes and update cmtime on MS_ASYNC
  ext4: Defer mmap cmtime updates
  btrfs: Defer mmap cmtime updates
  xfs: Defer mmap cmtime updates

 fs/btrfs/extent_io.c  |  1 +
 fs/btrfs/inode.c  | 32 +-
 fs/buffer.c   |  7 
 fs/ext4/inode.c   | 11 +--
 fs/inode.c| 64 +++-
 fs/xfs/xfs_aops.c |  1 +
 include/linux/fs.h|  9 +
 include/linux/pagemap.h   | 22 +
 include/linux/writeback.h |  1 +
 mm/memory.c   |  7 +++-
 mm/migrate.c  |  2 ++
 mm/mmap.c |  6 +++-
 mm/msync.c| 84 ---
 mm/page-writeback.c   | 53 +-
 mm/rmap.c | 27 +--
 mm/vmscan.c   |  1 +
 16 files changed, 272 insertions(+), 56 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 1/7] mm: Track mappings that have been written via ptes

2013-08-22 Thread Andy Lutomirski
This will allow the mm code to figure out when a file has been
changed through a writable mmap.  Future changes will use this
information to update the file timestamp after writes.

This is handled in core mm code for two reasons:

1. Performance.  Setting a bit directly is faster than an indirect
   call to a vma op.

2. Simplicity.  The cmtime bit is set with lots of mm locks held.
   Rather than making filesystems add a new vm operation that needs
   to be aware of locking, it's easier to just get it right in core
   code.

Signed-off-by: Andy Lutomirski 
---
 include/linux/pagemap.h | 16 
 mm/memory.c |  7 ++-
 mm/rmap.c   | 27 +--
 3 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index e3dea75..9a461ee 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -25,6 +25,7 @@ enum mapping_flags {
AS_MM_ALL_LOCKS = __GFP_BITS_SHIFT + 2, /* under mm_take_all_locks() */
AS_UNEVICTABLE  = __GFP_BITS_SHIFT + 3, /* e.g., ramdisk, SHM_LOCK */
AS_BALLOON_MAP  = __GFP_BITS_SHIFT + 4, /* balloon page special map */
+   AS_CMTIME   = __GFP_BITS_SHIFT + 5, /* cmtime update deferred */
 };
 
 static inline void mapping_set_error(struct address_space *mapping, int error)
@@ -74,6 +75,21 @@ static inline gfp_t mapping_gfp_mask(struct address_space * 
mapping)
return (__force gfp_t)mapping->flags & __GFP_BITS_MASK;
 }
 
+static inline void mapping_set_cmtime(struct address_space * mapping)
+{
+   set_bit(AS_CMTIME, &mapping->flags);
+}
+
+static inline bool mapping_test_cmtime(struct address_space * mapping)
+{
+   return test_bit(AS_CMTIME, &mapping->flags);
+}
+
+static inline bool mapping_test_clear_cmtime(struct address_space * mapping)
+{
+   return test_and_clear_bit(AS_CMTIME, &mapping->flags);
+}
+
 /*
  * This is non-atomic.  Only to be used before the mapping is activated.
  * Probably needs a barrier...
diff --git a/mm/memory.c b/mm/memory.c
index 4026841..1737a90 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1150,8 +1150,13 @@ again:
if (PageAnon(page))
rss[MM_ANONPAGES]--;
else {
-   if (pte_dirty(ptent))
+   if (pte_dirty(ptent)) {
+   struct address_space *mapping =
+   page_mapping(page);
+   if (mapping)
+   mapping_set_cmtime(mapping);
set_page_dirty(page);
+   }
if (pte_young(ptent) &&
likely(!(vma->vm_flags & VM_SEQ_READ)))
mark_page_accessed(page);
diff --git a/mm/rmap.c b/mm/rmap.c
index b2e29ac..2e3fb27 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -928,6 +928,10 @@ static int page_mkclean_file(struct address_space 
*mapping, struct page *page)
}
}
mutex_unlock(&mapping->i_mmap_mutex);
+
+   if (ret)
+   mapping_set_cmtime(mapping);
+
return ret;
 }
 
@@ -1179,6 +1183,19 @@ out:
 }
 
 /*
+ * Mark a page's mapping for future cmtime update.  It's safe to call this
+ * on any page, but it only has any effect if the page is backed by a mapping
+ * that uses mapping_test_clear_cmtime to handle file time updates.  This means
+ * that there's no need to call this on for non-VM_SHARED vmas.
+ */
+static void page_set_cmtime(struct page *page)
+{
+   struct address_space *mapping = page_mapping(page);
+   if (mapping)
+   mapping_set_cmtime(mapping);
+}
+
+/*
  * Subfunctions of try_to_unmap: try_to_unmap_one called
  * repeatedly from try_to_unmap_ksm, try_to_unmap_anon or try_to_unmap_file.
  */
@@ -1219,8 +1236,11 @@ int try_to_unmap_one(struct page *page, struct 
vm_area_struct *vma,
pteval = ptep_clear_flush(vma, address, pte);
 
/* Move the dirty bit to the physical page now the pte is gone. */
-   if (pte_dirty(pteval))
+   if (pte_dirty(pteval)) {
set_page_dirty(page);
+   if (vma->vm_flags & VM_SHARED)
+   page_set_cmtime(page);
+   }
 
/* Update high watermark before we lower rss */
update_hiwater_rss(mm);
@@ -1413,8 +1433,11 @@ static int try_to_unmap_cluster(unsigned long cursor, 
unsigned int *mapcount,
}
 
/* Move the dirty bit to the physical page now the pte is gone. 
*/
-   if (pte_dirty(pteval))
+   if (pte_dirty(pteval)) {
set_page_dirty(page);
+   if (vma->vm_flags & VM_SHARED)
+   page_set_cmtime(page);
+   }
 
pa

[PATCH v4 3/7] mm: Allow filesystems to defer cmtime updates

2013-08-22 Thread Andy Lutomirski
Filesystems that defer cmtime updates should update cmtime when any
of these events happen after a write via a mapping:

 - The mapping is written back to disk.  This happens from all kinds
   of places, most of which eventually call ->writepages.  (The
   exceptions are vmscan and migration.)

 - munmap is called or the mapping is removed when the process exits

 - msync(MS_ASYNC) is called.  Linux currently does nothing for
   msync(MS_ASYNC), but POSIX says that cmtime should be updated some
   time between an mmaped write and the subsequent msync call.
   MS_SYNC calls ->writepages, but MS_ASYNC needs special handling.

Filesystems are responsible for checking for pending deferred cmtime
updates in .writepages (a helper is provided for this purpose) and
for doing the actual update in .update_cmtime_deferred.

These changes have no effect by themselves; filesystems must opt in
by implementing .update_cmtime_deferred and removing any
file_update_time call in .page_mkwrite.

This patch does not implement the MS_ASYNC case; that's in the next
patch.

Signed-off-by: Andy Lutomirski 
---
 include/linux/fs.h|  8 +++
 include/linux/pagemap.h   |  6 ++
 include/linux/writeback.h |  1 +
 mm/migrate.c  |  2 ++
 mm/mmap.c |  6 +-
 mm/page-writeback.c   | 53 ++-
 mm/vmscan.c   |  1 +
 7 files changed, 75 insertions(+), 2 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 86cf0a4..f6b0f8b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -350,6 +350,14 @@ struct address_space_operations {
/* Write back some dirty pages from this mapping. */
int (*writepages)(struct address_space *, struct writeback_control *);
 
+   /*
+* Called when a deferred cmtime update should be applied.
+* Implementations should update cmtime.  (As an optional
+* optimization, implementaions can call mapping_test_clear_cmtime
+* from writepages as well.)
+*/
+   void (*update_cmtime_deferred)(struct address_space *);
+
/* Set a page dirty.  Return true if this dirtied it */
int (*set_page_dirty)(struct page *page);
 
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 9a461ee..2647a13 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -90,6 +90,12 @@ static inline bool mapping_test_clear_cmtime(struct 
address_space * mapping)
return test_and_clear_bit(AS_CMTIME, &mapping->flags);
 }
 
+/* Use this one in writepages, etc. */
+extern void mapping_flush_cmtime(struct address_space * mapping);
+
+/* Use this one outside writeback. */
+extern void mapping_flush_cmtime_nowb(struct address_space * mapping);
+
 /*
  * This is non-atomic.  Only to be used before the mapping is activated.
  * Probably needs a barrier...
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 4e198ca..efe4970 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -174,6 +174,7 @@ typedef int (*writepage_t)(struct page *page, struct 
writeback_control *wbc,
 
 int generic_writepages(struct address_space *mapping,
   struct writeback_control *wbc);
+void generic_update_cmtime_deferred(struct address_space *mapping);
 void tag_pages_for_writeback(struct address_space *mapping,
 pgoff_t start, pgoff_t end);
 int write_cache_pages(struct address_space *mapping,
diff --git a/mm/migrate.c b/mm/migrate.c
index 6f0c244..e4124e2 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -627,6 +627,8 @@ static int writeout(struct address_space *mapping, struct 
page *page)
/* unlocked. Relock */
lock_page(page);
 
+   mapping_flush_cmtime(mapping);
+
return (rc < 0) ? -EIO : -EAGAIN;
 }
 
diff --git a/mm/mmap.c b/mm/mmap.c
index 1edbaa3..189eb7a 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1,3 +1,4 @@
+
 /*
  * mm/mmap.c
  *
@@ -249,8 +250,11 @@ static struct vm_area_struct *remove_vma(struct 
vm_area_struct *vma)
might_sleep();
if (vma->vm_ops && vma->vm_ops->close)
vma->vm_ops->close(vma);
-   if (vma->vm_file)
+   if (vma->vm_file) {
+   if ((vma->vm_flags & VM_SHARED) && vma->vm_file->f_mapping)
+   mapping_flush_cmtime_nowb(vma->vm_file->f_mapping);
fput(vma->vm_file);
+   }
mpol_put(vma_policy(vma));
kmem_cache_free(vm_area_cachep, vma);
return next;
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 3f0c895..4ec8c02 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1912,12 +1912,30 @@ int generic_writepages(struct address_space *mapping,
 
blk_start_plug(&plug);
ret = write_cache_pages(mapping, wbc, __writepage, mapping);
+   mapping_flush_cmtime(mapping);
blk_finish_plug(&plug);
return ret;
 }
-
 EXPORT_SYMBOL(generic_writepage

[PATCH v4 4/7] mm: Scan for dirty ptes and update cmtime on MS_ASYNC

2013-08-22 Thread Andy Lutomirski
This is probably unimportant but improves POSIX compliance.

Signed-off-by: Andy Lutomirski 
---
 mm/msync.c | 84 ++
 1 file changed, 73 insertions(+), 11 deletions(-)

diff --git a/mm/msync.c b/mm/msync.c
index 632df45..a2ee43c 100644
--- a/mm/msync.c
+++ b/mm/msync.c
@@ -13,13 +13,16 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 /*
  * MS_SYNC syncs the entire file - including mappings.
  *
  * MS_ASYNC does not start I/O (it used to, up to 2.5.67).
  * Nor does it marks the relevant pages dirty (it used to up to 2.6.17).
- * Now it doesn't do anything, since dirty pages are properly tracked.
+ * Now all it does is ensure that file timestamps get updated, since POSIX
+ * requires it.  We track dirty pages correct without MS_ASYNC.
  *
  * The application may now run fsync() to
  * write out the dirty pages and wait on the writeout and check the result.
@@ -28,6 +31,54 @@
  * So by _not_ starting I/O in MS_ASYNC we provide complete flexibility to
  * applications.
  */
+
+static int msync_async_range(struct vm_area_struct *vma,
+ unsigned long *start, unsigned long end)
+{
+   struct mm_struct *mm;
+   int iters = 0;
+
+   while (*start < end && *start < vma->vm_end && iters < 128) {
+   unsigned int page_mask, page_increm;
+
+   /*
+* Require that the pte is writable (because otherwise
+* it can't be dirty, so there's nothing to clean).
+*
+* In theory we could check the pte dirty bit, but this is
+* awkward and barely worth it.
+*/
+   struct page *page = follow_page_mask(vma, *start,
+FOLL_GET | FOLL_WRITE,
+&page_mask);
+
+   if (page && !IS_ERR(page)) {
+   if (lock_page_killable(page) == 0) {
+   page_mkclean(page);
+   unlock_page(page);
+   }
+   put_page(page);
+   }
+
+   if (IS_ERR(page))
+   return PTR_ERR(page);
+
+   page_increm = 1 + (~(*start >> PAGE_SHIFT) & page_mask);
+   *start += page_increm * PAGE_SIZE;
+   cond_resched();
+   iters++;
+   }
+
+   /* XXX: try to do this only once? */
+   mapping_flush_cmtime_nowb(vma->vm_file->f_mapping);
+
+   /* Give mmap_sem writers a chance. */
+   mm = current->mm;
+   up_read(&mm->mmap_sem);
+   down_read(&mm->mmap_sem);
+   return 0;
+}
+
 SYSCALL_DEFINE3(msync, unsigned long, start, size_t, len, int, flags)
 {
unsigned long end;
@@ -77,18 +128,29 @@ SYSCALL_DEFINE3(msync, unsigned long, start, size_t, len, 
int, flags)
goto out_unlock;
}
file = vma->vm_file;
-   start = vma->vm_end;
-   if ((flags & MS_SYNC) && file &&
-   (vma->vm_flags & VM_SHARED)) {
-   get_file(file);
-   up_read(&mm->mmap_sem);
-   error = vfs_fsync(file, 0);
-   fput(file);
-   if (error || start >= end)
-   goto out;
-   down_read(&mm->mmap_sem);
+   if (file && vma->vm_flags & VM_SHARED) {
+   if (flags & MS_SYNC) {
+   start = vma->vm_end;
+   get_file(file);
+   up_read(&mm->mmap_sem);
+   error = vfs_fsync(file, 0);
+   fput(file);
+   if (error || start >= end)
+   goto out;
+   down_read(&mm->mmap_sem);
+   } else if ((vma->vm_flags & VM_WRITE) &&
+  file->f_mapping) {
+   error = msync_async_range(vma, &start, end);
+   if (error || start >= end)
+   goto out_unlock;
+   } else {
+   start = vma->vm_end;
+   if (start >= end)
+   goto out_unlock;
+   }
vma = find_vma(mm, start);
} else {
+   start = vma->vm_end;
if (start >= end) {
error = 0;
goto out_unlock;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.ke

[PATCH v4 5/7] ext4: Defer mmap cmtime updates

2013-08-22 Thread Andy Lutomirski
Signed-off-by: Andy Lutomirski 
---
 fs/ext4/inode.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index dd32a2e..2cb2961 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2382,8 +2382,11 @@ static int ext4_writepages(struct address_space *mapping,
 * a transaction for special inodes like journal inode on last iput()
 * because that could violate lock ordering on umount
 */
-   if (!mapping->nrpages || !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
+   if (!mapping->nrpages ||
+   !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) {
+   mapping_flush_cmtime(mapping);
return 0;
+   }
 
if (ext4_should_journal_data(inode)) {
struct blk_plug plug;
@@ -2391,6 +2394,7 @@ static int ext4_writepages(struct address_space *mapping,
 
blk_start_plug(&plug);
ret = write_cache_pages(mapping, wbc, __writepage, mapping);
+   mapping_flush_cmtime(mapping);
blk_finish_plug(&plug);
return ret;
}
@@ -2525,6 +2529,7 @@ retry:
if (ret)
break;
}
+   mapping_flush_cmtime(mapping);
blk_finish_plug(&plug);
if (!ret && !cycled) {
cycled = 1;
@@ -3238,6 +3243,7 @@ static const struct address_space_operations ext4_aops = {
.writepages = ext4_writepages,
.write_begin= ext4_write_begin,
.write_end  = ext4_write_end,
+   .update_cmtime_deferred = generic_update_cmtime_deferred,
.bmap   = ext4_bmap,
.invalidatepage = ext4_invalidatepage,
.releasepage= ext4_releasepage,
@@ -3254,6 +3260,7 @@ static const struct address_space_operations 
ext4_journalled_aops = {
.writepages = ext4_writepages,
.write_begin= ext4_write_begin,
.write_end  = ext4_journalled_write_end,
+   .update_cmtime_deferred = generic_update_cmtime_deferred,
.set_page_dirty = ext4_journalled_set_page_dirty,
.bmap   = ext4_bmap,
.invalidatepage = ext4_journalled_invalidatepage,
@@ -3270,6 +3277,7 @@ static const struct address_space_operations ext4_da_aops 
= {
.writepages = ext4_writepages,
.write_begin= ext4_da_write_begin,
.write_end  = ext4_da_write_end,
+   .update_cmtime_deferred = generic_update_cmtime_deferred,
.bmap   = ext4_bmap,
.invalidatepage = ext4_da_invalidatepage,
.releasepage= ext4_releasepage,
@@ -5025,7 +5033,6 @@ int ext4_page_mkwrite(struct vm_area_struct *vma, struct 
vm_fault *vmf)
int retries = 0;
 
sb_start_pagefault(inode->i_sb);
-   file_update_time(vma->vm_file);
/* Delalloc case is easy... */
if (test_opt(inode->i_sb, DELALLOC) &&
!ext4_should_journal_data(inode) &&
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 7/7] xfs: Defer mmap cmtime updates

2013-08-22 Thread Andy Lutomirski
This involves a change to block_page_mkwrite.  xfs is the only user.

Signed-off-by: Andy Lutomirski 
---
 fs/buffer.c   | 7 ---
 fs/xfs/xfs_aops.c | 1 +
 2 files changed, 1 insertion(+), 7 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 4d74335..408677c 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2431,13 +2431,6 @@ int block_page_mkwrite(struct vm_area_struct *vma, 
struct vm_fault *vmf,
struct super_block *sb = file_inode(vma->vm_file)->i_sb;
 
sb_start_pagefault(sb);
-
-   /*
-* Update file times before taking page lock. We may end up failing the
-* fault so this update may be superfluous but who really cares...
-*/
-   file_update_time(vma->vm_file);
-
ret = __block_page_mkwrite(vma, vmf, get_block);
sb_end_pagefault(sb);
return block_page_mkwrite_return(ret);
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 596ec71..aa8fbcf 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -1668,6 +1668,7 @@ const struct address_space_operations 
xfs_address_space_operations = {
.readpages  = xfs_vm_readpages,
.writepage  = xfs_vm_writepage,
.writepages = xfs_vm_writepages,
+   .update_cmtime_deferred = generic_update_cmtime_deferred,
.releasepage= xfs_vm_releasepage,
.invalidatepage = xfs_vm_invalidatepage,
.write_begin= xfs_vm_write_begin,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 6/7] btrfs: Defer mmap cmtime updates

2013-08-22 Thread Andy Lutomirski
Signed-off-by: Andy Lutomirski 
---
 fs/btrfs/extent_io.c |  1 +
 fs/btrfs/inode.c | 32 
 2 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index fe443fe..dc2f851 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3756,6 +3756,7 @@ int extent_writepages(struct extent_io_tree *tree,
   __extent_writepage, &epd,
   flush_write_bio);
flush_epd_write_bio(&epd);
+   mapping_flush_cmtime(mapping);
return ret;
 }
 
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 021694c..fc51380 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7499,10 +7499,8 @@ int btrfs_page_mkwrite(struct vm_area_struct *vma, 
struct vm_fault *vmf)
 
sb_start_pagefault(inode->i_sb);
ret  = btrfs_delalloc_reserve_space(inode, PAGE_CACHE_SIZE);
-   if (!ret) {
-   ret = file_update_time(vma->vm_file);
+   if (!ret)
reserved = 1;
-   }
if (ret) {
if (ret == -ENOMEM)
ret = VM_FAULT_OOM;
@@ -8711,22 +8709,24 @@ static struct extent_io_ops btrfs_extent_io_ops = {
  * For now we're avoiding this by dropping bmap.
  */
 static const struct address_space_operations btrfs_aops = {
-   .readpage   = btrfs_readpage,
-   .writepage  = btrfs_writepage,
-   .writepages = btrfs_writepages,
-   .readpages  = btrfs_readpages,
-   .direct_IO  = btrfs_direct_IO,
-   .invalidatepage = btrfs_invalidatepage,
-   .releasepage= btrfs_releasepage,
-   .set_page_dirty = btrfs_set_page_dirty,
-   .error_remove_page = generic_error_remove_page,
+   .readpage   = btrfs_readpage,
+   .writepage  = btrfs_writepage,
+   .writepages = btrfs_writepages,
+   .update_cmtime_deferred = generic_update_cmtime_deferred,
+   .readpages  = btrfs_readpages,
+   .direct_IO  = btrfs_direct_IO,
+   .invalidatepage = btrfs_invalidatepage,
+   .releasepage= btrfs_releasepage,
+   .set_page_dirty = btrfs_set_page_dirty,
+   .error_remove_page  = generic_error_remove_page,
 };
 
 static const struct address_space_operations btrfs_symlink_aops = {
-   .readpage   = btrfs_readpage,
-   .writepage  = btrfs_writepage,
-   .invalidatepage = btrfs_invalidatepage,
-   .releasepage= btrfs_releasepage,
+   .readpage   = btrfs_readpage,
+   .writepage  = btrfs_writepage,
+   .update_cmtime_deferred = generic_update_cmtime_deferred,
+   .invalidatepage = btrfs_invalidatepage,
+   .releasepage= btrfs_releasepage,
 };
 
 static const struct inode_operations btrfs_file_inode_operations = {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] PCI: avoid NULL deref in alloc_pcie_link_state

2013-08-22 Thread Bjorn Helgaas
On Thu, Aug 08, 2013 at 03:57:07PM +0200, Radim Krčmář wrote:
> PCIe switch can be connected directly to the PCIe root complex in QEMU;
> ASPM does not expect this topology and dereferences NULL pointer when
> initializing.
> 
> Downstream port can be also connected to the root complex without
> upstream one, so code checks for both, otherwise they dereference NULL
> on line drivers/pci/pcie/aspm.c:530 (alloc_pcie_link_state+13):
>   parent = pdev->bus->parent->self->link_state;
> "pdev->bus->parent->self == NULL" if upstream port is connected directly
> to the root bus and "pdev->bus->parent == NULL" in the second case.
> 
> v1 -> v2: (https://lkml.org/lkml/2013/6/19/753)
>  - Initialization is aborted in pcie_aspm_init_link_state, where other
>special cases are being handled
>  - pci_is_root_bus is used
>  - Warning is printed
> 
> Reproducer for "downstream -- root" and "downstream -- upstream -- root"
> (used qemu-kvm 1.5, q35 machine type might be missing on older ones)
> 
>   for parent in pcie.0 upstream; do
>qemu-kvm -m 128 -M q35 -nographic -no-reboot \
>  -device x3130-upstream,bus=pcie.0,id=upstream \
>  -device xio3130-downstream,bus=$parent,id=downstream,chassis=1 \
>  -device virtio-blk-pci,bus=downstream,id=virtio-zero,drive=zero \
>  -drive  file=/dev/zero,id=zero,format=raw \
>  -kernel bzImage -append "console=ttyS0 panic=3" # pcie_aspm=off
>   done
> 
> ASPM in QEMU works if we connect upstream through root port
>   -device ioh3420,bus=pcie.0,id=root.0 \
>   -device x3130-upstream,bus=root.0,id=upstream
> 
> Signed-off-by: Radim Krčmář 
> ---
>  drivers/pci/pcie/aspm.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 403a443..209cd7f 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -570,6 +570,15 @@ void pcie_aspm_init_link_state(struct pci_dev *pdev)
>   pdev->bus->self)
>   return;
>  
> + /* We require at least two ports between downstream and root bus */
> + if (pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM &&
> + (pci_is_root_bus(pdev->bus) ||
> +  pci_is_root_bus(pdev->bus->parent))) {
> + dev_warn(&pdev->dev, "ASPM disabled"
> +  " (connected directly to root bus)\n");
> + return;
> + }

I don't really want to detect invalid topologies piecemeal -- we will
likely find other areas (MPS, AER, link speed management, etc.) that
have similar dependencies.  I'd rather do it generically, maybe with
something like the following patch.

I tried this with the following qemu invocation:

$ /usr/local/bin/qemu-system-x86_64 -M q35 -enable-kvm -m 512   -device 
x3130-upstream,bus=pcie.0,id=upstream   -device 
xio3130-downstream,bus=upstream,id=downstream,chassis=1   -drive 
file=ubuntu.img,if=none,id=mydisk   -device ide-drive,drive=mydisk,bus=ide.0   
-drive file=scratch.img,id=disk1 -device 
virtio-blk-pci,bus=downstream,id=virtio-disk1,drive=disk1 -nographic -kernel 
~/linux/arch/x86/boot/bzImage   -append "console=ttyS0,115200n8 root=/dev/sda1 
ignore_loglevel"

With unmodified v3.11-rc2, I see the NULL pointer dereference, but with
the patch below applied, we just ignore the 00:03.0 device and the kernel
boots fine.

Bjorn

---
 arch/powerpc/kernel/pci_of_scan.c |7 ++-
 arch/sparc/kernel/pci.c   |7 ++-
 drivers/pci/probe.c   |   37 +
 include/linux/pci.h   |2 +-
 4 files changed, 46 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/pci_of_scan.c 
b/arch/powerpc/kernel/pci_of_scan.c
index 6b0ba58..f6ef4dd 100644
--- a/arch/powerpc/kernel/pci_of_scan.c
+++ b/arch/powerpc/kernel/pci_of_scan.c
@@ -143,7 +143,6 @@ struct pci_dev *of_create_pci_dev(struct device_node *node,
dev->devfn = devfn;
dev->multifunction = 0; /* maybe a lie? */
dev->needs_freset = 0;  /* pcie fundamental reset required */
-   set_pcie_port_type(dev);
 
list_for_each_entry(slot, &dev->bus->slots, list)
if (PCI_SLOT(dev->devfn) == slot->number)
@@ -164,6 +163,12 @@ struct pci_dev *of_create_pci_dev(struct device_node *node,
pr_debug("class: 0x%x\n", dev->class);
pr_debug("revision: 0x%x\n", dev->revision);
 
+   if (set_pcie_port_type(dev)) {
+   pci_bus_put(dev->bus);
+   kfree(dev);
+   return NULL;
+   }
+
dev->current_state = PCI_UNKNOWN;   /* unknown power state */
dev->error_state = pci_channel_io_normal;
dev->dma_mask = 0x;
diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c
index bc4d3f5..5600849 100644
--- a/arch/sparc/kernel/pci.c
+++ b/arch/sparc/kernel/pci.c
@@ -287,7 +287,6 @@ static struct pci_dev *of_create_pci_dev(struct 
pci_pbm_info *pbm,
dev->dev.of_node = of_node_get(

Re: [PATCH 10/13] tracing/uprobes: Fetch args before reserving a ring buffer

2013-08-22 Thread zhangwei(Jovi)
On 2013/8/23 0:42, Steven Rostedt wrote:
> On Fri, 09 Aug 2013 18:56:54 +0900
> Masami Hiramatsu  wrote:
> 
>> (2013/08/09 17:45), Namhyung Kim wrote:
>>> From: Namhyung Kim 
>>>
>>> Fetching from user space should be done in a non-atomic context.  So
>>> use a temporary buffer and copy its content to the ring buffer
>>> atomically.
>>>
>>> While at it, use __get_data_size() and store_trace_args() to reduce
>>> code duplication.
>>
>> I just concern using kmalloc() in the event handler. For fetching user
>> memory which can be swapped out, that is true. But most of the cases,
>> we can presume that it exists on the physical memory.
>>
> 
> 
> What about creating a per cpu buffer when uprobes are registered, and
> delete them when they are finished? Basically what trace_printk() does
> if it detects that there are users of trace_printk() in the kernel.
> Note, it does not deallocate them when finished, as it is never
> finished until reboot ;-)
> 
> -- Steve
>
I also thought out this approach, but the issue is we cannot fetch user
memory into per-cpu buffer, because use per-cpu buffer should under
preempt disabled, and fetching user memory could sleep.

jovi.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [RFC PATCH v2 04/11] pstore: Add compression support to pstore

2013-08-22 Thread Seiji Aguchi


> -Original Message-
> From: Luck, Tony [mailto:tony.l...@intel.com]
> Sent: Thursday, August 22, 2013 7:17 PM
> To: Seiji Aguchi; Aruna Balakrishnaiah; linuxppc-...@ozlabs.org; 
> linux-kernel@vger.kernel.org; keesc...@chromium.org
> Cc: jkeni...@linux.vnet.ibm.com; ana...@in.ibm.com; b...@kernel.crashing.org; 
> cbouatmai...@gmail.com;
> mah...@linux.vnet.ibm.com; ccr...@android.com
> Subject: RE: [RFC PATCH v2 04/11] pstore: Add compression support to pstore
> 
> <1>[  383.209057] RIP  [] sysrq_handle_crash+0x16/0x20
> <4>[  383.209057]  RSP 
> <4>[  383.209057] CR2: 
> <4>[  383.209057] ---[ end trace 04a1cddad37b4b33 ]---
> <3>[  383.209057] pstore: compression failed for Part 2 returned -5
> <3>[  383.209057] pstore: Capture uncompressed oops/panic report of Part 2
> <3>[  383.209057] pstore: compression failed for Part 5 returned -5
> 
> Interesting.  With ERST backend I didn't see these messages.  Traces in
> pstore recovered files go as far as the line before the "---[ end trace 
> 04a1cddad37b4b33 ]---"
> 
> Why the difference depending on which back end is in use?

I think the difference doesn't depend on the back end.
Rather it depends on the environment.

I tested on a kvm guest with OVMF.

Seiji


> 
> But I agree that we shouldn't have these messages.  They use up space
> in the persistent store that could be better used saving some more lines
> from earlier in the console log.
> 
> -Tony


Re: [PATCH 1/2] tick: broadcast: Deny per-cpu clockevents from being broadcast sources

2013-08-22 Thread Sören Brinkmann
On Thu, Aug 22, 2013 at 10:06:40AM -0700, Stephen Boyd wrote:
> On most ARM systems the per-cpu clockevents are truly per-cpu in
> the sense that they can't be controlled on any other CPU besides
> the CPU that they interrupt. If one of these clockevents were to
> become a broadcast source we will run into a lot of trouble
> because the broadcast source is enabled on the first CPU to go
> into deep idle (if that CPU suffers from FEAT_C3_STOP) and that
> could be a different CPU than what the clockevent is interrupting
> (or even worse the CPU that the clockevent interrupts could be
> offline).
> 
> Theoretically it's possible to support per-cpu clockevents as the
> broadcast source but so far we haven't needed this and supporting
> it is rather complicated. Let's just deny the possibility for now
> until this becomes a reality (let's hope it never does!).
> 
> Reported-by: Sören Brinkmann 
> Signed-off-by: Stephen Boyd 
Tested-by: Sören Brinkmann 

This fixes the issue I reported when enabling the global timer on Zynq.
The global timer is prevented from becoming the broadcast device and my
system boots.

Thanks,
Sören


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 03/10] tracing: add 'traceon' and 'traceoff' event trigger commands

2013-08-22 Thread Tom Zanussi
Add 'traceon' and 'traceoff' ftrace_func_command commands.  traceon
and traceoff event triggers are added by the user via these commands
in a similar way and using practically the same syntax as the
analagous 'traceon' and 'traceoff' ftrace function commands, but
instead of writing to the set_ftrace_filter file, the traceon and
traceoff triggers are written to the per-event 'trigger' files:

echo 'traceon' > .../tracing/events/somesys/someevent/trigger
echo 'traceoff' > .../tracing/events/somesys/someevent/trigger

The above command will turn tracing on or off whenever someevent is
hit.

This also adds a 'count' version that limits the number of times the
command will be invoked:

echo 'traceon:N' > .../tracing/events/somesys/someevent/trigger
echo 'traceoff:N' > .../tracing/events/somesys/someevent/trigger

Where N is the number of times the command will be invoked.

The above commands will will turn tracing on or off whenever someevent
is hit, but only N times.

Signed-off-by: Tom Zanussi 
---
 include/linux/ftrace_event.h|   1 +
 kernel/trace/trace_events_trigger.c | 182 
 2 files changed, 183 insertions(+)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 0765d3d..4c8f7c1 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -318,6 +318,7 @@ struct ftrace_event_file {
 
 enum trigger_mode {
TM_NONE = (0),
+   TM_TRACE_ONOFF  = (1 << 0),
 };
 
 extern void destroy_preds(struct ftrace_event_call *call);
diff --git a/kernel/trace/trace_events_trigger.c 
b/kernel/trace/trace_events_trigger.c
index 7a52109..d5a10ed 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -564,7 +564,189 @@ event_trigger_callback(struct event_command *cmd_ops,
goto out;
 }
 
+static void
+traceon_trigger(void **_data)
+{
+   struct event_trigger_data **p = (struct event_trigger_data **)_data;
+   struct event_trigger_data *data = *p;
+
+   if (!data)
+   return;
+
+   if (tracing_is_on())
+   return;
+
+   tracing_on();
+}
+
+static void
+traceon_count_trigger(void **_data)
+{
+   struct event_trigger_data **p = (struct event_trigger_data **)_data;
+   struct event_trigger_data *data = *p;
+
+   if (!data)
+   return;
+
+   if (!data->count)
+   return;
+
+   if (data->count != -1)
+   (data->count)--;
+
+   traceon_trigger(_data);
+}
+
+static void
+traceoff_trigger(void **_data)
+{
+   struct event_trigger_data **p = (struct event_trigger_data **)_data;
+   struct event_trigger_data *data = *p;
+
+   if (!data)
+   return;
+
+   if (!tracing_is_on())
+   return;
+
+   tracing_off();
+}
+
+static void
+traceoff_count_trigger(void **_data)
+{
+   struct event_trigger_data **p = (struct event_trigger_data **)_data;
+   struct event_trigger_data *data = *p;
+
+   if (!data)
+   return;
+
+   if (!data->count)
+   return;
+
+   if (data->count != -1)
+   (data->count)--;
+
+   traceoff_trigger(_data);
+}
+
+static int
+traceon_trigger_print(struct seq_file *m, struct event_trigger_ops *ops,
+ void *_data)
+{
+   struct event_trigger_data *data = _data;
+
+   return event_trigger_print("traceon", m, (void *)data->count,
+  data->filter_str);
+}
+
+static int
+traceoff_trigger_print(struct seq_file *m, struct event_trigger_ops *ops,
+  void *_data)
+{
+   struct event_trigger_data *data = _data;
+
+   return event_trigger_print("traceoff", m, (void *)data->count,
+  data->filter_str);
+}
+
+static struct event_trigger_ops traceon_trigger_ops = {
+   .func   = traceon_trigger,
+   .print  = traceon_trigger_print,
+   .init   = event_trigger_init,
+   .free   = event_trigger_free,
+};
+
+static struct event_trigger_ops traceon_count_trigger_ops = {
+   .func   = traceon_count_trigger,
+   .print  = traceon_trigger_print,
+   .init   = event_trigger_init,
+   .free   = event_trigger_free,
+};
+
+static struct event_trigger_ops traceoff_trigger_ops = {
+   .func   = traceoff_trigger,
+   .print  = traceoff_trigger_print,
+   .init   = event_trigger_init,
+   .free   = event_trigger_free,
+};
+
+static struct event_trigger_ops traceoff_count_trigger_ops = {
+   .func   = traceoff_count_trigger,
+   .print  = traceoff_trigger_print,
+   .init   = event_trigger_init,
+   .free   = event_trigger_free,
+};
+
+

  1   2   3   4   5   6   7   >