Re: [PATCH] RT: Fix special-case exception for preempting the local CPU

2007-10-11 Thread Ankita Garg
On Wed, Oct 10, 2007 at 09:22:48AM -0700, mike kravetz wrote:
> On Wed, Oct 10, 2007 at 10:49:35AM -0400, Gregory Haskins wrote:
> > diff --git a/kernel/sched.c b/kernel/sched.c
> > index 3e75c62..b7f7a96 100644
> > --- a/kernel/sched.c
> > +++ b/kernel/sched.c
> > @@ -1869,7 +1869,8 @@ out_activate:
> >  * extra locking in this particular case, because
> >  * we are on the current CPU.)
> >  */
> > -   if (TASK_PREEMPTS_CURR(p, this_rq))
> > +   if (TASK_PREEMPTS_CURR(p, this_rq)
> > +   && cpu_isset(this_cpu, p->cpus_allowed))
> > set_tsk_need_resched(this_rq->curr);
> > else
> > /*
> 
> I wonder if it might better to explicitly take the rq lock and try to
> put the task on this_rq in this situation?  Rather than waiting for
> schedule to pull it from a remote rq as part of balance_rt_tasks.
> 
> A question that has passed through my head a few times is:  When waking
> a RT task is it better to:
> 1) run on current CPU if possible
> 2) run on CPU task previously ran on
> 
> I think #1 may result in lower latency.  But, if the task has lots of
> cache warmth the lower wakeup latency may be negated by running on a
> 'remote' cpu.

Could we use task_hot() routine to find if the task is cache hot? If it
isn't, if possible, we could run on current CPU, else, if possible, on
the CPU it last ran on?


-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems & Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] RT: Fix special-case exception for preempting the local CPU

2007-10-11 Thread Ankita Garg
On Wed, Oct 10, 2007 at 09:22:48AM -0700, mike kravetz wrote:
 On Wed, Oct 10, 2007 at 10:49:35AM -0400, Gregory Haskins wrote:
  diff --git a/kernel/sched.c b/kernel/sched.c
  index 3e75c62..b7f7a96 100644
  --- a/kernel/sched.c
  +++ b/kernel/sched.c
  @@ -1869,7 +1869,8 @@ out_activate:
   * extra locking in this particular case, because
   * we are on the current CPU.)
   */
  -   if (TASK_PREEMPTS_CURR(p, this_rq))
  +   if (TASK_PREEMPTS_CURR(p, this_rq)
  +cpu_isset(this_cpu, p-cpus_allowed))
  set_tsk_need_resched(this_rq-curr);
  else
  /*
 
 I wonder if it might better to explicitly take the rq lock and try to
 put the task on this_rq in this situation?  Rather than waiting for
 schedule to pull it from a remote rq as part of balance_rt_tasks.
 
 A question that has passed through my head a few times is:  When waking
 a RT task is it better to:
 1) run on current CPU if possible
 2) run on CPU task previously ran on
 
 I think #1 may result in lower latency.  But, if the task has lots of
 cache warmth the lower wakeup latency may be negated by running on a
 'remote' cpu.

Could we use task_hot() routine to find if the task is cache hot? If it
isn't, if possible, we could run on current CPU, else, if possible, on
the CPU it last ran on?


-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems  Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: realtime preemption performance difference

2007-09-26 Thread Ankita Garg
Hi Jaswinder,

On Wed, Sep 26, 2007 at 12:41:57PM +0530, Jaswinder Singh wrote:
> hello hufey,
> 
> I am not using montavista kernel, I am using standard linux kernel.
> 
> Realtime is known by worst case latencies. Ingo and team are claiming
> for realtime support so their should be some samples and performance
> numbers,
> 

You can obtain several -rt testcases from the RT wiki:
http://rt.wiki.kernel.org/index.php/Main_Page Here we have testcases/benchmarks 
for measuring scheduling latencies, priority preemption, signaling latencies, 
etc 
and other important information on the patchset.

Hope this helps!

> Can someone please share their samples and/or numbers.
> 
> Thank you,
> 
> Jaswinder Singh.
> 
> On 9/25/07, hufey <[EMAIL PROTECTED]> wrote:
> > On 9/24/07, Jaswinder Singh <[EMAIL PROTECTED]> wrote:
> > > Hi all,
> > >
> > > I want to check performance difference by using realtime preemption patch 
> > > :
> > >
> > > http://www.kernel.org/pub/linux/kernel/projects/rt/
> > >
> > > Please let me know from where I can download samples to test realtime
> > > preemption performance difference.
> > >
> >
> > As far as I know, some embedded linux vendors such as MontaVista
> > provide kernel patch to export some interfaces to measure realtime
> > performance.
> >
> > > Can someone please share performance numbers for your hardware:-
> > >
> > > 1. Interrupt latency
> > > 2. Task switching time
> > > 3. hard-realtime scheduling latency
> > >
> > > Thank you,
> > >
> > > Jaswinder Singh.
> > > -
> > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > > the body of a message to [EMAIL PROTECTED]
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > Please read the FAQ at  http://www.tux.org/lkml/
> > >
> >
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems & Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: realtime preemption performance difference

2007-09-26 Thread Ankita Garg
Hi Jaswinder,

On Mon, Sep 24, 2007 at 05:18:01PM +0530, Jaswinder Singh wrote:
> Hi all,
> 
> I want to check performance difference by using realtime preemption patch :
> 
> http://www.kernel.org/pub/linux/kernel/projects/rt/
> 
> Please let me know from where I can download samples to test realtime
> preemption performance difference.
> 

You can obtain several -rt testcases from the RT wiki:
http://rt.wiki.kernel.org/index.php/Main_Page Here we have testcases/benchmarks
for measuring scheduling latencies, priority preemption, signaling latencies, 
etc
and other important information on the patchset.

Hope this helps!

> Can someone please share performance numbers for your hardware:-
> 
> 1. Interrupt latency
> 2. Task switching time
> 3. hard-realtime scheduling latency
> 
> Thank you,
> 
> Jaswinder Singh.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems & Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: realtime preemption performance difference

2007-09-26 Thread Ankita Garg
Hi Jaswinder,

On Mon, Sep 24, 2007 at 05:18:01PM +0530, Jaswinder Singh wrote:
 Hi all,
 
 I want to check performance difference by using realtime preemption patch :
 
 http://www.kernel.org/pub/linux/kernel/projects/rt/
 
 Please let me know from where I can download samples to test realtime
 preemption performance difference.
 

You can obtain several -rt testcases from the RT wiki:
http://rt.wiki.kernel.org/index.php/Main_Page Here we have testcases/benchmarks
for measuring scheduling latencies, priority preemption, signaling latencies, 
etc
and other important information on the patchset.

Hope this helps!

 Can someone please share performance numbers for your hardware:-
 
 1. Interrupt latency
 2. Task switching time
 3. hard-realtime scheduling latency
 
 Thank you,
 
 Jaswinder Singh.
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 

-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems  Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: realtime preemption performance difference

2007-09-26 Thread Ankita Garg
Hi Jaswinder,

On Wed, Sep 26, 2007 at 12:41:57PM +0530, Jaswinder Singh wrote:
 hello hufey,
 
 I am not using montavista kernel, I am using standard linux kernel.
 
 Realtime is known by worst case latencies. Ingo and team are claiming
 for realtime support so their should be some samples and performance
 numbers,
 

You can obtain several -rt testcases from the RT wiki:
http://rt.wiki.kernel.org/index.php/Main_Page Here we have testcases/benchmarks 
for measuring scheduling latencies, priority preemption, signaling latencies, 
etc 
and other important information on the patchset.

Hope this helps!

 Can someone please share their samples and/or numbers.
 
 Thank you,
 
 Jaswinder Singh.
 
 On 9/25/07, hufey [EMAIL PROTECTED] wrote:
  On 9/24/07, Jaswinder Singh [EMAIL PROTECTED] wrote:
   Hi all,
  
   I want to check performance difference by using realtime preemption patch 
   :
  
   http://www.kernel.org/pub/linux/kernel/projects/rt/
  
   Please let me know from where I can download samples to test realtime
   preemption performance difference.
  
 
  As far as I know, some embedded linux vendors such as MontaVista
  provide kernel patch to export some interfaces to measure realtime
  performance.
 
   Can someone please share performance numbers for your hardware:-
  
   1. Interrupt latency
   2. Task switching time
   3. hard-realtime scheduling latency
  
   Thank you,
  
   Jaswinder Singh.
   -
   To unsubscribe from this list: send the line unsubscribe linux-kernel in
   the body of a message to [EMAIL PROTECTED]
   More majordomo info at  http://vger.kernel.org/majordomo-info.html
   Please read the FAQ at  http://www.tux.org/lkml/
  
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 

-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems  Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -rt 6/9] spinlock/rt_lock random cleanups

2007-07-29 Thread Ankita Garg
On Sun, Jul 29, 2007 at 07:45:40PM -0700, Daniel Walker wrote:
> Signed-off-by: Daniel Walker <[EMAIL PROTECTED]>
> 
> ---
>  include/linux/rt_lock.h  |6 --
>  include/linux/spinlock.h |5 +++--
>  2 files changed, 7 insertions(+), 4 deletions(-)
> 
> Index: linux-2.6.22/include/linux/rt_lock.h
> ===
> --- linux-2.6.22.orig/include/linux/rt_lock.h
> +++ linux-2.6.22/include/linux/rt_lock.h
> @@ -128,12 +128,14 @@ struct semaphore name = \
>   */
>  #define DECLARE_MUTEX_LOCKED COMPAT_DECLARE_MUTEX_LOCKED
> 
> -extern void fastcall __sema_init(struct semaphore *sem, int val, char *name, 
> char *file, int line);
> +extern void fastcall
> +__sema_init(struct semaphore *sem, int val, char *name, char *file, int 
> line);
> 
>  #define rt_sema_init(sem, val) \
>   __sema_init(sem, val, #sem, __FILE__, __LINE__)
> 
> -extern void fastcall __init_MUTEX(struct semaphore *sem, char *name, char 
> *file, int line);
> +extern void fastcall
> +__init_MUTEX(struct semaphore *sem, char *name, char *file, int line);
>  #define rt_init_MUTEX(sem) \
>   __init_MUTEX(sem, #sem, __FILE__, __LINE__)
> 
> Index: linux-2.6.22/include/linux/spinlock.h
> ===
> --- linux-2.6.22.orig/include/linux/spinlock.h
> +++ linux-2.6.22/include/linux/spinlock.h
> @@ -126,7 +126,7 @@ extern int __lockfunc generic__raw_read_
> 
>  #ifdef CONFIG_DEBUG_SPINLOCK
>   extern __lockfunc void _raw_spin_lock(raw_spinlock_t *lock);
> -#define _raw_spin_lock_flags(lock, flags) _raw_spin_lock(lock)
> +# define _raw_spin_lock_flags(lock, flags) _raw_spin_lock(lock)

Any reason behind including a space here?

>   extern __lockfunc int _raw_spin_trylock(raw_spinlock_t *lock);
>   extern __lockfunc void _raw_spin_unlock(raw_spinlock_t *lock);
>   extern __lockfunc void _raw_read_lock(raw_rwlock_t *lock);
> @@ -325,7 +325,8 @@ do {  
> \
> 
>  # define _read_trylock(rwl)  rt_read_trylock(rwl)
>  # define _write_trylock(rwl) rt_write_trylock(rwl)
> -#define _write_trylock_irqsave(rwl, flags)  rt_write_trylock_irqsave(rwl, 
> flags)
> +# define _write_trylock_irqsave(rwl, flags) \
> + rt_write_trylock_irqsave(rwl, flags)
> 
>  # define _read_lock(rwl) rt_read_lock(rwl)
>  # define _write_lock(rwl)rt_write_lock(rwl)
> 

-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems & Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -rt 6/9] spinlock/rt_lock random cleanups

2007-07-29 Thread Ankita Garg
On Sun, Jul 29, 2007 at 07:45:40PM -0700, Daniel Walker wrote:
 Signed-off-by: Daniel Walker [EMAIL PROTECTED]
 
 ---
  include/linux/rt_lock.h  |6 --
  include/linux/spinlock.h |5 +++--
  2 files changed, 7 insertions(+), 4 deletions(-)
 
 Index: linux-2.6.22/include/linux/rt_lock.h
 ===
 --- linux-2.6.22.orig/include/linux/rt_lock.h
 +++ linux-2.6.22/include/linux/rt_lock.h
 @@ -128,12 +128,14 @@ struct semaphore name = \
   */
  #define DECLARE_MUTEX_LOCKED COMPAT_DECLARE_MUTEX_LOCKED
 
 -extern void fastcall __sema_init(struct semaphore *sem, int val, char *name, 
 char *file, int line);
 +extern void fastcall
 +__sema_init(struct semaphore *sem, int val, char *name, char *file, int 
 line);
 
  #define rt_sema_init(sem, val) \
   __sema_init(sem, val, #sem, __FILE__, __LINE__)
 
 -extern void fastcall __init_MUTEX(struct semaphore *sem, char *name, char 
 *file, int line);
 +extern void fastcall
 +__init_MUTEX(struct semaphore *sem, char *name, char *file, int line);
  #define rt_init_MUTEX(sem) \
   __init_MUTEX(sem, #sem, __FILE__, __LINE__)
 
 Index: linux-2.6.22/include/linux/spinlock.h
 ===
 --- linux-2.6.22.orig/include/linux/spinlock.h
 +++ linux-2.6.22/include/linux/spinlock.h
 @@ -126,7 +126,7 @@ extern int __lockfunc generic__raw_read_
 
  #ifdef CONFIG_DEBUG_SPINLOCK
   extern __lockfunc void _raw_spin_lock(raw_spinlock_t *lock);
 -#define _raw_spin_lock_flags(lock, flags) _raw_spin_lock(lock)
 +# define _raw_spin_lock_flags(lock, flags) _raw_spin_lock(lock)

Any reason behind including a space here?

   extern __lockfunc int _raw_spin_trylock(raw_spinlock_t *lock);
   extern __lockfunc void _raw_spin_unlock(raw_spinlock_t *lock);
   extern __lockfunc void _raw_read_lock(raw_rwlock_t *lock);
 @@ -325,7 +325,8 @@ do {  
 \
 
  # define _read_trylock(rwl)  rt_read_trylock(rwl)
  # define _write_trylock(rwl) rt_write_trylock(rwl)
 -#define _write_trylock_irqsave(rwl, flags)  rt_write_trylock_irqsave(rwl, 
 flags)
 +# define _write_trylock_irqsave(rwl, flags) \
 + rt_write_trylock_irqsave(rwl, flags)
 
  # define _read_lock(rwl) rt_read_lock(rwl)
  # define _write_lock(rwl)rt_write_lock(rwl)
 

-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems  Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Question] Hooks for scheduler tracing (CFS)

2007-07-26 Thread Ankita Garg
On Thu, Jul 26, 2007 at 12:22:31PM -0400, Frank Ch. Eigler wrote:
> Hi -
> 
> On Thu, Jul 26, 2007 at 11:02:26AM -0400, Mathieu Desnoyers wrote:
> > [...]
> > > > The problem is also in _stp_print_flush, not *only* in relay code:
> > > > void _stp_print_flush (void)
> > > > ...
> > > > spin_lock(&_stp_print_lock);
> > > > spin_unlock(&_stp_print_lock);
> > > > 
> > > > Those will turn into mutexes with -rt.
> > > 
> > > Indeed,
> 
> (Though actually that bug was fixed some time ago.)
> 
> 
> > > plus systemtap-generated locking code uses rwlocks,
> > > local_irq_save/restore or preempt_disable, in various places.  Could
> > > someone point to a place that spells out what would be more
> > > appropriate way of ensuring atomicity while being compatible with -rt?
> > 
> > AFAIK, for your needs either:
> > [...]
> > - Use per-cpu data with preempt disabling/irq disabling
> 
> As in local_irq_save / preempt_disable?  Yes, already done.
> 
> > - Use the original "real" spin locks/rwlocks (raw_*).
> > [...]
> 
> It was unclear from the OLS paper whether the spin_lock_irq* family of
> functions also had to be moved to the raw forms.

By making the locks of the raw_* type, spin_lock_irq* functions would
then automatically disable hardware interrupts.

> 
> > You just don't want to sleep in the tracing code. [...]  Since you
> > will likely disable preemption, make sure your tracing code executes
> > in a deterministic time.
> 
> Definitely, that has always been the case.
> 
> > Make sure that the sub-buffer switch code respects that too [...]
> 
> We will review that part of the related code.
> 
> - FChE

-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems & Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Question] Hooks for scheduler tracing (CFS)

2007-07-26 Thread Ankita Garg
Hi Ingo,

On Thu, Jul 26, 2007 at 01:05:04PM +0200, Ingo Molnar wrote:
> 
> * Ankita Garg <[EMAIL PROTECTED]> wrote:
> 
> > local_irq_save(flags);
> > buf = _stp_chan->buf[smp_processor_id()];
> > if (unlikely(buf->offset + length > _stp_chan->subbuf_size))
> > length = relay_switch_subbuf(buf, length);
> > memcpy(buf->data + buf->offset, data, length);
> > buf->offset += length;
> > local_irq_restore(flags);
> 
> oh, what a fine piece of s^H^H :-/ Who in their right mind calls this 
> from _tracing_ code:
> 
> smp_mb();
> if (waitqueue_active(>read_wait))
> /*
>  * Calling wake_up_interruptible() from here
>  * will deadlock if we happen to be logging
>  * from the scheduler (trying to re-grab
>  * rq->lock), so defer it.
>  */
> __mod_timer(>timer, jiffies + 1);
> 
> and the comment is utter rubbish: __mod_timer() can lock up just as 
> much. Just use an adaptive-polling method to drive the draining of the 
> relay buffer, instead of mucking with timers from within the tracing 
> code. Whoever implemented this has absolutely zero clue i have to say 
> ...
> 
> the smp_mb() is rubbish too.
> 
> could you try the patch below, does it fix the problem?

This patch did not fix my problem. I still get similar traces...

> 
>   Ingo
> 
> ->
> Subject: relay: fix timer madness
> From: Ingo Molnar <[EMAIL PROTECTED]>
> 
> remove timer calls (!!!) from deep within the tracing infrastructure.
> This was totally bogus code that can cause lockups and worse.
> Poll the buffer every 2 jiffies for now.
> 
> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
> ---
>  kernel/relay.c |   14 +-
>  1 file changed, 5 insertions(+), 9 deletions(-)
> 
> Index: linux-rt-rebase.q/kernel/relay.c
> ===
> --- linux-rt-rebase.q.orig/kernel/relay.c
> +++ linux-rt-rebase.q/kernel/relay.c
> @@ -319,6 +319,10 @@ static void wakeup_readers(unsigned long
>  {
>   struct rchan_buf *buf = (struct rchan_buf *)data;
>   wake_up_interruptible(>read_wait);
> + /*
> +  * Stupid polling for now:
> +  */
> + mod_timer(>timer, jiffies + 1);
>  }
> 
>  /**
> @@ -336,6 +340,7 @@ static void __relay_reset(struct rchan_b
>   init_waitqueue_head(>read_wait);
>   kref_init(>kref);
>   setup_timer(>timer, wakeup_readers, (unsigned long)buf);
> + mod_timer(>timer, jiffies + 1);
>   } else
>   del_timer_sync(>timer);
> 
> @@ -604,15 +609,6 @@ size_t relay_switch_subbuf(struct rchan_
>   buf->subbufs_produced++;
>   buf->dentry->d_inode->i_size += buf->chan->subbuf_size -
>   buf->padding[old_subbuf];
> - smp_mb();
> - if (waitqueue_active(>read_wait))
> - /*
> -  * Calling wake_up_interruptible() from here
> -  * will deadlock if we happen to be logging
> -  * from the scheduler (trying to re-grab
> -  * rq->lock), so defer it.
> -  */
> - __mod_timer(>timer, jiffies + 1);
>   }
> 
>   old = buf->data;

-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems & Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Question] Hooks for scheduler tracing (CFS)

2007-07-26 Thread Ankita Garg
On Thu, Jul 26, 2007 at 09:53:53AM +0200, Ingo Molnar wrote:
> 
> * Ankita Garg <[EMAIL PROTECTED]> wrote:
> 
> > > I'd suggest to not put a probe into a preempt-off section - put it 
> > > to the beginning and to the end of schedule() to capture 
> > > context-switches. _stp_print_flush() is in the systemtap-generated 
> > > module, right? Maybe the problem is resolved by changing that 
> > > spinlock to use raw_spinlock_t / DEFINE_RAW_SPIN_LOCK.
> > 
> > Yes, _stp_print_flush is in the systemtap-generated kprobe module. 
> > Placing the probe at the beginning of schedule() also has the same 
> > effect. Will try by changing the spinlock to raw_spinlock_t...
> 
> could you send us that module source ST generates? Perhaps there are 
> preempt_disable() (or local_irq_disable()) calls in it too.

Attaching the generated module as it is huge...

Looks like SystemTap makes use of relayfs for printing the buffer
contents. The _stp_print_flush() (line 269 in the attached module) implemented 
in the systemtap library function as calling the following:

static int _stp_relay_write (const void *data, unsigned length)
{
unsigned long flags;
struct rchan_buf *buf;

if (unlikely(length == 0))
return 0;

local_irq_save(flags);
buf = _stp_chan->buf[smp_processor_id()];
if (unlikely(buf->offset + length > _stp_chan->subbuf_size))
length = relay_switch_subbuf(buf, length);
memcpy(buf->data + buf->offset, data, length);
buf->offset += length;
local_irq_restore(flags);

if (unlikely(length == 0))
return -1;

return length;
}

The above does a local_irq_save().

-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems & Technology Labs, 
Bangalore, India   
#define TEST_MODE 0

#ifndef MAXNESTING
#define MAXNESTING 10
#endif
#ifndef MAXSTRINGLEN
#define MAXSTRINGLEN 128
#endif
#ifndef MAXACTION
#define MAXACTION 1000
#endif
#ifndef MAXACTION_INTERRUPTIBLE
#define MAXACTION_INTERRUPTIBLE (MAXACTION * 10)
#endif
#ifndef MAXTRYLOCK
#define MAXTRYLOCK MAXACTION
#endif
#ifndef TRYLOCKDELAY
#define TRYLOCKDELAY 100
#endif
#ifndef MAXMAPENTRIES
#define MAXMAPENTRIES 2048
#endif
#ifndef MAXERRORS
#define MAXERRORS 0
#endif
#ifndef MAXSKIPPED
#define MAXSKIPPED 100
#endif
#ifndef MINSTACKSPACE
#define MINSTACKSPACE 1024
#endif
#ifndef STP_OVERLOAD_INTERVAL
#define STP_OVERLOAD_INTERVAL 10LL
#endif
#ifndef STP_OVERLOAD_THRESHOLD
#define STP_OVERLOAD_THRESHOLD 5LL
#endif
#ifndef STP_NO_OVERLOAD
#define STP_OVERLOAD
#endif
#include "runtime.h"
#include "regs.c"
#include "stack.c"
#include "regs-ia64.c"
#include "stat.c"
#include 
#include 
#include 
#include 
#include 
#include 
#include "loc2c-runtime.h" 
#ifndef read_trylock
#define read_trylock(x) ({ read_lock(x); 1; })
#endif
#if defined(CONFIG_MARKERS)
#include 
#endif
typedef char string_t[MAXSTRINGLEN];

#define STAP_SESSION_STARTING 0
#define STAP_SESSION_RUNNING 1
#define STAP_SESSION_ERROR 2
#define STAP_SESSION_STOPPING 3
#define STAP_SESSION_STOPPED 4
atomic_t session_state = ATOMIC_INIT (STAP_SESSION_STARTING);
atomic_t error_count = ATOMIC_INIT (0);
atomic_t skipped_count = ATOMIC_INIT (0);

struct context {
  atomic_t busy;
  const char *probe_point;
  int actionremaining;
  unsigned nesting;
  const char *last_error;
  const char *last_stmt;
  struct pt_regs *regs;
  struct kretprobe_instance *pi;
  va_list mark_va_list;
  #ifdef STP_TIMING
  Stat *statp;
  #endif
  #ifdef STP_OVERLOAD
  cycles_t cycles_base;
  cycles_t cycles_sum;
  #endif
  union {
struct probe_1493_locals {
  string_t argstr;
  union {
struct {
  string_t __tmp0;
  int64_t __tmp1;
  int64_t __tmp2;
  string_t __tmp3;
};
  };
} probe_1493;
struct function_execname_locals {
  string_t __retvalue;
} function_execname;
struct function_pid_locals {
  int64_t __retvalue;
} function_pid;
struct function_tid_locals {
  int64_t __retvalue;
} function_tid;
  } locals [MAXNESTING];
};

void *contexts = NULL; /* alloc_percpu */



static void function_execname (struct context * __restrict__ c);

static void function_pid (struct context * __restrict__ c);

static void function_tid (struct context * __restrict__ c);

void function_execname (struct context* __restrict__ c) {
  struct function_execname_locals *  __restrict__ l =
& c->locals[c->nesting].function_execname;
  (void) l;
  #define CONTEXT c
  #define THIS l
  if (0) goto out;
  l->__retvalue[0] = '\0';
  {
 /* pure */
	strlcpy (THIS->__retvalue, current->comm, MAXSTRINGLEN);

  }
out:
  ;
  #undef CONTEXT
  #undef THIS
}


void function_pid (struct context*

Re: [Question] Hooks for scheduler tracing (CFS)

2007-07-26 Thread Ankita Garg
On Thu, Jul 26, 2007 at 09:35:20AM +0200, Ingo Molnar wrote:
> 
> * Ankita Garg <[EMAIL PROTECTED]> wrote:
> 
> > The probe point did get triggered, and soon after that I had the 
> > following in dmesg, leading to system hang...
> > 
> > BUG: scheduling while atomic: softirq-rcu/3/0x0004/52, CPU#3
> > 
> > Call Trace:
> >  <#DB>  [] __schedule_bug+0x4b/0x4f
> >  [] __sched_text_start+0xcc/0xaaa
> >  [] dump_trace+0x248/0x25d
> >  [] print_traces+0x9/0xb
> >  [] show_trace+0x5c/0x64
> >  [] schedule+0xe4/0x104
> >  [] rt_spin_lock_slowlock+0xfc/0x19e
> >  [] __rt_spin_lock+0x1f/0x21
> >  [] rt_spin_lock+0x9/0xb
> >  []
> > :stap_c1a10b1292b5f87a563f56d89ddfc765_606:_stp_print_flush+0x5f/0xdf
> >  []
> > :stap_c1a10b1292b5f87a563f56d89ddfc765_606:probe_1493+0x1f6/0x257
> >  []
> 
> I'd suggest to not put a probe into a preempt-off section - put it to 
> the beginning and to the end of schedule() to capture context-switches. 
> _stp_print_flush() is in the systemtap-generated module, right? Maybe 
> the problem is resolved by changing that spinlock to use raw_spinlock_t 
> / DEFINE_RAW_SPIN_LOCK.

Yes, _stp_print_flush is in the systemtap-generated kprobe module.
Placing the probe at the beginning of schedule() also has the same
effect. Will try by changing the spinlock to raw_spinlock_t...

> 
> 
>   Ingo

-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems & Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Question] Hooks for scheduler tracing (CFS)

2007-07-26 Thread Ankita Garg
Hi,

On Mon, Jul 16, 2007 at 12:52:37PM -0700, Arjan van de Ven wrote:
> On Mon, 2007-07-16 at 21:46 +0200, Remy Bohmer wrote:
> > So I was wondering if anybody knows some tool/kernel mechanism which
> > can do this?
> > If not, I will build a kernel extension for it myself (new extension
> > to 'latency_trace' ?)
> 
> systemtap has been able to do such things for me in the past...

Was trying to capture similar data as mentioned by Remy using Systemtap.
The tapset/systemtap script that I used is :

probe kernel.function("balance_rt_tasks").inline {
printf("%s (pid: %d, tid: %d argstr: %s ) \n", execname(),
pid(), tid(), argstr);
}

The probe point did get triggered, and soon after that I had the
following in dmesg, leading to system hang...

BUG: scheduling while atomic: softirq-rcu/3/0x0004/52, CPU#3

Call Trace:
 <#DB>  [] __schedule_bug+0x4b/0x4f
 [] __sched_text_start+0xcc/0xaaa
 [] dump_trace+0x248/0x25d
 [] print_traces+0x9/0xb
 [] show_trace+0x5c/0x64
 [] schedule+0xe4/0x104
 [] rt_spin_lock_slowlock+0xfc/0x19e
 [] __rt_spin_lock+0x1f/0x21
 [] rt_spin_lock+0x9/0xb
 []
:stap_c1a10b1292b5f87a563f56d89ddfc765_606:_stp_print_flush+0x5f/0xdf
 []
:stap_c1a10b1292b5f87a563f56d89ddfc765_606:probe_1493+0x1f6/0x257
 []
:stap_c1a10b1292b5f87a563f56d89ddfc765_606:enter_kprobe_probe+0x105/0x22a
 [] __sched_text_start+0x1c9/0xaaa
 [] kprobe_handler+0x1b3/0x1f5
 [] kprobe_exceptions_notify+0x3b/0x7f
 [] notifier_call_chain+0x33/0x5b
 [] __raw_notifier_call_chain+0x9/0xb
 [] raw_notifier_call_chain+0xf/0x11
 [] notify_die+0x2e/0x33
 [] do_int3+0x30/0x8d
 [] int3+0x93/0xb0
 [] __sched_text_start+0x1ca/0xaaa
 <>  [] __free_pages+0x18/0x21
 [] free_pages+0x55/0x5a
 [] kmem_freepages+0x112/0x11b
 [] schedule+0xe4/0x104
 [] ksoftirqd+0xbc/0x26f
 [] ksoftirqd+0x0/0x26f
 [] ksoftirqd+0x0/0x26f
 [] kthread+0x49/0x76
 [] child_rip+0xa/0x12
 [] thread_return+0x75/0x1d5
 [] kthread+0x0/0x76
 [] child_rip+0x0/0x12

Looks like printing the data in the tapset resulted in some lock
issues. The above script is just one of the many probe points that I
tried. In all cases, printing data from within the probe point resulted in the 
hang (as when I do the printing at the time the script is stopped,
everything works just fine!).

Any idea why this could be happening? An -rt issue or systemtap bug??


-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems & Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Question] Hooks for scheduler tracing (CFS)

2007-07-26 Thread Ankita Garg
On Thu, Jul 26, 2007 at 09:35:20AM +0200, Ingo Molnar wrote:
 
 * Ankita Garg [EMAIL PROTECTED] wrote:
 
  The probe point did get triggered, and soon after that I had the 
  following in dmesg, leading to system hang...
  
  BUG: scheduling while atomic: softirq-rcu/3/0x0004/52, CPU#3
  
  Call Trace:
   #DB  [81033555] __schedule_bug+0x4b/0x4f
   [8128b414] __sched_text_start+0xcc/0xaaa
   [8100b574] dump_trace+0x248/0x25d
   [81068334] print_traces+0x9/0xb
   [8100b5e5] show_trace+0x5c/0x64
   [8128c1c2] schedule+0xe4/0x104
   [8128d10c] rt_spin_lock_slowlock+0xfc/0x19e
   [8128d9de] __rt_spin_lock+0x1f/0x21
   [8128d9e9] rt_spin_lock+0x9/0xb
   [88387dcc]
  :stap_c1a10b1292b5f87a563f56d89ddfc765_606:_stp_print_flush+0x5f/0xdf
   [88389e41]
  :stap_c1a10b1292b5f87a563f56d89ddfc765_606:probe_1493+0x1f6/0x257
   [8838bdc3]
 
 I'd suggest to not put a probe into a preempt-off section - put it to 
 the beginning and to the end of schedule() to capture context-switches. 
 _stp_print_flush() is in the systemtap-generated module, right? Maybe 
 the problem is resolved by changing that spinlock to use raw_spinlock_t 
 / DEFINE_RAW_SPIN_LOCK.

Yes, _stp_print_flush is in the systemtap-generated kprobe module.
Placing the probe at the beginning of schedule() also has the same
effect. Will try by changing the spinlock to raw_spinlock_t...

 
 
   Ingo

-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems  Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Question] Hooks for scheduler tracing (CFS)

2007-07-26 Thread Ankita Garg
Hi,

On Mon, Jul 16, 2007 at 12:52:37PM -0700, Arjan van de Ven wrote:
 On Mon, 2007-07-16 at 21:46 +0200, Remy Bohmer wrote:
  So I was wondering if anybody knows some tool/kernel mechanism which
  can do this?
  If not, I will build a kernel extension for it myself (new extension
  to 'latency_trace' ?)
 
 systemtap has been able to do such things for me in the past...

Was trying to capture similar data as mentioned by Remy using Systemtap.
The tapset/systemtap script that I used is :

probe kernel.function(balance_rt_tasks).inline {
printf(%s (pid: %d, tid: %d argstr: %s ) \n, execname(),
pid(), tid(), argstr);
}

The probe point did get triggered, and soon after that I had the
following in dmesg, leading to system hang...

BUG: scheduling while atomic: softirq-rcu/3/0x0004/52, CPU#3

Call Trace:
 #DB  [81033555] __schedule_bug+0x4b/0x4f
 [8128b414] __sched_text_start+0xcc/0xaaa
 [8100b574] dump_trace+0x248/0x25d
 [81068334] print_traces+0x9/0xb
 [8100b5e5] show_trace+0x5c/0x64
 [8128c1c2] schedule+0xe4/0x104
 [8128d10c] rt_spin_lock_slowlock+0xfc/0x19e
 [8128d9de] __rt_spin_lock+0x1f/0x21
 [8128d9e9] rt_spin_lock+0x9/0xb
 [88387dcc]
:stap_c1a10b1292b5f87a563f56d89ddfc765_606:_stp_print_flush+0x5f/0xdf
 [88389e41]
:stap_c1a10b1292b5f87a563f56d89ddfc765_606:probe_1493+0x1f6/0x257
 [8838bdc3]
:stap_c1a10b1292b5f87a563f56d89ddfc765_606:enter_kprobe_probe+0x105/0x22a
 [8128b511] __sched_text_start+0x1c9/0xaaa
 [8128f8ee] kprobe_handler+0x1b3/0x1f5
 [8128f96b] kprobe_exceptions_notify+0x3b/0x7f
 [81290604] notifier_call_chain+0x33/0x5b
 [810461d5] __raw_notifier_call_chain+0x9/0xb
 [810461e6] raw_notifier_call_chain+0xf/0x11
 [8105098b] notify_die+0x2e/0x33
 [8128ef6d] do_int3+0x30/0x8d
 [8128e8a3] int3+0x93/0xb0
 [8128b512] __sched_text_start+0x1ca/0xaaa
 EOE  [8107b585] __free_pages+0x18/0x21
 [8107b5e3] free_pages+0x55/0x5a
 [8109945d] kmem_freepages+0x112/0x11b
 [8128c1c2] schedule+0xe4/0x104
 [8103edf5] ksoftirqd+0xbc/0x26f
 [8103ed39] ksoftirqd+0x0/0x26f
 [8103ed39] ksoftirqd+0x0/0x26f
 [8104c917] kthread+0x49/0x76
 [8100af18] child_rip+0xa/0x12
 [8128be67] thread_return+0x75/0x1d5
 [8104c8ce] kthread+0x0/0x76
 [8100af0e] child_rip+0x0/0x12

Looks like printing the data in the tapset resulted in some lock
issues. The above script is just one of the many probe points that I
tried. In all cases, printing data from within the probe point resulted in the 
hang (as when I do the printing at the time the script is stopped,
everything works just fine!).

Any idea why this could be happening? An -rt issue or systemtap bug??


-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems  Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Question] Hooks for scheduler tracing (CFS)

2007-07-26 Thread Ankita Garg
Hi Ingo,

On Thu, Jul 26, 2007 at 01:05:04PM +0200, Ingo Molnar wrote:
 
 * Ankita Garg [EMAIL PROTECTED] wrote:
 
  local_irq_save(flags);
  buf = _stp_chan-buf[smp_processor_id()];
  if (unlikely(buf-offset + length  _stp_chan-subbuf_size))
  length = relay_switch_subbuf(buf, length);
  memcpy(buf-data + buf-offset, data, length);
  buf-offset += length;
  local_irq_restore(flags);
 
 oh, what a fine piece of s^H^H :-/ Who in their right mind calls this 
 from _tracing_ code:
 
 smp_mb();
 if (waitqueue_active(buf-read_wait))
 /*
  * Calling wake_up_interruptible() from here
  * will deadlock if we happen to be logging
  * from the scheduler (trying to re-grab
  * rq-lock), so defer it.
  */
 __mod_timer(buf-timer, jiffies + 1);
 
 and the comment is utter rubbish: __mod_timer() can lock up just as 
 much. Just use an adaptive-polling method to drive the draining of the 
 relay buffer, instead of mucking with timers from within the tracing 
 code. Whoever implemented this has absolutely zero clue i have to say 
 ...
 
 the smp_mb() is rubbish too.
 
 could you try the patch below, does it fix the problem?

This patch did not fix my problem. I still get similar traces...

 
   Ingo
 
 -
 Subject: relay: fix timer madness
 From: Ingo Molnar [EMAIL PROTECTED]
 
 remove timer calls (!!!) from deep within the tracing infrastructure.
 This was totally bogus code that can cause lockups and worse.
 Poll the buffer every 2 jiffies for now.
 
 Signed-off-by: Ingo Molnar [EMAIL PROTECTED]
 ---
  kernel/relay.c |   14 +-
  1 file changed, 5 insertions(+), 9 deletions(-)
 
 Index: linux-rt-rebase.q/kernel/relay.c
 ===
 --- linux-rt-rebase.q.orig/kernel/relay.c
 +++ linux-rt-rebase.q/kernel/relay.c
 @@ -319,6 +319,10 @@ static void wakeup_readers(unsigned long
  {
   struct rchan_buf *buf = (struct rchan_buf *)data;
   wake_up_interruptible(buf-read_wait);
 + /*
 +  * Stupid polling for now:
 +  */
 + mod_timer(buf-timer, jiffies + 1);
  }
 
  /**
 @@ -336,6 +340,7 @@ static void __relay_reset(struct rchan_b
   init_waitqueue_head(buf-read_wait);
   kref_init(buf-kref);
   setup_timer(buf-timer, wakeup_readers, (unsigned long)buf);
 + mod_timer(buf-timer, jiffies + 1);
   } else
   del_timer_sync(buf-timer);
 
 @@ -604,15 +609,6 @@ size_t relay_switch_subbuf(struct rchan_
   buf-subbufs_produced++;
   buf-dentry-d_inode-i_size += buf-chan-subbuf_size -
   buf-padding[old_subbuf];
 - smp_mb();
 - if (waitqueue_active(buf-read_wait))
 - /*
 -  * Calling wake_up_interruptible() from here
 -  * will deadlock if we happen to be logging
 -  * from the scheduler (trying to re-grab
 -  * rq-lock), so defer it.
 -  */
 - __mod_timer(buf-timer, jiffies + 1);
   }
 
   old = buf-data;

-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems  Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Question] Hooks for scheduler tracing (CFS)

2007-07-26 Thread Ankita Garg
On Thu, Jul 26, 2007 at 09:53:53AM +0200, Ingo Molnar wrote:
 
 * Ankita Garg [EMAIL PROTECTED] wrote:
 
   I'd suggest to not put a probe into a preempt-off section - put it 
   to the beginning and to the end of schedule() to capture 
   context-switches. _stp_print_flush() is in the systemtap-generated 
   module, right? Maybe the problem is resolved by changing that 
   spinlock to use raw_spinlock_t / DEFINE_RAW_SPIN_LOCK.
  
  Yes, _stp_print_flush is in the systemtap-generated kprobe module. 
  Placing the probe at the beginning of schedule() also has the same 
  effect. Will try by changing the spinlock to raw_spinlock_t...
 
 could you send us that module source ST generates? Perhaps there are 
 preempt_disable() (or local_irq_disable()) calls in it too.

Attaching the generated module as it is huge...

Looks like SystemTap makes use of relayfs for printing the buffer
contents. The _stp_print_flush() (line 269 in the attached module) implemented 
in the systemtap library function as calling the following:

static int _stp_relay_write (const void *data, unsigned length)
{
unsigned long flags;
struct rchan_buf *buf;

if (unlikely(length == 0))
return 0;

local_irq_save(flags);
buf = _stp_chan-buf[smp_processor_id()];
if (unlikely(buf-offset + length  _stp_chan-subbuf_size))
length = relay_switch_subbuf(buf, length);
memcpy(buf-data + buf-offset, data, length);
buf-offset += length;
local_irq_restore(flags);

if (unlikely(length == 0))
return -1;

return length;
}

The above does a local_irq_save().

-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems  Technology Labs, 
Bangalore, India   
#define TEST_MODE 0

#ifndef MAXNESTING
#define MAXNESTING 10
#endif
#ifndef MAXSTRINGLEN
#define MAXSTRINGLEN 128
#endif
#ifndef MAXACTION
#define MAXACTION 1000
#endif
#ifndef MAXACTION_INTERRUPTIBLE
#define MAXACTION_INTERRUPTIBLE (MAXACTION * 10)
#endif
#ifndef MAXTRYLOCK
#define MAXTRYLOCK MAXACTION
#endif
#ifndef TRYLOCKDELAY
#define TRYLOCKDELAY 100
#endif
#ifndef MAXMAPENTRIES
#define MAXMAPENTRIES 2048
#endif
#ifndef MAXERRORS
#define MAXERRORS 0
#endif
#ifndef MAXSKIPPED
#define MAXSKIPPED 100
#endif
#ifndef MINSTACKSPACE
#define MINSTACKSPACE 1024
#endif
#ifndef STP_OVERLOAD_INTERVAL
#define STP_OVERLOAD_INTERVAL 10LL
#endif
#ifndef STP_OVERLOAD_THRESHOLD
#define STP_OVERLOAD_THRESHOLD 5LL
#endif
#ifndef STP_NO_OVERLOAD
#define STP_OVERLOAD
#endif
#include runtime.h
#include regs.c
#include stack.c
#include regs-ia64.c
#include stat.c
#include linux/string.h
#include linux/timer.h
#include linux/delay.h
#include linux/profile.h
#include linux/random.h
#include linux/utsname.h
#include loc2c-runtime.h 
#ifndef read_trylock
#define read_trylock(x) ({ read_lock(x); 1; })
#endif
#if defined(CONFIG_MARKERS)
#include linux/marker.h
#endif
typedef char string_t[MAXSTRINGLEN];

#define STAP_SESSION_STARTING 0
#define STAP_SESSION_RUNNING 1
#define STAP_SESSION_ERROR 2
#define STAP_SESSION_STOPPING 3
#define STAP_SESSION_STOPPED 4
atomic_t session_state = ATOMIC_INIT (STAP_SESSION_STARTING);
atomic_t error_count = ATOMIC_INIT (0);
atomic_t skipped_count = ATOMIC_INIT (0);

struct context {
  atomic_t busy;
  const char *probe_point;
  int actionremaining;
  unsigned nesting;
  const char *last_error;
  const char *last_stmt;
  struct pt_regs *regs;
  struct kretprobe_instance *pi;
  va_list mark_va_list;
  #ifdef STP_TIMING
  Stat *statp;
  #endif
  #ifdef STP_OVERLOAD
  cycles_t cycles_base;
  cycles_t cycles_sum;
  #endif
  union {
struct probe_1493_locals {
  string_t argstr;
  union {
struct {
  string_t __tmp0;
  int64_t __tmp1;
  int64_t __tmp2;
  string_t __tmp3;
};
  };
} probe_1493;
struct function_execname_locals {
  string_t __retvalue;
} function_execname;
struct function_pid_locals {
  int64_t __retvalue;
} function_pid;
struct function_tid_locals {
  int64_t __retvalue;
} function_tid;
  } locals [MAXNESTING];
};

void *contexts = NULL; /* alloc_percpu */



static void function_execname (struct context * __restrict__ c);

static void function_pid (struct context * __restrict__ c);

static void function_tid (struct context * __restrict__ c);

void function_execname (struct context* __restrict__ c) {
  struct function_execname_locals *  __restrict__ l =
 c-locals[c-nesting].function_execname;
  (void) l;
  #define CONTEXT c
  #define THIS l
  if (0) goto out;
  l-__retvalue[0] = '\0';
  {
 /* pure */
	strlcpy (THIS-__retvalue, current-comm, MAXSTRINGLEN);

  }
out:
  ;
  #undef CONTEXT
  #undef THIS
}


void function_pid (struct context* __restrict__ c) {
  struct function_pid_locals *  __restrict__ l =
 c-locals[c-nesting].function_pid;
  (void) l;
  #define CONTEXT c
  #define THIS l
  if (0) goto

Re: [Question] Hooks for scheduler tracing (CFS)

2007-07-26 Thread Ankita Garg
On Thu, Jul 26, 2007 at 12:22:31PM -0400, Frank Ch. Eigler wrote:
 Hi -
 
 On Thu, Jul 26, 2007 at 11:02:26AM -0400, Mathieu Desnoyers wrote:
  [...]
The problem is also in _stp_print_flush, not *only* in relay code:
void _stp_print_flush (void)
...
spin_lock(_stp_print_lock);
spin_unlock(_stp_print_lock);

Those will turn into mutexes with -rt.
   
   Indeed,
 
 (Though actually that bug was fixed some time ago.)
 
 
   plus systemtap-generated locking code uses rwlocks,
   local_irq_save/restore or preempt_disable, in various places.  Could
   someone point to a place that spells out what would be more
   appropriate way of ensuring atomicity while being compatible with -rt?
  
  AFAIK, for your needs either:
  [...]
  - Use per-cpu data with preempt disabling/irq disabling
 
 As in local_irq_save / preempt_disable?  Yes, already done.
 
  - Use the original real spin locks/rwlocks (raw_*).
  [...]
 
 It was unclear from the OLS paper whether the spin_lock_irq* family of
 functions also had to be moved to the raw forms.

By making the locks of the raw_* type, spin_lock_irq* functions would
then automatically disable hardware interrupts.

 
  You just don't want to sleep in the tracing code. [...]  Since you
  will likely disable preemption, make sure your tracing code executes
  in a deterministic time.
 
 Definitely, that has always been the case.
 
  Make sure that the sub-buffer switch code respects that too [...]
 
 We will review that part of the related code.
 
 - FChE

-- 
Regards,
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems  Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PREEMPT_RT] [PATCH] Fix BUG: using smp_processor_id() in preemptible [00000000] code: nfsd/2852

2007-04-18 Thread Ankita Garg
Hi,

While running some tests on 2.6.20-rt8 with DEBUG_PREEMPT on, I hit the 
following BUG:

BUG: using smp_processor_id() in preemptible [] code: nfsd/2852
caller is drain_array+0x25/0x132

Call Trace:
 [] dump_trace+0xbd/0x3d8
 [] show_trace+0x44/0x6d
 [] dump_stack+0x13/0x15
 [] debug_smp_processor_id+0xe3/0xf1
 [] drain_array+0x25/0x132
 [] __cache_shrink+0xd5/0x1a6
 [] kmem_cache_destroy+0x6c/0xe3
 [] :nfsd:nfsd4_free_slab+0x16/0x21
 [] :nfsd:nfsd4_free_slabs+0x10/0x36
 [] :nfsd:nfs4_state_shutdown+0x1a2/0x1ae
 [] :nfsd:nfsd_last_thread+0x47/0x76
 [] :sunrpc:svc_destroy+0x8d/0xd1
 [] :sunrpc:svc_exit_thread+0xba/0xc6
 [] :nfsd:nfsd+0x2a3/0x2b8
 [] child_rip+0xa/0x12

---
| preempt count: 0001 ]
| 1-level deep critical section nesting:

.. []  debug_smp_processor_id+0x90/0xf1
.[] ..   ( <= drain_array+0x25/0x132)


This patch fixes the above issue which arises due to the call to
smp_processor_id in drain_array() from mm/slab.c. smp_processor_id()
invocation is redundant here, as the call to slab_spin_lock_irq() 
already fills the value of this_cpu using raw_smp_processor_id().


o Patch to fix BUG in drain_array()

Signed-off-by: Ankita Garg 
--
 slab.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Index: linux-2.6.20-rt8/mm/slab.c
===
--- linux-2.6.20-rt8.orig/mm/slab.c 2007-04-18 18:41:22.0 +0530
+++ linux-2.6.20-rt8/mm/slab.c  2007-04-18 18:42:21.0 +0530
@@ -4121,8 +4121,7 @@
 int drain_array(struct kmem_cache *cachep, struct kmem_list3 *l3,
 struct array_cache *ac, int force, int node)
 {
-   int this_cpu = smp_processor_id();
-   int tofree;
+   int tofree, this_cpu;
 
if (!ac || !ac->avail)
return 0;


Regards,
-- 
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems & Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PREEMPT_RT] [PATCH] Fix BUG: using smp_processor_id() in preemptible [00000000] code: nfsd/2852

2007-04-18 Thread Ankita Garg
Hi,

While running some tests on 2.6.20-rt8 with DEBUG_PREEMPT on, I hit the 
following BUG:

BUG: using smp_processor_id() in preemptible [] code: nfsd/2852
caller is drain_array+0x25/0x132

Call Trace:
 [8026d828] dump_trace+0xbd/0x3d8
 [8026db87] show_trace+0x44/0x6d
 [8026ddc8] dump_stack+0x13/0x15
 [80355e2e] debug_smp_processor_id+0xe3/0xf1
 [802e01b5] drain_array+0x25/0x132
 [802e07ac] __cache_shrink+0xd5/0x1a6
 [802e0a47] kmem_cache_destroy+0x6c/0xe3
 [8846c809] :nfsd:nfsd4_free_slab+0x16/0x21
 [8846c824] :nfsd:nfsd4_free_slabs+0x10/0x36
 [8846daf9] :nfsd:nfs4_state_shutdown+0x1a2/0x1ae
 [884570f8] :nfsd:nfsd_last_thread+0x47/0x76
 [8839463a] :sunrpc:svc_destroy+0x8d/0xd1
 [88394738] :sunrpc:svc_exit_thread+0xba/0xc6
 [8845795f] :nfsd:nfsd+0x2a3/0x2b8
 [802600f8] child_rip+0xa/0x12

---
| preempt count: 0001 ]
| 1-level deep critical section nesting:

.. [80355ddb]  debug_smp_processor_id+0x90/0xf1
.[802e01b5] ..   ( = drain_array+0x25/0x132)


This patch fixes the above issue which arises due to the call to
smp_processor_id in drain_array() from mm/slab.c. smp_processor_id()
invocation is redundant here, as the call to slab_spin_lock_irq() 
already fills the value of this_cpu using raw_smp_processor_id().


o Patch to fix BUG in drain_array()

Signed-off-by: Ankita Garg ankita2in.ibm.com
--
 slab.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Index: linux-2.6.20-rt8/mm/slab.c
===
--- linux-2.6.20-rt8.orig/mm/slab.c 2007-04-18 18:41:22.0 +0530
+++ linux-2.6.20-rt8/mm/slab.c  2007-04-18 18:42:21.0 +0530
@@ -4121,8 +4121,7 @@
 int drain_array(struct kmem_cache *cachep, struct kmem_list3 *l3,
 struct array_cache *ac, int force, int node)
 {
-   int this_cpu = smp_processor_id();
-   int tofree;
+   int tofree, this_cpu;
 
if (!ac || !ac-avail)
return 0;


Regards,
-- 
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems  Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] oom fix: prevent oom from killing a process with children/sibling unkillable

2007-03-15 Thread Ankita Garg

Looking at oom_kill.c, found that the intention to not kill the selected
process if any of its children/siblings has OOM_DISABLE set, is not being met.


Signed-off-by: Ankita Garg <[EMAIL PROTECTED]>

Index: ankita/linux-2.6.20.1/mm/oom_kill.c
===
--- ankita.orig/linux-2.6.20.1/mm/oom_kill.c2007-02-20 12:04:32.0 
+0530
+++ ankita/linux-2.6.20.1/mm/oom_kill.c 2007-03-15 12:44:50.0 +0530
@@ -320,7 +320,7 @@
 * Don't kill the process if any threads are set to OOM_DISABLE
 */
do_each_thread(g, q) {
-   if (q->mm == mm && p->oomkilladj == OOM_DISABLE)
+   if (q->mm == mm && q->oomkilladj == OOM_DISABLE)
return 1;
} while_each_thread(g, q);
 

Regards,
-- 
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems & Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] oom fix: prevent oom from killing a process with children/sibling unkillable

2007-03-15 Thread Ankita Garg

Looking at oom_kill.c, found that the intention to not kill the selected
process if any of its children/siblings has OOM_DISABLE set, is not being met.


Signed-off-by: Ankita Garg [EMAIL PROTECTED]

Index: ankita/linux-2.6.20.1/mm/oom_kill.c
===
--- ankita.orig/linux-2.6.20.1/mm/oom_kill.c2007-02-20 12:04:32.0 
+0530
+++ ankita/linux-2.6.20.1/mm/oom_kill.c 2007-03-15 12:44:50.0 +0530
@@ -320,7 +320,7 @@
 * Don't kill the process if any threads are set to OOM_DISABLE
 */
do_each_thread(g, q) {
-   if (q-mm == mm  p-oomkilladj == OOM_DISABLE)
+   if (q-mm == mm  q-oomkilladj == OOM_DISABLE)
return 1;
} while_each_thread(g, q);
 

Regards,
-- 
Ankita Garg ([EMAIL PROTECTED])
Linux Technology Center
IBM India Systems  Technology Labs, 
Bangalore, India   
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/