Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-17 Thread Lee Revell
On Sat, 2005-02-19 at 15:45 -0500, Lee Revell wrote:
> On Sat, 2005-02-19 at 10:03 +0100, Ingo Molnar wrote:
> > * Ingo Molnar <[EMAIL PROTECTED]> wrote:
> > 
> > > > Testing on an all SCSI 1.3Ghz Athlon XP system, I am seeing very long
> > > > latencies in the journalling code with 2.6.11-rc4-RT-V0.7.39-02.
> > > 
> > > could you send me the full trace?
> > 
> > just in case the system in question is still running - could you also do 
> > a 'verbose' trace via:
> > 
> > echo 1 > /proc/sys/kernel/trace_verbose
> 
> OK, here is a 2912us verbose latency trace with "data=ordered", gzipped.
> dbench 32 or 64 is the easiest way to trigger these.
> 
> I have not tried "data=journal".  As previously stated "data=writeback"
> works perfectly - I ran JACK overnight while stressing the fs and did
> not get one xrun.

Any update on this?  The problem is still apparent in 2.6.11.  It seems
to be a regression from 2.6.10.  And now I've heard 2.6.12-rc1 mentioned
with no motion on this.

Here's the trace again in case you missed it:

http://www.alsa-project.org/~rlrevell/2912us

The "latency regressions" thread was all sub-millisecond stuff which can
be ignored IMHO.  Still interesting because they are regressions after
all, but not a real world problem.

However this one can be several milliseconds.  It's a real problem.

I'd hate to have to ship 2.6.12 with a disclaimer that ext3 with
"data=ordered" is not suitable for the desktop (as it clearly violates
the stated desktop responsiveness goal of 1ms).

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-17 Thread Lee Revell
On Sat, 2005-02-19 at 15:45 -0500, Lee Revell wrote:
 On Sat, 2005-02-19 at 10:03 +0100, Ingo Molnar wrote:
  * Ingo Molnar [EMAIL PROTECTED] wrote:
  
Testing on an all SCSI 1.3Ghz Athlon XP system, I am seeing very long
latencies in the journalling code with 2.6.11-rc4-RT-V0.7.39-02.
   
   could you send me the full trace?
  
  just in case the system in question is still running - could you also do 
  a 'verbose' trace via:
  
  echo 1  /proc/sys/kernel/trace_verbose
 
 OK, here is a 2912us verbose latency trace with data=ordered, gzipped.
 dbench 32 or 64 is the easiest way to trigger these.
 
 I have not tried data=journal.  As previously stated data=writeback
 works perfectly - I ran JACK overnight while stressing the fs and did
 not get one xrun.

Any update on this?  The problem is still apparent in 2.6.11.  It seems
to be a regression from 2.6.10.  And now I've heard 2.6.12-rc1 mentioned
with no motion on this.

Here's the trace again in case you missed it:

http://www.alsa-project.org/~rlrevell/2912us

The latency regressions thread was all sub-millisecond stuff which can
be ignored IMHO.  Still interesting because they are regressions after
all, but not a real world problem.

However this one can be several milliseconds.  It's a real problem.

I'd hate to have to ship 2.6.12 with a disclaimer that ext3 with
data=ordered is not suitable for the desktop (as it clearly violates
the stated desktop responsiveness goal of 1ms).

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-16 Thread Lee Revell
On Wed, 2005-03-16 at 02:50 -0500, Steven Rostedt wrote:
> 
> On Tue, 15 Mar 2005, Lee Revell wrote:
> 
> > On Tue, 2005-03-15 at 13:05 -0500, Steven Rostedt wrote:
> > > Damn! The answer was right there in front of my eyes! Here's the cleanest
> > > solution. I forgot about wait_on_bit_lock.  I've converted all the locks
> > > to use this instead.  We probably need to get priority inheritence working
> > > on this too someday, but for now it's better than wasting memory or
> > > getting into deadlocks.
> > >
> >
> > I am still not clear on why this did not hit with earlier kernels +
> > PREEMPT_DESKTOP.  Were the bitlocks introduced recently?  Or was another
> > lock-break patch dropped?
> >
> 
> When did you start seeing this? This code has been there as far back as
> 2.6.7 (the earliest 2.6 kernel I still have laying around) and as far
> back as Ingo's realtime-preempt-2.6.9-mm1-U10. Maybe the tracing didn't
> start picking this up till later, or that you were just lucky that no
> contention was happening on that lock.

Sometime after the RT preempt patches were rebased to mainline.

I don't see how there could be contention as I am on a UP.

Lee


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-16 Thread Andrew Morton
Ingo Molnar <[EMAIL PROTECTED]> wrote:
>
> 
> * Steven Rostedt <[EMAIL PROTECTED]> wrote:
> 
> > Damn! The answer was right there in front of my eyes! Here's the
> > cleanest solution. I forgot about wait_on_bit_lock.  I've converted
> > all the locks to use this instead. [...]
> 
> ah, indeed, this looks really nifty. Andrew?
> 

There's a little lock ranking diagram in jbd.h which tells us that these
locks nest inside j_list_lock and j_state_lock.  So I guess you'll need to
turn those into semaphores.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-16 Thread Ingo Molnar

* Steven Rostedt <[EMAIL PROTECTED]> wrote:

> Damn! The answer was right there in front of my eyes! Here's the
> cleanest solution. I forgot about wait_on_bit_lock.  I've converted
> all the locks to use this instead. [...]

ah, indeed, this looks really nifty. Andrew?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-16 Thread Ingo Molnar

* Steven Rostedt [EMAIL PROTECTED] wrote:

 Damn! The answer was right there in front of my eyes! Here's the
 cleanest solution. I forgot about wait_on_bit_lock.  I've converted
 all the locks to use this instead. [...]

ah, indeed, this looks really nifty. Andrew?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-16 Thread Andrew Morton
Ingo Molnar [EMAIL PROTECTED] wrote:

 
 * Steven Rostedt [EMAIL PROTECTED] wrote:
 
  Damn! The answer was right there in front of my eyes! Here's the
  cleanest solution. I forgot about wait_on_bit_lock.  I've converted
  all the locks to use this instead. [...]
 
 ah, indeed, this looks really nifty. Andrew?
 

There's a little lock ranking diagram in jbd.h which tells us that these
locks nest inside j_list_lock and j_state_lock.  So I guess you'll need to
turn those into semaphores.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-16 Thread Lee Revell
On Wed, 2005-03-16 at 02:50 -0500, Steven Rostedt wrote:
 
 On Tue, 15 Mar 2005, Lee Revell wrote:
 
  On Tue, 2005-03-15 at 13:05 -0500, Steven Rostedt wrote:
   Damn! The answer was right there in front of my eyes! Here's the cleanest
   solution. I forgot about wait_on_bit_lock.  I've converted all the locks
   to use this instead.  We probably need to get priority inheritence working
   on this too someday, but for now it's better than wasting memory or
   getting into deadlocks.
  
 
  I am still not clear on why this did not hit with earlier kernels +
  PREEMPT_DESKTOP.  Were the bitlocks introduced recently?  Or was another
  lock-break patch dropped?
 
 
 When did you start seeing this? This code has been there as far back as
 2.6.7 (the earliest 2.6 kernel I still have laying around) and as far
 back as Ingo's realtime-preempt-2.6.9-mm1-U10. Maybe the tracing didn't
 start picking this up till later, or that you were just lucky that no
 contention was happening on that lock.

Sometime after the RT preempt patches were rebased to mainline.

I don't see how there could be contention as I am on a UP.

Lee


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Steven Rostedt


On Tue, 15 Mar 2005, Lee Revell wrote:

> On Tue, 2005-03-15 at 13:05 -0500, Steven Rostedt wrote:
> > Damn! The answer was right there in front of my eyes! Here's the cleanest
> > solution. I forgot about wait_on_bit_lock.  I've converted all the locks
> > to use this instead.  We probably need to get priority inheritence working
> > on this too someday, but for now it's better than wasting memory or
> > getting into deadlocks.
> >
>
> I am still not clear on why this did not hit with earlier kernels +
> PREEMPT_DESKTOP.  Were the bitlocks introduced recently?  Or was another
> lock-break patch dropped?
>

When did you start seeing this? This code has been there as far back as
2.6.7 (the earliest 2.6 kernel I still have laying around) and as far
back as Ingo's realtime-preempt-2.6.9-mm1-U10. Maybe the tracing didn't
start picking this up till later, or that you were just lucky that no
contention was happening on that lock.

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Steven Rostedt


On Tue, 15 Mar 2005, Steven Rostedt wrote:

>
>
> On Tue, 15 Mar 2005, Ingo Molnar wrote:
> >
> > i'd go for removing bit-spinlocks altogether, in the upstream kernel. It
> > would simplify things, besides making PREEMPT_RT simpler as well. The
> > memory overhead is not a big issue i believe. (8 more bytes per ext3 bh,
> > on x86)
> >
>
> Hi Ingo,
>
> Damn! The answer was right there in front of my eyes! Here's the cleanest
> solution. I forgot about wait_on_bit_lock.  I've converted all the locks
> to use this instead.  We probably need to get priority inheritence working
> on this too someday, but for now it's better than wasting memory or
> getting into deadlocks.
>

One bit of caution on these. If we don't have PREEMPT_RT, then don't the
spinlocks on SMP act the same as normal spinlocks, and that we should not
schedule holding a spinlock? I believe that some of this locks are called
within holding spin_locks. So this isn't the right solution for other than
PREEMPT_RT. I also forgot to add might_sleep in the locking calls. Here's
the patch with the might_sleep added.  What should we do for non
PREEPMT_RT?  Maybe put the bit_spinlocks back in for that case?

-- Steve

diff -ur linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/journal.c 
linux-2.6.11-final-V0.7.40-00/fs/jbd/journal.c
--- linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/journal.c 2005-03-02 
02:37:49.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/fs/jbd/journal.c  2005-03-15 
11:58:14.0 -0500
@@ -82,6 +82,17 @@

 static int journal_convert_superblock_v1(journal_t *, journal_superblock_t *);

+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+/*
+ * Used in the locking of the bh_state and bh_journalhead bit locks.
+ */
+int jbd_lock_bh_sleep(void *notused)
+{
+   schedule();
+   return 0;
+}
+#endif
+
 /*
  * Helper function used to manage commit timeouts
  */
diff -ur linux-2.6.11-final-V0.7.40-00.orig/include/linux/jbd.h 
linux-2.6.11-final-V0.7.40-00/include/linux/jbd.h
--- linux-2.6.11-final-V0.7.40-00.orig/include/linux/jbd.h  2005-03-02 
02:38:19.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/include/linux/jbd.h   2005-03-16 
02:25:31.881251828 -0500
@@ -324,34 +324,65 @@
return bh->b_private;
 }

+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+int jbd_lock_bh_sleep(void *notused);
+#endif
+
 static inline void jbd_lock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_State, >b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   might_sleep();
+   
wait_on_bit_lock(>b_state,BH_State,_lock_bh_sleep,TASK_UNINTERRUPTIBLE);
+#endif
+   __acquire(bitlock);
 }

 static inline int jbd_trylock_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_trylock(BH_State, >b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   if (test_and_set_bit(BH_State, >b_state))
+   return 0;
+#endif
+   __acquire(bitlock);
+   return 1;
 }

 static inline int jbd_is_locked_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_is_locked(BH_State, >b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   return test_bit(BH_State, >b_state);
+#else
+   return 1;
+#endif
 }

 static inline void jbd_unlock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_State, >b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   clear_bit(BH_State, >b_state);
+   smp_mb__after_clear_bit();
+   wake_up_bit(>b_state, BH_State);
+#endif
+   __release(bitlock);
 }

 static inline void jbd_lock_bh_journal_head(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_JournalHead, >b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   might_sleep();
+   
wait_on_bit_lock(>b_state,BH_JournalHead,_lock_bh_sleep,TASK_UNINTERRUPTIBLE);
+#endif
+   __acquire(bitlock);
 }

 static inline void jbd_unlock_bh_journal_head(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_JournalHead, >b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   clear_bit(BH_JournalHead, >b_state);
+   smp_mb__after_clear_bit();
+   wake_up_bit(>b_state, BH_JournalHead);
+#endif
+   __release(bitlock);
 }

 struct jbd_revoke_table_s;
diff -ur linux-2.6.11-final-V0.7.40-00.orig/include/linux/spinlock.h 
linux-2.6.11-final-V0.7.40-00/include/linux/spinlock.h
--- linux-2.6.11-final-V0.7.40-00.orig/include/linux/spinlock.h 2005-03-14 
06:00:54.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/include/linux/spinlock.h  2005-03-15 
12:19:11.0 -0500
@@ -774,67 +774,6 @@
 }))


-/*
- *  bit-based spin_lock()
- *
- * Don't use this unless you really need to: spin_lock() and spin_unlock()
- * are significantly faster.
- 

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Lee Revell
On Tue, 2005-03-15 at 13:05 -0500, Steven Rostedt wrote:
> Damn! The answer was right there in front of my eyes! Here's the cleanest
> solution. I forgot about wait_on_bit_lock.  I've converted all the locks
> to use this instead.  We probably need to get priority inheritence working
> on this too someday, but for now it's better than wasting memory or
> getting into deadlocks.
> 

I am still not clear on why this did not hit with earlier kernels +
PREEMPT_DESKTOP.  Were the bitlocks introduced recently?  Or was another
lock-break patch dropped?

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Andrew Morton
Steven Rostedt <[EMAIL PROTECTED]> wrote:
>
> The problem here is that it's not ext3 bh's only. They're still the normal
>  buffer head.  The problem arrises because the ext3 "journal head" is
>  allocated within these bit spin locks.

Yes, the locks do want to live inside the buffer_head.

Stephen has pointed out that we might want to remove
jbd_lock_bh_journal_head() altogether some time, just use
jbd_lock_bh_state() for that.

In 2.4 these locks are global (or per-superblock).  Making them a global
spinlock would be acceptable for 2-ways and probably larger.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Steven Rostedt


On Tue, 15 Mar 2005, Ingo Molnar wrote:
>
> i'd go for removing bit-spinlocks altogether, in the upstream kernel. It
> would simplify things, besides making PREEMPT_RT simpler as well. The
> memory overhead is not a big issue i believe. (8 more bytes per ext3 bh,
> on x86)
>

Hi Ingo,

Damn! The answer was right there in front of my eyes! Here's the cleanest
solution. I forgot about wait_on_bit_lock.  I've converted all the locks
to use this instead.  We probably need to get priority inheritence working
on this too someday, but for now it's better than wasting memory or
getting into deadlocks.

-- Steve

diff -ur linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/journal.c 
linux-2.6.11-final-V0.7.40-00/fs/jbd/journal.c
--- linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/journal.c 2005-03-02 
02:37:49.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/fs/jbd/journal.c  2005-03-15 
11:58:14.0 -0500
@@ -82,6 +82,17 @@

 static int journal_convert_superblock_v1(journal_t *, journal_superblock_t *);

+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+/*
+ * Used in the locking of the bh_state and bh_journalhead bit locks.
+ */
+int jbd_lock_bh_sleep(void *notused)
+{
+   schedule();
+   return 0;
+}
+#endif
+
 /*
  * Helper function used to manage commit timeouts
  */
diff -ur linux-2.6.11-final-V0.7.40-00.orig/include/linux/jbd.h 
linux-2.6.11-final-V0.7.40-00/include/linux/jbd.h
--- linux-2.6.11-final-V0.7.40-00.orig/include/linux/jbd.h  2005-03-02 
02:38:19.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/include/linux/jbd.h   2005-03-15 
11:58:40.0 -0500
@@ -324,34 +324,63 @@
return bh->b_private;
 }

+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+int jbd_lock_bh_sleep(void *notused);
+#endif
+
 static inline void jbd_lock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_State, >b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   
wait_on_bit_lock(>b_state,BH_State,_lock_bh_sleep,TASK_UNINTERRUPTIBLE);
+#endif
+   __acquire(bitlock);
 }

 static inline int jbd_trylock_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_trylock(BH_State, >b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   if (test_and_set_bit(BH_State, >b_state))
+   return 0;
+#endif
+   __acquire(bitlock);
+   return 1;
 }

 static inline int jbd_is_locked_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_is_locked(BH_State, >b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   return test_bit(BH_State, >b_state);
+#else
+   return 1;
+#endif
 }

 static inline void jbd_unlock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_State, >b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   clear_bit(BH_State, >b_state);
+   smp_mb__after_clear_bit();
+   wake_up_bit(>b_state, BH_State);
+#endif
+   __release(bitlock);
 }

 static inline void jbd_lock_bh_journal_head(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_JournalHead, >b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   
wait_on_bit_lock(>b_state,BH_JournalHead,_lock_bh_sleep,TASK_UNINTERRUPTIBLE);
+#endif
+   __acquire(bitlock);
 }

 static inline void jbd_unlock_bh_journal_head(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_JournalHead, >b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   clear_bit(BH_JournalHead, >b_state);
+   smp_mb__after_clear_bit();
+   wake_up_bit(>b_state, BH_JournalHead);
+#endif
+   __release(bitlock);
 }

 struct jbd_revoke_table_s;
diff -ur linux-2.6.11-final-V0.7.40-00.orig/include/linux/spinlock.h 
linux-2.6.11-final-V0.7.40-00/include/linux/spinlock.h
--- linux-2.6.11-final-V0.7.40-00.orig/include/linux/spinlock.h 2005-03-14 
06:00:54.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/include/linux/spinlock.h  2005-03-15 
12:19:11.032217736 -0500
@@ -774,67 +774,6 @@
 }))


-/*
- *  bit-based spin_lock()
- *
- * Don't use this unless you really need to: spin_lock() and spin_unlock()
- * are significantly faster.
- */
-static inline void bit_spin_lock(int bitnum, unsigned long *addr)
-{
-   /*
-* Assuming the lock is uncontended, this never enters
-* the body of the outer loop. If it is contended, then
-* within the inner loop a non-atomic test is used to
-* busywait with less bus contention for a good time to
-* attempt to acquire the lock bit.
-*/
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
-   while (test_and_set_bit(bitnum, addr))
-   while (test_bit(bitnum, addr))
-   cpu_relax();
-#endif
-

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Steven Rostedt


On Tue, 15 Mar 2005, Ingo Molnar wrote:

>
> * Steven Rostedt <[EMAIL PROTECTED]> wrote:
>
>
> > What should we use instead of #ifdef PREEMPT_RT? Or should we just
> > keep it the same for both.  Since this fix is only to fix spinlocks
> > that schedule, I figured that it would be better not to waste the
> > memory of those not using PREEMPT_RT.  Should I use the opposite
> > PREEMPT_DESKTOP?
>
> i'd go for removing bit-spinlocks altogether, in the upstream kernel. It
> would simplify things, besides making PREEMPT_RT simpler as well. The
> memory overhead is not a big issue i believe. (8 more bytes per ext3 bh,
> on x86)
>

The problem here is that it's not ext3 bh's only. They're still the normal
buffer head.  The problem arrises because the ext3 "journal head" is
allocated within these bit spin locks. I tried to monkey with putting the
locks in the journal heads and have checks to see when to free them, but
it wasn't that simple. I started having problems with some of the freeing
transactions, I might have assumed too much.

I'll give it one more try to get it into the journal heads, but after
that, (if I fail) I'll let someone who understands the ext3 system better
handle this.

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Ingo Molnar

* Steven Rostedt <[EMAIL PROTECTED]> wrote:

> > good progress - but the global lock may be a scalability worry on
> > upstream though. Would it be possible to just mirror much of the current
> > lock logic, but with spinlocks instead of bitlocks? And there should be
> > no #ifdefs on PREEMPT_RT.
> 
> The first patch I had just converted the bit spinlocks to spinlocks
> but I thought that adding two spinlocks was too much for every buffer
> head, even if it wasn't in the ext3 file system. The journal head
> spinlock is just used to add and remove the journal heads from the
> buffer heads, so I'm not sure how much contention is on them. I only
> have a dual smp system, so I can't test the system on large number of
> CPUs. What do you think, should we sacrafice memory for speed?

there are two bad effects of global spinlocks: 1) contention 2)
cacheline bouncing. It's #2 that would affect this spinlock. While i'm
not sure this would show up in usual benchmarks, we should rather err on
the side of more scalability. Two spinlocks are just two more machine
words on most architectures, so i dont think it matters all that much,
while it removes a major wart - as long as the two extra locks are for
ext3 buffer-heads only.

> What should we use instead of #ifdef PREEMPT_RT? Or should we just
> keep it the same for both.  Since this fix is only to fix spinlocks
> that schedule, I figured that it would be better not to waste the
> memory of those not using PREEMPT_RT.  Should I use the opposite
> PREEMPT_DESKTOP?

i'd go for removing bit-spinlocks altogether, in the upstream kernel. It
would simplify things, besides making PREEMPT_RT simpler as well. The
memory overhead is not a big issue i believe. (8 more bytes per ext3 bh,
on x86)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Steven Rostedt


On Tue, 15 Mar 2005, Ingo Molnar wrote:

>
> * Steven Rostedt <[EMAIL PROTECTED]> wrote:
>
> > I've realized that my previous patch had too many problems with the
> > way the journaling system works.  So I went back to my first approach
> > but added the journal_head lock as one global lock to keep the buffer
> > head size smaller. I only added the state lock to the buffer head.
> > I've tested this for some time now, and it works well (for the test at
> > least). I'll recompile it with PREEMPT_DESKTOP to see if that works
> > too.
>
> good progress - but the global lock may be a scalability worry on
> upstream though. Would it be possible to just mirror much of the current
> lock logic, but with spinlocks instead of bitlocks? And there should be
> no #ifdefs on PREEMPT_RT.
>

The first patch I had just converted the bit spinlocks to spinlocks but I
thought that adding two spinlocks was too much for every buffer head, even
if it wasn't in the ext3 file system. The journal head spinlock is just
used to add and remove the journal heads from the buffer heads, so I'm not
sure how much contention is on them. I only have a dual smp system, so I
can't test the system on large number of CPUs. What do you think, should
we sacrafice memory for speed?

What should we use instead of #ifdef PREEMPT_RT? Or should we just keep it
the same for both.  Since this fix is only to fix spinlocks that schedule,
I figured that it would be better not to waste the memory of those not
using PREEMPT_RT.  Should I use the opposite PREEMPT_DESKTOP?

Thanks,

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Ingo Molnar

* Steven Rostedt <[EMAIL PROTECTED]> wrote:

> I've realized that my previous patch had too many problems with the
> way the journaling system works.  So I went back to my first approach
> but added the journal_head lock as one global lock to keep the buffer
> head size smaller. I only added the state lock to the buffer head.
> I've tested this for some time now, and it works well (for the test at
> least). I'll recompile it with PREEMPT_DESKTOP to see if that works
> too.

good progress - but the global lock may be a scalability worry on
upstream though. Would it be possible to just mirror much of the current
lock logic, but with spinlocks instead of bitlocks? And there should be
no #ifdefs on PREEMPT_RT.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Steven Rostedt


I've realized that my previous patch had too many problems with the way
the journaling system works.  So I went back to my first approach but
added the journal_head lock as one global lock to keep the buffer head
size smaller. I only added the state lock to the buffer head. I've tested
this for some time now, and it works well (for the test at least). I'll
recompile it with PREEMPT_DESKTOP to see if that works too.


-- Steve



diff -ur linux-2.6.11-final-V0.7.40-00.orig/fs/buffer.c 
linux-2.6.11-final-V0.7.40-00/fs/buffer.c
--- linux-2.6.11-final-V0.7.40-00.orig/fs/buffer.c  2005-03-02 
02:38:10.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/fs/buffer.c   2005-03-15 03:41:15.0 
-0500
@@ -3003,6 +3003,9 @@
preempt_disable();
__get_cpu_var(bh_accounting).nr++;
recalc_bh_state();
+#ifdef CONFIG_PREEMPT_RT
+   spin_lock_init(>b_jstate_lock);
+#endif
preempt_enable();
}
return ret;
diff -ur linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/journal.c 
linux-2.6.11-final-V0.7.40-00/fs/jbd/journal.c
--- linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/journal.c 2005-03-02 
02:37:49.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/fs/jbd/journal.c  2005-03-15 
03:49:10.0 -0500
@@ -82,6 +82,8 @@

 static int journal_convert_superblock_v1(journal_t *, journal_superblock_t *);

+spinlock_t journal_head_lock = SPIN_LOCK_UNLOCKED;
+
 /*
  * Helper function used to manage commit timeouts
  */
diff -ur linux-2.6.11-final-V0.7.40-00.orig/include/linux/buffer_head.h 
linux-2.6.11-final-V0.7.40-00/include/linux/buffer_head.h
--- linux-2.6.11-final-V0.7.40-00.orig/include/linux/buffer_head.h  
2005-03-02 02:37:45.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/include/linux/buffer_head.h   2005-03-15 
03:42:22.0 -0500
@@ -62,6 +62,13 @@
bh_end_io_t *b_end_io;  /* I/O completion */
void *b_private;/* reserved for b_end_io */
struct list_head b_assoc_buffers; /* associated with another mapping */
+
+#ifdef CONFIG_PREEMPT_RT
+   /*
+* Fixme: This should be in the journal code.
+*/
+   spinlock_t b_jstate_lock;   /* lock for journal state. */
+#endif
 };

 /*
diff -ur linux-2.6.11-final-V0.7.40-00.orig/include/linux/jbd.h 
linux-2.6.11-final-V0.7.40-00/include/linux/jbd.h
--- linux-2.6.11-final-V0.7.40-00.orig/include/linux/jbd.h  2005-03-02 
02:38:19.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/include/linux/jbd.h   2005-03-15 
03:45:33.0 -0500
@@ -314,6 +314,13 @@
 TAS_BUFFER_FNS(RevokeValid, revokevalid)
 BUFFER_FNS(Freed, freed)

+#ifdef CONFIG_PREEMPT_RT
+extern spinlock_t journal_head_lock;
+#define PICK_SPIN_LOCK(otype,bit,name) spin_##otype(>b_##name##_lock)
+#else
+#define PICK_SPIN_LOCK(otype,bit,name) bit_spin_##otype(bit,bh->b_state);
+#endif
+
 static inline struct buffer_head *jh2bh(struct journal_head *jh)
 {
return jh->b_bh;
@@ -326,24 +333,36 @@

 static inline void jbd_lock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_State, >b_state);
+   PICK_SPIN_LOCK(lock,BH_State,jstate);
 }

 static inline int jbd_trylock_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_trylock(BH_State, >b_state);
+   return PICK_SPIN_LOCK(trylock,BH_State,jstate);
 }

 static inline int jbd_is_locked_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_is_locked(BH_State, >b_state);
+   return PICK_SPIN_LOCK(is_locked,BH_State,jstate);
 }

 static inline void jbd_unlock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_State, >b_state);
+   PICK_SPIN_LOCK(unlock,BH_State,jstate);
+}
+#undef PICK_SPIN_LOCK
+
+#ifdef CONFIG_PREEMPT_RT
+static inline void jbd_lock_bh_journal_head(struct buffer_head *bh)
+{
+   spin_lock(_head_lock);
 }

+static inline void jbd_unlock_bh_journal_head(struct buffer_head *bh)
+{
+   spin_unlock(_head_lock);
+}
+#else /* !CONFIG_PREEMPT_RT */
 static inline void jbd_lock_bh_journal_head(struct buffer_head *bh)
 {
bit_spin_lock(BH_JournalHead, >b_state);
@@ -353,6 +372,7 @@
 {
bit_spin_unlock(BH_JournalHead, >b_state);
 }
+#endif /* CONFIG_PREEMPT_RT */

 struct jbd_revoke_table_s;

diff -ur linux-2.6.11-final-V0.7.40-00.orig/include/linux/spinlock.h 
linux-2.6.11-final-V0.7.40-00/include/linux/spinlock.h
--- linux-2.6.11-final-V0.7.40-00.orig/include/linux/spinlock.h 2005-03-14 
06:00:54.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/include/linux/spinlock.h  2005-03-15 
03:40:31.0 -0500
@@ -774,6 +774,10 @@
 }))


+#ifndef CONFIG_PREEMPT_RT
+
+/* These are just plain evil! */
+
 /*
  *  bit-based spin_lock()
  *
@@ -789,10 +793,15 @@
 * busywait with less bus contention for a good time to
 * attempt to acquire the lock bit.
 */
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
-   while (test_and_set_bit(bitnum, addr))
-  

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Steven Rostedt


I've realized that my previous patch had too many problems with the way
the journaling system works.  So I went back to my first approach but
added the journal_head lock as one global lock to keep the buffer head
size smaller. I only added the state lock to the buffer head. I've tested
this for some time now, and it works well (for the test at least). I'll
recompile it with PREEMPT_DESKTOP to see if that works too.


-- Steve



diff -ur linux-2.6.11-final-V0.7.40-00.orig/fs/buffer.c 
linux-2.6.11-final-V0.7.40-00/fs/buffer.c
--- linux-2.6.11-final-V0.7.40-00.orig/fs/buffer.c  2005-03-02 
02:38:10.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/fs/buffer.c   2005-03-15 03:41:15.0 
-0500
@@ -3003,6 +3003,9 @@
preempt_disable();
__get_cpu_var(bh_accounting).nr++;
recalc_bh_state();
+#ifdef CONFIG_PREEMPT_RT
+   spin_lock_init(ret-b_jstate_lock);
+#endif
preempt_enable();
}
return ret;
diff -ur linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/journal.c 
linux-2.6.11-final-V0.7.40-00/fs/jbd/journal.c
--- linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/journal.c 2005-03-02 
02:37:49.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/fs/jbd/journal.c  2005-03-15 
03:49:10.0 -0500
@@ -82,6 +82,8 @@

 static int journal_convert_superblock_v1(journal_t *, journal_superblock_t *);

+spinlock_t journal_head_lock = SPIN_LOCK_UNLOCKED;
+
 /*
  * Helper function used to manage commit timeouts
  */
diff -ur linux-2.6.11-final-V0.7.40-00.orig/include/linux/buffer_head.h 
linux-2.6.11-final-V0.7.40-00/include/linux/buffer_head.h
--- linux-2.6.11-final-V0.7.40-00.orig/include/linux/buffer_head.h  
2005-03-02 02:37:45.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/include/linux/buffer_head.h   2005-03-15 
03:42:22.0 -0500
@@ -62,6 +62,13 @@
bh_end_io_t *b_end_io;  /* I/O completion */
void *b_private;/* reserved for b_end_io */
struct list_head b_assoc_buffers; /* associated with another mapping */
+
+#ifdef CONFIG_PREEMPT_RT
+   /*
+* Fixme: This should be in the journal code.
+*/
+   spinlock_t b_jstate_lock;   /* lock for journal state. */
+#endif
 };

 /*
diff -ur linux-2.6.11-final-V0.7.40-00.orig/include/linux/jbd.h 
linux-2.6.11-final-V0.7.40-00/include/linux/jbd.h
--- linux-2.6.11-final-V0.7.40-00.orig/include/linux/jbd.h  2005-03-02 
02:38:19.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/include/linux/jbd.h   2005-03-15 
03:45:33.0 -0500
@@ -314,6 +314,13 @@
 TAS_BUFFER_FNS(RevokeValid, revokevalid)
 BUFFER_FNS(Freed, freed)

+#ifdef CONFIG_PREEMPT_RT
+extern spinlock_t journal_head_lock;
+#define PICK_SPIN_LOCK(otype,bit,name) spin_##otype(bh-b_##name##_lock)
+#else
+#define PICK_SPIN_LOCK(otype,bit,name) bit_spin_##otype(bit,bh-b_state);
+#endif
+
 static inline struct buffer_head *jh2bh(struct journal_head *jh)
 {
return jh-b_bh;
@@ -326,24 +333,36 @@

 static inline void jbd_lock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_State, bh-b_state);
+   PICK_SPIN_LOCK(lock,BH_State,jstate);
 }

 static inline int jbd_trylock_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_trylock(BH_State, bh-b_state);
+   return PICK_SPIN_LOCK(trylock,BH_State,jstate);
 }

 static inline int jbd_is_locked_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_is_locked(BH_State, bh-b_state);
+   return PICK_SPIN_LOCK(is_locked,BH_State,jstate);
 }

 static inline void jbd_unlock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_State, bh-b_state);
+   PICK_SPIN_LOCK(unlock,BH_State,jstate);
+}
+#undef PICK_SPIN_LOCK
+
+#ifdef CONFIG_PREEMPT_RT
+static inline void jbd_lock_bh_journal_head(struct buffer_head *bh)
+{
+   spin_lock(journal_head_lock);
 }

+static inline void jbd_unlock_bh_journal_head(struct buffer_head *bh)
+{
+   spin_unlock(journal_head_lock);
+}
+#else /* !CONFIG_PREEMPT_RT */
 static inline void jbd_lock_bh_journal_head(struct buffer_head *bh)
 {
bit_spin_lock(BH_JournalHead, bh-b_state);
@@ -353,6 +372,7 @@
 {
bit_spin_unlock(BH_JournalHead, bh-b_state);
 }
+#endif /* CONFIG_PREEMPT_RT */

 struct jbd_revoke_table_s;

diff -ur linux-2.6.11-final-V0.7.40-00.orig/include/linux/spinlock.h 
linux-2.6.11-final-V0.7.40-00/include/linux/spinlock.h
--- linux-2.6.11-final-V0.7.40-00.orig/include/linux/spinlock.h 2005-03-14 
06:00:54.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/include/linux/spinlock.h  2005-03-15 
03:40:31.0 -0500
@@ -774,6 +774,10 @@
 }))


+#ifndef CONFIG_PREEMPT_RT
+
+/* These are just plain evil! */
+
 /*
  *  bit-based spin_lock()
  *
@@ -789,10 +793,15 @@
 * busywait with less bus contention for a good time to
 * attempt to acquire the lock bit.
 */
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
-   while 

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Ingo Molnar

* Steven Rostedt [EMAIL PROTECTED] wrote:

 I've realized that my previous patch had too many problems with the
 way the journaling system works.  So I went back to my first approach
 but added the journal_head lock as one global lock to keep the buffer
 head size smaller. I only added the state lock to the buffer head.
 I've tested this for some time now, and it works well (for the test at
 least). I'll recompile it with PREEMPT_DESKTOP to see if that works
 too.

good progress - but the global lock may be a scalability worry on
upstream though. Would it be possible to just mirror much of the current
lock logic, but with spinlocks instead of bitlocks? And there should be
no #ifdefs on PREEMPT_RT.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Steven Rostedt


On Tue, 15 Mar 2005, Ingo Molnar wrote:


 * Steven Rostedt [EMAIL PROTECTED] wrote:

  I've realized that my previous patch had too many problems with the
  way the journaling system works.  So I went back to my first approach
  but added the journal_head lock as one global lock to keep the buffer
  head size smaller. I only added the state lock to the buffer head.
  I've tested this for some time now, and it works well (for the test at
  least). I'll recompile it with PREEMPT_DESKTOP to see if that works
  too.

 good progress - but the global lock may be a scalability worry on
 upstream though. Would it be possible to just mirror much of the current
 lock logic, but with spinlocks instead of bitlocks? And there should be
 no #ifdefs on PREEMPT_RT.


The first patch I had just converted the bit spinlocks to spinlocks but I
thought that adding two spinlocks was too much for every buffer head, even
if it wasn't in the ext3 file system. The journal head spinlock is just
used to add and remove the journal heads from the buffer heads, so I'm not
sure how much contention is on them. I only have a dual smp system, so I
can't test the system on large number of CPUs. What do you think, should
we sacrafice memory for speed?

What should we use instead of #ifdef PREEMPT_RT? Or should we just keep it
the same for both.  Since this fix is only to fix spinlocks that schedule,
I figured that it would be better not to waste the memory of those not
using PREEMPT_RT.  Should I use the opposite PREEMPT_DESKTOP?

Thanks,

-- Steve

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Ingo Molnar

* Steven Rostedt [EMAIL PROTECTED] wrote:

  good progress - but the global lock may be a scalability worry on
  upstream though. Would it be possible to just mirror much of the current
  lock logic, but with spinlocks instead of bitlocks? And there should be
  no #ifdefs on PREEMPT_RT.
 
 The first patch I had just converted the bit spinlocks to spinlocks
 but I thought that adding two spinlocks was too much for every buffer
 head, even if it wasn't in the ext3 file system. The journal head
 spinlock is just used to add and remove the journal heads from the
 buffer heads, so I'm not sure how much contention is on them. I only
 have a dual smp system, so I can't test the system on large number of
 CPUs. What do you think, should we sacrafice memory for speed?

there are two bad effects of global spinlocks: 1) contention 2)
cacheline bouncing. It's #2 that would affect this spinlock. While i'm
not sure this would show up in usual benchmarks, we should rather err on
the side of more scalability. Two spinlocks are just two more machine
words on most architectures, so i dont think it matters all that much,
while it removes a major wart - as long as the two extra locks are for
ext3 buffer-heads only.

 What should we use instead of #ifdef PREEMPT_RT? Or should we just
 keep it the same for both.  Since this fix is only to fix spinlocks
 that schedule, I figured that it would be better not to waste the
 memory of those not using PREEMPT_RT.  Should I use the opposite
 PREEMPT_DESKTOP?

i'd go for removing bit-spinlocks altogether, in the upstream kernel. It
would simplify things, besides making PREEMPT_RT simpler as well. The
memory overhead is not a big issue i believe. (8 more bytes per ext3 bh,
on x86)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Steven Rostedt


On Tue, 15 Mar 2005, Ingo Molnar wrote:


 * Steven Rostedt [EMAIL PROTECTED] wrote:


  What should we use instead of #ifdef PREEMPT_RT? Or should we just
  keep it the same for both.  Since this fix is only to fix spinlocks
  that schedule, I figured that it would be better not to waste the
  memory of those not using PREEMPT_RT.  Should I use the opposite
  PREEMPT_DESKTOP?

 i'd go for removing bit-spinlocks altogether, in the upstream kernel. It
 would simplify things, besides making PREEMPT_RT simpler as well. The
 memory overhead is not a big issue i believe. (8 more bytes per ext3 bh,
 on x86)


The problem here is that it's not ext3 bh's only. They're still the normal
buffer head.  The problem arrises because the ext3 journal head is
allocated within these bit spin locks. I tried to monkey with putting the
locks in the journal heads and have checks to see when to free them, but
it wasn't that simple. I started having problems with some of the freeing
transactions, I might have assumed too much.

I'll give it one more try to get it into the journal heads, but after
that, (if I fail) I'll let someone who understands the ext3 system better
handle this.

-- Steve

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Steven Rostedt


On Tue, 15 Mar 2005, Ingo Molnar wrote:

 i'd go for removing bit-spinlocks altogether, in the upstream kernel. It
 would simplify things, besides making PREEMPT_RT simpler as well. The
 memory overhead is not a big issue i believe. (8 more bytes per ext3 bh,
 on x86)


Hi Ingo,

Damn! The answer was right there in front of my eyes! Here's the cleanest
solution. I forgot about wait_on_bit_lock.  I've converted all the locks
to use this instead.  We probably need to get priority inheritence working
on this too someday, but for now it's better than wasting memory or
getting into deadlocks.

-- Steve

diff -ur linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/journal.c 
linux-2.6.11-final-V0.7.40-00/fs/jbd/journal.c
--- linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/journal.c 2005-03-02 
02:37:49.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/fs/jbd/journal.c  2005-03-15 
11:58:14.0 -0500
@@ -82,6 +82,17 @@

 static int journal_convert_superblock_v1(journal_t *, journal_superblock_t *);

+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+/*
+ * Used in the locking of the bh_state and bh_journalhead bit locks.
+ */
+int jbd_lock_bh_sleep(void *notused)
+{
+   schedule();
+   return 0;
+}
+#endif
+
 /*
  * Helper function used to manage commit timeouts
  */
diff -ur linux-2.6.11-final-V0.7.40-00.orig/include/linux/jbd.h 
linux-2.6.11-final-V0.7.40-00/include/linux/jbd.h
--- linux-2.6.11-final-V0.7.40-00.orig/include/linux/jbd.h  2005-03-02 
02:38:19.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/include/linux/jbd.h   2005-03-15 
11:58:40.0 -0500
@@ -324,34 +324,63 @@
return bh-b_private;
 }

+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+int jbd_lock_bh_sleep(void *notused);
+#endif
+
 static inline void jbd_lock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_State, bh-b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   
wait_on_bit_lock(bh-b_state,BH_State,jbd_lock_bh_sleep,TASK_UNINTERRUPTIBLE);
+#endif
+   __acquire(bitlock);
 }

 static inline int jbd_trylock_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_trylock(BH_State, bh-b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   if (test_and_set_bit(BH_State, bh-b_state))
+   return 0;
+#endif
+   __acquire(bitlock);
+   return 1;
 }

 static inline int jbd_is_locked_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_is_locked(BH_State, bh-b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   return test_bit(BH_State, bh-b_state);
+#else
+   return 1;
+#endif
 }

 static inline void jbd_unlock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_State, bh-b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   clear_bit(BH_State, bh-b_state);
+   smp_mb__after_clear_bit();
+   wake_up_bit(bh-b_state, BH_State);
+#endif
+   __release(bitlock);
 }

 static inline void jbd_lock_bh_journal_head(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_JournalHead, bh-b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   
wait_on_bit_lock(bh-b_state,BH_JournalHead,jbd_lock_bh_sleep,TASK_UNINTERRUPTIBLE);
+#endif
+   __acquire(bitlock);
 }

 static inline void jbd_unlock_bh_journal_head(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_JournalHead, bh-b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   clear_bit(BH_JournalHead, bh-b_state);
+   smp_mb__after_clear_bit();
+   wake_up_bit(bh-b_state, BH_JournalHead);
+#endif
+   __release(bitlock);
 }

 struct jbd_revoke_table_s;
diff -ur linux-2.6.11-final-V0.7.40-00.orig/include/linux/spinlock.h 
linux-2.6.11-final-V0.7.40-00/include/linux/spinlock.h
--- linux-2.6.11-final-V0.7.40-00.orig/include/linux/spinlock.h 2005-03-14 
06:00:54.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/include/linux/spinlock.h  2005-03-15 
12:19:11.032217736 -0500
@@ -774,67 +774,6 @@
 }))


-/*
- *  bit-based spin_lock()
- *
- * Don't use this unless you really need to: spin_lock() and spin_unlock()
- * are significantly faster.
- */
-static inline void bit_spin_lock(int bitnum, unsigned long *addr)
-{
-   /*
-* Assuming the lock is uncontended, this never enters
-* the body of the outer loop. If it is contended, then
-* within the inner loop a non-atomic test is used to
-* busywait with less bus contention for a good time to
-* attempt to acquire the lock bit.
-*/
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
-   while (test_and_set_bit(bitnum, addr))
-   while (test_bit(bitnum, addr))
-  

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Andrew Morton
Steven Rostedt [EMAIL PROTECTED] wrote:

 The problem here is that it's not ext3 bh's only. They're still the normal
  buffer head.  The problem arrises because the ext3 journal head is
  allocated within these bit spin locks.

Yes, the locks do want to live inside the buffer_head.

Stephen has pointed out that we might want to remove
jbd_lock_bh_journal_head() altogether some time, just use
jbd_lock_bh_state() for that.

In 2.4 these locks are global (or per-superblock).  Making them a global
spinlock would be acceptable for 2-ways and probably larger.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Lee Revell
On Tue, 2005-03-15 at 13:05 -0500, Steven Rostedt wrote:
 Damn! The answer was right there in front of my eyes! Here's the cleanest
 solution. I forgot about wait_on_bit_lock.  I've converted all the locks
 to use this instead.  We probably need to get priority inheritence working
 on this too someday, but for now it's better than wasting memory or
 getting into deadlocks.
 

I am still not clear on why this did not hit with earlier kernels +
PREEMPT_DESKTOP.  Were the bitlocks introduced recently?  Or was another
lock-break patch dropped?

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Steven Rostedt


On Tue, 15 Mar 2005, Steven Rostedt wrote:



 On Tue, 15 Mar 2005, Ingo Molnar wrote:
 
  i'd go for removing bit-spinlocks altogether, in the upstream kernel. It
  would simplify things, besides making PREEMPT_RT simpler as well. The
  memory overhead is not a big issue i believe. (8 more bytes per ext3 bh,
  on x86)
 

 Hi Ingo,

 Damn! The answer was right there in front of my eyes! Here's the cleanest
 solution. I forgot about wait_on_bit_lock.  I've converted all the locks
 to use this instead.  We probably need to get priority inheritence working
 on this too someday, but for now it's better than wasting memory or
 getting into deadlocks.


One bit of caution on these. If we don't have PREEMPT_RT, then don't the
spinlocks on SMP act the same as normal spinlocks, and that we should not
schedule holding a spinlock? I believe that some of this locks are called
within holding spin_locks. So this isn't the right solution for other than
PREEMPT_RT. I also forgot to add might_sleep in the locking calls. Here's
the patch with the might_sleep added.  What should we do for non
PREEPMT_RT?  Maybe put the bit_spinlocks back in for that case?

-- Steve

diff -ur linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/journal.c 
linux-2.6.11-final-V0.7.40-00/fs/jbd/journal.c
--- linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/journal.c 2005-03-02 
02:37:49.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/fs/jbd/journal.c  2005-03-15 
11:58:14.0 -0500
@@ -82,6 +82,17 @@

 static int journal_convert_superblock_v1(journal_t *, journal_superblock_t *);

+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+/*
+ * Used in the locking of the bh_state and bh_journalhead bit locks.
+ */
+int jbd_lock_bh_sleep(void *notused)
+{
+   schedule();
+   return 0;
+}
+#endif
+
 /*
  * Helper function used to manage commit timeouts
  */
diff -ur linux-2.6.11-final-V0.7.40-00.orig/include/linux/jbd.h 
linux-2.6.11-final-V0.7.40-00/include/linux/jbd.h
--- linux-2.6.11-final-V0.7.40-00.orig/include/linux/jbd.h  2005-03-02 
02:38:19.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/include/linux/jbd.h   2005-03-16 
02:25:31.881251828 -0500
@@ -324,34 +324,65 @@
return bh-b_private;
 }

+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+int jbd_lock_bh_sleep(void *notused);
+#endif
+
 static inline void jbd_lock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_State, bh-b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   might_sleep();
+   
wait_on_bit_lock(bh-b_state,BH_State,jbd_lock_bh_sleep,TASK_UNINTERRUPTIBLE);
+#endif
+   __acquire(bitlock);
 }

 static inline int jbd_trylock_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_trylock(BH_State, bh-b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   if (test_and_set_bit(BH_State, bh-b_state))
+   return 0;
+#endif
+   __acquire(bitlock);
+   return 1;
 }

 static inline int jbd_is_locked_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_is_locked(BH_State, bh-b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   return test_bit(BH_State, bh-b_state);
+#else
+   return 1;
+#endif
 }

 static inline void jbd_unlock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_State, bh-b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   clear_bit(BH_State, bh-b_state);
+   smp_mb__after_clear_bit();
+   wake_up_bit(bh-b_state, BH_State);
+#endif
+   __release(bitlock);
 }

 static inline void jbd_lock_bh_journal_head(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_JournalHead, bh-b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   might_sleep();
+   
wait_on_bit_lock(bh-b_state,BH_JournalHead,jbd_lock_bh_sleep,TASK_UNINTERRUPTIBLE);
+#endif
+   __acquire(bitlock);
 }

 static inline void jbd_unlock_bh_journal_head(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_JournalHead, bh-b_state);
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
+   clear_bit(BH_JournalHead, bh-b_state);
+   smp_mb__after_clear_bit();
+   wake_up_bit(bh-b_state, BH_JournalHead);
+#endif
+   __release(bitlock);
 }

 struct jbd_revoke_table_s;
diff -ur linux-2.6.11-final-V0.7.40-00.orig/include/linux/spinlock.h 
linux-2.6.11-final-V0.7.40-00/include/linux/spinlock.h
--- linux-2.6.11-final-V0.7.40-00.orig/include/linux/spinlock.h 2005-03-14 
06:00:54.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/include/linux/spinlock.h  2005-03-15 
12:19:11.0 -0500
@@ -774,67 +774,6 @@
 }))


-/*
- *  bit-based spin_lock()
- *
- * Don't use this unless you really need to: spin_lock() and spin_unlock()
- * are significantly 

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-15 Thread Steven Rostedt


On Tue, 15 Mar 2005, Lee Revell wrote:

 On Tue, 2005-03-15 at 13:05 -0500, Steven Rostedt wrote:
  Damn! The answer was right there in front of my eyes! Here's the cleanest
  solution. I forgot about wait_on_bit_lock.  I've converted all the locks
  to use this instead.  We probably need to get priority inheritence working
  on this too someday, but for now it's better than wasting memory or
  getting into deadlocks.
 

 I am still not clear on why this did not hit with earlier kernels +
 PREEMPT_DESKTOP.  Were the bitlocks introduced recently?  Or was another
 lock-break patch dropped?


When did you start seeing this? This code has been there as far back as
2.6.7 (the earliest 2.6 kernel I still have laying around) and as far
back as Ingo's realtime-preempt-2.6.9-mm1-U10. Maybe the tracing didn't
start picking this up till later, or that you were just lucky that no
contention was happening on that lock.

-- Steve

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-14 Thread Steven Rostedt

Hi Ingo,

I've found something that is very interesting and I can't explain it.


On Mon, 14 Mar 2005, Steven Rostedt wrote:
>
>
> On Mon, 14 Mar 2005, Steven Rostedt wrote:
> >
> > On Mon, 14 Mar 2005, Steven Rostedt wrote:
> > >
> > > I just downloaded -40 and applied my patch, compiled it with
> > > PREEMPT_DESKTOP and data=ordered, ran it and everything seems OK, except
> > > I'm getting the following...
> > >
> > > BUG: Unable to handle kernel NULL pointer dereference at virtual address
> > > 
> > >  printing eip:
> > > c0213438
> > > *pde = 
> >
> > [snip]
> >
> > >

All I did now was to add this patch to your -40-00 kernel:

diff -ur linux-2.6.11-final-V0.7.40-00.orig/include/linux/jbd.h 
linux-2.6.11-final-V0.7.40-00/include/linux/jbd.h
--- linux-2.6.11-final-V0.7.40-00.orig/include/linux/jbd.h  2005-03-02 
02:38:19.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/include/linux/jbd.h   2005-03-14 
13:22:04.0 -0500
@@ -324,6 +324,8 @@
return bh->b_private;
 }

+BUFFER_FNS(JournalHead,journalhead)
+
 static inline void jbd_lock_bh_state(struct buffer_head *bh)
 {
bit_spin_lock(BH_State, >b_state);



And I get the following output:

BUG: Unable to handle kernel NULL pointer dereference at virtual address

 printing eip:
c0213118
*pde = 
Oops:  [#1]
Modules linked in: ipv6 af_packet tsdev mousedev evdev floppy psmouse
pcspkr snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm
snd_timer snd soundcore snd_page_alloc shpchp pci_hotplug ehci_hcd
intel_agp agpgart uhci_hcd usbcore e100 mii ide_cd cdrom unix
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010286   (2.6.11-RT-V0.7.40-00)
EIP is at vt_ioctl+0x18/0x1ab0
eax:    ebx: 5603   ecx: 5603   edx: cee14d80
esi: c0213100   edi: cb4bd000   ebp: cc03bf18   esp: cc03be48
ds: 007b   es: 007b   ss: 0068   preempt: 
Process XFree86 (pid: 4709, threadinfo=cc03a000 task=cf0d5020)
Stack: cf0d5170 cc03a000 cf0d5020 c03448ec cf0d5020 0246 cc03be7c
c0117267
   c03448f4 0006 0001   cc03bebc cf1b81ec
ce820600
   ce94a9b8   cc03bed4 c01704f1 ce94a9b8 0007

Call Trace:
 [] show_stack+0x7f/0xa0 (28)
 [] show_registers+0x165/0x1d0 (56)
 [] die+0xc8/0x150 (64)
 [] do_page_fault+0x356/0x6c4 (216)
 [] error_code+0x2b/0x30 (268)
 [] tty_ioctl+0x34b/0x490 (52)
 [] do_ioctl+0x4f/0x70 (32)
 [] vfs_ioctl+0x62/0x1d0 (40)
 [] sys_ioctl+0x61/0x90 (40)
 [] syscall_call+0x7/0xb (-8124)
Code: ff ff 8d 05 28 4d 34 c0 e8 f6 60 0a 00 e9 3a ff ff ff 90 55 89 e5 57
56 53 81 ec c4 00 00 00 8b 7d 08 8b 5d 10 8b 87 7c 09 00 00 <8b> 30 89 34
24 8b 04 b5 e0 b7 3c c0 89 45 8c e8 a4 6a 00 00 85



I don't know why. BUFFER_FNS is just defined as:

#define BUFFER_FNS(bit, name)   \
static inline void set_buffer_##name(struct buffer_head *bh)\
{   \
set_bit(BH_##bit, &(bh)->b_state);  \
}   \
static inline void clear_buffer_##name(struct buffer_head *bh)  \
{   \
clear_bit(BH_##bit, &(bh)->b_state);\
}   \
static inline int buffer_##name(const struct buffer_head *bh)   \
{   \
return test_bit(BH_##bit, &(bh)->b_state);  \
}

So all it does is make three function that are never used.

set_buffer_journalhead(...)
clear_buffer_journalhead(...)
buffer_journalhead(...)

Unless, some macro uses it, but I don't know why adding that line causes
the bug output that I showed.  If I remove that line, I don't get that
output.  And this is consistent. I've recompiled the kernel several
times, and everytime I compile it with this added patch I get that output.
And everytime without it, it runs fine.

Oh, please note that this only happens with PREEMPT_DESKTOP, and not with
PREEMPT_RT.

I really think this is a symptom of something else and not the cause of
the bug. What do you think?


-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-14 Thread Steven Rostedt


On Mon, 14 Mar 2005, Steven Rostedt wrote:
>
> On Mon, 14 Mar 2005, Steven Rostedt wrote:
> >
> > I just downloaded -40 and applied my patch, compiled it with
> > PREEMPT_DESKTOP and data=ordered, ran it and everything seems OK, except
> > I'm getting the following...
> >
> > BUG: Unable to handle kernel NULL pointer dereference at virtual address
> > 
> >  printing eip:
> > c0213438
> > *pde = 
>
> [snip]
>
> >
> >
> > I'll see if this happens without the patch, and if so, then I'll look into
> > this further.
> >
>
> Well, I took out my patch and this bug didn't happen, so I guess it's may
> fault!  OK, I'll dig into it further.
>

Here's a new patch. All I did was move BUFFER_FNS(JournalHead,journalhead)
to inside the #ifdef CONFIG_PREEMPT_RT and my oops went away !?!  This
really bothers me since it just declares some functions and is not used
with CONFIG_PREEMPT_RT off.  I have no idea what's going on.

Lee, can you see if this still crashes for you.


Thanks,

-- Steve


diff -ur linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/journal.c 
linux-2.6.11-final-V0.7.40-00/fs/jbd/journal.c
--- linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/journal.c 2005-03-02 
02:37:49.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/fs/jbd/journal.c  2005-03-14 
09:46:41.0 -0500
@@ -80,6 +80,10 @@
 EXPORT_SYMBOL(journal_try_to_free_buffers);
 EXPORT_SYMBOL(journal_force_commit);

+#ifdef CONFIG_PREEMPT_RT
+spinlock_t jbd_journal_head_lock = SPIN_LOCK_UNLOCKED;
+#endif
+
 static int journal_convert_superblock_v1(journal_t *, journal_superblock_t *);

 /*
@@ -1727,6 +1731,9 @@
jh = new_jh;
new_jh = NULL;  /* We consumed it */
set_buffer_jbd(bh);
+#ifdef CONFIG_PREEMPT_RT
+   spin_lock_init(>b_state_lock);
+#endif
bh->b_private = jh;
jh->b_bh = bh;
get_bh(bh);
@@ -1767,26 +1774,34 @@
if (jh->b_transaction == NULL &&
jh->b_next_transaction == NULL &&
jh->b_cp_transaction == NULL) {
-   J_ASSERT_BH(bh, buffer_jbd(bh));
-   J_ASSERT_BH(bh, jh2bh(jh) == bh);
-   BUFFER_TRACE(bh, "remove journal_head");
-   if (jh->b_frozen_data) {
-   printk(KERN_WARNING "%s: freeing "
-   "b_frozen_data\n",
-   __FUNCTION__);
-   kfree(jh->b_frozen_data);
-   }
-   if (jh->b_committed_data) {
-   printk(KERN_WARNING "%s: freeing "
-   "b_committed_data\n",
-   __FUNCTION__);
-   kfree(jh->b_committed_data);
+#ifdef CONFIG_PREEMPT_RT
+   if (atomic_read(>b_state_wait_count)) {
+   BUG_ON(buffer_journalhead(bh));
+   set_buffer_journalhead(bh);
+   } else
+#endif
+   {
+   J_ASSERT_BH(bh, buffer_jbd(bh));
+   J_ASSERT_BH(bh, jh2bh(jh) == bh);
+   BUFFER_TRACE(bh, "remove journal_head");
+   if (jh->b_frozen_data) {
+   printk(KERN_WARNING "%s: freeing "
+  "b_frozen_data\n",
+  __FUNCTION__);
+   kfree(jh->b_frozen_data);
+   }
+   if (jh->b_committed_data) {
+   printk(KERN_WARNING "%s: freeing "
+  "b_committed_data\n",
+  __FUNCTION__);
+   kfree(jh->b_committed_data);
+   }
+   bh->b_private = NULL;
+   jh->b_bh = NULL;/* debug, really */
+   clear_buffer_jbd(bh);
+   __brelse(bh);
+   journal_free_journal_head(jh);
}
-   bh->b_private = NULL;
-   jh->b_bh = NULL;/* debug, really */
-   clear_buffer_jbd(bh);
-   __brelse(bh);
-   journal_free_journal_head(jh);
} else {
BUFFER_TRACE(bh, "journal_head was locked");
}
diff -ur linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/transaction.c 
linux-2.6.11-final-V0.7.40-00/fs/jbd/transaction.c
--- 

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-14 Thread Steven Rostedt


On Mon, 14 Mar 2005, Steven Rostedt wrote:
>
> I just downloaded -40 and applied my patch, compiled it with
> PREEMPT_DESKTOP and data=ordered, ran it and everything seems OK, except
> I'm getting the following...
>
> BUG: Unable to handle kernel NULL pointer dereference at virtual address
> 
>  printing eip:
> c0213438
> *pde = 

[snip]

>
>
> I'll see if this happens without the patch, and if so, then I'll look into
> this further.
>

Well, I took out my patch and this bug didn't happen, so I guess it's may
fault!  OK, I'll dig into it further.

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-14 Thread Steven Rostedt


On Mon, 14 Mar 2005, Steven Rostedt wrote:

>
> > > I'll test this with PREEMPT_DESKTOP and data=ordered also and see how it
> > > goes.
> >
> > Does not seem to work at all with the above settings.  It seemed OK
> > until I started X.  Then every time I launched an xterm it would
> > disappear as soon as I typed anything.  I could not switch consoles to
> > see the Oops.
> >
>
> Hi Lee,
>
> I just compiled PREEMPT_DESKTOP and mounted root (only disk filesystem on
> my test machine) as data=ordered.  I had no problem getting to X, starting
> an xterm and running a make. Actually it was a gnome-term since I didn't
> have xterm. But then I su to root, apt-get xterm, ran xterm, and did a
> make there with no problems.
>
> Did you patch this against 39-02 or -40-X?
>
> I haven't had time to upgrade to 40 yet.  Maybe, I'll work on that today.
>

I just downloaded -40 and applied my patch, compiled it with
PREEMPT_DESKTOP and data=ordered, ran it and everything seems OK, except
I'm getting the following...

BUG: Unable to handle kernel NULL pointer dereference at virtual address

 printing eip:
c0213438
*pde = 
Oops:  [#1]
Modules linked in: ipv6 af_packet tsdev mousedev evdev floppy psmouse
pcspkr snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm
snd_timer snd soundcore snd_page_alloc shpchp pci_hotplug ehci_hcd
intel_agp agpgart uhci_hcd usbcore e100 mii ide_cd cdrom unix
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010286   (2.6.11-RT-V0.7.40-00)
EIP is at vt_ioctl+0x18/0x1ab0
eax:    ebx: 5603   ecx: 5603   edx: cb6c8780
esi: c0213420   edi: cc956000   ebp: cb613f18   esp: cb613e48
ds: 007b   es: 007b   ss: 0068   preempt: 
Process XFree86 (pid: 4713, threadinfo=cb612000 task=cb5e0a40)
Stack: cb5e0b90 cb612000 cb5e0a40 c034494c cb5e0a40 0246 cb613e7c
c0117217
   c0344954 0006 0001   cb613ebc ce0cce24
c13e1800
   cf1279b8   cb613ed4 c01707f1 cf1279b8 0007

Call Trace:
 [] show_stack+0x7f/0xa0 (28)
 [] show_registers+0x165/0x1d0 (56)
 [] die+0xc8/0x150 (64)
 [] do_page_fault+0x356/0x6c4 (216)
 [] error_code+0x2b/0x30 (268)
 [] tty_ioctl+0x34b/0x490 (52)
 [] do_ioctl+0x4f/0x70 (32)
 [] vfs_ioctl+0x62/0x1d0 (40)
 [] sys_ioctl+0x61/0x90 (40)
 [] syscall_call+0x7/0xb (-8124)
Code: ff ff 8d 05 88 4d 34 c0 e8 f6 60 0a 00 e9 3a ff ff ff 90 55 89 e5 57
56 53 81 ec c4 00 00 00 8b 7d 08 8b 5d 10 8b 87 7c 09 00 00 <8b> 30 89 34
24 8b 04 b5 e0 b7 3c c0 89 45 8c e8 a4 6a 00 00 85


I'll see if this happens without the patch, and if so, then I'll look into
this further.

Thanks,

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-14 Thread Steven Rostedt


On Mon, 14 Mar 2005, Steven Rostedt wrote:


   I'll test this with PREEMPT_DESKTOP and data=ordered also and see how it
   goes.
 
  Does not seem to work at all with the above settings.  It seemed OK
  until I started X.  Then every time I launched an xterm it would
  disappear as soon as I typed anything.  I could not switch consoles to
  see the Oops.
 

 Hi Lee,

 I just compiled PREEMPT_DESKTOP and mounted root (only disk filesystem on
 my test machine) as data=ordered.  I had no problem getting to X, starting
 an xterm and running a make. Actually it was a gnome-term since I didn't
 have xterm. But then I su to root, apt-get xterm, ran xterm, and did a
 make there with no problems.

 Did you patch this against 39-02 or -40-X?

 I haven't had time to upgrade to 40 yet.  Maybe, I'll work on that today.


I just downloaded -40 and applied my patch, compiled it with
PREEMPT_DESKTOP and data=ordered, ran it and everything seems OK, except
I'm getting the following...

BUG: Unable to handle kernel NULL pointer dereference at virtual address

 printing eip:
c0213438
*pde = 
Oops:  [#1]
Modules linked in: ipv6 af_packet tsdev mousedev evdev floppy psmouse
pcspkr snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm
snd_timer snd soundcore snd_page_alloc shpchp pci_hotplug ehci_hcd
intel_agp agpgart uhci_hcd usbcore e100 mii ide_cd cdrom unix
CPU:0
EIP:0060:[c0213438]Not tainted VLI
EFLAGS: 00010286   (2.6.11-RT-V0.7.40-00)
EIP is at vt_ioctl+0x18/0x1ab0
eax:    ebx: 5603   ecx: 5603   edx: cb6c8780
esi: c0213420   edi: cc956000   ebp: cb613f18   esp: cb613e48
ds: 007b   es: 007b   ss: 0068   preempt: 
Process XFree86 (pid: 4713, threadinfo=cb612000 task=cb5e0a40)
Stack: cb5e0b90 cb612000 cb5e0a40 c034494c cb5e0a40 0246 cb613e7c
c0117217
   c0344954 0006 0001   cb613ebc ce0cce24
c13e1800
   cf1279b8   cb613ed4 c01707f1 cf1279b8 0007

Call Trace:
 [c0103cdf] show_stack+0x7f/0xa0 (28)
 [c0103e95] show_registers+0x165/0x1d0 (56)
 [c0104088] die+0xc8/0x150 (64)
 [c0115376] do_page_fault+0x356/0x6c4 (216)
 [c0103973] error_code+0x2b/0x30 (268)
 [c020e91b] tty_ioctl+0x34b/0x490 (52)
 [c016837f] do_ioctl+0x4f/0x70 (32)
 [c0168582] vfs_ioctl+0x62/0x1d0 (40)
 [c0168751] sys_ioctl+0x61/0x90 (40)
 [c0102ec3] syscall_call+0x7/0xb (-8124)
Code: ff ff 8d 05 88 4d 34 c0 e8 f6 60 0a 00 e9 3a ff ff ff 90 55 89 e5 57
56 53 81 ec c4 00 00 00 8b 7d 08 8b 5d 10 8b 87 7c 09 00 00 8b 30 89 34
24 8b 04 b5 e0 b7 3c c0 89 45 8c e8 a4 6a 00 00 85


I'll see if this happens without the patch, and if so, then I'll look into
this further.

Thanks,

-- Steve

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-14 Thread Steven Rostedt


On Mon, 14 Mar 2005, Steven Rostedt wrote:

 I just downloaded -40 and applied my patch, compiled it with
 PREEMPT_DESKTOP and data=ordered, ran it and everything seems OK, except
 I'm getting the following...

 BUG: Unable to handle kernel NULL pointer dereference at virtual address
 
  printing eip:
 c0213438
 *pde = 

[snip]



 I'll see if this happens without the patch, and if so, then I'll look into
 this further.


Well, I took out my patch and this bug didn't happen, so I guess it's may
fault!  OK, I'll dig into it further.

-- Steve

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-14 Thread Steven Rostedt


On Mon, 14 Mar 2005, Steven Rostedt wrote:

 On Mon, 14 Mar 2005, Steven Rostedt wrote:
 
  I just downloaded -40 and applied my patch, compiled it with
  PREEMPT_DESKTOP and data=ordered, ran it and everything seems OK, except
  I'm getting the following...
 
  BUG: Unable to handle kernel NULL pointer dereference at virtual address
  
   printing eip:
  c0213438
  *pde = 

 [snip]

 
 
  I'll see if this happens without the patch, and if so, then I'll look into
  this further.
 

 Well, I took out my patch and this bug didn't happen, so I guess it's may
 fault!  OK, I'll dig into it further.


Here's a new patch. All I did was move BUFFER_FNS(JournalHead,journalhead)
to inside the #ifdef CONFIG_PREEMPT_RT and my oops went away !?!  This
really bothers me since it just declares some functions and is not used
with CONFIG_PREEMPT_RT off.  I have no idea what's going on.

Lee, can you see if this still crashes for you.


Thanks,

-- Steve


diff -ur linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/journal.c 
linux-2.6.11-final-V0.7.40-00/fs/jbd/journal.c
--- linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/journal.c 2005-03-02 
02:37:49.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/fs/jbd/journal.c  2005-03-14 
09:46:41.0 -0500
@@ -80,6 +80,10 @@
 EXPORT_SYMBOL(journal_try_to_free_buffers);
 EXPORT_SYMBOL(journal_force_commit);

+#ifdef CONFIG_PREEMPT_RT
+spinlock_t jbd_journal_head_lock = SPIN_LOCK_UNLOCKED;
+#endif
+
 static int journal_convert_superblock_v1(journal_t *, journal_superblock_t *);

 /*
@@ -1727,6 +1731,9 @@
jh = new_jh;
new_jh = NULL;  /* We consumed it */
set_buffer_jbd(bh);
+#ifdef CONFIG_PREEMPT_RT
+   spin_lock_init(jh-b_state_lock);
+#endif
bh-b_private = jh;
jh-b_bh = bh;
get_bh(bh);
@@ -1767,26 +1774,34 @@
if (jh-b_transaction == NULL 
jh-b_next_transaction == NULL 
jh-b_cp_transaction == NULL) {
-   J_ASSERT_BH(bh, buffer_jbd(bh));
-   J_ASSERT_BH(bh, jh2bh(jh) == bh);
-   BUFFER_TRACE(bh, remove journal_head);
-   if (jh-b_frozen_data) {
-   printk(KERN_WARNING %s: freeing 
-   b_frozen_data\n,
-   __FUNCTION__);
-   kfree(jh-b_frozen_data);
-   }
-   if (jh-b_committed_data) {
-   printk(KERN_WARNING %s: freeing 
-   b_committed_data\n,
-   __FUNCTION__);
-   kfree(jh-b_committed_data);
+#ifdef CONFIG_PREEMPT_RT
+   if (atomic_read(jh-b_state_wait_count)) {
+   BUG_ON(buffer_journalhead(bh));
+   set_buffer_journalhead(bh);
+   } else
+#endif
+   {
+   J_ASSERT_BH(bh, buffer_jbd(bh));
+   J_ASSERT_BH(bh, jh2bh(jh) == bh);
+   BUFFER_TRACE(bh, remove journal_head);
+   if (jh-b_frozen_data) {
+   printk(KERN_WARNING %s: freeing 
+  b_frozen_data\n,
+  __FUNCTION__);
+   kfree(jh-b_frozen_data);
+   }
+   if (jh-b_committed_data) {
+   printk(KERN_WARNING %s: freeing 
+  b_committed_data\n,
+  __FUNCTION__);
+   kfree(jh-b_committed_data);
+   }
+   bh-b_private = NULL;
+   jh-b_bh = NULL;/* debug, really */
+   clear_buffer_jbd(bh);
+   __brelse(bh);
+   journal_free_journal_head(jh);
}
-   bh-b_private = NULL;
-   jh-b_bh = NULL;/* debug, really */
-   clear_buffer_jbd(bh);
-   __brelse(bh);
-   journal_free_journal_head(jh);
} else {
BUFFER_TRACE(bh, journal_head was locked);
}
diff -ur linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/transaction.c 
linux-2.6.11-final-V0.7.40-00/fs/jbd/transaction.c
--- linux-2.6.11-final-V0.7.40-00.orig/fs/jbd/transaction.c 2005-03-02 
02:37:53.0 -0500
+++ 

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-14 Thread Steven Rostedt

Hi Ingo,

I've found something that is very interesting and I can't explain it.


On Mon, 14 Mar 2005, Steven Rostedt wrote:


 On Mon, 14 Mar 2005, Steven Rostedt wrote:
 
  On Mon, 14 Mar 2005, Steven Rostedt wrote:
  
   I just downloaded -40 and applied my patch, compiled it with
   PREEMPT_DESKTOP and data=ordered, ran it and everything seems OK, except
   I'm getting the following...
  
   BUG: Unable to handle kernel NULL pointer dereference at virtual address
   
printing eip:
   c0213438
   *pde = 
 
  [snip]
 
  

All I did now was to add this patch to your -40-00 kernel:

diff -ur linux-2.6.11-final-V0.7.40-00.orig/include/linux/jbd.h 
linux-2.6.11-final-V0.7.40-00/include/linux/jbd.h
--- linux-2.6.11-final-V0.7.40-00.orig/include/linux/jbd.h  2005-03-02 
02:38:19.0 -0500
+++ linux-2.6.11-final-V0.7.40-00/include/linux/jbd.h   2005-03-14 
13:22:04.0 -0500
@@ -324,6 +324,8 @@
return bh-b_private;
 }

+BUFFER_FNS(JournalHead,journalhead)
+
 static inline void jbd_lock_bh_state(struct buffer_head *bh)
 {
bit_spin_lock(BH_State, bh-b_state);



And I get the following output:

BUG: Unable to handle kernel NULL pointer dereference at virtual address

 printing eip:
c0213118
*pde = 
Oops:  [#1]
Modules linked in: ipv6 af_packet tsdev mousedev evdev floppy psmouse
pcspkr snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm
snd_timer snd soundcore snd_page_alloc shpchp pci_hotplug ehci_hcd
intel_agp agpgart uhci_hcd usbcore e100 mii ide_cd cdrom unix
CPU:0
EIP:0060:[c0213118]Not tainted VLI
EFLAGS: 00010286   (2.6.11-RT-V0.7.40-00)
EIP is at vt_ioctl+0x18/0x1ab0
eax:    ebx: 5603   ecx: 5603   edx: cee14d80
esi: c0213100   edi: cb4bd000   ebp: cc03bf18   esp: cc03be48
ds: 007b   es: 007b   ss: 0068   preempt: 
Process XFree86 (pid: 4709, threadinfo=cc03a000 task=cf0d5020)
Stack: cf0d5170 cc03a000 cf0d5020 c03448ec cf0d5020 0246 cc03be7c
c0117267
   c03448f4 0006 0001   cc03bebc cf1b81ec
ce820600
   ce94a9b8   cc03bed4 c01704f1 ce94a9b8 0007

Call Trace:
 [c0103cdf] show_stack+0x7f/0xa0 (28)
 [c0103e95] show_registers+0x165/0x1d0 (56)
 [c0104088] die+0xc8/0x150 (64)
 [c01153c6] do_page_fault+0x356/0x6c4 (216)
 [c0103973] error_code+0x2b/0x30 (268)
 [c020e5fb] tty_ioctl+0x34b/0x490 (52)
 [c016807f] do_ioctl+0x4f/0x70 (32)
 [c0168282] vfs_ioctl+0x62/0x1d0 (40)
 [c0168451] sys_ioctl+0x61/0x90 (40)
 [c0102ec3] syscall_call+0x7/0xb (-8124)
Code: ff ff 8d 05 28 4d 34 c0 e8 f6 60 0a 00 e9 3a ff ff ff 90 55 89 e5 57
56 53 81 ec c4 00 00 00 8b 7d 08 8b 5d 10 8b 87 7c 09 00 00 8b 30 89 34
24 8b 04 b5 e0 b7 3c c0 89 45 8c e8 a4 6a 00 00 85



I don't know why. BUFFER_FNS is just defined as:

#define BUFFER_FNS(bit, name)   \
static inline void set_buffer_##name(struct buffer_head *bh)\
{   \
set_bit(BH_##bit, (bh)-b_state);  \
}   \
static inline void clear_buffer_##name(struct buffer_head *bh)  \
{   \
clear_bit(BH_##bit, (bh)-b_state);\
}   \
static inline int buffer_##name(const struct buffer_head *bh)   \
{   \
return test_bit(BH_##bit, (bh)-b_state);  \
}

So all it does is make three function that are never used.

set_buffer_journalhead(...)
clear_buffer_journalhead(...)
buffer_journalhead(...)

Unless, some macro uses it, but I don't know why adding that line causes
the bug output that I showed.  If I remove that line, I don't get that
output.  And this is consistent. I've recompiled the kernel several
times, and everytime I compile it with this added patch I get that output.
And everytime without it, it runs fine.

Oh, please note that this only happens with PREEMPT_DESKTOP, and not with
PREEMPT_RT.

I really think this is a symptom of something else and not the cause of
the bug. What do you think?


-- Steve

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-13 Thread Steven Rostedt

On Fri, 11 Mar 2005, Lee Revell wrote:

> On Fri, 2005-03-11 at 15:46 -0500, Lee Revell wrote:
> > On Fri, 2005-03-11 at 15:39 -0500, Steven Rostedt wrote:
> > > I'm leaving now for the weekend, so I won't be able to respond to anyone
> > > till Monday.  I'll also run this patch over the weekend while compiling
> > > the kernel in an endless loop
> >
> > I'll test this with PREEMPT_DESKTOP and data=ordered also and see how it
> > goes.
>
> Does not seem to work at all with the above settings.  It seemed OK
> until I started X.  Then every time I launched an xterm it would
> disappear as soon as I typed anything.  I could not switch consoles to
> see the Oops.
>

Hi Lee,

I just compiled PREEMPT_DESKTOP and mounted root (only disk filesystem on
my test machine) as data=ordered.  I had no problem getting to X, starting
an xterm and running a make. Actually it was a gnome-term since I didn't
have xterm. But then I su to root, apt-get xterm, ran xterm, and did a
make there with no problems.

Did you patch this against 39-02 or -40-X?

I haven't had time to upgrade to 40 yet.  Maybe, I'll work on that today.

Maybe your crash has something else to do with.  My test machine has a
serial hookup that I can look at even if the term goes down. I'll see if
40 gives me problems.

-- Steve
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-13 Thread Steven Rostedt

On Fri, 11 Mar 2005, Lee Revell wrote:

 On Fri, 2005-03-11 at 15:46 -0500, Lee Revell wrote:
  On Fri, 2005-03-11 at 15:39 -0500, Steven Rostedt wrote:
   I'm leaving now for the weekend, so I won't be able to respond to anyone
   till Monday.  I'll also run this patch over the weekend while compiling
   the kernel in an endless loop
 
  I'll test this with PREEMPT_DESKTOP and data=ordered also and see how it
  goes.

 Does not seem to work at all with the above settings.  It seemed OK
 until I started X.  Then every time I launched an xterm it would
 disappear as soon as I typed anything.  I could not switch consoles to
 see the Oops.


Hi Lee,

I just compiled PREEMPT_DESKTOP and mounted root (only disk filesystem on
my test machine) as data=ordered.  I had no problem getting to X, starting
an xterm and running a make. Actually it was a gnome-term since I didn't
have xterm. But then I su to root, apt-get xterm, ran xterm, and did a
make there with no problems.

Did you patch this against 39-02 or -40-X?

I haven't had time to upgrade to 40 yet.  Maybe, I'll work on that today.

Maybe your crash has something else to do with.  My test machine has a
serial hookup that I can look at even if the term goes down. I'll see if
40 gives me problems.

-- Steve
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Lee Revell
On Fri, 2005-03-11 at 15:46 -0500, Lee Revell wrote:
> On Fri, 2005-03-11 at 15:39 -0500, Steven Rostedt wrote:
> > I'm leaving now for the weekend, so I won't be able to respond to anyone
> > till Monday.  I'll also run this patch over the weekend while compiling
> > the kernel in an endless loop
> 
> I'll test this with PREEMPT_DESKTOP and data=ordered also and see how it
> goes.

Does not seem to work at all with the above settings.  It seemed OK
until I started X.  Then every time I launched an xterm it would
disappear as soon as I typed anything.  I could not switch consoles to
see the Oops.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Lee Revell
On Fri, 2005-03-11 at 15:39 -0500, Steven Rostedt wrote:
> I'm leaving now for the weekend, so I won't be able to respond to anyone
> till Monday.  I'll also run this patch over the weekend while compiling
> the kernel in an endless loop

I'll test this with PREEMPT_DESKTOP and data=ordered also and see how it
goes.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Steven Rostedt


On Fri, 11 Mar 2005, Ingo Molnar wrote:

>
> * Steven Rostedt <[EMAIL PROTECTED]> wrote:
>
> > Here's the patch. It's probably more of an overkill wrt buffer heads,
> > but it seems to be the easiest solution.
>
> isnt there some ext3-private journal structure (journal-bh) linked off
> the bh? If the lock is in that structure then the overhead would only
> affect ext3.
>

OK, here it is (Yuck!).  I was able to use the journal head (private data
of the buffer head) for the state lock.  I just decided to have the
journal head lock be one global lock for all buffer heads, since it is
used to add and remove the journal private data from the buffer head, and
thus can't be stored in the journal private data.

The state lock is now in the journal private data but we must be careful
not to free this data before we unlock it. So here's what I've done.

  static inline void jbd_lock_bh_state(struct buffer_head *bh)
  {
BUG_ON(!bh->b_private);
atomic_inc((bh)->b_state_wait_count);
spin_lock((bh)->b_state_lock);
  }

I have a counter of those that want/have the lock, and this informs the
journal_remove_journal_head that it should not free the jh.

  static void __journal_remove_journal_head(struct buffer_head *bh)
  {
struct journal_head *jh = bh2jh(bh);

J_ASSERT_JH(jh, jh->b_jcount >= 0);

get_bh(bh);
if (jh->b_jcount == 0) {
if (jh->b_transaction == NULL &&
jh->b_next_transaction == NULL &&
jh->b_cp_transaction == NULL) {
  #ifdef CONFIG_PREEMPT_RT
if (atomic_read(>b_state_wait_count)) {
BUG_ON(buffer_journalhead(bh));
set_buffer_journalhead(bh);
} else
  #endif
{


Here the state_wait_count is checked, and if > 0, then using the bit that
was originally used for locking the journal head, is set to inform the
unlocking of the state lock that it needs to be removed.

  static inline void jbd_unlock_bh_state(struct buffer_head *bh)
  {
int rmjh = 0;

BUG_ON(!atomic_read((bh)->b_state_wait_count));
atomic_dec((bh)->b_state_wait_count);

if (buffer_journalhead(bh)) {
clear_buffer_journalhead(bh);
rmjh = 1;
}

spin_unlock((bh)->b_state_lock);

if (rmjh)
journal_remove_journal_head(bh);
  }

Now in the unlocking of the state lock, the journal head bit is tested and
if it is set, then the remove journal head function is called.


Maybe this isn't the cleanest solution, but it keeps the overhead on the
buffer heads down, so it's prefered over my last patch.

Once again, this has only been tested with full preemption enabled, but I
tried to keep it from changing the way non PREEMPT_RT works.

I'm leaving now for the weekend, so I won't be able to respond to anyone
till Monday.  I'll also run this patch over the weekend while compiling
the kernel in an endless loop

 while [ 1 ]; do
   make clean; make
 done

With kjournal running FIFO, to see if it survives.

Cheers,


-- Steve

diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/fs/jbd/journal.c 
linux-2.6.11-rc4-V0.7.39-02/fs/jbd/journal.c
--- linux-2.6.11-rc4-V0.7.39-02.orig/fs/jbd/journal.c   2005-02-12 
22:05:29.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/fs/jbd/journal.c2005-03-11 
14:54:21.0 -0500
@@ -80,6 +80,10 @@
 EXPORT_SYMBOL(journal_try_to_free_buffers);
 EXPORT_SYMBOL(journal_force_commit);

+#ifdef CONFIG_PREEMPT_RT
+spinlock_t jbd_journal_head_lock = SPIN_LOCK_UNLOCKED;
+#endif
+
 static int journal_convert_superblock_v1(journal_t *, journal_superblock_t *);

 /*
@@ -1727,6 +1731,9 @@
jh = new_jh;
new_jh = NULL;  /* We consumed it */
set_buffer_jbd(bh);
+#ifdef CONFIG_PREEMPT_RT
+   spin_lock_init(>b_state_lock);
+#endif
bh->b_private = jh;
jh->b_bh = bh;
get_bh(bh);
@@ -1767,26 +1774,34 @@
if (jh->b_transaction == NULL &&
jh->b_next_transaction == NULL &&
jh->b_cp_transaction == NULL) {
-   J_ASSERT_BH(bh, buffer_jbd(bh));
-   J_ASSERT_BH(bh, jh2bh(jh) == bh);
-   BUFFER_TRACE(bh, "remove journal_head");
-   if (jh->b_frozen_data) {
-   printk(KERN_WARNING "%s: freeing "
-   "b_frozen_data\n",
-   __FUNCTION__);
-   kfree(jh->b_frozen_data);
-   }
-   if (jh->b_committed_data) {
-   printk(KERN_WARNING "%s: freeing "
-   "b_committed_data\n",
-

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Steven Rostedt

On Fri, 11 Mar 2005, Ingo Molnar wrote:

>
> * Steven Rostedt <[EMAIL PROTECTED]> wrote:
>
> > Here's the patch. It's probably more of an overkill wrt buffer heads,
> > but it seems to be the easiest solution.
>
> isnt there some ext3-private journal structure (journal-bh) linked off
> the bh? If the lock is in that structure then the overhead would only
> affect ext3.
>

Yes, there is, and I was trying to use it before you mentioned trying this
(which works for now).  The locks are called before and after the private
pointer of the bh is set and removed.  The journal_head lock, I was going
to make global, and the state lock would go on this structure. I would
have to do some hack in journal.c to flag the state lock when it was
removing the journal head so that it didn't do the remove there, but did
it after the state lock was released. But this still had a few crashes.

The journal_head lock was used to lock when to add or remove the private
data from the bh, so you can see why this structure can't be used for this
purpose. But the state lock seemed to be ok for this. I need to know more
about the journaling system.

 I'll look into doing this too, but this fix should due for now.

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Ingo Molnar

* Steven Rostedt <[EMAIL PROTECTED]> wrote:

> Here's the patch. It's probably more of an overkill wrt buffer heads,
> but it seems to be the easiest solution.

isnt there some ext3-private journal structure (journal-bh) linked off 
the bh? If the lock is in that structure then the overhead would only 
affect ext3.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread K.R. Foley
Steven Rostedt wrote:
+#ifdef CONFIG_PREEMPT_RT
+#define PICK_SPIN_LOCK(otype,bit,name) spin_##otype(>b_##name##_lock)
+#else
+#define PICK_SPIN_LOCK(otype,bit,name) bit_spin_##otype(bit,bh->b_state);
+#endif
+

Oops, extra semicolon on the non RT side.
I'll try again.
-- Steve
Haven't tried it yet, but does apply cleanly to 2.6.11-final-V0.7.40-00.
kr
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Steven Rostedt

>
> +#ifdef CONFIG_PREEMPT_RT
> +#define PICK_SPIN_LOCK(otype,bit,name) spin_##otype(>b_##name##_lock)
> +#else
> +#define PICK_SPIN_LOCK(otype,bit,name) bit_spin_##otype(bit,bh->b_state);
> +#endif
> +

Oops, extra semicolon on the non RT side.


I'll try again.

-- Steve

diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/fs/buffer.c 
linux-2.6.11-rc4-V0.7.39-02/fs/buffer.c
--- linux-2.6.11-rc4-V0.7.39-02.orig/fs/buffer.c2005-02-12 
22:06:54.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/fs/buffer.c 2005-03-11 07:48:04.0 
-0500
@@ -3002,6 +3002,10 @@
preempt_disable();
__get_cpu_var(bh_accounting).nr++;
recalc_bh_state();
+#ifdef CONFIG_PREEMPT_RT
+   spin_lock_init(>b_jstate_lock);
+   spin_lock_init(>b_jhead_lock);
+#endif
preempt_enable();
}
return ret;
diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/buffer_head.h 
linux-2.6.11-rc4-V0.7.39-02/include/linux/buffer_head.h
--- linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/buffer_head.h
2005-02-12 22:05:10.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/include/linux/buffer_head.h 2005-03-11 
07:59:44.0 -0500
@@ -62,6 +62,14 @@
bh_end_io_t *b_end_io;  /* I/O completion */
void *b_private;/* reserved for b_end_io */
struct list_head b_assoc_buffers; /* associated with another mapping */
+
+#ifdef CONFIG_PREEMPT_RT
+   /*
+* Fixme: This should be in the journal code.
+*/
+   spinlock_t b_jstate_lock;   /* lock for journal state. */
+   spinlock_t b_jhead_lock;/* lock for journal head. */
+#endif
 };

 /*
diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/jbd.h 
linux-2.6.11-rc4-V0.7.39-02/include/linux/jbd.h
--- linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/jbd.h2005-02-12 
22:07:18.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/include/linux/jbd.h 2005-03-11 
07:57:47.0 -0500
@@ -314,6 +314,12 @@
 TAS_BUFFER_FNS(RevokeValid, revokevalid)
 BUFFER_FNS(Freed, freed)

+#ifdef CONFIG_PREEMPT_RT
+#define PICK_SPIN_LOCK(otype,bit,name) spin_##otype(>b_##name##_lock)
+#else
+#define PICK_SPIN_LOCK(otype,bit,name) bit_spin_##otype(bit,bh->b_state)
+#endif
+
 static inline struct buffer_head *jh2bh(struct journal_head *jh)
 {
return jh->b_bh;
@@ -326,33 +332,34 @@

 static inline void jbd_lock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_State, >b_state);
+   PICK_SPIN_LOCK(lock,BH_State,jstate);
 }

 static inline int jbd_trylock_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_trylock(BH_State, >b_state);
+   return PICK_SPIN_LOCK(trylock,BH_State,jstate);
 }

 static inline int jbd_is_locked_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_is_locked(BH_State, >b_state);
+   return PICK_SPIN_LOCK(is_locked,BH_State,jstate);
 }

 static inline void jbd_unlock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_State, >b_state);
+   PICK_SPIN_LOCK(unlock,BH_State,jstate);
 }

 static inline void jbd_lock_bh_journal_head(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_JournalHead, >b_state);
+   PICK_SPIN_LOCK(lock,BH_JournalHead,jhead);
 }

 static inline void jbd_unlock_bh_journal_head(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_JournalHead, >b_state);
+   PICK_SPIN_LOCK(unlock,BH_JournalHead,jhead);
 }
+#undef PICK_SPIN_LOCK

 struct jbd_revoke_table_s;

diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/spinlock.h 
linux-2.6.11-rc4-V0.7.39-02/include/linux/spinlock.h
--- linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/spinlock.h   2005-03-10 
08:47:25.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/include/linux/spinlock.h2005-03-11 
09:06:26.254317378 -0500
@@ -774,6 +774,10 @@
 }))


+#ifndef CONFIG_PREEMPT_RT
+
+/* These are just plain evil! */
+
 /*
  *  bit-based spin_lock()
  *
@@ -789,10 +793,15 @@
 * busywait with less bus contention for a good time to
 * attempt to acquire the lock bit.
 */
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
-   while (test_and_set_bit(bitnum, addr))
-   while (test_bit(bitnum, addr))
+   preempt_disable();
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
+   while (test_and_set_bit(bitnum, addr)) {
+   while (test_bit(bitnum, addr)) {
+   preempt_enable();
cpu_relax();
+   preempt_disable();
+   }
+   }
 #endif
__acquire(bitlock);
 }
@@ -802,9 +811,12 @@
  */
 static inline int bit_spin_trylock(int bitnum, unsigned long *addr)
 {
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
-   if (test_and_set_bit(bitnum, addr))
+   preempt_disable();
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
+   if 

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Steven Rostedt


Here's the patch. It's probably more of an overkill wrt buffer heads, but
it seems to be the easiest solution.

I also put back some of the changes you made for the
bit_spin_locks, so that they act the same as the vanilla kernel if
PREEMPT_RT is not defined.  Now I only tested this with PREEMPT_RT
configured so I hope others can test it with it off. If I get time I'll do
that as well.

I patched this against linux-2.6.11-rc4-V0.7.39-02, so I hope it goes
easily into .40.

Lee,

 Could you see what the latencies are with kjournal with this patch
applied.

Thanks,

 -- Steve


diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/fs/buffer.c 
linux-2.6.11-rc4-V0.7.39-02/fs/buffer.c
--- linux-2.6.11-rc4-V0.7.39-02.orig/fs/buffer.c2005-02-12 
22:06:54.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/fs/buffer.c 2005-03-11 07:48:04.0 
-0500
@@ -3002,6 +3002,10 @@
preempt_disable();
__get_cpu_var(bh_accounting).nr++;
recalc_bh_state();
+#ifdef CONFIG_PREEMPT_RT
+   spin_lock_init(>b_jstate_lock);
+   spin_lock_init(>b_jhead_lock);
+#endif
preempt_enable();
}
return ret;
diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/buffer_head.h 
linux-2.6.11-rc4-V0.7.39-02/include/linux/buffer_head.h
--- linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/buffer_head.h
2005-02-12 22:05:10.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/include/linux/buffer_head.h 2005-03-11 
07:59:44.0 -0500
@@ -62,6 +62,14 @@
bh_end_io_t *b_end_io;  /* I/O completion */
void *b_private;/* reserved for b_end_io */
struct list_head b_assoc_buffers; /* associated with another mapping */
+
+#ifdef CONFIG_PREEMPT_RT
+   /*
+* Fixme: This should be in the journal code.
+*/
+   spinlock_t b_jstate_lock;   /* lock for journal state. */
+   spinlock_t b_jhead_lock;/* lock for journal head. */
+#endif
 };

 /*
diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/jbd.h 
linux-2.6.11-rc4-V0.7.39-02/include/linux/jbd.h
--- linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/jbd.h2005-02-12 
22:07:18.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/include/linux/jbd.h 2005-03-11 
07:57:47.0 -0500
@@ -314,6 +314,12 @@
 TAS_BUFFER_FNS(RevokeValid, revokevalid)
 BUFFER_FNS(Freed, freed)

+#ifdef CONFIG_PREEMPT_RT
+#define PICK_SPIN_LOCK(otype,bit,name) spin_##otype(>b_##name##_lock)
+#else
+#define PICK_SPIN_LOCK(otype,bit,name) bit_spin_##otype(bit,bh->b_state);
+#endif
+
 static inline struct buffer_head *jh2bh(struct journal_head *jh)
 {
return jh->b_bh;
@@ -326,33 +332,34 @@

 static inline void jbd_lock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_State, >b_state);
+   PICK_SPIN_LOCK(lock,BH_State,jstate);
 }

 static inline int jbd_trylock_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_trylock(BH_State, >b_state);
+   return PICK_SPIN_LOCK(trylock,BH_State,jstate);
 }

 static inline int jbd_is_locked_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_is_locked(BH_State, >b_state);
+   return PICK_SPIN_LOCK(is_locked,BH_State,jstate);
 }

 static inline void jbd_unlock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_State, >b_state);
+   PICK_SPIN_LOCK(unlock,BH_State,jstate);
 }

 static inline void jbd_lock_bh_journal_head(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_JournalHead, >b_state);
+   PICK_SPIN_LOCK(lock,BH_JournalHead,jhead);
 }

 static inline void jbd_unlock_bh_journal_head(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_JournalHead, >b_state);
+   PICK_SPIN_LOCK(unlock,BH_JournalHead,jhead);
 }
+#undef PICK_SPIN_LOCK

 struct jbd_revoke_table_s;

diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/spinlock.h 
linux-2.6.11-rc4-V0.7.39-02/include/linux/spinlock.h
--- linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/spinlock.h   2005-03-10 
08:47:25.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/include/linux/spinlock.h2005-03-11 
09:06:26.254317378 -0500
@@ -774,6 +774,10 @@
 }))


+#ifndef CONFIG_PREEMPT_RT
+
+/* These are just plain evil! */
+
 /*
  *  bit-based spin_lock()
  *
@@ -789,10 +793,15 @@
 * busywait with less bus contention for a good time to
 * attempt to acquire the lock bit.
 */
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
-   while (test_and_set_bit(bitnum, addr))
-   while (test_bit(bitnum, addr))
+   preempt_disable();
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
+   while (test_and_set_bit(bitnum, addr)) {
+   while (test_bit(bitnum, addr)) {
+   preempt_enable();
cpu_relax();
+   preempt_disable();
+   }
+   }
 #endif
__acquire(bitlock);
 }
@@ -802,9 +811,12 @@
  */
 

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Steven Rostedt


On Fri, 11 Mar 2005, Andrew Morton wrote:

> Steven Rostedt <[EMAIL PROTECTED]> wrote:
> >  No, I'll try that now. I just didn't want to modify the buffer head struct
> >  just for journaling.  But if it is the quickest and easiest fix, then I'll
> >  submit it and we can change it later.
>
> You'll need two spinlocks.  jbd_lock_bh_state() and 
> jbd_lock_bh_journal_head().
>

Yep, already did that. Now I need to reboot the new kernel and give it a
try.

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Andrew Morton
Steven Rostedt <[EMAIL PROTECTED]> wrote:
>
> > did you try the canonical way of putting a spinlock into every
>  > buffer_head?
>  >
> 
>  No, I'll try that now. I just didn't want to modify the buffer head struct
>  just for journaling.  But if it is the quickest and easiest fix, then I'll
>  submit it and we can change it later.

You'll need two spinlocks.  jbd_lock_bh_state() and jbd_lock_bh_journal_head().
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Ingo Molnar

* Steven Rostedt <[EMAIL PROTECTED]> wrote:

> > > Doing a quick search on the kernel, it looks like only kjournald uses
> > > the bit_spin_locks. I'll start converting them to spinlocks. The use
> > > seems to be more of a hack, since it is using bits in the state field
> > > for locking, and these bits aren't used for anything else.
> >
> > yeah. bit-spinlocks are really a hack.
> 
> And this really sucks too!  I've been looking into a fix for this and
> have yet to get something stable.  As you probably already know, you
> can't just put back the preempt_disable since your spinlocks now
> schedule. So I've been looking into finding a way to get rid of these.
> 
> I've tried making two global spinlocks, one for the state bit and one
> for the journal head bit use.  But this deadlocks with j_state_lock.
> The journal head lock seems to be ok to be global, but the state lock
> needs to have one for every buffer head.  I'm now hacking away to do
> this without touching the actual buffer head. But I'm not sure what
> some of the side effects this is having.  I'll keep you posted when I
> get something working.  I'm now having a crash course in how kjournal
> and friends work.

did you try the canonical way of putting a spinlock into every
buffer_head?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Steven Rostedt

On Fri, 11 Mar 2005, Ingo Molnar wrote:

>
> * Steven Rostedt <[EMAIL PROTECTED]> wrote:
>
> > > The short term fix is probably to put back the preempt_disables, the long
> > > term is to get rid of these stupid bit_spin_lock busy loops.
> >
> > Doing a quick search on the kernel, it looks like only kjournald uses
> > the bit_spin_locks. I'll start converting them to spinlocks. The use
> > seems to be more of a hack, since it is using bits in the state field
> > for locking, and these bits aren't used for anything else.
>
> yeah. bit-spinlocks are really a hack.
>
>   Ingo
>

And this really sucks too!  I've been looking into a fix for this and have
yet to get something stable.  As you probably already know, you can't just
put back the preempt_disable since your spinlocks now schedule. So I've
been looking into finding a way to get rid of these.

I've tried making two global spinlocks, one for the state bit and one for
the journal head bit use.  But this deadlocks with j_state_lock. The
journal head lock seems to be ok to be global, but the state lock needs to
have one for every buffer head.  I'm now hacking away to do this without
touching the actual buffer head. But I'm not sure what some of the
side effects this is having.  I'll keep you posted when I get something
working.  I'm now having a crash course in how kjournal and friends work.

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Ingo Molnar

* Steven Rostedt <[EMAIL PROTECTED]> wrote:

> > The short term fix is probably to put back the preempt_disables, the long
> > term is to get rid of these stupid bit_spin_lock busy loops.
> 
> Doing a quick search on the kernel, it looks like only kjournald uses
> the bit_spin_locks. I'll start converting them to spinlocks. The use
> seems to be more of a hack, since it is using bits in the state field
> for locking, and these bits aren't used for anything else.

yeah. bit-spinlocks are really a hack.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Ingo Molnar

* Steven Rostedt [EMAIL PROTECTED] wrote:

  The short term fix is probably to put back the preempt_disables, the long
  term is to get rid of these stupid bit_spin_lock busy loops.
 
 Doing a quick search on the kernel, it looks like only kjournald uses
 the bit_spin_locks. I'll start converting them to spinlocks. The use
 seems to be more of a hack, since it is using bits in the state field
 for locking, and these bits aren't used for anything else.

yeah. bit-spinlocks are really a hack.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Steven Rostedt

On Fri, 11 Mar 2005, Ingo Molnar wrote:


 * Steven Rostedt [EMAIL PROTECTED] wrote:

   The short term fix is probably to put back the preempt_disables, the long
   term is to get rid of these stupid bit_spin_lock busy loops.
 
  Doing a quick search on the kernel, it looks like only kjournald uses
  the bit_spin_locks. I'll start converting them to spinlocks. The use
  seems to be more of a hack, since it is using bits in the state field
  for locking, and these bits aren't used for anything else.

 yeah. bit-spinlocks are really a hack.

   Ingo


And this really sucks too!  I've been looking into a fix for this and have
yet to get something stable.  As you probably already know, you can't just
put back the preempt_disable since your spinlocks now schedule. So I've
been looking into finding a way to get rid of these.

I've tried making two global spinlocks, one for the state bit and one for
the journal head bit use.  But this deadlocks with j_state_lock. The
journal head lock seems to be ok to be global, but the state lock needs to
have one for every buffer head.  I'm now hacking away to do this without
touching the actual buffer head. But I'm not sure what some of the
side effects this is having.  I'll keep you posted when I get something
working.  I'm now having a crash course in how kjournal and friends work.

-- Steve

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Ingo Molnar

* Steven Rostedt [EMAIL PROTECTED] wrote:

   Doing a quick search on the kernel, it looks like only kjournald uses
   the bit_spin_locks. I'll start converting them to spinlocks. The use
   seems to be more of a hack, since it is using bits in the state field
   for locking, and these bits aren't used for anything else.
 
  yeah. bit-spinlocks are really a hack.
 
 And this really sucks too!  I've been looking into a fix for this and
 have yet to get something stable.  As you probably already know, you
 can't just put back the preempt_disable since your spinlocks now
 schedule. So I've been looking into finding a way to get rid of these.
 
 I've tried making two global spinlocks, one for the state bit and one
 for the journal head bit use.  But this deadlocks with j_state_lock.
 The journal head lock seems to be ok to be global, but the state lock
 needs to have one for every buffer head.  I'm now hacking away to do
 this without touching the actual buffer head. But I'm not sure what
 some of the side effects this is having.  I'll keep you posted when I
 get something working.  I'm now having a crash course in how kjournal
 and friends work.

did you try the canonical way of putting a spinlock into every
buffer_head?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Andrew Morton
Steven Rostedt [EMAIL PROTECTED] wrote:

  did you try the canonical way of putting a spinlock into every
   buffer_head?
  
 
  No, I'll try that now. I just didn't want to modify the buffer head struct
  just for journaling.  But if it is the quickest and easiest fix, then I'll
  submit it and we can change it later.

You'll need two spinlocks.  jbd_lock_bh_state() and jbd_lock_bh_journal_head().
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Steven Rostedt


On Fri, 11 Mar 2005, Andrew Morton wrote:

 Steven Rostedt [EMAIL PROTECTED] wrote:
   No, I'll try that now. I just didn't want to modify the buffer head struct
   just for journaling.  But if it is the quickest and easiest fix, then I'll
   submit it and we can change it later.

 You'll need two spinlocks.  jbd_lock_bh_state() and 
 jbd_lock_bh_journal_head().


Yep, already did that. Now I need to reboot the new kernel and give it a
try.

-- Steve

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Steven Rostedt


Here's the patch. It's probably more of an overkill wrt buffer heads, but
it seems to be the easiest solution.

I also put back some of the changes you made for the
bit_spin_locks, so that they act the same as the vanilla kernel if
PREEMPT_RT is not defined.  Now I only tested this with PREEMPT_RT
configured so I hope others can test it with it off. If I get time I'll do
that as well.

I patched this against linux-2.6.11-rc4-V0.7.39-02, so I hope it goes
easily into .40.

Lee,

 Could you see what the latencies are with kjournal with this patch
applied.

Thanks,

 -- Steve


diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/fs/buffer.c 
linux-2.6.11-rc4-V0.7.39-02/fs/buffer.c
--- linux-2.6.11-rc4-V0.7.39-02.orig/fs/buffer.c2005-02-12 
22:06:54.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/fs/buffer.c 2005-03-11 07:48:04.0 
-0500
@@ -3002,6 +3002,10 @@
preempt_disable();
__get_cpu_var(bh_accounting).nr++;
recalc_bh_state();
+#ifdef CONFIG_PREEMPT_RT
+   spin_lock_init(ret-b_jstate_lock);
+   spin_lock_init(ret-b_jhead_lock);
+#endif
preempt_enable();
}
return ret;
diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/buffer_head.h 
linux-2.6.11-rc4-V0.7.39-02/include/linux/buffer_head.h
--- linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/buffer_head.h
2005-02-12 22:05:10.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/include/linux/buffer_head.h 2005-03-11 
07:59:44.0 -0500
@@ -62,6 +62,14 @@
bh_end_io_t *b_end_io;  /* I/O completion */
void *b_private;/* reserved for b_end_io */
struct list_head b_assoc_buffers; /* associated with another mapping */
+
+#ifdef CONFIG_PREEMPT_RT
+   /*
+* Fixme: This should be in the journal code.
+*/
+   spinlock_t b_jstate_lock;   /* lock for journal state. */
+   spinlock_t b_jhead_lock;/* lock for journal head. */
+#endif
 };

 /*
diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/jbd.h 
linux-2.6.11-rc4-V0.7.39-02/include/linux/jbd.h
--- linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/jbd.h2005-02-12 
22:07:18.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/include/linux/jbd.h 2005-03-11 
07:57:47.0 -0500
@@ -314,6 +314,12 @@
 TAS_BUFFER_FNS(RevokeValid, revokevalid)
 BUFFER_FNS(Freed, freed)

+#ifdef CONFIG_PREEMPT_RT
+#define PICK_SPIN_LOCK(otype,bit,name) spin_##otype(bh-b_##name##_lock)
+#else
+#define PICK_SPIN_LOCK(otype,bit,name) bit_spin_##otype(bit,bh-b_state);
+#endif
+
 static inline struct buffer_head *jh2bh(struct journal_head *jh)
 {
return jh-b_bh;
@@ -326,33 +332,34 @@

 static inline void jbd_lock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_State, bh-b_state);
+   PICK_SPIN_LOCK(lock,BH_State,jstate);
 }

 static inline int jbd_trylock_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_trylock(BH_State, bh-b_state);
+   return PICK_SPIN_LOCK(trylock,BH_State,jstate);
 }

 static inline int jbd_is_locked_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_is_locked(BH_State, bh-b_state);
+   return PICK_SPIN_LOCK(is_locked,BH_State,jstate);
 }

 static inline void jbd_unlock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_State, bh-b_state);
+   PICK_SPIN_LOCK(unlock,BH_State,jstate);
 }

 static inline void jbd_lock_bh_journal_head(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_JournalHead, bh-b_state);
+   PICK_SPIN_LOCK(lock,BH_JournalHead,jhead);
 }

 static inline void jbd_unlock_bh_journal_head(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_JournalHead, bh-b_state);
+   PICK_SPIN_LOCK(unlock,BH_JournalHead,jhead);
 }
+#undef PICK_SPIN_LOCK

 struct jbd_revoke_table_s;

diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/spinlock.h 
linux-2.6.11-rc4-V0.7.39-02/include/linux/spinlock.h
--- linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/spinlock.h   2005-03-10 
08:47:25.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/include/linux/spinlock.h2005-03-11 
09:06:26.254317378 -0500
@@ -774,6 +774,10 @@
 }))


+#ifndef CONFIG_PREEMPT_RT
+
+/* These are just plain evil! */
+
 /*
  *  bit-based spin_lock()
  *
@@ -789,10 +793,15 @@
 * busywait with less bus contention for a good time to
 * attempt to acquire the lock bit.
 */
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
-   while (test_and_set_bit(bitnum, addr))
-   while (test_bit(bitnum, addr))
+   preempt_disable();
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
+   while (test_and_set_bit(bitnum, addr)) {
+   while (test_bit(bitnum, addr)) {
+   preempt_enable();
cpu_relax();
+   preempt_disable();
+   }
+   }
 #endif
__acquire(bitlock);
 }
@@ -802,9 

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Steven Rostedt


 +#ifdef CONFIG_PREEMPT_RT
 +#define PICK_SPIN_LOCK(otype,bit,name) spin_##otype(bh-b_##name##_lock)
 +#else
 +#define PICK_SPIN_LOCK(otype,bit,name) bit_spin_##otype(bit,bh-b_state);
 +#endif
 +

Oops, extra semicolon on the non RT side.


I'll try again.

-- Steve

diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/fs/buffer.c 
linux-2.6.11-rc4-V0.7.39-02/fs/buffer.c
--- linux-2.6.11-rc4-V0.7.39-02.orig/fs/buffer.c2005-02-12 
22:06:54.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/fs/buffer.c 2005-03-11 07:48:04.0 
-0500
@@ -3002,6 +3002,10 @@
preempt_disable();
__get_cpu_var(bh_accounting).nr++;
recalc_bh_state();
+#ifdef CONFIG_PREEMPT_RT
+   spin_lock_init(ret-b_jstate_lock);
+   spin_lock_init(ret-b_jhead_lock);
+#endif
preempt_enable();
}
return ret;
diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/buffer_head.h 
linux-2.6.11-rc4-V0.7.39-02/include/linux/buffer_head.h
--- linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/buffer_head.h
2005-02-12 22:05:10.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/include/linux/buffer_head.h 2005-03-11 
07:59:44.0 -0500
@@ -62,6 +62,14 @@
bh_end_io_t *b_end_io;  /* I/O completion */
void *b_private;/* reserved for b_end_io */
struct list_head b_assoc_buffers; /* associated with another mapping */
+
+#ifdef CONFIG_PREEMPT_RT
+   /*
+* Fixme: This should be in the journal code.
+*/
+   spinlock_t b_jstate_lock;   /* lock for journal state. */
+   spinlock_t b_jhead_lock;/* lock for journal head. */
+#endif
 };

 /*
diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/jbd.h 
linux-2.6.11-rc4-V0.7.39-02/include/linux/jbd.h
--- linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/jbd.h2005-02-12 
22:07:18.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/include/linux/jbd.h 2005-03-11 
07:57:47.0 -0500
@@ -314,6 +314,12 @@
 TAS_BUFFER_FNS(RevokeValid, revokevalid)
 BUFFER_FNS(Freed, freed)

+#ifdef CONFIG_PREEMPT_RT
+#define PICK_SPIN_LOCK(otype,bit,name) spin_##otype(bh-b_##name##_lock)
+#else
+#define PICK_SPIN_LOCK(otype,bit,name) bit_spin_##otype(bit,bh-b_state)
+#endif
+
 static inline struct buffer_head *jh2bh(struct journal_head *jh)
 {
return jh-b_bh;
@@ -326,33 +332,34 @@

 static inline void jbd_lock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_State, bh-b_state);
+   PICK_SPIN_LOCK(lock,BH_State,jstate);
 }

 static inline int jbd_trylock_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_trylock(BH_State, bh-b_state);
+   return PICK_SPIN_LOCK(trylock,BH_State,jstate);
 }

 static inline int jbd_is_locked_bh_state(struct buffer_head *bh)
 {
-   return bit_spin_is_locked(BH_State, bh-b_state);
+   return PICK_SPIN_LOCK(is_locked,BH_State,jstate);
 }

 static inline void jbd_unlock_bh_state(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_State, bh-b_state);
+   PICK_SPIN_LOCK(unlock,BH_State,jstate);
 }

 static inline void jbd_lock_bh_journal_head(struct buffer_head *bh)
 {
-   bit_spin_lock(BH_JournalHead, bh-b_state);
+   PICK_SPIN_LOCK(lock,BH_JournalHead,jhead);
 }

 static inline void jbd_unlock_bh_journal_head(struct buffer_head *bh)
 {
-   bit_spin_unlock(BH_JournalHead, bh-b_state);
+   PICK_SPIN_LOCK(unlock,BH_JournalHead,jhead);
 }
+#undef PICK_SPIN_LOCK

 struct jbd_revoke_table_s;

diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/spinlock.h 
linux-2.6.11-rc4-V0.7.39-02/include/linux/spinlock.h
--- linux-2.6.11-rc4-V0.7.39-02.orig/include/linux/spinlock.h   2005-03-10 
08:47:25.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/include/linux/spinlock.h2005-03-11 
09:06:26.254317378 -0500
@@ -774,6 +774,10 @@
 }))


+#ifndef CONFIG_PREEMPT_RT
+
+/* These are just plain evil! */
+
 /*
  *  bit-based spin_lock()
  *
@@ -789,10 +793,15 @@
 * busywait with less bus contention for a good time to
 * attempt to acquire the lock bit.
 */
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
-   while (test_and_set_bit(bitnum, addr))
-   while (test_bit(bitnum, addr))
+   preempt_disable();
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
+   while (test_and_set_bit(bitnum, addr)) {
+   while (test_bit(bitnum, addr)) {
+   preempt_enable();
cpu_relax();
+   preempt_disable();
+   }
+   }
 #endif
__acquire(bitlock);
 }
@@ -802,9 +811,12 @@
  */
 static inline int bit_spin_trylock(int bitnum, unsigned long *addr)
 {
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || 
defined(CONFIG_PREEMPT)
-   if (test_and_set_bit(bitnum, addr))
+   preempt_disable();
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
+   if 

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread K.R. Foley
Steven Rostedt wrote:
+#ifdef CONFIG_PREEMPT_RT
+#define PICK_SPIN_LOCK(otype,bit,name) spin_##otype(bh-b_##name##_lock)
+#else
+#define PICK_SPIN_LOCK(otype,bit,name) bit_spin_##otype(bit,bh-b_state);
+#endif
+

Oops, extra semicolon on the non RT side.
I'll try again.
-- Steve
Haven't tried it yet, but does apply cleanly to 2.6.11-final-V0.7.40-00.
kr
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Ingo Molnar

* Steven Rostedt [EMAIL PROTECTED] wrote:

 Here's the patch. It's probably more of an overkill wrt buffer heads,
 but it seems to be the easiest solution.

isnt there some ext3-private journal structure (journal-bh) linked off 
the bh? If the lock is in that structure then the overhead would only 
affect ext3.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Steven Rostedt

On Fri, 11 Mar 2005, Ingo Molnar wrote:


 * Steven Rostedt [EMAIL PROTECTED] wrote:

  Here's the patch. It's probably more of an overkill wrt buffer heads,
  but it seems to be the easiest solution.

 isnt there some ext3-private journal structure (journal-bh) linked off
 the bh? If the lock is in that structure then the overhead would only
 affect ext3.


Yes, there is, and I was trying to use it before you mentioned trying this
(which works for now).  The locks are called before and after the private
pointer of the bh is set and removed.  The journal_head lock, I was going
to make global, and the state lock would go on this structure. I would
have to do some hack in journal.c to flag the state lock when it was
removing the journal head so that it didn't do the remove there, but did
it after the state lock was released. But this still had a few crashes.

The journal_head lock was used to lock when to add or remove the private
data from the bh, so you can see why this structure can't be used for this
purpose. But the state lock seemed to be ok for this. I need to know more
about the journaling system.

 I'll look into doing this too, but this fix should due for now.

-- Steve

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Steven Rostedt


On Fri, 11 Mar 2005, Ingo Molnar wrote:


 * Steven Rostedt [EMAIL PROTECTED] wrote:

  Here's the patch. It's probably more of an overkill wrt buffer heads,
  but it seems to be the easiest solution.

 isnt there some ext3-private journal structure (journal-bh) linked off
 the bh? If the lock is in that structure then the overhead would only
 affect ext3.


OK, here it is (Yuck!).  I was able to use the journal head (private data
of the buffer head) for the state lock.  I just decided to have the
journal head lock be one global lock for all buffer heads, since it is
used to add and remove the journal private data from the buffer head, and
thus can't be stored in the journal private data.

The state lock is now in the journal private data but we must be careful
not to free this data before we unlock it. So here's what I've done.

  static inline void jbd_lock_bh_state(struct buffer_head *bh)
  {
BUG_ON(!bh-b_private);
atomic_inc(bh2jh(bh)-b_state_wait_count);
spin_lock(bh2jh(bh)-b_state_lock);
  }

I have a counter of those that want/have the lock, and this informs the
journal_remove_journal_head that it should not free the jh.

  static void __journal_remove_journal_head(struct buffer_head *bh)
  {
struct journal_head *jh = bh2jh(bh);

J_ASSERT_JH(jh, jh-b_jcount = 0);

get_bh(bh);
if (jh-b_jcount == 0) {
if (jh-b_transaction == NULL 
jh-b_next_transaction == NULL 
jh-b_cp_transaction == NULL) {
  #ifdef CONFIG_PREEMPT_RT
if (atomic_read(jh-b_state_wait_count)) {
BUG_ON(buffer_journalhead(bh));
set_buffer_journalhead(bh);
} else
  #endif
{


Here the state_wait_count is checked, and if  0, then using the bit that
was originally used for locking the journal head, is set to inform the
unlocking of the state lock that it needs to be removed.

  static inline void jbd_unlock_bh_state(struct buffer_head *bh)
  {
int rmjh = 0;

BUG_ON(!atomic_read(bh2jh(bh)-b_state_wait_count));
atomic_dec(bh2jh(bh)-b_state_wait_count);

if (buffer_journalhead(bh)) {
clear_buffer_journalhead(bh);
rmjh = 1;
}

spin_unlock(bh2jh(bh)-b_state_lock);

if (rmjh)
journal_remove_journal_head(bh);
  }

Now in the unlocking of the state lock, the journal head bit is tested and
if it is set, then the remove journal head function is called.


Maybe this isn't the cleanest solution, but it keeps the overhead on the
buffer heads down, so it's prefered over my last patch.

Once again, this has only been tested with full preemption enabled, but I
tried to keep it from changing the way non PREEMPT_RT works.

I'm leaving now for the weekend, so I won't be able to respond to anyone
till Monday.  I'll also run this patch over the weekend while compiling
the kernel in an endless loop

 while [ 1 ]; do
   make clean; make
 done

With kjournal running FIFO, to see if it survives.

Cheers,


-- Steve

diff -ur linux-2.6.11-rc4-V0.7.39-02.orig/fs/jbd/journal.c 
linux-2.6.11-rc4-V0.7.39-02/fs/jbd/journal.c
--- linux-2.6.11-rc4-V0.7.39-02.orig/fs/jbd/journal.c   2005-02-12 
22:05:29.0 -0500
+++ linux-2.6.11-rc4-V0.7.39-02/fs/jbd/journal.c2005-03-11 
14:54:21.0 -0500
@@ -80,6 +80,10 @@
 EXPORT_SYMBOL(journal_try_to_free_buffers);
 EXPORT_SYMBOL(journal_force_commit);

+#ifdef CONFIG_PREEMPT_RT
+spinlock_t jbd_journal_head_lock = SPIN_LOCK_UNLOCKED;
+#endif
+
 static int journal_convert_superblock_v1(journal_t *, journal_superblock_t *);

 /*
@@ -1727,6 +1731,9 @@
jh = new_jh;
new_jh = NULL;  /* We consumed it */
set_buffer_jbd(bh);
+#ifdef CONFIG_PREEMPT_RT
+   spin_lock_init(jh-b_state_lock);
+#endif
bh-b_private = jh;
jh-b_bh = bh;
get_bh(bh);
@@ -1767,26 +1774,34 @@
if (jh-b_transaction == NULL 
jh-b_next_transaction == NULL 
jh-b_cp_transaction == NULL) {
-   J_ASSERT_BH(bh, buffer_jbd(bh));
-   J_ASSERT_BH(bh, jh2bh(jh) == bh);
-   BUFFER_TRACE(bh, remove journal_head);
-   if (jh-b_frozen_data) {
-   printk(KERN_WARNING %s: freeing 
-   b_frozen_data\n,
-   __FUNCTION__);
-   kfree(jh-b_frozen_data);
-   }
-   if (jh-b_committed_data) {
-   printk(KERN_WARNING %s: freeing 
-   b_committed_data\n,
-

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Lee Revell
On Fri, 2005-03-11 at 15:39 -0500, Steven Rostedt wrote:
 I'm leaving now for the weekend, so I won't be able to respond to anyone
 till Monday.  I'll also run this patch over the weekend while compiling
 the kernel in an endless loop

I'll test this with PREEMPT_DESKTOP and data=ordered also and see how it
goes.

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-11 Thread Lee Revell
On Fri, 2005-03-11 at 15:46 -0500, Lee Revell wrote:
 On Fri, 2005-03-11 at 15:39 -0500, Steven Rostedt wrote:
  I'm leaving now for the weekend, so I won't be able to respond to anyone
  till Monday.  I'll also run this patch over the weekend while compiling
  the kernel in an endless loop
 
 I'll test this with PREEMPT_DESKTOP and data=ordered also and see how it
 goes.

Does not seem to work at all with the above settings.  It seemed OK
until I started X.  Then every time I launched an xterm it would
disappear as soon as I typed anything.  I could not switch consoles to
see the Oops.

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-10 Thread Steven Rostedt

On Thu, 10 Mar 2005, Steven Rostedt wrote:

> The short term fix is probably to put back the preempt_disables, the long
> term is to get rid of these stupid bit_spin_lock busy loops.
>

Doing a quick search on the kernel, it looks like only kjournald uses the
bit_spin_locks. I'll start converting them to spinlocks. The use seems to
be more of a hack, since it is using bits in the state field for locking,
and these bits aren't used for anything else.

-- Steve
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-10 Thread Steven Rostedt

Hi Ingo,

I notice a problem with the bit_spin_locks that would probably explain the
kjournald latency problems. I'm working on a custom kernel based on your's
and I needed to temporarily remove the scheduler_tick from
update_process_times to implement some special scheduling needs.  This
caused kjournal to go into an infinite loop.

Here's your bit_spin_lock:

static inline void bit_spin_lock(int bitnum, unsigned long *addr)
{
/*
 * Assuming the lock is uncontended, this never enters
 * the body of the outer loop. If it is contended, then
 * within the inner loop a non-atomic test is used to
 * busywait with less bus contention for a good time to
 * attempt to acquire the lock bit.
 */
#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) ||
defined(CONFIG_PREEMPT)
while (test_and_set_bit(bitnum, addr))
while (test_bit(bitnum, addr))
cpu_relax();
#endif
__acquire(bitlock);
}


You removed the preempt disable and added the CONFIG_PREEMPT. What happens
if a lower priority process gets the bit lock and gets preempted by a
higher priority process that then tries to get this lock. It spins until
it's quota runs out.  This is what is happening to kjournald. A lower
priority process gets the bit lock and kjournald preempts it causing
kjournald to spin until it's quota is up to let the other process
release the lock.  Now, luckly your kernel kjournald is not realtime
FIFO. If it were, you would than have a deadlock, try it. I just set
kjournald (using your kernel) to FIFO prio 42 (prio 58 inside the kernel),
and with a non-rt task, I did a build of the kernel.  After a minute or
two, all processes under the priority of kjournald were starved out of the
CPU, and kjournald was spinning.  Make sure your kjournald has a lower
prioirty than your interrupt threads.

The culprit is jbd_lock_bh_state and jbd_lock_bh_journal_head which call
bit_spin_lock.

Example of long latency: (or deadlock)

journal_refile_buffer
   --> spin_lock(>j_list_lock);
   --> journal_remove_journal_head(bh);
 --> jbd_lock_bh_journal_head(bh);
   --> bit_spin_lock(BH_JournalHead, >b_state);

The short term fix is probably to put back the preempt_disables, the long
term is to get rid of these stupid bit_spin_lock busy loops.

-- Steve


On Sat, 19 Feb 2005, Lee Revell wrote:

> On Fri, 2005-02-04 at 11:03 +0100, Ingo Molnar wrote:
> >   http://redhat.com/~mingo/realtime-preempt/
> >
>
> Testing on an all SCSI 1.3Ghz Athlon XP system, I am seeing very long
> latencies in the journalling code with 2.6.11-rc4-RT-V0.7.39-02.
>
> preemption latency trace v1.1.4 on 2.6.11-rc4-RT-V0.7.39-02
> 
>  latency: 713 µs, #3455/3455, CPU#0 | (M:preempt VP:0, KP:1, SP:1 HP:1 #P:1)
> -
> | task: ksoftirqd/0-2 (uid:0 nice:-10 policy:0 rt_prio:0)
> -
>
>  _--=> CPU#
> / _-=> irqs-off
>| / _=> need-resched
>|| / _---=> hardirq/softirq
>||| / _--=> preempt-depth
> /
>| delay
>cmd pid | time  |   caller
>   \   /|   \   |   /
> kjournal-2478  0dn.40µs!: <756f6a6b> (<6c616e72>)
> kjournal-2478  0dn.40µs : __trace_start_sched_wakeup (try_to_wake_up)
> kjournal-2478  0dn.30µs : preempt_schedule (try_to_wake_up)
> kjournal-2478  0dn.30µs : try_to_wake_up <<...>-2> (69 73):
> kjournal-2478  0dn.20µs : preempt_schedule (try_to_wake_up)
> kjournal-2478  0dn.20µs : wake_up_process (do_softirq)
> kjournal-2478  0dn.11µs < (1)
>
> The repeating pattern is 8 of these:
>
> kjournal-2478  0.n.11µs : inverted_lock (journal_commit_transaction)
> kjournal-2478  0.n.11µs : __journal_unfile_buffer 
> (journal_commit_transaction)
> kjournal-2478  0.n.11µs : journal_remove_journal_head 
> (journal_commit_transaction)
> kjournal-2478  0.n.11µs : __journal_remove_journal_head 
> (journal_remove_journal_head)
> kjournal-2478  0.n.11µs : __brelse (__journal_remove_journal_head)
> kjournal-2478  0.n.11µs : journal_free_journal_head 
> (journal_remove_journal_head)
> kjournal-2478  0.n.12µs : kmem_cache_free (journal_free_journal_head)
>
> and one of these:
>
> kjournal-2478  0dn.19µs : cache_flusharray (kmem_cache_free)
> kjournal-2478  0dn.29µs : free_block (cache_flusharray)
> kjournal-2478  0dn.1   11µs : preempt_schedule (cache_flusharray)
> kjournal-2478  0dn.1   11µs : memmove (cache_flusharray)
> kjournal-2478  0dn.1   11µs : memcpy (memmove)
>
> etc.  Finally:
>
> kjournal-2478  0dn.1  704µs : cache_flusharray (kmem_cache_free)
> kjournal-2478  0dn.2  704µs+: free_block (cache_flusharray)
> kjournal-2478  0dn.1  707µs : preempt_schedule (cache_flusharray)
> kjournal-2478  0dn.1  707µs : memmove (cache_flusharray)
> 

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-10 Thread Steven Rostedt

Hi Ingo,

I notice a problem with the bit_spin_locks that would probably explain the
kjournald latency problems. I'm working on a custom kernel based on your's
and I needed to temporarily remove the scheduler_tick from
update_process_times to implement some special scheduling needs.  This
caused kjournal to go into an infinite loop.

Here's your bit_spin_lock:

static inline void bit_spin_lock(int bitnum, unsigned long *addr)
{
/*
 * Assuming the lock is uncontended, this never enters
 * the body of the outer loop. If it is contended, then
 * within the inner loop a non-atomic test is used to
 * busywait with less bus contention for a good time to
 * attempt to acquire the lock bit.
 */
#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) ||
defined(CONFIG_PREEMPT)
while (test_and_set_bit(bitnum, addr))
while (test_bit(bitnum, addr))
cpu_relax();
#endif
__acquire(bitlock);
}


You removed the preempt disable and added the CONFIG_PREEMPT. What happens
if a lower priority process gets the bit lock and gets preempted by a
higher priority process that then tries to get this lock. It spins until
it's quota runs out.  This is what is happening to kjournald. A lower
priority process gets the bit lock and kjournald preempts it causing
kjournald to spin until it's quota is up to let the other process
release the lock.  Now, luckly your kernel kjournald is not realtime
FIFO. If it were, you would than have a deadlock, try it. I just set
kjournald (using your kernel) to FIFO prio 42 (prio 58 inside the kernel),
and with a non-rt task, I did a build of the kernel.  After a minute or
two, all processes under the priority of kjournald were starved out of the
CPU, and kjournald was spinning.  Make sure your kjournald has a lower
prioirty than your interrupt threads.

The culprit is jbd_lock_bh_state and jbd_lock_bh_journal_head which call
bit_spin_lock.

Example of long latency: (or deadlock)

journal_refile_buffer
   -- spin_lock(journal-j_list_lock);
   -- journal_remove_journal_head(bh);
 -- jbd_lock_bh_journal_head(bh);
   -- bit_spin_lock(BH_JournalHead, bh-b_state);

The short term fix is probably to put back the preempt_disables, the long
term is to get rid of these stupid bit_spin_lock busy loops.

-- Steve


On Sat, 19 Feb 2005, Lee Revell wrote:

 On Fri, 2005-02-04 at 11:03 +0100, Ingo Molnar wrote:
http://redhat.com/~mingo/realtime-preempt/
 

 Testing on an all SCSI 1.3Ghz Athlon XP system, I am seeing very long
 latencies in the journalling code with 2.6.11-rc4-RT-V0.7.39-02.

 preemption latency trace v1.1.4 on 2.6.11-rc4-RT-V0.7.39-02
 
  latency: 713 µs, #3455/3455, CPU#0 | (M:preempt VP:0, KP:1, SP:1 HP:1 #P:1)
 -
 | task: ksoftirqd/0-2 (uid:0 nice:-10 policy:0 rt_prio:0)
 -

  _--= CPU#
 / _-= irqs-off
| / _= need-resched
|| / _---= hardirq/softirq
||| / _--= preempt-depth
 /
| delay
cmd pid | time  |   caller
   \   /|   \   |   /
 kjournal-2478  0dn.40µs!: 756f6a6b (6c616e72)
 kjournal-2478  0dn.40µs : __trace_start_sched_wakeup (try_to_wake_up)
 kjournal-2478  0dn.30µs : preempt_schedule (try_to_wake_up)
 kjournal-2478  0dn.30µs : try_to_wake_up ...-2 (69 73):
 kjournal-2478  0dn.20µs : preempt_schedule (try_to_wake_up)
 kjournal-2478  0dn.20µs : wake_up_process (do_softirq)
 kjournal-2478  0dn.11µs  (1)

 The repeating pattern is 8 of these:

 kjournal-2478  0.n.11µs : inverted_lock (journal_commit_transaction)
 kjournal-2478  0.n.11µs : __journal_unfile_buffer 
 (journal_commit_transaction)
 kjournal-2478  0.n.11µs : journal_remove_journal_head 
 (journal_commit_transaction)
 kjournal-2478  0.n.11µs : __journal_remove_journal_head 
 (journal_remove_journal_head)
 kjournal-2478  0.n.11µs : __brelse (__journal_remove_journal_head)
 kjournal-2478  0.n.11µs : journal_free_journal_head 
 (journal_remove_journal_head)
 kjournal-2478  0.n.12µs : kmem_cache_free (journal_free_journal_head)

 and one of these:

 kjournal-2478  0dn.19µs : cache_flusharray (kmem_cache_free)
 kjournal-2478  0dn.29µs : free_block (cache_flusharray)
 kjournal-2478  0dn.1   11µs : preempt_schedule (cache_flusharray)
 kjournal-2478  0dn.1   11µs : memmove (cache_flusharray)
 kjournal-2478  0dn.1   11µs : memcpy (memmove)

 etc.  Finally:

 kjournal-2478  0dn.1  704µs : cache_flusharray (kmem_cache_free)
 kjournal-2478  0dn.2  704µs+: free_block (cache_flusharray)
 kjournal-2478  0dn.1  707µs : preempt_schedule (cache_flusharray)
 kjournal-2478  0dn.1  707µs : memmove (cache_flusharray)
 kjournal-2478  0dn.1  707µs : memcpy (memmove)
 kjournal-2478  0.n.1  

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-03-10 Thread Steven Rostedt

On Thu, 10 Mar 2005, Steven Rostedt wrote:

 The short term fix is probably to put back the preempt_disables, the long
 term is to get rid of these stupid bit_spin_lock busy loops.


Doing a quick search on the kernel, it looks like only kjournald uses the
bit_spin_locks. I'll start converting them to spinlocks. The use seems to
be more of a hack, since it is using bits in the state field for locking,
and these bits aren't used for anything else.

-- Steve
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-22 Thread Lee Revell
On Sat, 2005-02-19 at 10:03 +0100, Ingo Molnar wrote:
> * Ingo Molnar <[EMAIL PROTECTED]> wrote:
> 
> > > Testing on an all SCSI 1.3Ghz Athlon XP system, I am seeing very long
> > > latencies in the journalling code with 2.6.11-rc4-RT-V0.7.39-02.
> > 
> > could you send me the full trace?
> 

On my other machine this 333us trace is the longest latency reported in
the first few minutes with PREEMPT_DESKTOP.  It seems to be a regression
from earlier versions.  If I read the trace right copy_pte_range is the
problem.

Lee

preemption latency trace v1.1.4 on 2.6.11-rc4-RT-V0.7.39-02

 latency: 333 µs, #63/63, CPU#0 | (M:preempt VP:0, KP:1, SP:1 HP:1 #P:1)
-
| task: XFree86-2593 (uid:0 nice:0 policy:0 rt_prio:0)
-

 _--=> CPU#
/ _-=> irqs-off
   | / _=> need-resched
   || / _---=> hardirq/softirq 
   ||| / _--=> preempt-depth   
    /  
   | delay 
   cmd pid | time  |   caller  
  \   /|   \   |   /   
(T1/#0) dpkg  4362 0 5 0006  [380181315825] 0.000ms 
(+3550398.796ms): <676b7064> (<00746500>)
(T1/#2) dpkg  4362 0 5 0006 0002 [380181316227] 0.000ms 
(+0.000ms): __trace_start_sched_wakeup+0x96/0xc0  
(try_to_wake_up+0x81/0x150 )
(T1/#3) dpkg  4362 0 5 0004 0003 [380181316766] 0.001ms 
(+0.001ms): wake_up_state+0x1e/0x30  (signal_wake_up+0x2d/0x30 
)
(T1/#4) dpkg  4362 0 5  0004 [380181317637] 0.003ms 
(+0.000ms): __wake_up+0xe/0x70  (mousedev_event+0xd8/0x140 )
(T1/#5) dpkg  4362 0 5 0001 0005 [380181318080] 0.003ms 
(+0.001ms): __wake_up_common+0xb/0x70  (__wake_up+0x3b/0x70 
)
(T1/#6) dpkg  4362 0 5  0006 [380181318983] 0.005ms 
(+0.002ms): usb_submit_urb+0xe/0x2c0  (hid_irq_in+0x4e/0xe0 
)
(T1/#7) dpkg  4362 0 5  0007 [380181320688] 0.008ms 
(+0.001ms): hcd_submit_urb+0xe/0x200  (usb_submit_urb+0x1c6/0x2c0 
)
(T1/#8) dpkg  4362 0 5 0001 0008 [380181321463] 0.009ms 
(+0.000ms): usb_get_dev+0x9/0x30  (hcd_submit_urb+0x1a9/0x200 
)
(T1/#9) dpkg  4362 0 5 0001 0009 [380181321943] 0.010ms 
(+0.000ms): get_device+0x8/0x30  (usb_get_dev+0x19/0x30 )
(T1/#10) dpkg  4362 0 5 0001 000a [380181322283] 
0.010ms (+0.000ms): kobject_get+0x9/0x30  (get_device+0x1a/0x30 
)
(T1/#11) dpkg  4362 0 5 0001 000b [380181322691] 
0.011ms (+0.001ms): kref_get+0x9/0x60  (kobject_get+0x19/0x30 
)
(T1/#12) dpkg  4362 0 5  000c [380181323295] 
0.012ms (+0.000ms): usb_get_urb+0x9/0x20  (hcd_submit_urb+0xc6/0x200 
)
(T1/#13) dpkg  4362 0 5  000d [380181323566] 
0.012ms (+0.001ms): kref_get+0x9/0x60  (usb_get_urb+0x16/0x20 
)
(T1/#14) dpkg  4362 0 5  000e [380181324216] 
0.013ms (+0.000ms): uhci_urb_enqueue+0xe/0x290  
(hcd_submit_urb+0x123/0x200 )
(T1/#15) dpkg  4362 0 5 0001 000f [380181324743] 
0.014ms (+0.000ms): uhci_find_urb_ep+0xe/0xb0  
(uhci_urb_enqueue+0x7a/0x290 )
(T1/#16) dpkg  4362 0 5 0001 0010 [380181325251] 
0.015ms (+0.000ms): uhci_alloc_urb_priv+0xb/0x80  
(uhci_urb_enqueue+0x87/0x290 )
(T1/#17) dpkg  4362 0 5 0001 0011 [380181325582] 
0.016ms (+0.001ms): kmem_cache_alloc+0xb/0x70  
(uhci_alloc_urb_priv+0x1c/0x80 )
(T1/#18) dpkg  4362 0 5 0001 0012 [380181326332] 
0.017ms (+0.000ms): usb_check_bandwidth+0xc/0x140  
(uhci_urb_enqueue+0x200/0x290 )
(T1/#19) dpkg  4362 0 5 0001 0013 [380181326926] 
0.018ms (+0.001ms): usb_calc_bus_time+0x9/0x270  
(usb_check_bandwidth+0x6b/0x140 )
(T1/#20) dpkg  4362 0 5 0001 0014 [380181327893] 
0.020ms (+0.001ms): uhci_submit_common+0xe/0x380  
(uhci_urb_enqueue+0x239/0x290 )
(T1/#21) dpkg  4362 0 5 0001 0015 [380181328984] 
0.021ms (+0.001ms): uhci_alloc_td+0xb/0x80  
(uhci_submit_common+0xf0/0x380 )
(T1/#22) dpkg  4362 0 5 0001 0016 [380181329685] 
0.023ms (+0.002ms): dma_pool_alloc+0xe/0x1a0  
(uhci_alloc_td+0x20/0x80 )
(T1/#23) dpkg  4362 0 5 0001 0017 [380181331207] 
0.025ms (+0.000ms): usb_get_dev+0x9/0x30  (uhci_alloc_td+0x69/0x80 
)
(T1/#24) dpkg  4362 0 5 0001 0018 [380181331544] 
0.026ms (+0.000ms): get_device+0x8/0x30  (usb_get_dev+0x19/0x30 
)
(T1/#25) dpkg  4362 0 5 0001 0019 [380181331882] 
0.026ms (+0.000ms): kobject_get+0x9/0x30  (get_device+0x1a/0x30 
)
(T1/#26) dpkg  4362 0 5 0001 001a [380181332215] 
0.027ms (+0.000ms): 

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-22 Thread Lee Revell
On Sat, 2005-02-19 at 10:03 +0100, Ingo Molnar wrote:
 * Ingo Molnar [EMAIL PROTECTED] wrote:
 
   Testing on an all SCSI 1.3Ghz Athlon XP system, I am seeing very long
   latencies in the journalling code with 2.6.11-rc4-RT-V0.7.39-02.
  
  could you send me the full trace?
 

On my other machine this 333us trace is the longest latency reported in
the first few minutes with PREEMPT_DESKTOP.  It seems to be a regression
from earlier versions.  If I read the trace right copy_pte_range is the
problem.

Lee

preemption latency trace v1.1.4 on 2.6.11-rc4-RT-V0.7.39-02

 latency: 333 µs, #63/63, CPU#0 | (M:preempt VP:0, KP:1, SP:1 HP:1 #P:1)
-
| task: XFree86-2593 (uid:0 nice:0 policy:0 rt_prio:0)
-

 _--= CPU#
/ _-= irqs-off
   | / _= need-resched
   || / _---= hardirq/softirq 
   ||| / _--= preempt-depth   
    /  
   | delay 
   cmd pid | time  |   caller  
  \   /|   \   |   /   
(T1/#0) dpkg  4362 0 5 0006  [380181315825] 0.000ms 
(+3550398.796ms): 676b7064 (00746500)
(T1/#2) dpkg  4362 0 5 0006 0002 [380181316227] 0.000ms 
(+0.000ms): __trace_start_sched_wakeup+0x96/0xc0 c012cbe6 
(try_to_wake_up+0x81/0x150 c010f911)
(T1/#3) dpkg  4362 0 5 0004 0003 [380181316766] 0.001ms 
(+0.001ms): wake_up_state+0x1e/0x30 c010fa5e (signal_wake_up+0x2d/0x30 
c011f7bd)
(T1/#4) dpkg  4362 0 5  0004 [380181317637] 0.003ms 
(+0.000ms): __wake_up+0xe/0x70 c011059e (mousedev_event+0xd8/0x140 c0223ac8)
(T1/#5) dpkg  4362 0 5 0001 0005 [380181318080] 0.003ms 
(+0.001ms): __wake_up_common+0xb/0x70 c011052b (__wake_up+0x3b/0x70 
c01105cb)
(T1/#6) dpkg  4362 0 5  0006 [380181318983] 0.005ms 
(+0.002ms): usb_submit_urb+0xe/0x2c0 dcabaefe (hid_irq_in+0x4e/0xe0 
dca7335e)
(T1/#7) dpkg  4362 0 5  0007 [380181320688] 0.008ms 
(+0.001ms): hcd_submit_urb+0xe/0x200 dcaba57e (usb_submit_urb+0x1c6/0x2c0 
dcabb0b6)
(T1/#8) dpkg  4362 0 5 0001 0008 [380181321463] 0.009ms 
(+0.000ms): usb_get_dev+0x9/0x30 dcab5939 (hcd_submit_urb+0x1a9/0x200 
dcaba719)
(T1/#9) dpkg  4362 0 5 0001 0009 [380181321943] 0.010ms 
(+0.000ms): get_device+0x8/0x30 c02012d8 (usb_get_dev+0x19/0x30 dcab5949)
(T1/#10) dpkg  4362 0 5 0001 000a [380181322283] 
0.010ms (+0.000ms): kobject_get+0x9/0x30 c01d7869 (get_device+0x1a/0x30 
c02012ea)
(T1/#11) dpkg  4362 0 5 0001 000b [380181322691] 
0.011ms (+0.001ms): kref_get+0x9/0x60 c01d8339 (kobject_get+0x19/0x30 
c01d7879)
(T1/#12) dpkg  4362 0 5  000c [380181323295] 
0.012ms (+0.000ms): usb_get_urb+0x9/0x20 dcabaed9 (hcd_submit_urb+0xc6/0x200 
dcaba636)
(T1/#13) dpkg  4362 0 5  000d [380181323566] 
0.012ms (+0.001ms): kref_get+0x9/0x60 c01d8339 (usb_get_urb+0x16/0x20 
dcabaee6)
(T1/#14) dpkg  4362 0 5  000e [380181324216] 
0.013ms (+0.000ms): uhci_urb_enqueue+0xe/0x290 dca6bf4e 
(hcd_submit_urb+0x123/0x200 dcaba693)
(T1/#15) dpkg  4362 0 5 0001 000f [380181324743] 
0.014ms (+0.000ms): uhci_find_urb_ep+0xe/0xb0 dca6be9e 
(uhci_urb_enqueue+0x7a/0x290 dca6bfba)
(T1/#16) dpkg  4362 0 5 0001 0010 [380181325251] 
0.015ms (+0.000ms): uhci_alloc_urb_priv+0xb/0x80 dca6aebb 
(uhci_urb_enqueue+0x87/0x290 dca6bfc7)
(T1/#17) dpkg  4362 0 5 0001 0011 [380181325582] 
0.016ms (+0.001ms): kmem_cache_alloc+0xb/0x70 c013dc6b 
(uhci_alloc_urb_priv+0x1c/0x80 dca6aecc)
(T1/#18) dpkg  4362 0 5 0001 0012 [380181326332] 
0.017ms (+0.000ms): usb_check_bandwidth+0xc/0x140 dcaba2fc 
(uhci_urb_enqueue+0x200/0x290 dca6c140)
(T1/#19) dpkg  4362 0 5 0001 0013 [380181326926] 
0.018ms (+0.001ms): usb_calc_bus_time+0x9/0x270 dcaba089 
(usb_check_bandwidth+0x6b/0x140 dcaba35b)
(T1/#20) dpkg  4362 0 5 0001 0014 [380181327893] 
0.020ms (+0.001ms): uhci_submit_common+0xe/0x380 dca6b77e 
(uhci_urb_enqueue+0x239/0x290 dca6c179)
(T1/#21) dpkg  4362 0 5 0001 0015 [380181328984] 
0.021ms (+0.001ms): uhci_alloc_td+0xb/0x80 dca6a5bb 
(uhci_submit_common+0xf0/0x380 dca6b860)
(T1/#22) dpkg  4362 0 5 0001 0016 [380181329685] 
0.023ms (+0.002ms): dma_pool_alloc+0xe/0x1a0 c02051fe 
(uhci_alloc_td+0x20/0x80 dca6a5d0)
(T1/#23) dpkg  4362 0 5 0001 0017 [380181331207] 
0.025ms (+0.000ms): usb_get_dev+0x9/0x30 dcab5939 (uhci_alloc_td+0x69/0x80 
dca6a619)
(T1/#24) dpkg  4362 0 5 0001 

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-19 Thread Lee Revell
On Sat, 2005-02-19 at 15:45 -0500, Lee Revell wrote:
> I have not tried "data=journal".  As previously stated "data=writeback"
> works perfectly - I ran JACK overnight while stressing the fs and did
> not get one xrun.

"data=journal" has the same good performance as "data=writeback".  Only
the ordered data mode is affected.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-19 Thread Ingo Molnar

* Lee Revell <[EMAIL PROTECTED]> wrote:

> On Fri, 2005-02-04 at 11:03 +0100, Ingo Molnar wrote:
> >   http://redhat.com/~mingo/realtime-preempt/
> > 
> 
> Testing on an all SCSI 1.3Ghz Athlon XP system, I am seeing very long
> latencies in the journalling code with 2.6.11-rc4-RT-V0.7.39-02.

could you send me the full trace?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-19 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> > Testing on an all SCSI 1.3Ghz Athlon XP system, I am seeing very long
> > latencies in the journalling code with 2.6.11-rc4-RT-V0.7.39-02.
> 
> could you send me the full trace?

just in case the system in question is still running - could you also do 
a 'verbose' trace via:

echo 1 > /proc/sys/kernel/trace_verbose

and then copying /proc/latency_trace again? (so that we can see the
precise function call offsets - journal_commit_transaction() is a long
function.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-19 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

  Testing on an all SCSI 1.3Ghz Athlon XP system, I am seeing very long
  latencies in the journalling code with 2.6.11-rc4-RT-V0.7.39-02.
 
 could you send me the full trace?

just in case the system in question is still running - could you also do 
a 'verbose' trace via:

echo 1  /proc/sys/kernel/trace_verbose

and then copying /proc/latency_trace again? (so that we can see the
precise function call offsets - journal_commit_transaction() is a long
function.)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-19 Thread Ingo Molnar

* Lee Revell [EMAIL PROTECTED] wrote:

 On Fri, 2005-02-04 at 11:03 +0100, Ingo Molnar wrote:
http://redhat.com/~mingo/realtime-preempt/
  
 
 Testing on an all SCSI 1.3Ghz Athlon XP system, I am seeing very long
 latencies in the journalling code with 2.6.11-rc4-RT-V0.7.39-02.

could you send me the full trace?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-19 Thread Lee Revell
On Sat, 2005-02-19 at 15:45 -0500, Lee Revell wrote:
 I have not tried data=journal.  As previously stated data=writeback
 works perfectly - I ran JACK overnight while stressing the fs and did
 not get one xrun.

data=journal has the same good performance as data=writeback.  Only
the ordered data mode is affected.

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-18 Thread Lee Revell
On Sat, 2005-02-19 at 00:08 -0500, Lee Revell wrote:
> On Fri, 2005-02-04 at 11:03 +0100, Ingo Molnar wrote:
> >   http://redhat.com/~mingo/realtime-preempt/
> > 
> 
> Testing on an all SCSI 1.3Ghz Athlon XP system, I am seeing very long
> latencies in the journalling code with 2.6.11-rc4-RT-V0.7.39-02.

If I mount all filesystems with 'data=writeback', it works perfectly.  I
can run 'dbench 64', JACK with Hydrogen at 32 frames and have been
unable to produce a single xrun.  The maximum wakeup latency I have seen
is 139us.  With 'data=ordered', just launching a web browser can produce
an xrun.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-18 Thread Lee Revell
On Fri, 2005-02-04 at 11:03 +0100, Ingo Molnar wrote:
>   http://redhat.com/~mingo/realtime-preempt/
> 

Testing on an all SCSI 1.3Ghz Athlon XP system, I am seeing very long
latencies in the journalling code with 2.6.11-rc4-RT-V0.7.39-02.

preemption latency trace v1.1.4 on 2.6.11-rc4-RT-V0.7.39-02

 latency: 713 µs, #3455/3455, CPU#0 | (M:preempt VP:0, KP:1, SP:1 HP:1 #P:1)
-
| task: ksoftirqd/0-2 (uid:0 nice:-10 policy:0 rt_prio:0)
-

 _--=> CPU#
/ _-=> irqs-off
   | / _=> need-resched
   || / _---=> hardirq/softirq 
   ||| / _--=> preempt-depth   
    /
   | delay
   cmd pid | time  |   caller  
  \   /|   \   |   /
kjournal-2478  0dn.40µs!: <756f6a6b> (<6c616e72>)
kjournal-2478  0dn.40µs : __trace_start_sched_wakeup (try_to_wake_up)
kjournal-2478  0dn.30µs : preempt_schedule (try_to_wake_up)
kjournal-2478  0dn.30µs : try_to_wake_up <<...>-2> (69 73): 
kjournal-2478  0dn.20µs : preempt_schedule (try_to_wake_up)
kjournal-2478  0dn.20µs : wake_up_process (do_softirq)
kjournal-2478  0dn.11µs < (1)

The repeating pattern is 8 of these:

kjournal-2478  0.n.11µs : inverted_lock (journal_commit_transaction)
kjournal-2478  0.n.11µs : __journal_unfile_buffer 
(journal_commit_transaction)
kjournal-2478  0.n.11µs : journal_remove_journal_head 
(journal_commit_transaction)
kjournal-2478  0.n.11µs : __journal_remove_journal_head 
(journal_remove_journal_head)
kjournal-2478  0.n.11µs : __brelse (__journal_remove_journal_head)
kjournal-2478  0.n.11µs : journal_free_journal_head 
(journal_remove_journal_head)
kjournal-2478  0.n.12µs : kmem_cache_free (journal_free_journal_head)

and one of these:

kjournal-2478  0dn.19µs : cache_flusharray (kmem_cache_free)
kjournal-2478  0dn.29µs : free_block (cache_flusharray)
kjournal-2478  0dn.1   11µs : preempt_schedule (cache_flusharray)
kjournal-2478  0dn.1   11µs : memmove (cache_flusharray)
kjournal-2478  0dn.1   11µs : memcpy (memmove)

etc.  Finally:

kjournal-2478  0dn.1  704µs : cache_flusharray (kmem_cache_free)
kjournal-2478  0dn.2  704µs+: free_block (cache_flusharray)
kjournal-2478  0dn.1  707µs : preempt_schedule (cache_flusharray)
kjournal-2478  0dn.1  707µs : memmove (cache_flusharray)
kjournal-2478  0dn.1  707µs : memcpy (memmove)
kjournal-2478  0.n.1  708µs : inverted_lock (journal_commit_transaction)
kjournal-2478  0.n.1  708µs : __journal_unfile_buffer 
(journal_commit_transaction)
kjournal-2478  0.n.1  709µs : journal_remove_journal_head 
(journal_commit_transaction)
kjournal-2478  0.n.1  709µs : __journal_remove_journal_head 
(journal_remove_journal_head)
kjournal-2478  0.n.1  709µs : __brelse (__journal_remove_journal_head)
kjournal-2478  0.n.1  709µs : journal_free_journal_head 
(journal_remove_journal_head)
kjournal-2478  0.n.1  709µs : kmem_cache_free (journal_free_journal_head)
kjournal-2478  0.n..  710µs : preempt_schedule (journal_commit_transaction)
kjournal-2478  0dn..  710µs : __schedule (preempt_schedule)
kjournal-2478  0dn..  710µs : profile_hit (__schedule)
kjournal-2478  0dn.1  710µs : sched_clock (__schedule)
kjournal-2478  0dn.2  711µs : dequeue_task (__schedule)
kjournal-2478  0dn.2  711µs : recalc_task_prio (__schedule)
kjournal-2478  0dn.2  711µs : effective_prio (recalc_task_prio)
kjournal-2478  0dn.2  711µs : enqueue_task (__schedule)
   <...>-2 0d..2  712µs : __switch_to (__schedule)
   <...>-2 0d..2  712µs : __schedule  (73 69):
   <...>-2 0d..2  712µs : finish_task_switch (__schedule)
   <...>-2 0d..1  712µs : trace_stop_sched_switched (finish_task_switch)
   <...>-2 0d..1  712µs : trace_stop_sched_switched <<...>-2> (69 0):
   <...>-2 0d..1  713µs : trace_stop_sched_switched (finish_task_switch)

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-18 Thread Lee Revell
On Fri, 2005-02-04 at 11:03 +0100, Ingo Molnar wrote:
   http://redhat.com/~mingo/realtime-preempt/
 

Testing on an all SCSI 1.3Ghz Athlon XP system, I am seeing very long
latencies in the journalling code with 2.6.11-rc4-RT-V0.7.39-02.

preemption latency trace v1.1.4 on 2.6.11-rc4-RT-V0.7.39-02

 latency: 713 µs, #3455/3455, CPU#0 | (M:preempt VP:0, KP:1, SP:1 HP:1 #P:1)
-
| task: ksoftirqd/0-2 (uid:0 nice:-10 policy:0 rt_prio:0)
-

 _--= CPU#
/ _-= irqs-off
   | / _= need-resched
   || / _---= hardirq/softirq 
   ||| / _--= preempt-depth   
    /
   | delay
   cmd pid | time  |   caller  
  \   /|   \   |   /
kjournal-2478  0dn.40µs!: 756f6a6b (6c616e72)
kjournal-2478  0dn.40µs : __trace_start_sched_wakeup (try_to_wake_up)
kjournal-2478  0dn.30µs : preempt_schedule (try_to_wake_up)
kjournal-2478  0dn.30µs : try_to_wake_up ...-2 (69 73): 
kjournal-2478  0dn.20µs : preempt_schedule (try_to_wake_up)
kjournal-2478  0dn.20µs : wake_up_process (do_softirq)
kjournal-2478  0dn.11µs  (1)

The repeating pattern is 8 of these:

kjournal-2478  0.n.11µs : inverted_lock (journal_commit_transaction)
kjournal-2478  0.n.11µs : __journal_unfile_buffer 
(journal_commit_transaction)
kjournal-2478  0.n.11µs : journal_remove_journal_head 
(journal_commit_transaction)
kjournal-2478  0.n.11µs : __journal_remove_journal_head 
(journal_remove_journal_head)
kjournal-2478  0.n.11µs : __brelse (__journal_remove_journal_head)
kjournal-2478  0.n.11µs : journal_free_journal_head 
(journal_remove_journal_head)
kjournal-2478  0.n.12µs : kmem_cache_free (journal_free_journal_head)

and one of these:

kjournal-2478  0dn.19µs : cache_flusharray (kmem_cache_free)
kjournal-2478  0dn.29µs : free_block (cache_flusharray)
kjournal-2478  0dn.1   11µs : preempt_schedule (cache_flusharray)
kjournal-2478  0dn.1   11µs : memmove (cache_flusharray)
kjournal-2478  0dn.1   11µs : memcpy (memmove)

etc.  Finally:

kjournal-2478  0dn.1  704µs : cache_flusharray (kmem_cache_free)
kjournal-2478  0dn.2  704µs+: free_block (cache_flusharray)
kjournal-2478  0dn.1  707µs : preempt_schedule (cache_flusharray)
kjournal-2478  0dn.1  707µs : memmove (cache_flusharray)
kjournal-2478  0dn.1  707µs : memcpy (memmove)
kjournal-2478  0.n.1  708µs : inverted_lock (journal_commit_transaction)
kjournal-2478  0.n.1  708µs : __journal_unfile_buffer 
(journal_commit_transaction)
kjournal-2478  0.n.1  709µs : journal_remove_journal_head 
(journal_commit_transaction)
kjournal-2478  0.n.1  709µs : __journal_remove_journal_head 
(journal_remove_journal_head)
kjournal-2478  0.n.1  709µs : __brelse (__journal_remove_journal_head)
kjournal-2478  0.n.1  709µs : journal_free_journal_head 
(journal_remove_journal_head)
kjournal-2478  0.n.1  709µs : kmem_cache_free (journal_free_journal_head)
kjournal-2478  0.n..  710µs : preempt_schedule (journal_commit_transaction)
kjournal-2478  0dn..  710µs : __schedule (preempt_schedule)
kjournal-2478  0dn..  710µs : profile_hit (__schedule)
kjournal-2478  0dn.1  710µs : sched_clock (__schedule)
kjournal-2478  0dn.2  711µs : dequeue_task (__schedule)
kjournal-2478  0dn.2  711µs : recalc_task_prio (__schedule)
kjournal-2478  0dn.2  711µs : effective_prio (recalc_task_prio)
kjournal-2478  0dn.2  711µs : enqueue_task (__schedule)
   ...-2 0d..2  712µs : __switch_to (__schedule)
   ...-2 0d..2  712µs : __schedule kjournal-2478 (73 69):
   ...-2 0d..2  712µs : finish_task_switch (__schedule)
   ...-2 0d..1  712µs : trace_stop_sched_switched (finish_task_switch)
   ...-2 0d..1  712µs : trace_stop_sched_switched ...-2 (69 0):
   ...-2 0d..1  713µs : trace_stop_sched_switched (finish_task_switch)

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-18 Thread Lee Revell
On Sat, 2005-02-19 at 00:08 -0500, Lee Revell wrote:
 On Fri, 2005-02-04 at 11:03 +0100, Ingo Molnar wrote:
http://redhat.com/~mingo/realtime-preempt/
  
 
 Testing on an all SCSI 1.3Ghz Athlon XP system, I am seeing very long
 latencies in the journalling code with 2.6.11-rc4-RT-V0.7.39-02.

If I mount all filesystems with 'data=writeback', it works perfectly.  I
can run 'dbench 64', JACK with Hydrogen at 32 frames and have been
unable to produce a single xrun.  The maximum wakeup latency I have seen
is 139us.  With 'data=ordered', just launching a web browser can produce
an xrun.

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-13 Thread Steven Rostedt
On Sun, 2005-02-13 at 13:59 +0100, Ingo Molnar wrote:

> yeah - it's "M" already in fs/proc/array.c, but i missed the sched.c
> case.
> 

You also missed the kernel/rt.c case :-)

-- Steve


Index: kernel/rt.c
===
--- kernel/rt.c (revision 75)
+++ kernel/rt.c (working copy)
@@ -207,6 +207,7 @@
 {
switch (p->state) {
case TASK_RUNNING:  printk("R"); break;
+   case TASK_RUNNING_MUTEX:printk("M"); break;
case TASK_INTERRUPTIBLE:printk("s"); break;
case TASK_UNINTERRUPTIBLE:  printk("D"); break;
case TASK_STOPPED:  printk("T"); break;


This is still from the 38-06.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-13 Thread Ingo Molnar

* Steven Rostedt <[EMAIL PROTECTED]> wrote:

> Ingo,
> 
> Here's a trivial patch to help others from freaking out when they see
> on a show_trace that most of their processes are TASK_UNINTERRUPTIBLE. 

thanks, applied it to -39-00.

> - static const char *stat_nam[] = { "R", "S", "D", "T", "t", "Z", "X" };
> + static const char *stat_nam[] = { "R", "M", "S", "D", "T", "t", "Z", 
> "X" };

> I figure that "M" would be a good fit for TASK_RUNNING_MUTEX.

yeah - it's "M" already in fs/proc/array.c, but i missed the sched.c
case.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-13 Thread Ingo Molnar

* Steven Rostedt [EMAIL PROTECTED] wrote:

 Ingo,
 
 Here's a trivial patch to help others from freaking out when they see
 on a show_trace that most of their processes are TASK_UNINTERRUPTIBLE. 

thanks, applied it to -39-00.

 - static const char *stat_nam[] = { R, S, D, T, t, Z, X };
 + static const char *stat_nam[] = { R, M, S, D, T, t, Z, 
 X };

 I figure that M would be a good fit for TASK_RUNNING_MUTEX.

yeah - it's M already in fs/proc/array.c, but i missed the sched.c
case.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-13 Thread Steven Rostedt
On Sun, 2005-02-13 at 13:59 +0100, Ingo Molnar wrote:

 yeah - it's M already in fs/proc/array.c, but i missed the sched.c
 case.
 

You also missed the kernel/rt.c case :-)

-- Steve


Index: kernel/rt.c
===
--- kernel/rt.c (revision 75)
+++ kernel/rt.c (working copy)
@@ -207,6 +207,7 @@
 {
switch (p-state) {
case TASK_RUNNING:  printk(R); break;
+   case TASK_RUNNING_MUTEX:printk(M); break;
case TASK_INTERRUPTIBLE:printk(s); break;
case TASK_UNINTERRUPTIBLE:  printk(D); break;
case TASK_STOPPED:  printk(T); break;


This is still from the 38-06.



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-11 Thread Steven Rostedt
Ingo,

Here's a trivial patch to help others from freaking out when they see on
a show_trace that most of their processes are TASK_UNINTERRUPTIBLE. 

Index: kernel/sched.c
===
--- kernel/sched.c  (revision 75)
+++ kernel/sched.c  (working copy)
@@ -4489,7 +4489,7 @@
task_t *relative;
unsigned state;
unsigned long free = 0;
-   static const char *stat_nam[] = { "R", "S", "D", "T", "t", "Z", "X" };
+   static const char *stat_nam[] = { "R", "M", "S", "D", "T", "t", "Z", 
"X" };
 
printk("%-13.13s [%p]", p->comm, p);
state = p->state ? __ffs(p->state) + 1 : 0;


I figure that "M" would be a good fit for TASK_RUNNING_MUTEX.

-- Steve


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-11 Thread Ingo Molnar

* Sven Dietrich <[EMAIL PROTECTED]> wrote:

> > this patch only changes xtime_lock back and forth - it does 
> > in no way impact the 'threadedness' of the timer IRQ. (it 
> > does not move the timer IRQ into an interrupt thread.)
> > 
> > nor do we really want to make it configurable - it's 
> > non-threaded right now and we'll see what effect this has on 
> > the worst-case latencies. 
> 
> Its clear that there are all sorts of issues with process accounting
> and other race conditions associated with running the timer in a
> thread.
> 
> The timer IRQ does have a noticable impact especially on the slower
> CPUS. In this domain, precise process time accounting may not be all
> that important, as long as the scheduler does not get confused, and
> that lone NODELAY IRQ doesn't get delayed (as much).

well, i saved the delta when i removed threaded timer IRQs, find the
patch below, apply it with -R to -RT-V0.7.37-00 to get threaded irqs
back on x86.

Right now i dont plan to reintroduce threaded timer IRQs because it
causes architecture merging problems (e.g. on x64 and MIPS) and also
caused artifacts. So the complexity vs. latency benefit is not all that
clear, especially at this stage. Also note that there were unsolved
problems wrt. time handling in the threaded setup.

(we can try it again later on. But if we do so it will have to be an
all-or-nothing item - #ifdef hell and behavioral divergence is to be
avoided.)

Ingo

--- linux.old/Makefile  
+++ linux.new/Makefile  
@@ -1,7 +1,7 @@
 VERSION = 2
 PATCHLEVEL = 6
 SUBLEVEL = 11
-EXTRAVERSION =-rc2-RT-V0.7.36-06
+EXTRAVERSION =-rc2-RT-V0.7.37-00
 NAME=Woozy Numbat
 
 # *DOCUMENTATION*
--- linux.old/arch/i386/kernel/irq.c
+++ linux.new/arch/i386/kernel/irq.c
@@ -70,8 +70,6 @@ fastcall notrace unsigned int do_IRQ(str
}
}
 #endif
-   if (unlikely(!irq))
-   direct_timer_interrupt(regs);
 
 #ifdef CONFIG_4KSTACKS
 
--- linux.old/arch/i386/kernel/time.c   
+++ linux.new/arch/i386/kernel/time.c   
@@ -82,7 +82,7 @@ unsigned long cpu_khz;/* Detected as we
 
 extern unsigned long wall_jiffies;
 
-DEFINE_SPINLOCK(rtc_lock);
+DEFINE_RAW_SPINLOCK(rtc_lock);
 
 #include 
 
@@ -217,19 +217,6 @@ unsigned long notrace profile_pc(struct 
 EXPORT_SYMBOL(profile_pc);
 #endif
 
-#ifdef CONFIG_PREEMPT_HARDIRQS
-
-/*
- * If the timer is redirected then this is the minimal
- * interrupt-context processing we have to do:
- */
-void direct_timer_interrupt(struct pt_regs *regs)
-{
-   do_timer_interrupt_hook(regs);
-}
-
-#endif
-
 /*
  * timer_interrupt() needs to keep up the real-time clock,
  * as well as call the "do_timer()" routine every clocktick
@@ -254,9 +241,7 @@ static inline void do_timer_interrupt(in
}
 #endif
 
-#ifndef CONFIG_PREEMPT_HARDIRQS
do_timer_interrupt_hook(regs);
-#endif
 
/*
 * If we have an externally synchronized Linux clock, then update
@@ -313,7 +298,6 @@ irqreturn_t timer_interrupt(int irq, voi
write_seqlock(_lock);
 
cur_timer->mark_offset();
-   do_timer(regs);
  
do_timer_interrupt(irq, NULL, regs);
 
--- linux.old/arch/i386/mach-default/setup.c
+++ linux.new/arch/i386/mach-default/setup.c
@@ -71,7 +71,7 @@ void __init trap_init_hook(void)
 {
 }
 
-static struct irqaction irq0  = { timer_interrupt, SA_INTERRUPT, 
CPU_MASK_NONE, "timer", NULL, NULL};
+static struct irqaction irq0  = { timer_interrupt, SA_INTERRUPT | SA_NODELAY, 
CPU_MASK_NONE, "timer", NULL, NULL};
 
 /**
  * time_init_hook - do any specific initialisations for the system timer.
--- linux.old/drivers/char/rtc.c
+++ linux.new/drivers/char/rtc.c
@@ -380,6 +380,8 @@ static inline void rtc_close_event(void)
 
 irqreturn_t rtc_interrupt(int irq, void *dev_id, struct pt_regs *regs)
 {
+   int mod;
+
/*
 *  Can be an alarm interrupt, update complete interrupt,
 *  or a periodic interrupt. We store the status in the
@@ -401,10 +403,13 @@ irqreturn_t rtc_interrupt(int irq, void 
rtc_irq_data |= (CMOS_READ(RTC_INTR_FLAGS) & 0xF0);
}
 
+   mod = 0;
if (rtc_status & RTC_TIMER_ON)
-   mod_timer(_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
+   mod = 1;
 
spin_unlock (_lock);
+   if (mod)
+   mod_timer(_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
 
/* Now do the rest of the actions */
spin_lock(_task_lock);
@@ -569,8 +574,8 @@ static int rtc_do_ioctl(unsigned int cmd
if (rtc_status & RTC_TIMER_ON) {
spin_lock_irq (_lock);
rtc_status &= ~RTC_TIMER_ON;
-   del_timer(_irq_timer);
spin_unlock_irq (_lock);
+   del_timer(_irq_timer);
}
return 0;
}
@@ -588,9 +593,9 @@ static int rtc_do_ioctl(unsigned int cmd
if (!(rtc_status & 

RE: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-11 Thread Sven Dietrich


Ingo wrote:

> 
> * Sven Dietrich <[EMAIL PROTECTED]> wrote:
> 
> > This patch adds a config option to allow you to select 
> whether timer 
> > IRQ runs in thread or not.
> 
> this patch only changes xtime_lock back and forth - it does 
> in no way impact the 'threadedness' of the timer IRQ. (it 
> does not move the timer IRQ into an interrupt thread.)
> 
> nor do we really want to make it configurable - it's 
> non-threaded right now and we'll see what effect this has on 
> the worst-case latencies. 
> 
>   Ingo
> 

Its clear that there are all sorts of issues 
with process accounting and other race conditions
associated with running the timer in a thread.

The timer IRQ does have a noticable impact 
especially on the slower CPUS. In this domain,
precise process time accounting may not be 
all that important, as long as the scheduler
does not get confused, and that lone NODELAY
IRQ doesn't get delayed (as much).

It would be nice if some of the process 
accounting could be pipelined or deferred,
but I don't have those answers right now.

Sven

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-11 Thread Ingo Molnar

* Sven Dietrich <[EMAIL PROTECTED]> wrote:

> No, this is not in arm. Here is the patch.
> 
> Index: linux-2.6.10/include/asm-i386/spinlock.h

what version do you have? The current released patch is
2.6.11-rc3-V0.7.38-10.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-11 Thread Sven Dietrich

No, this is not in arm. Here is the patch.

Index: linux-2.6.10/include/asm-i386/spinlock.h
===
--- linux-2.6.10.orig/include/asm-i386/spinlock.h  2005-02-11 
09:25:39.224240321 +
+++ linux-2.6.10/include/asm-i386/spinlock.h   2005-02-11 09:25:58.006812173 
+
@@ -30,7 +30,7 @@

 #define __raw_spin_is_locked(x)(*(volatile signed char *)(&(x)->lock) 
<= 0)
 #define __raw_spin_unlock_wait(x) \
-   do { barrier(); } while(__spin_is_locked(x))
+   do { barrier(); } while(__raw_spin_is_locked(x))

 #define spin_lock_string \
"\n1:\t" \




> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Ingo Molnar
> Sent: Friday, February 11, 2005 12:34 AM
> To: George Anzinger
> Cc: William Weston; linux-kernel@vger.kernel.org
> Subject: Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01
> 
> 
> 
> * George Anzinger  wrote:
> 
> > Possibly from:
> > define __raw_spin_is_locked(x)  (*(volatile signed char 
> *)(&(x)->lock) <= 0)
> > #define __raw_spin_unlock_wait(x) \
> > do { barrier(); } while(__spin_is_locked(x))
> > in asm/spinlock.h
> > 
> > should that be __raw_spin_is_locked(x) instead?
> 
> yeah. Is this in the ARM patch? I havent applied the ARM 
> patch yet, waiting to see Thomas Gleixner's generic-hardirq 
> based one. (which is more compelling from an architectural 
> and long-term maintainance POV - but also more work to 
> address all of RMK's concerns.)
> 
>   Ingo
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in the body of a message to 
> [EMAIL PROTECTED] More majordomo info at  
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-11 Thread Ingo Molnar

* George Anzinger  wrote:

> Possibly from:
> define __raw_spin_is_locked(x)(*(volatile signed char *)(&(x)->lock) 
> <= 0)
> #define __raw_spin_unlock_wait(x) \
>   do { barrier(); } while(__spin_is_locked(x))
> in asm/spinlock.h
> 
> should that be __raw_spin_is_locked(x) instead?

yeah. Is this in the ARM patch? I havent applied the ARM patch yet,
waiting to see Thomas Gleixner's generic-hardirq based one. (which is
more compelling from an architectural and long-term maintainance POV -
but also more work to address all of RMK's concerns.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-11 Thread Ingo Molnar

* Sven Dietrich <[EMAIL PROTECTED]> wrote:

> This patch adds a config option to allow you to select whether timer
> IRQ runs in thread or not.

this patch only changes xtime_lock back and forth - it does in no way
impact the 'threadedness' of the timer IRQ. (it does not move the timer
IRQ into an interrupt thread.)

nor do we really want to make it configurable - it's non-threaded right
now and we'll see what effect this has on the worst-case latencies. 

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-11 Thread Ingo Molnar

* Sven Dietrich [EMAIL PROTECTED] wrote:

 This patch adds a config option to allow you to select whether timer
 IRQ runs in thread or not.

this patch only changes xtime_lock back and forth - it does in no way
impact the 'threadedness' of the timer IRQ. (it does not move the timer
IRQ into an interrupt thread.)

nor do we really want to make it configurable - it's non-threaded right
now and we'll see what effect this has on the worst-case latencies. 

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-11 Thread Ingo Molnar

* George Anzinger george@mvista.com wrote:

 Possibly from:
 define __raw_spin_is_locked(x)(*(volatile signed char *)((x)-lock) 
 = 0)
 #define __raw_spin_unlock_wait(x) \
   do { barrier(); } while(__spin_is_locked(x))
 in asm/spinlock.h
 
 should that be __raw_spin_is_locked(x) instead?

yeah. Is this in the ARM patch? I havent applied the ARM patch yet,
waiting to see Thomas Gleixner's generic-hardirq based one. (which is
more compelling from an architectural and long-term maintainance POV -
but also more work to address all of RMK's concerns.)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-11 Thread Sven Dietrich

No, this is not in arm. Here is the patch.

Index: linux-2.6.10/include/asm-i386/spinlock.h
===
--- linux-2.6.10.orig/include/asm-i386/spinlock.h  2005-02-11 
09:25:39.224240321 +
+++ linux-2.6.10/include/asm-i386/spinlock.h   2005-02-11 09:25:58.006812173 
+
@@ -30,7 +30,7 @@

 #define __raw_spin_is_locked(x)(*(volatile signed char *)((x)-lock) 
= 0)
 #define __raw_spin_unlock_wait(x) \
-   do { barrier(); } while(__spin_is_locked(x))
+   do { barrier(); } while(__raw_spin_is_locked(x))

 #define spin_lock_string \
\n1:\t \




 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Ingo Molnar
 Sent: Friday, February 11, 2005 12:34 AM
 To: George Anzinger
 Cc: William Weston; linux-kernel@vger.kernel.org
 Subject: Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01
 
 
 
 * George Anzinger george@mvista.com wrote:
 
  Possibly from:
  define __raw_spin_is_locked(x)  (*(volatile signed char 
 *)((x)-lock) = 0)
  #define __raw_spin_unlock_wait(x) \
  do { barrier(); } while(__spin_is_locked(x))
  in asm/spinlock.h
  
  should that be __raw_spin_is_locked(x) instead?
 
 yeah. Is this in the ARM patch? I havent applied the ARM 
 patch yet, waiting to see Thomas Gleixner's generic-hardirq 
 based one. (which is more compelling from an architectural 
 and long-term maintainance POV - but also more work to 
 address all of RMK's concerns.)
 
   Ingo
 -
 To unsubscribe from this list: send the line unsubscribe 
 linux-kernel in the body of a message to 
 [EMAIL PROTECTED] More majordomo info at  
 http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-11 Thread Ingo Molnar

* Sven Dietrich [EMAIL PROTECTED] wrote:

 No, this is not in arm. Here is the patch.
 
 Index: linux-2.6.10/include/asm-i386/spinlock.h

what version do you have? The current released patch is
2.6.11-rc3-V0.7.38-10.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-11 Thread Sven Dietrich


Ingo wrote:

 
 * Sven Dietrich [EMAIL PROTECTED] wrote:
 
  This patch adds a config option to allow you to select 
 whether timer 
  IRQ runs in thread or not.
 
 this patch only changes xtime_lock back and forth - it does 
 in no way impact the 'threadedness' of the timer IRQ. (it 
 does not move the timer IRQ into an interrupt thread.)
 
 nor do we really want to make it configurable - it's 
 non-threaded right now and we'll see what effect this has on 
 the worst-case latencies. 
 
   Ingo
 

Its clear that there are all sorts of issues 
with process accounting and other race conditions
associated with running the timer in a thread.

The timer IRQ does have a noticable impact 
especially on the slower CPUS. In this domain,
precise process time accounting may not be 
all that important, as long as the scheduler
does not get confused, and that lone NODELAY
IRQ doesn't get delayed (as much).

It would be nice if some of the process 
accounting could be pipelined or deferred,
but I don't have those answers right now.

Sven

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-11 Thread Ingo Molnar

* Sven Dietrich [EMAIL PROTECTED] wrote:

  this patch only changes xtime_lock back and forth - it does 
  in no way impact the 'threadedness' of the timer IRQ. (it 
  does not move the timer IRQ into an interrupt thread.)
  
  nor do we really want to make it configurable - it's 
  non-threaded right now and we'll see what effect this has on 
  the worst-case latencies. 
 
 Its clear that there are all sorts of issues with process accounting
 and other race conditions associated with running the timer in a
 thread.
 
 The timer IRQ does have a noticable impact especially on the slower
 CPUS. In this domain, precise process time accounting may not be all
 that important, as long as the scheduler does not get confused, and
 that lone NODELAY IRQ doesn't get delayed (as much).

well, i saved the delta when i removed threaded timer IRQs, find the
patch below, apply it with -R to -RT-V0.7.37-00 to get threaded irqs
back on x86.

Right now i dont plan to reintroduce threaded timer IRQs because it
causes architecture merging problems (e.g. on x64 and MIPS) and also
caused artifacts. So the complexity vs. latency benefit is not all that
clear, especially at this stage. Also note that there were unsolved
problems wrt. time handling in the threaded setup.

(we can try it again later on. But if we do so it will have to be an
all-or-nothing item - #ifdef hell and behavioral divergence is to be
avoided.)

Ingo

--- linux.old/Makefile  
+++ linux.new/Makefile  
@@ -1,7 +1,7 @@
 VERSION = 2
 PATCHLEVEL = 6
 SUBLEVEL = 11
-EXTRAVERSION =-rc2-RT-V0.7.36-06
+EXTRAVERSION =-rc2-RT-V0.7.37-00
 NAME=Woozy Numbat
 
 # *DOCUMENTATION*
--- linux.old/arch/i386/kernel/irq.c
+++ linux.new/arch/i386/kernel/irq.c
@@ -70,8 +70,6 @@ fastcall notrace unsigned int do_IRQ(str
}
}
 #endif
-   if (unlikely(!irq))
-   direct_timer_interrupt(regs);
 
 #ifdef CONFIG_4KSTACKS
 
--- linux.old/arch/i386/kernel/time.c   
+++ linux.new/arch/i386/kernel/time.c   
@@ -82,7 +82,7 @@ unsigned long cpu_khz;/* Detected as we
 
 extern unsigned long wall_jiffies;
 
-DEFINE_SPINLOCK(rtc_lock);
+DEFINE_RAW_SPINLOCK(rtc_lock);
 
 #include asm/i8253.h
 
@@ -217,19 +217,6 @@ unsigned long notrace profile_pc(struct 
 EXPORT_SYMBOL(profile_pc);
 #endif
 
-#ifdef CONFIG_PREEMPT_HARDIRQS
-
-/*
- * If the timer is redirected then this is the minimal
- * interrupt-context processing we have to do:
- */
-void direct_timer_interrupt(struct pt_regs *regs)
-{
-   do_timer_interrupt_hook(regs);
-}
-
-#endif
-
 /*
  * timer_interrupt() needs to keep up the real-time clock,
  * as well as call the do_timer() routine every clocktick
@@ -254,9 +241,7 @@ static inline void do_timer_interrupt(in
}
 #endif
 
-#ifndef CONFIG_PREEMPT_HARDIRQS
do_timer_interrupt_hook(regs);
-#endif
 
/*
 * If we have an externally synchronized Linux clock, then update
@@ -313,7 +298,6 @@ irqreturn_t timer_interrupt(int irq, voi
write_seqlock(xtime_lock);
 
cur_timer-mark_offset();
-   do_timer(regs);
  
do_timer_interrupt(irq, NULL, regs);
 
--- linux.old/arch/i386/mach-default/setup.c
+++ linux.new/arch/i386/mach-default/setup.c
@@ -71,7 +71,7 @@ void __init trap_init_hook(void)
 {
 }
 
-static struct irqaction irq0  = { timer_interrupt, SA_INTERRUPT, 
CPU_MASK_NONE, timer, NULL, NULL};
+static struct irqaction irq0  = { timer_interrupt, SA_INTERRUPT | SA_NODELAY, 
CPU_MASK_NONE, timer, NULL, NULL};
 
 /**
  * time_init_hook - do any specific initialisations for the system timer.
--- linux.old/drivers/char/rtc.c
+++ linux.new/drivers/char/rtc.c
@@ -380,6 +380,8 @@ static inline void rtc_close_event(void)
 
 irqreturn_t rtc_interrupt(int irq, void *dev_id, struct pt_regs *regs)
 {
+   int mod;
+
/*
 *  Can be an alarm interrupt, update complete interrupt,
 *  or a periodic interrupt. We store the status in the
@@ -401,10 +403,13 @@ irqreturn_t rtc_interrupt(int irq, void 
rtc_irq_data |= (CMOS_READ(RTC_INTR_FLAGS)  0xF0);
}
 
+   mod = 0;
if (rtc_status  RTC_TIMER_ON)
-   mod_timer(rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
+   mod = 1;
 
spin_unlock (rtc_lock);
+   if (mod)
+   mod_timer(rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
 
/* Now do the rest of the actions */
spin_lock(rtc_task_lock);
@@ -569,8 +574,8 @@ static int rtc_do_ioctl(unsigned int cmd
if (rtc_status  RTC_TIMER_ON) {
spin_lock_irq (rtc_lock);
rtc_status = ~RTC_TIMER_ON;
-   del_timer(rtc_irq_timer);
spin_unlock_irq (rtc_lock);
+   del_timer(rtc_irq_timer);
}
return 0;
}
@@ -588,9 +593,9 @@ static int rtc_do_ioctl(unsigned int cmd
if (!(rtc_status  

Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-11 Thread Steven Rostedt
Ingo,

Here's a trivial patch to help others from freaking out when they see on
a show_trace that most of their processes are TASK_UNINTERRUPTIBLE. 

Index: kernel/sched.c
===
--- kernel/sched.c  (revision 75)
+++ kernel/sched.c  (working copy)
@@ -4489,7 +4489,7 @@
task_t *relative;
unsigned state;
unsigned long free = 0;
-   static const char *stat_nam[] = { R, S, D, T, t, Z, X };
+   static const char *stat_nam[] = { R, M, S, D, T, t, Z, 
X };
 
printk(%-13.13s [%p], p-comm, p);
state = p-state ? __ffs(p-state) + 1 : 0;


I figure that M would be a good fit for TASK_RUNNING_MUTEX.

-- Steve


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-10 Thread George Anzinger
Sven Dietrich wrote:
Hi George,
you may want to use this for reference.
This patch adds a config option to allow you to select whether timer IRQ runs 
in thread or not.
I'm not totally happy with the #ifdefs, but it may make witching back and forth easier.
Thanks, but...
You are addressing a different problem than I.  I want to code the VST patch to 
work in a system with or without the RT patch (it is easy to work with the RT 
option on or off).  The problem is setting up the spin locks it needs.  My 
solution assumes that RAW_SPIN_LOCK_UNLOCKED will not be defined unless the RT 
patch is applied.

As to your patch, in most archs the timer interrupt does accounting which 
requires input on just who was interrupted on the interrupt.  This is lost when 
threading the timer IRQ.  I think it was problems of this sort that caused Ingo 
to back away...

George
PS
By the way, your mailer (Microsoft Outlook) set up your attachment in such a 
way that my mailer would not inline it.  You might want to look into this.
Sven

-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of 
George Anzinger
Sent: Thursday, February 10, 2005 12:21 PM
To: Ingo Molnar
Cc: William Weston; linux-kernel@vger.kernel.org
Subject: Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

If I want to write a patch that will work with or without the 
RT patch applied 
is the following enough?

#ifndef RAW_SPIN_LOCK_UNLOCKED
typedef raw_spinlock_t spinlock_t
#define RAW_SPIN_LOCK_UNLOCKED SPIN_LOCK_UNLOCKED
#endif
--
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/
-
To unsubscribe from this list: send the line "unsubscribe 
linux-kernel" in the body of a message to 
[EMAIL PROTECTED] More majordomo info at  
http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

--
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01

2005-02-10 Thread Sven Dietrich

Hi George,

you may want to use this for reference.

This patch adds a config option to allow you to select whether timer IRQ runs 
in thread or not.

I'm not totally happy with the #ifdefs, but it may make witching back and forth 
easier.

Sven


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> George Anzinger
> Sent: Thursday, February 10, 2005 12:21 PM
> To: Ingo Molnar
> Cc: William Weston; linux-kernel@vger.kernel.org
> Subject: Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01
> 
> 
> If I want to write a patch that will work with or without the 
> RT patch applied 
> is the following enough?
> 
> #ifndef RAW_SPIN_LOCK_UNLOCKED
> typedef raw_spinlock_t spinlock_t
> #define RAW_SPIN_LOCK_UNLOCKED SPIN_LOCK_UNLOCKED
> #endif
> 
> 
> -- 
> George Anzinger   george@mvista.com
> High-res-timers:  http://sourceforge.net/projects/high-res-timers/
> 
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in the body of a message to 
> [EMAIL PROTECTED] More majordomo info at  
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


common_timer_irqthread.patch
Description: Binary data


  1   2   >