Re: [Xenomai-core] [PATCH 1/4] Refactor generic system.h

2008-03-01 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
  Gilles Chanteperdrix wrote:
   Jan Kiszka wrote:
 In order to allow further optimizations of xnlock, I started with
 refactoring the related system.h. This improves the readability
 significantly, IMHO. It also happen to reduce the text size of
 __xnlock_get a bit by avoid redundant rthal_processor_id read-outs.
 
 Another quirk I happen to remove: xnlock debugging depends on
 XENO_OPT_DEBUG_NUCLEUS, but needlessly we used to pick the debug version
 of xnlock_t already with XENO_OPT_DEBUG.
   
   There is a lot of whitespace change in this patch, which make it hard to
   read.
  
  Well, this patch is mostly about whitespace and formatting fixes (among
  which ifdef reduction falls for me as well). But I can split it up if
  desired.
  
   
   Anyway, there are a few things I do not like in this patch:
   - macro which make reference to symbols defined elsewhere
  
  You mean XNLOCK_DBG_PREPARE_ACQUIRE vs. XNLOCK_DBG_SPINNING/ACQUIRED?
  Granted, not nice but so far the most compact approach I found. My goal
  was to keep the lock implementations as pure as possible (you can easily
  ignore the debug stuff now when reading xnlock_get/put).
  
   - functions arguments as macro, I find more readable the #ifdef with the
 different function prototypes, the code can be read without having to
 look at a different place.
  
  I'm open to learn a third way to achieve what we need. I'm just
  convinced that the old way was far worse.
  
  Please consider for a better suggestion that the number of variants
  increase with my ticket lock. That's why I tried to stuff things in
  macros. Hmm, maybe we should simply get rid of the file/line/function
  stuff completely and switch to IP + ksyms. What do you think?

I do not want to leave this in a dead end. IMO, your approach make
xnlock_get readable in the non debugging case at the expense of its
readability in the debugging case. I would better see the two
implementations with a unique ifdef. Granted, there will be some code
duplication, but it will not be that much, and this allows us to move
the debugging version out of line while keeping the non debugging case
inline.

-- 


Gilles Chanteperdrix.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH 1/4] Refactor generic system.h

2008-03-01 Thread Jan Kiszka

Gilles Chanteperdrix wrote:

Jan Kiszka wrote:
  Gilles Chanteperdrix wrote:
   Jan Kiszka wrote:
 In order to allow further optimizations of xnlock, I started with
 refactoring the related system.h. This improves the readability
 significantly, IMHO. It also happen to reduce the text size of
 __xnlock_get a bit by avoid redundant rthal_processor_id read-outs.
 
 Another quirk I happen to remove: xnlock debugging depends on

 XENO_OPT_DEBUG_NUCLEUS, but needlessly we used to pick the debug version
 of xnlock_t already with XENO_OPT_DEBUG.
   
   There is a lot of whitespace change in this patch, which make it hard to

   read.
  
  Well, this patch is mostly about whitespace and formatting fixes (among

  which ifdef reduction falls for me as well). But I can split it up if
  desired.
  
   
   Anyway, there are a few things I do not like in this patch:

   - macro which make reference to symbols defined elsewhere
  
  You mean XNLOCK_DBG_PREPARE_ACQUIRE vs. XNLOCK_DBG_SPINNING/ACQUIRED?

  Granted, not nice but so far the most compact approach I found. My goal
  was to keep the lock implementations as pure as possible (you can easily
  ignore the debug stuff now when reading xnlock_get/put).
  
   - functions arguments as macro, I find more readable the #ifdef with the

 different function prototypes, the code can be read without having to
 look at a different place.
  
  I'm open to learn a third way to achieve what we need. I'm just

  convinced that the old way was far worse.
  
  Please consider for a better suggestion that the number of variants

  increase with my ticket lock. That's why I tried to stuff things in
  macros. Hmm, maybe we should simply get rid of the file/line/function
  stuff completely and switch to IP + ksyms. What do you think?

I do not want to leave this in a dead end. IMO, your approach make
xnlock_get readable in the non debugging case at the expense of its
readability in the debugging case. I would better see the two
implementations with a unique ifdef. Granted, there will be some code
duplication, but it will not be that much, and this allows us to move
the debugging version out of line while keeping the non debugging case
inline.


Don't panic. I'm sitting on a new version of this patch series, only 
running a final benchmark to estimate the gain. This unfortunately takes 
a lot of time.


BTW, the series will also add lock debugging for UP, and it beautifies 
the refactoring a bit more, addressing your concerns.


Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [PATCH 1/4] Refactor generic system.h

2008-02-23 Thread Jan Kiszka
In order to allow further optimizations of xnlock, I started with
refactoring the related system.h. This improves the readability
significantly, IMHO. It also happen to reduce the text size of
__xnlock_get a bit by avoid redundant rthal_processor_id read-outs.

Another quirk I happen to remove: xnlock debugging depends on
XENO_OPT_DEBUG_NUCLEUS, but needlessly we used to pick the debug version
of xnlock_t already with XENO_OPT_DEBUG.

Jan
---
 include/asm-generic/system.h |  412 +--
 1 file changed, 204 insertions(+), 208 deletions(-)

Index: b/include/asm-generic/system.h
===
--- a/include/asm-generic/system.h
+++ b/include/asm-generic/system.h
@@ -47,18 +47,22 @@
 #define CONFIG_XENO_OPT_DEBUG_NUCLEUS 0
 #endif
 
+#ifdef __cplusplus
+extern C {
+#endif
+
 /* Time base export */
 #define xnarch_declare_tbase(base)		do { } while(0)
 
 /* Tracer interface */
 #define xnarch_trace_max_begin(v)		rthal_trace_max_begin(v)
-#define xnarch_trace_max_end(v)		rthal_trace_max_end(v)
+#define xnarch_trace_max_end(v)			rthal_trace_max_end(v)
 #define xnarch_trace_max_reset()		rthal_trace_max_reset()
 #define xnarch_trace_user_start()		rthal_trace_user_start()
 #define xnarch_trace_user_stop(v)		rthal_trace_user_stop(v)
-#define xnarch_trace_user_freeze(v, once) 	rthal_trace_user_freeze(v, once)
+#define xnarch_trace_user_freeze(v, once)	rthal_trace_user_freeze(v, once)
 #define xnarch_trace_special(id, v)		rthal_trace_special(id, v)
-#define xnarch_trace_special_u64(id, v)	rthal_trace_special_u64(id, v)
+#define xnarch_trace_special_u64(id, v)		rthal_trace_special_u64(id, v)
 #define xnarch_trace_pid(pid, prio)		rthal_trace_pid(pid, prio)
 #define xnarch_trace_panic_freeze()		rthal_trace_panic_freeze()
 #define xnarch_trace_panic_dump()		rthal_trace_panic_dump()
@@ -81,26 +85,32 @@ typedef unsigned long spl_t;
 #define spltest()   rthal_local_irq_test()
 #define splget(x)   rthal_local_irq_flags(x)
 
-#if defined(CONFIG_SMP)  defined(CONFIG_XENO_OPT_DEBUG)
+static inline unsigned xnarch_current_cpu(void)
+{
+	return rthal_processor_id();
+}
+
+#if defined(CONFIG_SMP)  XENO_DEBUG(NUCLEUS)
+
 typedef struct {
 
-unsigned long long spin_time;
-unsigned long long lock_time;
-const char *file;
-const char *function;
-unsigned line;
+	unsigned long long spin_time;
+	unsigned long long lock_time;
+	const char *file;
+	const char *function;
+	unsigned line;
 
 } xnlockinfo_t;
 
 typedef struct {
 
-atomic_t owner;
-const char *file;
-const char *function;
-unsigned line;
-int cpu;
-unsigned long long spin_time;
-unsigned long long lock_date;
+	atomic_t owner;
+	const char *file;
+	const char *function;
+	unsigned line;
+	int cpu;
+	unsigned long long spin_time;
+	unsigned long long lock_date;
 
 } xnlock_t;
 
@@ -114,70 +124,137 @@ typedef struct {
 	0LL,	\
 }
 
-#else /* !(CONFIG_SMP  CONFIG_XENO_OPT_DEBUG) */
+#define XNLOCK_DBG_CONTEXT		, __FILE__, __LINE__, __FUNCTION__
+#define XNLOCK_DBG_CONTEXT_ARGS \
+	, const char *file, int line, const char *function
+#define XNLOCK_DBG_PASS_CONTEXT		, file, line, function
+
+#define XNLOCK_DBG_PREPARE_ACQUIRE()	\
+	unsigned long long __lock_date = rthal_rdtsc();			\
+	unsigned __spin_limit = 300
+
+#define XNLOCK_DBG_SPINNING()		\
+	do {\
+		if (__spin_limit-- == 0) { \
+			rthal_emergency_console();			\
+			printk(KERN_ERR	\
+			   Xenomai: stuck on nucleus lock %p\n	\
+			waiter = %s:%u (%s(), CPU #%d)\n  \
+			owner  = %s:%u (%s(), CPU #%d)\n, \
+			   lock, file, line, function, cpu, lock-file, \
+			   lock-line, lock-function, lock-cpu);	\
+			show_stack(NULL, NULL);\
+			for (;;)	\
+cpu_relax();\
+		}			\
+	} while (0)
+
+#define XNLOCK_DBG_ACQUIRED()		\
+	do {\
+		lock-spin_time = rthal_rdtsc() - __lock_date;		\
+		lock-lock_date = __lock_date;\
+		lock-file = file;	\
+		lock-function = function;\
+		lock-line = line;	\
+		lock-cpu = cpu;	\
+	} while (0)
+
+static inline void xnlock_dbg_release(xnlock_t *lock)
+{
+	extern xnlockinfo_t xnlock_stats[];
+	unsigned long long lock_time = rthal_rdtsc() - lock-lock_date;
+	xnlockinfo_t *stats = xnlock_stats[xnarch_current_cpu()];
+
+	if (lock_time  stats-lock_time) {
+		stats-lock_time = lock_time;
+		stats-spin_time = lock-spin_time;
+		stats-file = lock-file;
+		stats-function = lock-function;
+		stats-line = lock-line;
+	}
+}
+
+static inline void xnlock_dbg_invalid_release(xnlock_t *lock)
+{
+	rthal_emergency_console();
+	printk(KERN_ERR Xenomai: unlocking unlocked nucleus lock %p\n
+			   owner  = %s:%u (%s(), CPU #%d)\n,
+	   lock, lock-file, lock-line, lock-function, lock-cpu);
+	show_stack(NULL,NULL);
+	for (;;)
+		cpu_relax();
+}
+
+#else /* !(CONFIG_SMP  XENO_DEBUG(NUCLEUS)) 

Re: [Xenomai-core] [PATCH 1/4] Refactor generic system.h

2008-02-23 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
  In order to allow further optimizations of xnlock, I started with
  refactoring the related system.h. This improves the readability
  significantly, IMHO. It also happen to reduce the text size of
  __xnlock_get a bit by avoid redundant rthal_processor_id read-outs.
  
  Another quirk I happen to remove: xnlock debugging depends on
  XENO_OPT_DEBUG_NUCLEUS, but needlessly we used to pick the debug version
  of xnlock_t already with XENO_OPT_DEBUG.

There is a lot of whitespace change in this patch, which make it hard to
read.

Anyway, there are a few things I do not like in this patch:
- macro which make reference to symbols defined elsewhere
- functions arguments as macro, I find more readable the #ifdef with the
  different function prototypes, the code can be read without having to
  look at a different place.

Something we could be interesting would be to be able to enable
spinlocks debug in UP, which would enable real debugging xnlocks in this
case. I made an attempt of doing this on ARM some time ago, this
generated a kernel that would lockup at boot. But I think this is
something we should sort out.

-- 


Gilles Chanteperdrix.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH 1/4] Refactor generic system.h

2008-02-23 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
   In order to allow further optimizations of xnlock, I started with
   refactoring the related system.h. This improves the readability
   significantly, IMHO. It also happen to reduce the text size of
   __xnlock_get a bit by avoid redundant rthal_processor_id read-outs.
   
   Another quirk I happen to remove: xnlock debugging depends on
   XENO_OPT_DEBUG_NUCLEUS, but needlessly we used to pick the debug version
   of xnlock_t already with XENO_OPT_DEBUG.
 
 There is a lot of whitespace change in this patch, which make it hard to
 read.

Well, this patch is mostly about whitespace and formatting fixes (among
which ifdef reduction falls for me as well). But I can split it up if
desired.

 
 Anyway, there are a few things I do not like in this patch:
 - macro which make reference to symbols defined elsewhere

You mean XNLOCK_DBG_PREPARE_ACQUIRE vs. XNLOCK_DBG_SPINNING/ACQUIRED?
Granted, not nice but so far the most compact approach I found. My goal
was to keep the lock implementations as pure as possible (you can easily
ignore the debug stuff now when reading xnlock_get/put).

 - functions arguments as macro, I find more readable the #ifdef with the
   different function prototypes, the code can be read without having to
   look at a different place.

I'm open to learn a third way to achieve what we need. I'm just
convinced that the old way was far worse.

Please consider for a better suggestion that the number of variants
increase with my ticket lock. That's why I tried to stuff things in
macros. Hmm, maybe we should simply get rid of the file/line/function
stuff completely and switch to IP + ksyms. What do you think?

 
 Something we could be interesting would be to be able to enable
 spinlocks debug in UP, which would enable real debugging xnlocks in this
 case. I made an attempt of doing this on ARM some time ago, this
 generated a kernel that would lockup at boot. But I think this is
 something we should sort out.
 

Yeah, sounds good and should be feasible. Will check.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH 1/4] Refactor generic system.h

2008-02-23 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
  Gilles Chanteperdrix wrote:
   Jan Kiszka wrote:
 In order to allow further optimizations of xnlock, I started with
 refactoring the related system.h. This improves the readability
 significantly, IMHO. It also happen to reduce the text size of
 __xnlock_get a bit by avoid redundant rthal_processor_id read-outs.
 
 Another quirk I happen to remove: xnlock debugging depends on
 XENO_OPT_DEBUG_NUCLEUS, but needlessly we used to pick the debug version
 of xnlock_t already with XENO_OPT_DEBUG.
   
   There is a lot of whitespace change in this patch, which make it hard to
   read.
  
  Well, this patch is mostly about whitespace and formatting fixes (among
  which ifdef reduction falls for me as well). But I can split it up if
  desired.
  
   
   Anyway, there are a few things I do not like in this patch:
   - macro which make reference to symbols defined elsewhere
  
  You mean XNLOCK_DBG_PREPARE_ACQUIRE vs. XNLOCK_DBG_SPINNING/ACQUIRED?
  Granted, not nice but so far the most compact approach I found. My goal
  was to keep the lock implementations as pure as possible (you can easily
  ignore the debug stuff now when reading xnlock_get/put).
  
   - functions arguments as macro, I find more readable the #ifdef with the
 different function prototypes, the code can be read without having to
 look at a different place.
  
  I'm open to learn a third way to achieve what we need. I'm just
  convinced that the old way was far worse.

I do not see a third approach. Maybe passing all the arguments to
the function, and count on the optimizer to remove useless arguments
when inlining ? 

  
  Please consider for a better suggestion that the number of variants
  increase with my ticket lock. That's why I tried to stuff things in
  macros. Hmm, maybe we should simply get rid of the file/line/function
  stuff completely and switch to IP + ksyms. What do you think?

I find the file line approach more precise. print_symbol gives you an
offset, you have to disassemble to understand what it means, and
it does not see inline functions.

-- 


Gilles Chanteperdrix.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core