Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-04-05 Thread Jan Kiszka

Tomas Kalibera wrote:

Hi,

I've tried a more defensive kernel setup  your patch (no.6). The lockup 
is still there. It  happens after a realtime task is started, though I 
was unable to track exactly when -  it does not crash in a debugger, 
does not crash with strace, breaks SysRq, and printing log messages 
seems to be delayed (despite flushing). I tried changing the application 
code (like using more default flags when creating a task, etc). But I 
did not find a workaround.


I've put the kernel on the web again, including the config (the one that 
contains xenomaidp6). Maybe it might help to track down the bug... 
Maybe not.


Jumping late on this, I didn't find any (user space) test case for the 
observed bug in this thread. Can you provide something? The simpler, the 
better. It may even contain bugs itself, it just has to trigger the 
kernel oops reliably.


Then I saw in your .config that your kernel is optimized for AMD K6. In 
order to prepare exporting the bug, could you check that more generic 
CONFIG_M586TSC makes no difference? Also, if you happen to have a 
second, different box (/wrt CPU type  speed) at hand, it would be nice 
to know that the issue is present there as well. But the latter is also 
something we can try once a test case is available. My preferred target 
will be QEMU, because that one can quite nicely be debugged even if the 
box is hopelessly locked up.


Thanks,
Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-04-04 Thread Tomas Kalibera

Hi,

I've tried a more defensive kernel setup  your patch (no.6). The lockup 
is still there. It  happens after a realtime task is started, though I 
was unable to track exactly when -  it does not crash in a debugger, 
does not crash with strace, breaks SysRq, and printing log messages 
seems to be delayed (despite flushing). I tried changing the application 
code (like using more default flags when creating a task, etc). But I 
did not find a workaround.

I've put the kernel on the web again, including the config (the one that 
contains xenomaidp6). Maybe it might help to track down the bug... 
Maybe not.

Do you have any ideas how to work it around ?

Thanks
Tomas





Tomas Kalibera wrote:
 Gilles Chanteperdrix wrote:
   
   Of course, we are looking for all bugs. But please tell me: do you get
   the lock-up even before fork is called ? If not, could you verify that
   at least some Xenomai programs run correctly, for instance latency ?
   Looking at the code, I think I found a bug, but I doubt it could cause a
   lockup. The definition of VM_PINNED in include/linux/mm.h collides with
   another bit used by Linux, so this defintion should be changed from:
   #define VM_PINNED 0x0800
   to:
   #define VM_PINNED 0x1000

 Here comes a 6th patch for this bug, (patch 6 includes patch 5).

   
 
 I've tested the 6th patch, the lockup is still there.
 As far as I can observe, it behaves exactly like with the 5th patch.

 Tomas




 ___
 Xenomai-core mailing list
 Xenomai-core@gna.org
 https://mail.gna.org/listinfo/xenomai-core
   


___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-04-03 Thread Gilles Chanteperdrix
On Thu, Apr 3, 2008 at 1:46 AM, Tomas Kalibera [EMAIL PROTECTED] wrote:
 Gilles Chanteperdrix wrote:

  Of course, we are looking for all bugs. But please tell me: do you get
  the lock-up even before fork is called ? If not, could you verify that
  at least some Xenomai programs run correctly, for instance latency ?
 
 
  The lock up with patch 5 happens before fork is called, but after a
 real-time task is started by the program. I don't know better now - I'd have
 to add more logging.  If I run in strace, the lock-up does not happen.

  Thinking about that, it can be a bug in my program. If I understand the
 concept of Xenomai correctly, I can just write a real-time task that would
 starve the Linux kernel indefinitely, correct ? My program definitely does
 have bugs. So I'll do more debugging.

Yes, you can starve Linux, but after 4seconds the Xenomai watchdog
should trigger. You can also starve Linux with a vanilla Linux
application running with the SCHED_FIFO scheduling policy, but in this
case, it is Linux soft lockup detector which should trigger. I see
that you have the two options enabled, so, the lockup is probably
another bug.

-- 
 Gilles

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-04-03 Thread Tomas Kalibera
Gilles Chanteperdrix wrote:
   Of course, we are looking for all bugs. But please tell me: do you get
   the lock-up even before fork is called ? If not, could you verify that
   at least some Xenomai programs run correctly, for instance latency ?
   Looking at the code, I think I found a bug, but I doubt it could cause a
   lockup. The definition of VM_PINNED in include/linux/mm.h collides with
   another bit used by Linux, so this defintion should be changed from:
   #define VM_PINNED 0x0800
   to:
   #define VM_PINNED 0x1000

 Here comes a 6th patch for this bug, (patch 6 includes patch 5).

   
I've tested the 6th patch, the lockup is still there.
As far as I can observe, it behaves exactly like with the 5th patch.

Tomas




___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-04-02 Thread Gilles Chanteperdrix
On Wed, Apr 2, 2008 at 5:02 PM, Tomas Kalibera [EMAIL PROTECTED] wrote:

  OK, no change with this patch compared to the previous situation. The
 system boots, but hangs without a stacktrace when I run my Xenomai task.
 SysRq is blocked, now even SysRq-kill did not work, only SysRq-boot did.

Are you sure you did not keep the stuff in highmem_32.c ?


-- 
 Gilles

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-04-02 Thread Tomas Kalibera

Hmm, I checked again and did not find a mistake in the experiment 
(neither using the old binary nor old sources). I'm doing a clean build 
from scratch again, so that we can be absolutely sure. I can then run 
memtest on the machine...

Tomas

Gilles Chanteperdrix wrote:
 On Wed, Apr 2, 2008 at 5:02 PM, Tomas Kalibera [EMAIL PROTECTED] wrote:
   
  OK, no change with this patch compared to the previous situation. The
 system boots, but hangs without a stacktrace when I run my Xenomai task.
 SysRq is blocked, now even SysRq-kill did not work, only SysRq-boot did.
 

 Are you sure you did not keep the stuff in highmem_32.c ?


   


___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-04-02 Thread Tomas Kalibera

Hi Gilles,

I've recompiled the kernel again from scratch and got the same lock up. 
Fix 5 does not help... If you want to inspect the exact kernel I used, 
it's again at http://www.cs.purdue.edu/~tkaliber/crash, the one with 
p5 in its name.

Tomas

Gilles Chanteperdrix wrote:
 On Wed, Apr 2, 2008 at 5:02 PM, Tomas Kalibera [EMAIL PROTECTED] wrote:
   
  OK, no change with this patch compared to the previous situation. The
 system boots, but hangs without a stacktrace when I run my Xenomai task.
 SysRq is blocked, now even SysRq-kill did not work, only SysRq-boot did.
 

 Are you sure you did not keep the stuff in highmem_32.c ?


   


___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-04-02 Thread Gilles Chanteperdrix
Tomas Kalibera wrote:
  
  Hi Gilles,
  
  I've recompiled the kernel again from scratch and got the same lock up. 
  Fix 5 does not help... If you want to inspect the exact kernel I used, 
  it's again at http://www.cs.purdue.edu/~tkaliber/crash, the one with 
  p5 in its name.
  
  Tomas

But... do you get the lock-up without the patch ?

-- 


Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-04-02 Thread Bill Gatliff
Tomas Kalibera wrote:
 Hi Gilles,
 
 I've recompiled the kernel again from scratch and got the same lock up. 
 Fix 5 does not help... If you want to inspect the exact kernel I used, 
 it's again at http://www.cs.purdue.edu/~tkaliber/crash, the one with 
 p5 in its name.

You aren't using gcc-4.2 or later, are you?  I've had problems with 
those for building and/or running kernels.  On non-x86 targets, mind 
you, but maybe there's a connection...


b.g.
-- 
Bill Gatliff
[EMAIL PROTECTED]

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-04-02 Thread Gilles Chanteperdrix
Tomas Kalibera wrote:
  
  Hi Gilles,
  
  I've recompiled the kernel again from scratch and got the same lock up. 
  Fix 5 does not help... If you want to inspect the exact kernel I used, 
  it's again at http://www.cs.purdue.edu/~tkaliber/crash, the one with 
  p5 in its name.

permission denied to download kernel configuration.

-- 


Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-04-02 Thread Tomas Kalibera
Gilles Chanteperdrix wrote:
 Tomas Kalibera wrote:
   
   Hi Gilles,
   
   I've recompiled the kernel again from scratch and got the same lock up. 
   Fix 5 does not help... If you want to inspect the exact kernel I used, 
   it's again at http://www.cs.purdue.edu/~tkaliber/crash, the one with 
   p5 in its name.

 permission denied to download kernel configuration.

   
Sorry. Permissions fixed.
T



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-04-02 Thread Gilles Chanteperdrix
Tomas Kalibera wrote:
  Gilles Chanteperdrix wrote:
   Tomas Kalibera wrote:
 
 Hi Gilles,
 
 I've recompiled the kernel again from scratch and got the same lock up. 
 Fix 5 does not help... If you want to inspect the exact kernel I used, 
 it's again at http://www.cs.purdue.edu/~tkaliber/crash, the one with 
 p5 in its name.
 
 Tomas
  
   But... do you get the lock-up without the patch ?
  
 
  No. Or, more precisely, not the same one. With this patch (5), the 
  system locks up as soon as the application starts. It  does not print 
  any stack trace.
  Without the patch, the system gets to unusable state when the 
  application calls clone/fork, and it does produce a stack trace (those I 
  was sending you before). It seems to be more alive (processes start, but 
  crash, because of garbled preempt_count).
  
  The crashes are perfectly repeatable on the system I have. So, the 
  crashes make no sense to you, right ?  I can indeed try to go the 
  defensive path and try an older kernel or something, but if there is a 
  Xenomai bug, it would be nice to find it...  The same for kernel bug, 
  indeed.

Of course, we are looking for all bugs. But please tell me: do you get
the lock-up even before fork is called ? If not, could you verify that
at least some Xenomai programs run correctly, for instance latency ?
Looking at the code, I think I found a bug, but I doubt it could cause a
lockup. The definition of VM_PINNED in include/linux/mm.h collides with
another bit used by Linux, so this defintion should be changed from:
#define VM_PINNED 0x0800
to:
#define VM_PINNED 0x1000

I will now try, if possible, to reproduce the bug on a x86 box of mine
and will keep you informed.

-- 


Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-04-02 Thread Tomas Kalibera
Gilles Chanteperdrix wrote:
 Of course, we are looking for all bugs. But please tell me: do you get
 the lock-up even before fork is called ? If not, could you verify that
 at least some Xenomai programs run correctly, for instance latency ?
   
The lock up with patch 5 happens before fork is called, but after a 
real-time task is started by the program. I don't know better now - I'd 
have to add more logging.  If I run in strace, the lock-up does not happen.

Thinking about that, it can be a bug in my program. If I understand the 
concept of Xenomai correctly, I can just write a real-time task that 
would starve the Linux kernel indefinitely, correct ? My program 
definitely does have bugs. So I'll do more debugging.

The lock-up does NOT happen with latency. But, the bug in the kernel 
without patch 5 (the one that lead to a stack trac, after call to fork), 
did not appear in latency, either.
 Looking at the code, I think I found a bug, but I doubt it could cause a
 lockup. The definition of VM_PINNED in include/linux/mm.h collides with
 another bit used by Linux, so this defintion should be changed from:
 #define VM_PINNED 0x0800
 to:
 #define VM_PINNED 0x1000

 I will now try, if possible, to reproduce the bug on a x86 box of mine
 and will keep you informed.
   
Thanks ! I'll indeed build kernel with patch 6, test again, and test my 
application.

Tomas


___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-04-01 Thread Gilles Chanteperdrix
On Tue, Apr 1, 2008 at 7:52 AM, Gilles Chanteperdrix
[EMAIL PROTECTED] wrote:
 Tomas Kalibera wrote:
   
I added a missing underscore and re-tried, and none of the debug
messages was printed. I added another one to make sure that there is not
a problem with getting printk messages to the serial console. The
resulting highmem_32.c and the output is attached.
   
T

  The interesting part of the output is the printk which occurs right
  before the first bug, what happens afterwards is of little use. Do you
  get any output before the first bug ?

There are other kmap_atomic calls in copy_pte_range than the
kmap_atomic taking place in cow_user_page, they use KM_PTE0 and
KM_PTE1 as the type value. So, we should track these types as well in
highmem_32.c.

-- 
 Gilles

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-03-31 Thread Gilles Chanteperdrix
Tomas Kalibera wrote:
  
  Crashed on the very same line as before
  Tomas

Ok. Let us look for unbalanced kmap_atomics then. Try this patch instead.

-- 


Gilles.
diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c
index 1c3bf95..a78494e 100644
--- a/arch/x86/mm/highmem_32.c
+++ b/arch/x86/mm/highmem_32.c
@@ -1,6 +1,11 @@
 #include linux/highmem.h
 #include linux/module.h
 
+static struct {
+   const char *file;
+   unsigned line;
+} last_km_user0 [NR_CPUS];
+
 void *kmap(struct page *page)
 {
might_sleep();
@@ -26,7 +31,8 @@ void kunmap(struct page *page)
  * However when holding an atomic kmap is is not legal to sleep, so atomic
  * kmaps are appropriate for short, tight code paths only.
  */
-void *kmap_atomic_prot(struct page *page, enum km_type type, pgprot_t prot)
+void *_kmap_atomic_prot(struct page *page, enum km_type type,
+   pgprot_t prot, const char *file, unsigned line)
 {
enum fixed_addresses idx;
unsigned long vaddr;
@@ -39,7 +45,17 @@ void *kmap_atomic_prot(struct page *page, enum km_type type, 
pgprot_t prot)
 
idx = type + KM_TYPE_NR*smp_processor_id();
vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
-   BUG_ON(!pte_none(*(kmap_pte-idx)));
+   if (!pte_none(*(kmap_pte-idx))) {
+   if (type == KM_USER0)
+   printk(KM_USER0 already mapped at %s:%d\n,
+  last_km_user0[smp_processor_id()].file,
+  last_km_user0[smp_processor_id()].line);
+   BUG();
+   } else if (type == KM_USER0) {
+   last_km_user0[smp_processor_id()].file = file;
+   last_km_user0[smp_processor_id()].line = line;
+   }
+
set_pte(kmap_pte-idx, mk_pte(page, prot));
arch_flush_lazy_mmu_mode();
 
@@ -70,6 +86,10 @@ void kunmap_atomic(void *kvaddr, enum km_type type)
BUG_ON(vaddr = (unsigned long)high_memory);
 #endif
}
+   if (type == KM_USER0) {
+   last_km_user0[smp_processor_id()].file = NULL;
+   last_km_user0[smp_processor_id()].line = 0;
+   }
 
arch_flush_lazy_mmu_mode();
pagefault_enable();
@@ -78,7 +98,8 @@ void kunmap_atomic(void *kvaddr, enum km_type type)
 /* This is the same as kmap_atomic() but can map memory that doesn't
  * have a struct page associated with it.
  */
-void *kmap_atomic_pfn(unsigned long pfn, enum km_type type)
+void *_kmap_atomic_pfn(unsigned long pfn, enum km_type type,
+  const char *file, unsigned line)
 {
enum fixed_addresses idx;
unsigned long vaddr;
@@ -87,6 +108,16 @@ void *kmap_atomic_pfn(unsigned long pfn, enum km_type type)
 
idx = type + KM_TYPE_NR*smp_processor_id();
vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
+   if (!pte_none(*(kmap_pte-idx))) {
+   if (type == KM_USER0)
+   printk(KM_USER0 already mapped at %s:%d\n,
+  last_km_user0[smp_processor_id()].file,
+  last_km_user0[smp_processor_id()].line);
+   BUG();
+   } else if (type == KM_USER0) {
+   last_km_user0[smp_processor_id()].file = file;
+   last_km_user0[smp_processor_id()].line = line;
+   }
set_pte(kmap_pte-idx, pfn_pte(pfn, kmap_prot));
arch_flush_lazy_mmu_mode();
 
diff --git a/include/asm-x86/highmem.h b/include/asm-x86/highmem.h
index 13cdcd6..57b89f7 100644
--- a/include/asm-x86/highmem.h
+++ b/include/asm-x86/highmem.h
@@ -68,10 +68,16 @@ extern void FASTCALL(kunmap_high(struct page *page));
 
 void *kmap(struct page *page);
 void kunmap(struct page *page);
-void *kmap_atomic_prot(struct page *page, enum km_type type, pgprot_t prot);
+void *_kmap_atomic_prot(struct page *page, enum km_type type,
+   pgprot_t prot, const char *file, unsigned line);
+#define kmap_atomic_prot(page, type, prot) \
+   _kmap_atomic_prot(page, type, prot, __FILE__, __LINE__)
 void *kmap_atomic(struct page *page, enum km_type type);
 void kunmap_atomic(void *kvaddr, enum km_type type);
-void *kmap_atomic_pfn(unsigned long pfn, enum km_type type);
+void *_kmap_atomic_pfn(unsigned long pfn, enum km_type type,
+  const char *file, unsigned line);
+#define kmap_atomic_pfn(pfn, type) \
+   _kmap_atomic_pfn(pfn, type, __FILE__, __LINE__)
 struct page *kmap_atomic_to_page(void *ptr);
 
 #ifndef CONFIG_PARAVIRT
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-03-31 Thread Gilles Chanteperdrix
Gilles Chanteperdrix wrote:
  Tomas Kalibera wrote:

Crashed on the very same line as before
Tomas
  
  Ok. Let us look for unbalanced kmap_atomics then. Try this patch instead.

Just when I hit the reply button, I realize that I forgot something. So,
try this one instead.

-- 


Gilles.
diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c
index 1c3bf95..97a5242 100644
--- a/arch/x86/mm/highmem_32.c
+++ b/arch/x86/mm/highmem_32.c
@@ -1,6 +1,11 @@
 #include linux/highmem.h
 #include linux/module.h
 
+static struct {
+   const char *file;
+   unsigned line;
+} last_km_user0 [NR_CPUS];
+
 void *kmap(struct page *page)
 {
might_sleep();
@@ -26,7 +31,8 @@ void kunmap(struct page *page)
  * However when holding an atomic kmap is is not legal to sleep, so atomic
  * kmaps are appropriate for short, tight code paths only.
  */
-void *kmap_atomic_prot(struct page *page, enum km_type type, pgprot_t prot)
+void *_kmap_atomic_prot(struct page *page, enum km_type type,
+   pgprot_t prot, const char *file, unsigned line)
 {
enum fixed_addresses idx;
unsigned long vaddr;
@@ -39,16 +45,27 @@ void *kmap_atomic_prot(struct page *page, enum km_type 
type, pgprot_t prot)
 
idx = type + KM_TYPE_NR*smp_processor_id();
vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
-   BUG_ON(!pte_none(*(kmap_pte-idx)));
+   if (!pte_none(*(kmap_pte-idx))) {
+   if (type == KM_USER0)
+   printk(KM_USER0 already mapped at %s:%d\n,
+  last_km_user0[smp_processor_id()].file,
+  last_km_user0[smp_processor_id()].line);
+   BUG();
+   } else if (type == KM_USER0) {
+   last_km_user0[smp_processor_id()].file = file;
+   last_km_user0[smp_processor_id()].line = line;
+   }
+
set_pte(kmap_pte-idx, mk_pte(page, prot));
arch_flush_lazy_mmu_mode();
 
return (void *)vaddr;
 }
 
-void *kmap_atomic(struct page *page, enum km_type type)
+void *_kmap_atomic(struct page *page, enum km_type type,
+  const char *file, unsigned line)
 {
-   return kmap_atomic_prot(page, type, kmap_prot);
+   return _kmap_atomic_prot(page, type, kmap_prot, file, line);
 }
 
 void kunmap_atomic(void *kvaddr, enum km_type type)
@@ -70,6 +87,10 @@ void kunmap_atomic(void *kvaddr, enum km_type type)
BUG_ON(vaddr = (unsigned long)high_memory);
 #endif
}
+   if (type == KM_USER0) {
+   last_km_user0[smp_processor_id()].file = NULL;
+   last_km_user0[smp_processor_id()].line = 0;
+   }
 
arch_flush_lazy_mmu_mode();
pagefault_enable();
@@ -78,7 +99,8 @@ void kunmap_atomic(void *kvaddr, enum km_type type)
 /* This is the same as kmap_atomic() but can map memory that doesn't
  * have a struct page associated with it.
  */
-void *kmap_atomic_pfn(unsigned long pfn, enum km_type type)
+void *_kmap_atomic_pfn(unsigned long pfn, enum km_type type,
+  const char *file, unsigned line)
 {
enum fixed_addresses idx;
unsigned long vaddr;
@@ -87,6 +109,16 @@ void *kmap_atomic_pfn(unsigned long pfn, enum km_type type)
 
idx = type + KM_TYPE_NR*smp_processor_id();
vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
+   if (!pte_none(*(kmap_pte-idx))) {
+   if (type == KM_USER0)
+   printk(KM_USER0 already mapped at %s:%d\n,
+  last_km_user0[smp_processor_id()].file,
+  last_km_user0[smp_processor_id()].line);
+   BUG();
+   } else if (type == KM_USER0) {
+   last_km_user0[smp_processor_id()].file = file;
+   last_km_user0[smp_processor_id()].line = line;
+   }
set_pte(kmap_pte-idx, pfn_pte(pfn, kmap_prot));
arch_flush_lazy_mmu_mode();
 
diff --git a/include/asm-x86/highmem.h b/include/asm-x86/highmem.h
index 13cdcd6..db09f27 100644
--- a/include/asm-x86/highmem.h
+++ b/include/asm-x86/highmem.h
@@ -68,10 +68,19 @@ extern void FASTCALL(kunmap_high(struct page *page));
 
 void *kmap(struct page *page);
 void kunmap(struct page *page);
-void *kmap_atomic_prot(struct page *page, enum km_type type, pgprot_t prot);
-void *kmap_atomic(struct page *page, enum km_type type);
+void *_kmap_atomic_prot(struct page *page, enum km_type type,
+   pgprot_t prot, const char *file, unsigned line);
+#define kmap_atomic_prot(page, type, prot) \
+   _kmap_atomic_prot(page, type, prot, __FILE__, __LINE__)
+void *_kmap_atomic(struct page *page, enum km_type type,
+  const char *file, unsigned line);
+#define kmap_atomic(page, type) \
+   _kmap_atomic(page, type, __FILE__, __LINE__)
 void kunmap_atomic(void *kvaddr, enum km_type type);
-void *kmap_atomic_pfn(unsigned long pfn, enum km_type type);
+void 

Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-03-31 Thread Tomas Kalibera


I added a missing underscore and re-tried, and none of the debug 
messages was printed. I added another one to make sure that there is not 
a problem with getting printk messages to the serial console. The 
resulting highmem_32.c and the output is attached.


T


Gilles Chanteperdrix wrote:

Gilles Chanteperdrix wrote:
  Tomas Kalibera wrote:

Crashed on the very same line as before

Tomas
  
  Ok. Let us look for unbalanced kmap_atomics then. Try this patch instead.


Just when I hit the reply button, I realize that I forgot something. So,
try this one instead.

  


#include linux/highmem.h
#include linux/module.h

static struct {
	const char *file;
	unsigned line;
} last_km_user0 [NR_CPUS];

void *kmap(struct page *page)
{
	might_sleep();
	if (!PageHighMem(page))
		return page_address(page);
	return kmap_high(page);
}

void kunmap(struct page *page)
{
	if (in_interrupt())
		BUG();
	if (!PageHighMem(page))
		return;
	kunmap_high(page);
}

/*
 * kmap_atomic/kunmap_atomic is significantly faster than kmap/kunmap because
 * no global lock is needed and because the kmap code must perform a global TLB
 * invalidation when the kmap pool wraps.
 *
 * However when holding an atomic kmap is is not legal to sleep, so atomic
 * kmaps are appropriate for short, tight code paths only.
 */
void *_kmap_atomic_prot(struct page *page, enum km_type type,
			pgprot_t prot, const char *file, unsigned line)
{
	enum fixed_addresses idx;
	unsigned long vaddr;

	/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
	pagefault_disable();

	if (!PageHighMem(page))
		return page_address(page);

	idx = type + KM_TYPE_NR*smp_processor_id();
	vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
	if (!pte_none(*(kmap_pte-idx))) {
		if (type == KM_USER0) {
			printk(KM_USER0 already mapped at %s:%d\n,
			   last_km_user0[smp_processor_id()].file,
			   last_km_user0[smp_processor_id()].line);
		} else {
			printk(type is NOT KM_USER0\n);
		}
		BUG();
	} else if (type == KM_USER0) {
		last_km_user0[smp_processor_id()].file = file;
		last_km_user0[smp_processor_id()].line = line;
	}

	set_pte(kmap_pte-idx, mk_pte(page, prot));
	arch_flush_lazy_mmu_mode();

	return (void *)vaddr;
}

void *_kmap_atomic(struct page *page, enum km_type type,
		   const char *file, unsigned line)
{
	return _kmap_atomic_prot(page, type, kmap_prot, file, line);
}

void kunmap_atomic(void *kvaddr, enum km_type type)
{
	unsigned long vaddr = (unsigned long) kvaddr  PAGE_MASK;
	enum fixed_addresses idx = type + KM_TYPE_NR*smp_processor_id();

	/*
	 * Force other mappings to Oops if they'll try to access this pte
	 * without first remap it.  Keeping stale mappings around is a bad idea
	 * also, in case the page changes cacheability attributes or becomes
	 * a protected page in a hypervisor.
	 */
	if (vaddr == __fix_to_virt(FIX_KMAP_BEGIN+idx))
		kpte_clear_flush(kmap_pte-idx, vaddr);
	else {
#ifdef CONFIG_DEBUG_HIGHMEM
		BUG_ON(vaddr  PAGE_OFFSET);
		BUG_ON(vaddr = (unsigned long)high_memory);
#endif
	}
	if (type == KM_USER0) {
		last_km_user0[smp_processor_id()].file = NULL;
		last_km_user0[smp_processor_id()].line = 0;
	}

	arch_flush_lazy_mmu_mode();
	pagefault_enable();
}

/* This is the same as kmap_atomic() but can map memory that doesn't
 * have a struct page associated with it.
 */
void *_kmap_atomic_pfn(unsigned long pfn, enum km_type type,
		   const char *file, unsigned line)
{
	enum fixed_addresses idx;
	unsigned long vaddr;

	pagefault_disable();

	idx = type + KM_TYPE_NR*smp_processor_id();
	vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
	if (!pte_none(*(kmap_pte-idx))) {
		if (type == KM_USER0)
			printk(KM_USER0 already mapped at %s:%d\n,
			   last_km_user0[smp_processor_id()].file,
			   last_km_user0[smp_processor_id()].line);
		BUG();
	} else if (type == KM_USER0) {
		last_km_user0[smp_processor_id()].file = file;
		last_km_user0[smp_processor_id()].line = line;
	}
	set_pte(kmap_pte-idx, pfn_pte(pfn, kmap_prot));
	arch_flush_lazy_mmu_mode();

	return (void*) vaddr;
}

struct page *kmap_atomic_to_page(void *ptr)
{
	unsigned long idx, vaddr = (unsigned long)ptr;
	pte_t *pte;

	if (vaddr  FIXADDR_START)
		return virt_to_page(ptr);

	idx = virt_to_fix(vaddr);
	pte = kmap_pte - (idx - FIX_KMAP_BEGIN);
	return pte_page(*pte);
}

EXPORT_SYMBOL(kmap);
EXPORT_SYMBOL(kunmap);
EXPORT_SYMBOL(_kmap_atomic);
EXPORT_SYMBOL(kunmap_atomic);
EXPORT_SYMBOL(kmap_atomic_to_page);
[  255.285392] [ cut here ]
[  255.289992] kernel BUG at arch/x86/mm/highmem_32.c:56!
[  255.295107] invalid opcode:  [#1] PREEMPT SMP 
[  255.299901] Modules linked in: rfcomm l2cap bluetooth ppdev sbp2 ipv6 
parport_pc lp parport pcspkr iTCO_wdt iTCO_vendor_se
[  255.327057] 
[  255.328538] Pid: 4986, comm: ovmtask Not tainted (2.6.24.3xenomaip3 #2)
[  255.335123] EIP: 0060:[c011a966] EFLAGS: 00010286 CPU: 0
[  255.340588] EIP is at _kmap_atomic_prot+0xa6/0x120
[  255.345356] EAX: 0027 EBX: c2b27520 ECX:  

Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-03-31 Thread Gilles Chanteperdrix
Tomas Kalibera wrote:
  
  I added a missing underscore and re-tried, and none of the debug 
  messages was printed. I added another one to make sure that there is not 
  a problem with getting printk messages to the serial console. The 
  resulting highmem_32.c and the output is attached.
  
  T

The interesting part of the output is the printk which occurs right
before the first bug, what happens afterwards is of little use. Do you
get any output before the first bug ?

-- 


Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-03-30 Thread Gilles Chanteperdrix
Tomas Kalibera wrote:
  Hi Gilles,
  
  thanks for looking at it. Your analysis is correct, I don't indeed have 
  CONFIG_PREEMPT_RT kernel, but only CONFIG_PREEMPT, sorry for the confusion.
  
  I've put the kernel config, sources, and binary on the web, so that you 
  can be sure you're really looking on the kernel that is crashing, 
  http://www.cs.purdue.edu/homes/tkaliber/crash

After looking at the sources, it appears that kmap_atomic disables
preemption and kunmap_atomic reenables it. In short, the bug should
never happen. What could happen is that the preemption count is garbled,
or that a call to kmap_atomic is not paired with a kunmap_atomic.

To check if the problem comes from the preemption count, could you apply
the following patch ?

-- 


Gilles.
diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c
index 1c3bf95..4bb9fc6 100644
--- a/arch/x86/mm/highmem_32.c
+++ b/arch/x86/mm/highmem_32.c
@@ -34,6 +34,7 @@ void *kmap_atomic_prot(struct page *page, enum km_type type, 
pgprot_t prot)
/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
pagefault_disable();
 
+   BUG_ON(type == KM_USER0  !in_atomic());
if (!PageHighMem(page))
return page_address(page);
 
@@ -85,6 +86,7 @@ void *kmap_atomic_pfn(unsigned long pfn, enum km_type type)
 
pagefault_disable();
 
+   BUG_ON(type == KM_USER0  !in_atomic());
idx = type + KM_TYPE_NR*smp_processor_id();
vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
set_pte(kmap_pte-idx, pfn_pte(pfn, kmap_prot));
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-03-30 Thread Tomas Kalibera

Crashed on the very same line as before
Tomas

[  189.558776] [ cut here ]
[  189.563377] kernel BUG at arch/x86/mm/highmem_32.c:43!
[  189.568491] invalid opcode:  [#1] PREEMPT SMP 
[  189.573285] Modules linked in: rfcomm l2cap bluetooth ppdev sbp2 parport_pc 
lp parport sr_mod cdrom pcspkr iTCO_wdt iTCO_v
endor_support shpchp pci_hotplug ipv6 evdev ext3 jbd mbcache sg sd_mod ata_piix 
usbhid hid floppy ata_generic ahci ohci1394 l
ibata scsi_mod ieee1394 ehci_hcd tg3 uhci_hcd usbcore fuse
[  189.600440] 
[  189.601924] Pid: 4960, comm: ovmtask Not tainted (2.6.24.3xenomaip1 #1)
[  189.608508] EIP: 0060:[c011a908] EFLAGS: 00010286 CPU: 0
[  189.613971] EIP is at kmap_atomic_prot+0xb8/0xc0
[  189.618566] EAX: d91a8163 EBX: c2b23500 ECX: f000 EDX: c044fecc
[  189.624804] ESI: 0007 EDI: 0163 EBP: 08003875 ESP: df673ea0
[  189.631043]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  189.636416] Process ovmtask (pid: 4960, ti=df672000 task=df4d29e0 
task.ti=df672000)0
[  189.643865] I-pipe domain Linux
[  189.647257] Stack: fffb2000  c2b2350c c01a96aa fffb7000 fffb6000 
df66d278 dfb7a580 
[  189.655648]dfb7ae40 df846084 df9b7084 08615000 0840 08615000 
f7c3d740 c2b23520 
[  189.664039]  c2b2350c c2be8aac fffb3000 08614fff 
  
[  189.672430] Call Trace:
[  189.675045]  [c01a96aa] copy_page_range+0x13a/0x560
[  189.680086]  [c01224ef] copy_process+0x8df/0x1250
[  189.684951]  [c012309c] do_fork+0x4c/0x200
[  189.689211]  [c01022d2] sys_clone+0x32/0x40
[  189.693556]  [c01043a1] sysenter_past_esp+0x6e/0x72
[  189.698595]  ===
[  189.702150] Code: 0c c1 fb 05 29 c1 c1 e3 0c 89 c8 09 fb 89 1a 5b 5e 5f c3 
89 e0 25 00 e0 ff ff f7 40 14 ff ff ff ef 0f 85
 69 ff ff ff 0f 0b eb fe 0f 0b eb fe 8d 74 26 00 8b 0d f4 b1 45 c0 e9 35 ff 
ff ff 90 8d 
[  189.721467] EIP: [c011a908] kmap_atomic_prot+0xb8/0xc0 SS:ESP 0068:df673ea0
[  189.728669] ---[ end trace 7363976c5f0598cc ]---
[  189.733269] note: ovmtask[4960] exited with preempt_count 1



Gilles Chanteperdrix wrote:
 Tomas Kalibera wrote:
   Hi Gilles,
   
   thanks for looking at it. Your analysis is correct, I don't indeed have 
   CONFIG_PREEMPT_RT kernel, but only CONFIG_PREEMPT, sorry for the confusion.
   
   I've put the kernel config, sources, and binary on the web, so that you 
   can be sure you're really looking on the kernel that is crashing, 
   http://www.cs.purdue.edu/homes/tkaliber/crash

 After looking at the sources, it appears that kmap_atomic disables
 preemption and kunmap_atomic reenables it. In short, the bug should
 never happen. What could happen is that the preemption count is garbled,
 or that a call to kmap_atomic is not paired with a kunmap_atomic.

 To check if the problem comes from the preemption count, could you apply
 the following patch ?

   


___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-03-29 Thread Gilles Chanteperdrix
Tomas Kalibera wrote:
  Hi Gilles,
  
  thanks for looking at it. Your analysis is correct, I don't indeed have 
  CONFIG_PREEMPT_RT kernel, but only CONFIG_PREEMPT, sorry for the confusion.
  
  I've put the kernel config, sources, and binary on the web, so that you 
  can be sure you're really looking on the kernel that is crashing, 
  http://www.cs.purdue.edu/homes/tkaliber/crash

It looks like do_wp_page, the caller of cow_user_page calls it with
spinlock unlocked. So nothing prevents a rescheduling to happen and
reschedule a real-time process, which can call fork. Now, I wonder what
prevents do_wp_page to be called in the same conditions...

-- 


Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-03-28 Thread Gilles Chanteperdrix
Tomas Kalibera wrote:
  
  Hi,
  
  I'm getting kernel crashes with my native skin user-space Xenomai 
  application. It looks like the crash happens after clone/fork. I'm using 
  kernel 2.6.24.3, SMP, RT_PREEMPT (settings like  2.6.22-14-rt from 
  Ubuntu 7.10). Xenomai 2.4.2.
  
  The thread causing the crash is a Xenomai task, running most of the time 
  in the Linux domain. The application is very huge, getting a short 
  example leading to the bug is unfortunatelly not realistic.
  
  The crash happens when running on real hardware (x86_64 with 32 bit 
  kernel and applications).  The system is unusable after it happens, can 
  only be rebooted, the dump is from serial console.
  In VMWare on another x86_64 machine, it does not crash.
  
  Anyone getting a similar error ? Any ideas where to look for the problem ?

Looking at the kernel code, it seems that only one page may be mapped at
a time with kmap_atomic using KM_USER0. So what probably happens is that
for other invocations of cow_user_page than the one taking place in
fork, a lock of some kind prevents concurrent invocation of
cow_user_page. In our use of cow_user_page, we probably do not hold that
lock. I look at the code, I see that copy_pte_range holds a spinlock,
which should disable preemption on a classical kernel. But who knows
what happens with RT_PREEMPT enabled...

-- 


Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-03-28 Thread Gilles Chanteperdrix
Gilles Chanteperdrix wrote:
  Tomas Kalibera wrote:

Hi,

I'm getting kernel crashes with my native skin user-space Xenomai 
application. It looks like the crash happens after clone/fork. I'm using 
kernel 2.6.24.3, SMP, RT_PREEMPT (settings like  2.6.22-14-rt from 
Ubuntu 7.10). Xenomai 2.4.2.

The thread causing the crash is a Xenomai task, running most of the time 
in the Linux domain. The application is very huge, getting a short 
example leading to the bug is unfortunatelly not realistic.

The crash happens when running on real hardware (x86_64 with 32 bit 
kernel and applications).  The system is unusable after it happens, can 
only be rebooted, the dump is from serial console.
In VMWare on another x86_64 machine, it does not crash.

Anyone getting a similar error ? Any ideas where to look for the problem ?
  
  Looking at the kernel code, it seems that only one page may be mapped at
  a time with kmap_atomic using KM_USER0. So what probably happens is that
  for other invocations of cow_user_page than the one taking place in
  fork, a lock of some kind prevents concurrent invocation of
  cow_user_page. In our use of cow_user_page, we probably do not hold that
  lock. I look at the code, I see that copy_pte_range holds a spinlock,
  which should disable preemption on a classical kernel. But who knows
  what happens with RT_PREEMPT enabled...

There is something strange... Normally, when compiling with
CONFIG_PREEMPT_RT, kmap_atomic_prot is replaced with kmap and the real
kmap_atomic_prot is renamd __kmap_atomic_prot. Since cow_user_page uses
kmap_atomic_prot, kmap is in fact called and kmap_atomic_prot BUG_ON
condition should in fact never occur.

-- 


Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Kernel crash with Xenomai (caused by fork?)

2008-03-28 Thread Tomas Kalibera
Hi Gilles,

thanks for looking at it. Your analysis is correct, I don't indeed have 
CONFIG_PREEMPT_RT kernel, but only CONFIG_PREEMPT, sorry for the confusion.

I've put the kernel config, sources, and binary on the web, so that you 
can be sure you're really looking on the kernel that is crashing, 
http://www.cs.purdue.edu/homes/tkaliber/crash

Thanks,

Tomas

Gilles Chanteperdrix wrote:
 Gilles Chanteperdrix wrote:
   Tomas Kalibera wrote:
 
 Hi,
 
 I'm getting kernel crashes with my native skin user-space Xenomai 
 application. It looks like the crash happens after clone/fork. I'm 
 using 
 kernel 2.6.24.3, SMP, RT_PREEMPT (settings like  2.6.22-14-rt from 
 Ubuntu 7.10). Xenomai 2.4.2.
 
 The thread causing the crash is a Xenomai task, running most of the 
 time 
 in the Linux domain. The application is very huge, getting a short 
 example leading to the bug is unfortunatelly not realistic.
 
 The crash happens when running on real hardware (x86_64 with 32 bit 
 kernel and applications).  The system is unusable after it happens, can 
 only be rebooted, the dump is from serial console.
 In VMWare on another x86_64 machine, it does not crash.
 
 Anyone getting a similar error ? Any ideas where to look for the 
 problem ?
   
   Looking at the kernel code, it seems that only one page may be mapped at
   a time with kmap_atomic using KM_USER0. So what probably happens is that
   for other invocations of cow_user_page than the one taking place in
   fork, a lock of some kind prevents concurrent invocation of
   cow_user_page. In our use of cow_user_page, we probably do not hold that
   lock. I look at the code, I see that copy_pte_range holds a spinlock,
   which should disable preemption on a classical kernel. But who knows
   what happens with RT_PREEMPT enabled...

 There is something strange... Normally, when compiling with
 CONFIG_PREEMPT_RT, kmap_atomic_prot is replaced with kmap and the real
 kmap_atomic_prot is renamd __kmap_atomic_prot. Since cow_user_page uses
 kmap_atomic_prot, kmap is in fact called and kmap_atomic_prot BUG_ON
 condition should in fact never occur.

   


___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core