subject:"Re\: \[Xenomai\-core\] Xenomai"

On 09/04/2011 10:52 PM, Gilles Chanteperdrix wrote:
 
 Hi,
 
 The first release candidate for the 2.6.0 version may be downloaded here:
 
 http://download.gna.org/xenomai/testing/xenomai-2.6.0-rc1.tar.bz2

Hi,

currently 2.6.0-rc1 fails to build on 2.4 kernel, with errors related to
vfile support. Do we really want to still support 2.4 kernels?

Regards.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-help] Xenomai 2.6.0-rc1

2011-09-06 Thread Roland Stigge

Hi,

On 09/06/2011 01:31 PM, Gilles Chanteperdrix wrote:
 currently 2.6.0-rc1 fails to build on 2.4 kernel, with errors related to
 vfile support. Do we really want to still support 2.4 kernels?

No worries here from the Debian (and derivatives) perspective.

bye,
  Roland

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-help] Xenomai 2.6.0-rc1

On Tue, 2011-09-06 at 13:31 +0200, Gilles Chanteperdrix wrote:
 On 09/04/2011 10:52 PM, Gilles Chanteperdrix wrote:
  
  Hi,
  
  The first release candidate for the 2.6.0 version may be downloaded here:
  
  http://download.gna.org/xenomai/testing/xenomai-2.6.0-rc1.tar.bz2
 
 Hi,
 
 currently 2.6.0-rc1 fails to build on 2.4 kernel, with errors related to
 vfile support. Do we really want to still support 2.4 kernels?
 

That would not be a massive loss, but removing linux 2.4 support is more
than a few hunks here and there, so this may not be the right thing to
do ATM. Besides, it would be better not to leave the few linux 2.4 users
out there without upgrade path to xenomai 2.6, since this will be the
last maintained version from the Xenomai 2.x architecture.

That stuff does not compile likely because the Config.in bits are not up
to date, blame it on me. I'll make this build over linux 2.4 and commit
the result today.

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-help] Xenomai 2.6.0-rc1

On 09/06/2011 03:27 PM, Philippe Gerum wrote:
 On Tue, 2011-09-06 at 13:31 +0200, Gilles Chanteperdrix wrote:
 On 09/04/2011 10:52 PM, Gilles Chanteperdrix wrote:

 Hi,

 The first release candidate for the 2.6.0 version may be downloaded here:

 http://download.gna.org/xenomai/testing/xenomai-2.6.0-rc1.tar.bz2

 Hi,

 currently 2.6.0-rc1 fails to build on 2.4 kernel, with errors related to
 vfile support. Do we really want to still support 2.4 kernels?

 
 That would not be a massive loss, but removing linux 2.4 support is more
 than a few hunks here and there, so this may not be the right thing to
 do ATM. Besides, it would be better not to leave the few linux 2.4 users
 out there without upgrade path to xenomai 2.6, since this will be the
 last maintained version from the Xenomai 2.x architecture.
 
 That stuff does not compile likely because the Config.in bits are not up
 to date, blame it on me. I'll make this build over linux 2.4 and commit
 the result today.
 

No problem, I was not looking for someone to blame... Since you are at
it, I have problems compiling the nios2 kernel too, but I am not sure I
got the proper configuration file.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-help] Xenomai 2.6.0-rc1

On Tue, 2011-09-06 at 16:19 +0200, Gilles Chanteperdrix wrote:
 On 09/06/2011 03:27 PM, Philippe Gerum wrote:
  On Tue, 2011-09-06 at 13:31 +0200, Gilles Chanteperdrix wrote:
  On 09/04/2011 10:52 PM, Gilles Chanteperdrix wrote:
 
  Hi,
 
  The first release candidate for the 2.6.0 version may be downloaded here:
 
  http://download.gna.org/xenomai/testing/xenomai-2.6.0-rc1.tar.bz2
 
  Hi,
 
  currently 2.6.0-rc1 fails to build on 2.4 kernel, with errors related to
  vfile support. Do we really want to still support 2.4 kernels?
 
  
  That would not be a massive loss, but removing linux 2.4 support is more
  than a few hunks here and there, so this may not be the right thing to
  do ATM. Besides, it would be better not to leave the few linux 2.4 users
  out there without upgrade path to xenomai 2.6, since this will be the
  last maintained version from the Xenomai 2.x architecture.
  
  That stuff does not compile likely because the Config.in bits are not up
  to date, blame it on me. I'll make this build over linux 2.4 and commit
  the result today.
  
 
 No problem, I was not looking for someone to blame... Since you are at
 it, I have problems compiling the nios2 kernel too, but I am not sure I
 got the proper configuration file.
 

Ok, I'll check this.

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-help] Xenomai 2.6.0-rc1

On Tue, 2011-09-06 at 16:19 +0200, Gilles Chanteperdrix wrote:
 On 09/06/2011 03:27 PM, Philippe Gerum wrote:
  On Tue, 2011-09-06 at 13:31 +0200, Gilles Chanteperdrix wrote:
  On 09/04/2011 10:52 PM, Gilles Chanteperdrix wrote:
 
  Hi,
 
  The first release candidate for the 2.6.0 version may be downloaded here:
 
  http://download.gna.org/xenomai/testing/xenomai-2.6.0-rc1.tar.bz2
 
  Hi,
 
  currently 2.6.0-rc1 fails to build on 2.4 kernel, with errors related to
  vfile support. Do we really want to still support 2.4 kernels?
 
  
  That would not be a massive loss, but removing linux 2.4 support is more
  than a few hunks here and there, so this may not be the right thing to
  do ATM. Besides, it would be better not to leave the few linux 2.4 users
  out there without upgrade path to xenomai 2.6, since this will be the
  last maintained version from the Xenomai 2.x architecture.
  
  That stuff does not compile likely because the Config.in bits are not up
  to date, blame it on me. I'll make this build over linux 2.4 and commit
  the result today.
  
 
 No problem, I was not looking for someone to blame... Since you are at
 it, I have problems compiling the nios2 kernel too, but I am not sure I
 got the proper configuration file.
 

HEAD builds fine based on the attached .config. 

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-help] Xenomai 2.6.0-rc1

On Tue, 2011-09-06 at 16:53 +0200, Philippe Gerum wrote:
 On Tue, 2011-09-06 at 16:19 +0200, Gilles Chanteperdrix wrote:
  On 09/06/2011 03:27 PM, Philippe Gerum wrote:
   On Tue, 2011-09-06 at 13:31 +0200, Gilles Chanteperdrix wrote:
   On 09/04/2011 10:52 PM, Gilles Chanteperdrix wrote:
  
   Hi,
  
   The first release candidate for the 2.6.0 version may be downloaded 
   here:
  
   http://download.gna.org/xenomai/testing/xenomai-2.6.0-rc1.tar.bz2
  
   Hi,
  
   currently 2.6.0-rc1 fails to build on 2.4 kernel, with errors related to
   vfile support. Do we really want to still support 2.4 kernels?
  
   
   That would not be a massive loss, but removing linux 2.4 support is more
   than a few hunks here and there, so this may not be the right thing to
   do ATM. Besides, it would be better not to leave the few linux 2.4 users
   out there without upgrade path to xenomai 2.6, since this will be the
   last maintained version from the Xenomai 2.x architecture.
   
   That stuff does not compile likely because the Config.in bits are not up
   to date, blame it on me. I'll make this build over linux 2.4 and commit
   the result today.
   
  
  No problem, I was not looking for someone to blame... Since you are at
  it, I have problems compiling the nios2 kernel too, but I am not sure I
  got the proper configuration file.
  
 
 HEAD builds fine based on the attached .config. 
 

Mmmfff...

-- 
Philippe.

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.35
# Tue Sep  6 16:49:25 2011
#

#
# Linux/NiosII Configuration
#
CONFIG_NIOS2=y
CONFIG_MMU=y
# CONFIG_FPU is not set
# CONFIG_SWAP is not set
CONFIG_RWSEM_GENERIC_SPINLOCK=y

#
# NiosII board configuration
#
# CONFIG_3C120 is not set
CONFIG_NEEK=y
CONFIG_NIOS2_CUSTOM_FPGA=y
# CONFIG_NIOS2_NEEK_OCM is not set

#
# NiosII specific compiler options
#
CONFIG_NIOS2_HW_MUL_SUPPORT=y
# CONFIG_NIOS2_HW_MULX_SUPPORT is not set
# CONFIG_NIOS2_HW_DIV_SUPPORT is not set
# CONFIG_OF is not set
CONFIG_ALIGNMENT_TRAP=y
CONFIG_RAMKERNEL=y

#
# Boot options
#
CONFIG_CMDLINE=
CONFIG_PASS_CMDLINE=y
CONFIG_BOOT_LINK_OFFSET=0x0100

#
# Platform driver options
#
# CONFIG_AVALON_DMA is not set

#
# Additional NiosII Device Drivers
#
# CONFIG_PCI_ALTPCI is not set
# CONFIG_ALTERA_REMOTE_UPDATE is not set
# CONFIG_PIO_DEVICES is not set
# CONFIG_NIOS2_GPIO is not set
# CONFIG_ALTERA_PIO_GPIO is not set
CONFIG_UID16=y
CONFIG_GENERIC_CSUM=y
CONFIG_GENERIC_FIND_NEXT_BIT=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_NO_IOPORT=y
CONFIG_ZONE_DMA=y
CONFIG_BINFMT_ELF=y
# CONFIG_NOT_COHERENT_CACHE is not set
CONFIG_HZ=100
# CONFIG_TRACE_IRQFLAGS_SUPPORT is not set
CONFIG_IPIPE=y
CONFIG_IPIPE_DOMAINS=4
CONFIG_IPIPE_DELAYED_ATOMICSW=y
# CONFIG_IPIPE_UNMASKED_CONTEXT_SWITCH is not set
CONFIG_IPIPE_HAVE_PREEMPTIBLE_SWITCH=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
# CONFIG_DISCONTIGMEM_MANUAL is not set
# CONFIG_SPARSEMEM_MANUAL is not set
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
CONFIG_PAGEFLAGS_EXTENDED=y
CONFIG_SPLIT_PTLOCK_CPUS=4
# CONFIG_PHYS_ADDR_T_64BIT is not set
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
# CONFIG_KSM is not set
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config
CONFIG_CONSTRUCTORS=y

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=
CONFIG_LOCALVERSION=
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
# CONFIG_POSIX_MQUEUE is not set
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
# CONFIG_TASKSTATS is not set
# CONFIG_AUDIT is not set

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_TREE_PREEMPT_RCU is not set
# CONFIG_TINY_RCU is not set
# CONFIG_RCU_TRACE is not set
CONFIG_RCU_FANOUT=32
# CONFIG_RCU_FANOUT_EXACT is not set
# CONFIG_TREE_RCU_TRACE is not set
# CONFIG_IKCONFIG is not set
CONFIG_LOG_BUF_SHIFT=14
# CONFIG_SYSFS_DEPRECATED_V2 is not set
# CONFIG_RELAY is not set
# CONFIG_NAMESPACES is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=
CONFIG_RD_GZIP=y
# CONFIG_RD_BZIP2 is not set
# CONFIG_RD_LZMA is not set
# CONFIG_RD_LZO is not set
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
CONFIG_EMBEDDED=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
# CONFIG_ELF_CORE is not set
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
# CONFIG_EPOLL is not set
# CONFIG_SIGNALFD is not set
# CONFIG_TIMERFD is not set
# CONFIG_EVENTFD is not set
# CONFIG_SHMEM is not set
CONFIG_AIO=y

#
# Kernel Performance Events And Counters
#
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_COMPAT_BRK=y
CONFIG_SLAB=y
# CONFIG_SLUB is not set
# CONFIG_SLOB is not set
# CONFIG_PROFILING is not

Re: [Xenomai-core] [Xenomai-help] Xenomai 2.6.0-rc1

On Tue, 2011-09-06 at 16:53 +0200, Philippe Gerum wrote:
 On Tue, 2011-09-06 at 16:53 +0200, Philippe Gerum wrote:
  On Tue, 2011-09-06 at 16:19 +0200, Gilles Chanteperdrix wrote:
   On 09/06/2011 03:27 PM, Philippe Gerum wrote:
On Tue, 2011-09-06 at 13:31 +0200, Gilles Chanteperdrix wrote:
On 09/04/2011 10:52 PM, Gilles Chanteperdrix wrote:
   
Hi,
   
The first release candidate for the 2.6.0 version may be downloaded 
here:
   
http://download.gna.org/xenomai/testing/xenomai-2.6.0-rc1.tar.bz2
   
Hi,
   
currently 2.6.0-rc1 fails to build on 2.4 kernel, with errors related 
to
vfile support. Do we really want to still support 2.4 kernels?
   

That would not be a massive loss, but removing linux 2.4 support is more
than a few hunks here and there, so this may not be the right thing to
do ATM. Besides, it would be better not to leave the few linux 2.4 users
out there without upgrade path to xenomai 2.6, since this will be the
last maintained version from the Xenomai 2.x architecture.

That stuff does not compile likely because the Config.in bits are not up
to date, blame it on me. I'll make this build over linux 2.4 and commit
the result today.

   
   No problem, I was not looking for someone to blame... Since you are at
   it, I have problems compiling the nios2 kernel too, but I am not sure I
   got the proper configuration file.
   
  
  HEAD builds fine based on the attached .config. 
  

Btw we now only support the MMU version (2.6.35.2) of this kernel over
Xenomai 2.6. Reference tree is available there:

url = git://sopc.et.ntust.edu.tw/git/linux-2.6.git
branch = nios2mmu

nommu support is discontinued for nios2 - people who depend on it should
stick with Xenomai 2.5.x.

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-help] Xenomai 2.6.0-rc1

On 09/06/2011 05:10 PM, Philippe Gerum wrote:
 On Tue, 2011-09-06 at 16:53 +0200, Philippe Gerum wrote:
 On Tue, 2011-09-06 at 16:53 +0200, Philippe Gerum wrote:
 On Tue, 2011-09-06 at 16:19 +0200, Gilles Chanteperdrix wrote:
 On 09/06/2011 03:27 PM, Philippe Gerum wrote:
 On Tue, 2011-09-06 at 13:31 +0200, Gilles Chanteperdrix wrote:
 On 09/04/2011 10:52 PM, Gilles Chanteperdrix wrote:

 Hi,

 The first release candidate for the 2.6.0 version may be downloaded 
 here:

 http://download.gna.org/xenomai/testing/xenomai-2.6.0-rc1.tar.bz2

 Hi,

 currently 2.6.0-rc1 fails to build on 2.4 kernel, with errors related to
 vfile support. Do we really want to still support 2.4 kernels?


 That would not be a massive loss, but removing linux 2.4 support is more
 than a few hunks here and there, so this may not be the right thing to
 do ATM. Besides, it would be better not to leave the few linux 2.4 users
 out there without upgrade path to xenomai 2.6, since this will be the
 last maintained version from the Xenomai 2.x architecture.

 That stuff does not compile likely because the Config.in bits are not up
 to date, blame it on me. I'll make this build over linux 2.4 and commit
 the result today.


 No problem, I was not looking for someone to blame... Since you are at
 it, I have problems compiling the nios2 kernel too, but I am not sure I
 got the proper configuration file.


 HEAD builds fine based on the attached .config. 

 
 Btw we now only support the MMU version (2.6.35.2) of this kernel over
 Xenomai 2.6. Reference tree is available there:
 
 url = git://sopc.et.ntust.edu.tw/git/linux-2.6.git
 branch = nios2mmu
 
 nommu support is discontinued for nios2 - people who depend on it should
 stick with Xenomai 2.5.x.
 

Ok, still not building, maybe the commit number mentioned in the README
is not up-to-date?

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-help] Xenomai 2.6.0-rc1

On 09/06/2011 08:19 PM, Gilles Chanteperdrix wrote:
 On 09/06/2011 05:10 PM, Philippe Gerum wrote:
 On Tue, 2011-09-06 at 16:53 +0200, Philippe Gerum wrote:
 On Tue, 2011-09-06 at 16:53 +0200, Philippe Gerum wrote:
 On Tue, 2011-09-06 at 16:19 +0200, Gilles Chanteperdrix wrote:
 On 09/06/2011 03:27 PM, Philippe Gerum wrote:
 On Tue, 2011-09-06 at 13:31 +0200, Gilles Chanteperdrix wrote:
 On 09/04/2011 10:52 PM, Gilles Chanteperdrix wrote:

 Hi,

 The first release candidate for the 2.6.0 version may be downloaded 
 here:

 http://download.gna.org/xenomai/testing/xenomai-2.6.0-rc1.tar.bz2

 Hi,

 currently 2.6.0-rc1 fails to build on 2.4 kernel, with errors related to
 vfile support. Do we really want to still support 2.4 kernels?


 That would not be a massive loss, but removing linux 2.4 support is more
 than a few hunks here and there, so this may not be the right thing to
 do ATM. Besides, it would be better not to leave the few linux 2.4 users
 out there without upgrade path to xenomai 2.6, since this will be the
 last maintained version from the Xenomai 2.x architecture.

 That stuff does not compile likely because the Config.in bits are not up
 to date, blame it on me. I'll make this build over linux 2.4 and commit
 the result today.


 No problem, I was not looking for someone to blame... Since you are at
 it, I have problems compiling the nios2 kernel too, but I am not sure I
 got the proper configuration file.


 HEAD builds fine based on the attached .config. 


 Btw we now only support the MMU version (2.6.35.2) of this kernel over
 Xenomai 2.6. Reference tree is available there:

 url = git://sopc.et.ntust.edu.tw/git/linux-2.6.git
 branch = nios2mmu

 nommu support is discontinued for nios2 - people who depend on it should
 stick with Xenomai 2.5.x.

 
 Ok, still not building, maybe the commit number mentioned in the README
 is not up-to-date?
 

More build failures for kernel 3.0 and ppc...

http://sisyphus.hd.free.fr/~gilles/bx/index.html#powerpc

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-help] Xenomai 2.6.0-rc1

On Tue, 2011-09-06 at 21:42 +0200, Gilles Chanteperdrix wrote:
 On 09/06/2011 08:19 PM, Gilles Chanteperdrix wrote:
  On 09/06/2011 05:10 PM, Philippe Gerum wrote:
  On Tue, 2011-09-06 at 16:53 +0200, Philippe Gerum wrote:
  On Tue, 2011-09-06 at 16:53 +0200, Philippe Gerum wrote:
  On Tue, 2011-09-06 at 16:19 +0200, Gilles Chanteperdrix wrote:
  On 09/06/2011 03:27 PM, Philippe Gerum wrote:
  On Tue, 2011-09-06 at 13:31 +0200, Gilles Chanteperdrix wrote:
  On 09/04/2011 10:52 PM, Gilles Chanteperdrix wrote:
 
  Hi,
 
  The first release candidate for the 2.6.0 version may be downloaded 
  here:
 
  http://download.gna.org/xenomai/testing/xenomai-2.6.0-rc1.tar.bz2
 
  Hi,
 
  currently 2.6.0-rc1 fails to build on 2.4 kernel, with errors related 
  to
  vfile support. Do we really want to still support 2.4 kernels?
 
 
  That would not be a massive loss, but removing linux 2.4 support is 
  more
  than a few hunks here and there, so this may not be the right thing to
  do ATM. Besides, it would be better not to leave the few linux 2.4 
  users
  out there without upgrade path to xenomai 2.6, since this will be the
  last maintained version from the Xenomai 2.x architecture.
 
  That stuff does not compile likely because the Config.in bits are not 
  up
  to date, blame it on me. I'll make this build over linux 2.4 and commit
  the result today.
 
 
  No problem, I was not looking for someone to blame... Since you are at
  it, I have problems compiling the nios2 kernel too, but I am not sure I
  got the proper configuration file.
 
 
  HEAD builds fine based on the attached .config. 
 
 
  Btw we now only support the MMU version (2.6.35.2) of this kernel over
  Xenomai 2.6. Reference tree is available there:
 
  url = git://sopc.et.ntust.edu.tw/git/linux-2.6.git
  branch = nios2mmu
 
  nommu support is discontinued for nios2 - people who depend on it should
  stick with Xenomai 2.5.x.
 
  
  Ok, still not building, maybe the commit number mentioned in the README
  is not up-to-date?
  
 
 More build failures for kernel 3.0 and ppc...
 
 http://sisyphus.hd.free.fr/~gilles/bx/index.html#powerpc
 

I've fixed most of these, however the platform driver interface changed
once again circa 2.6.39, and AFAICT, picking the right approach to cope
with this never ending mess for the mscan driver requires some thoughts
from educated people. Since I don't qualify for the job, I'm shamelessly
passing the buck to Wolfgang:
http://sisyphus.hd.free.fr/~gilles/bx/lite5200/3.0.4-ppc_6xx-gcc-4.2.2/log.html#1

PS: I guess this fix can wait until 2.6.0 final, this is not critical
for -rc2.

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-help] Xenomai 2.6.0-rc1

On Tue, 2011-09-06 at 20:19 +0200, Gilles Chanteperdrix wrote:
 On 09/06/2011 05:10 PM, Philippe Gerum wrote:
  On Tue, 2011-09-06 at 16:53 +0200, Philippe Gerum wrote:
  On Tue, 2011-09-06 at 16:53 +0200, Philippe Gerum wrote:
  On Tue, 2011-09-06 at 16:19 +0200, Gilles Chanteperdrix wrote:
  On 09/06/2011 03:27 PM, Philippe Gerum wrote:
  On Tue, 2011-09-06 at 13:31 +0200, Gilles Chanteperdrix wrote:
  On 09/04/2011 10:52 PM, Gilles Chanteperdrix wrote:
 
  Hi,
 
  The first release candidate for the 2.6.0 version may be downloaded 
  here:
 
  http://download.gna.org/xenomai/testing/xenomai-2.6.0-rc1.tar.bz2
 
  Hi,
 
  currently 2.6.0-rc1 fails to build on 2.4 kernel, with errors related 
  to
  vfile support. Do we really want to still support 2.4 kernels?
 
 
  That would not be a massive loss, but removing linux 2.4 support is more
  than a few hunks here and there, so this may not be the right thing to
  do ATM. Besides, it would be better not to leave the few linux 2.4 users
  out there without upgrade path to xenomai 2.6, since this will be the
  last maintained version from the Xenomai 2.x architecture.
 
  That stuff does not compile likely because the Config.in bits are not up
  to date, blame it on me. I'll make this build over linux 2.4 and commit
  the result today.
 
 
  No problem, I was not looking for someone to blame... Since you are at
  it, I have problems compiling the nios2 kernel too, but I am not sure I
  got the proper configuration file.
 
 
  HEAD builds fine based on the attached .config. 
 
  
  Btw we now only support the MMU version (2.6.35.2) of this kernel over
  Xenomai 2.6. Reference tree is available there:
  
  url = git://sopc.et.ntust.edu.tw/git/linux-2.6.git
  branch = nios2mmu
  
  nommu support is discontinued for nios2 - people who depend on it should
  stick with Xenomai 2.5.x.
  
 
 Ok, still not building, maybe the commit number mentioned in the README
 is not up-to-date?
 

The commit # is correct, but I suspect that your kernel tree does not
have the files normally created by the SOPC builder anymore, these can't
(may not actually) be included in the pipeline patch. In short, your
tree might be missing the bits corresponding to the fpga design your
build for, so basic symbols like HRCLOCK* and HRTIMER* are undefined.

I'm building for a cyclone 3c25 from the NEEK kit, with SOPC files
available from arch/nios2/boards/neek. Any valuable files in there on
your side? (typically, include/asm/custom_fpga.h should contain
definitions for our real-time clocks and timers)

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-help] Xenomai 2.6.0-rc1

2011-09-05 Thread Henri Roosen

Hi Gilles,

Unfortunately I didn't find the time to test this release yet. I'm
just wondering if there is a fix for this problem in the 2.6.0
release: https://mail.gna.org/public/xenomai-core/2011-05/msg00028.html

We are using the auto-relax patches on top of 2.5.6 for a long time
now. We found issues with it regarding auto-relax tasks that were not
being auto-relaxed anymore. Philippe made patches for that, see
https://mail.gna.org/public/xenomai-help/2011-03/msg00161.html.
However, locally I reverted those two patches because these introduced
a memory leak in xnheap; I could only do rt_task_create()
rt_task_delete() for 1024 times ;-). I thought that was the discussion
of
https://mail.gna.org/public/xenomai-core/2011-05/msg00028.html at that
time and I don't recall a proper fix for it was provided. But I might
have missed it...

Thanks,
Henri

On Sun, Sep 4, 2011 at 10:52 PM, Gilles Chanteperdrix
gilles.chanteperd...@xenomai.org wrote:

 Hi,

 The first release candidate for the 2.6.0 version may be downloaded here:

 http://download.gna.org/xenomai/testing/xenomai-2.6.0-rc1.tar.bz2

 This version fixes a few issues in the 2.5.x branch which required
 breaking the ABI:
 - user-space heap mapping;
 - user-space access to thread mode;
 - get threads running with SCHED_OTHER scheduling policy to
 automatically return to secondary mode after each primary mode only
 system call (except when holding a mutex);
 - fix both native and posix condition variables signal handling.

 contains a few improvements as well:
 - add support for CLOCK_HOST_REALTIME, a real-time clock synchronized
 with Linux clock;
 - factor proc filesystem handling;
 - the xeno-test scripts has been simplified and rebased on
 xeno-test-run, which will allow writing custom test scripts;
 - add support for sh4 architecture;
 - simplify arm user-space configure script;
 - move rtdk to libxenomai library, printf is now rt-safe when using the
 posix skin;
 - add support for pkg-config, the xenomai skin libraries are available
 each as a libxenomai_skin pkg-config package.

 Regards.

 --
                                                                Gilles.

 ___
 Xenomai-help mailing list
 xenomai-h...@gna.org
 https://mail.gna.org/listinfo/xenomai-help


___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-help] Xenomai 2.6.0-rc1

2011-09-05 Thread Gilles Chanteperdrix

On 09/05/2011 07:14 PM, Henri Roosen wrote:
 Hi Gilles,
 
 Unfortunately I didn't find the time to test this release yet. I'm
 just wondering if there is a fix for this problem in the 2.6.0
 release: https://mail.gna.org/public/xenomai-core/2011-05/msg00028.html

This one is fixed, a bit differently, since we fixed the ppd handling so
that the ppd is valid up to the end of a process.

 
 We are using the auto-relax patches on top of 2.5.6 for a long time
 now. We found issues with it regarding auto-relax tasks that were not
 being auto-relaxed anymore. Philippe made patches for that, see
 https://mail.gna.org/public/xenomai-help/2011-03/msg00161.html.

Philippe's patches for rt_task_send/receive/reply should have been
merged too.

 However, locally I reverted those two patches because these introduced
 a memory leak in xnheap; I could only do rt_task_create()
 rt_task_delete() for 1024 times ;-). I thought that was the discussion
 of
 https://mail.gna.org/public/xenomai-core/2011-05/msg00028.html at that
 time and I don't recall a proper fix for it was provided. But I might
 have missed it...

This looks related to the ppd issue as well, in which case, it should
have been fixed too.

It would be nice if you could test the release and tell us whether you
still have these issues.


-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] xenomai-core ftrace

On 2011-09-04 07:10, rainbow wrote:
 Sorry to reply so late, I did a test about install ftrace on xenomai. the
 following is my procedure:
 #git://git.xenomai.org/xenomai-jki.git queues/ftrace
 #git://git.kiszka.org/ipipe-2.6 queues/2.6.35-x86-trace
 #cd queues/ftrace
 #git checkout -b remotes/origin/queues/ftrace
  origin/queues/2.6.35-x86-trace  //change to the ftrace xenomai branch
 #cd ../2.6.35-x86-trace
 #git checkout
 -b origin/queues/2.6.35-x86-trace origin/queues/2.6.35-x86-trace
 #cd ../ftrace
 #./scripts/prepare-kernel.sh  --arch=i386
 --adeos=ksrc/arch/x86/patches/adeos-ipipe-2.6.35.9-x86-2.8-04.patch
 --linux=../2.6.35-x86-trace/
 #cd /2.6.35-x86-trace/
 
 then I compile the kernel but I get the following error message:
 arch/x86/kernel/ipipe.c:851: error: conflicting types for ‘update_vsyscall’
 include/linux/clocksource.h:316: note: previous declaration of
 ‘update_vsyscall’ was here
 make[2]: *** [arch/x86/kernel/ipipe.o] Error 1
 make[1]: *** [arch/x86/kernel] Error 2
 make: *** [arch/x86] Error 2

That's a build issues of the underlying old ipipe patch. However, it's
x86-32 only. And as the documentation stated, only x86-64 is supported
by the ftrace patches. So build for 64 bit instead.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] xenomai-core ftrace

2011-09-04 Thread rainbow

you mean I use remotes/origin/queues/2.6.37-x86 branch and use the ipipe
patch for 2.6.37 then install them on x86_64, the ftrace can work?I will
have a try, thank you!

2011/9/4 Jan Kiszka jan.kis...@web.de

 On 2011-09-04 07:10, rainbow wrote:
  Sorry to reply so late, I did a test about install ftrace on xenomai. the
  following is my procedure:
  #git://git.xenomai.org/xenomai-jki.git queues/ftrace
  #git://git.kiszka.org/ipipe-2.6 queues/2.6.35-x86-trace
  #cd queues/ftrace
  #git checkout -b remotes/origin/queues/ftrace
   origin/queues/2.6.35-x86-trace  //change to the ftrace xenomai branch
  #cd ../2.6.35-x86-trace
  #git checkout
  -b origin/queues/2.6.35-x86-trace origin/queues/2.6.35-x86-trace
  #cd ../ftrace
  #./scripts/prepare-kernel.sh  --arch=i386
  --adeos=ksrc/arch/x86/patches/adeos-ipipe-2.6.35.9-x86-2.8-04.patch
  --linux=../2.6.35-x86-trace/
  #cd /2.6.35-x86-trace/
 
  then I compile the kernel but I get the following error message:
  arch/x86/kernel/ipipe.c:851: error: conflicting types for
 ‘update_vsyscall’
  include/linux/clocksource.h:316: note: previous declaration of
  ‘update_vsyscall’ was here
  make[2]: *** [arch/x86/kernel/ipipe.o] Error 1
  make[1]: *** [arch/x86/kernel] Error 2
  make: *** [arch/x86] Error 2

 That's a build issues of the underlying old ipipe patch. However, it's
 x86-32 only. And as the documentation stated, only x86-64 is supported
 by the ftrace patches. So build for 64 bit instead.

 Jan




-- 
Qingquan Lv
School of Information Science  Engineering , Lanzhou University.
mail: lvq...@gmail.com
Do what you like,
Enjoy your life.
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] xenomai-core ftrace

On 2011-09-04 13:49, rainbow wrote:
 you mean I use remotes/origin/queues/2.6.37-x86 branch and use the ipipe
 patch for 2.6.37 then install them on x86_64, the ftrace can work?I will
 have a try, thank you!

Use the 2.6.35-x86-trace, it already contains the ipipe patch, and build
it for x86-64.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] xenomai-core ftrace

2011-09-04 Thread rainbow

Is the ipipe patch the same as patch like
adeos-ipipe-2.6.37.6-x86-2.9-02.patch, I know the latter is xenomai patch
and after I patch it, I can see Real-time sub-system  ---  Option. But If
I use 2.6.35-x86-trace which contains ,there is no such option.
Another  problem is that there are so many xenomai gits , how can i download
the correct git?
I am a newby to xenomai and I am sorry to ask so many questions but I want
to do something on xenomai :) . Thank you for your detail answers.

2011/9/4 Jan Kiszka jan.kis...@web.de

 On 2011-09-04 13:49, rainbow wrote:
  you mean I use remotes/origin/queues/2.6.37-x86 branch and use the ipipe
  patch for 2.6.37 then install them on x86_64, the ftrace can work?I will
  have a try, thank you!

 Use the 2.6.35-x86-trace, it already contains the ipipe patch, and build
 it for x86-64.

 Jan




-- 
Qingquan Lv
School of Information Science  Engineering , Lanzhou University.
mail: lvq...@gmail.com
Do what you like,
Enjoy your life.
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] xenomai-core ftrace

On 2011-09-04 14:21, rainbow wrote:
 Is the ipipe patch the same as patch like
 adeos-ipipe-2.6.37.6-x86-2.9-02.patch,

Except that the trace branch is for 2.6.35, yes. More precisely it is
now the same, I just pushed the latest version that includes two more
backported ipipe fixes.

 I know the latter is xenomai patch
 and after I patch it, I can see Real-time sub-system  ---  Option. But If
 I use 2.6.35-x86-trace which contains ,there is no such option.

That menu option is introduced by Xenomai, ie. after running
prepare-kernel.sh. You likely forgot that step.

Note again that you have to use a Xenomai tree with the required ftrace
patches on top if you want Xenomai to generate ftrace events as well.

 Another  problem is that there are so many xenomai gits , how can i download
 the correct git?

By cloning the the git repository you obtain all available branches. You
just need to checkout the desired one afterward.

 I am a newby to xenomai and I am sorry to ask so many questions but I want
 to do something on xenomai :) . Thank you for your detail answers.

Setting up ftrace for Xenomai is not necessarily a newbie task, but I
think I know the background of this. :)

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] xenomai-core ftrace

2011-09-04 Thread rainbow

2011/9/4 Jan Kiszka jan.kis...@web.de

 On 2011-09-04 14:21, rainbow wrote:
  Is the ipipe patch the same as patch like
  adeos-ipipe-2.6.37.6-x86-2.9-02.patch,

 Except that the trace branch is for 2.6.35, yes. More precisely it is
 now the same, I just pushed the latest version that includes two more
 backported ipipe fixes.

  I know the latter is xenomai patch
  and after I patch it, I can see Real-time sub-system  ---  Option. But
 If
  I use 2.6.35-x86-trace which contains ,there is no such option.

 That menu option is introduced by Xenomai, ie. after running
 prepare-kernel.sh. You likely forgot that step.



 Yes,I forget the step. So I think I only have to run prepare-kernel.sh
 --arch=x86_64

--linux=2.6.35-x86-trace  , I do not need --adeos option because
 the 2.6.35-x86-trace contains the ipipe patch.




 Note again that you have to use a Xenomai tree with the required ftrace
 patches on top if you want Xenomai to generate ftrace events as well.

 Xenomai tree with required ftrace patches on top you mean the branch
remotes/origin/queues/ftrace?


  Another  problem is that there are so many xenomai gits , how can i
 download
  the correct git?

 By cloning the the git repository you obtain all available branches. You
 just need to checkout the desired one afterward.

  I am a newby to xenomai and I am sorry to ask so many questions but I
 want
  to do something on xenomai :) . Thank you for your detail answers.

 Setting up ftrace for Xenomai is not necessarily a newbie task, but I
 think I know the background of this. :)


I think you really know the background :).


 Jan




-- 
Qingquan Lv
School of Information Science  Engineering , Lanzhou University.
mail: lvq...@gmail.com
Do what you like,
Enjoy your life.
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] xenomai-core ftrace

On 2011-09-04 15:16, rainbow wrote:
 2011/9/4 Jan Kiszka jan.kis...@web.de
 
 On 2011-09-04 14:21, rainbow wrote:
 Is the ipipe patch the same as patch like
 adeos-ipipe-2.6.37.6-x86-2.9-02.patch,

 Except that the trace branch is for 2.6.35, yes. More precisely it is
 now the same, I just pushed the latest version that includes two more
 backported ipipe fixes.

 I know the latter is xenomai patch
 and after I patch it, I can see Real-time sub-system  ---  Option. But
 If
 I use 2.6.35-x86-trace which contains ,there is no such option.

 That menu option is introduced by Xenomai, ie. after running
 prepare-kernel.sh. You likely forgot that step.

 
 
 Yes,I forget the step. So I think I only have to run prepare-kernel.sh
 --arch=x86_64
 
 --linux=2.6.35-x86-trace  , I do not need --adeos option because
 the 2.6.35-x86-trace contains the ipipe patch.

 
 
 
 Note again that you have to use a Xenomai tree with the required ftrace
 patches on top if you want Xenomai to generate ftrace events as well.

 Xenomai tree with required ftrace patches on top you mean the branch
 remotes/origin/queues/ftrace?

Yep. I just pushed a rebased version of current git master.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] xenomai-core ftrace

2011-09-03 Thread Jan Kiszka

On 2011-09-03 04:52, rainbow wrote:
 hi,all,I want to use ftrace in xenomai-2.5.6,but when I use git://
 git.kiszka.org/ipipe.git queues/2.6.35-x86-trace to get the linux
 kernel,there is no option about xenomai or ipipe . If I want to patch the
 xenomai patch,there are some problem. How should I use ftrace on
 xenomai?Thanks!

First of all, make sure to read README.INSTALL in the Xenomai tree for
the basic installation procedure.

That git branch above replaces the installation step of picking a
vanilla Linux source tree and applying the ipipe patch to it (if there
is no ipipe option in the kernel config, you probably haven't check out
the right branch yet).

The next step would be running Xenomai's prepare-kernel.sh, in this case
using a Xenomai tree that has the required ftrace patches, see

http://permalink.gmane.org/gmane.linux.real-time.xenomai.devel/7966

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] xenomai-core ftrace

2011-09-03 Thread rainbow

Sorry to reply so late, I did a test about install ftrace on xenomai. the
following is my procedure:
#git://git.xenomai.org/xenomai-jki.git queues/ftrace
#git://git.kiszka.org/ipipe-2.6 queues/2.6.35-x86-trace
#cd queues/ftrace
#git checkout -b remotes/origin/queues/ftrace
 origin/queues/2.6.35-x86-trace  //change to the ftrace xenomai branch
#cd ../2.6.35-x86-trace
#git checkout
-b origin/queues/2.6.35-x86-trace origin/queues/2.6.35-x86-trace
#cd ../ftrace
#./scripts/prepare-kernel.sh  --arch=i386
--adeos=ksrc/arch/x86/patches/adeos-ipipe-2.6.35.9-x86-2.8-04.patch
--linux=../2.6.35-x86-trace/
#cd /2.6.35-x86-trace/

then I compile the kernel but I get the following error message:
arch/x86/kernel/ipipe.c:851: error: conflicting types for ‘update_vsyscall’
include/linux/clocksource.h:316: note: previous declaration of
‘update_vsyscall’ was here
make[2]: *** [arch/x86/kernel/ipipe.o] Error 1
make[1]: *** [arch/x86/kernel] Error 2
make: *** [arch/x86] Error 2

I am not sure the reason is that I get the wrong patch or the kernel
configuration is wrong, Is the procedure above right? Thanks!

2011/9/3 Jan Kiszka jan.kis...@web.de

 On 2011-09-03 04:52, rainbow wrote:
  hi,all,I want to use ftrace in xenomai-2.5.6,but when I use git://
  git.kiszka.org/ipipe.git queues/2.6.35-x86-trace to get the linux
  kernel,there is no option about xenomai or ipipe . If I want to patch the
  xenomai patch,there are some problem. How should I use ftrace on
  xenomai?Thanks!

 First of all, make sure to read README.INSTALL in the Xenomai tree for
 the basic installation procedure.

 That git branch above replaces the installation step of picking a
 vanilla Linux source tree and applying the ipipe patch to it (if there
 is no ipipe option in the kernel config, you probably haven't check out
 the right branch yet).

 The next step would be running Xenomai's prepare-kernel.sh, in this case
 using a Xenomai tree that has the required ftrace patches, see

 http://permalink.gmane.org/gmane.linux.real-time.xenomai.devel/7966

 Jan




-- 
Qingquan Lv
School of Information Science  Engineering , Lanzhou University.
mail: lvq...@gmail.com
Do what you like,
Enjoy your life.
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] Xenomai 2.6.0, or -rc1?

2011-08-30 Thread Gilles Chanteperdrix

On 08/30/2011 01:00 AM, Alexis Berlemont wrote:
 Hi,
 
 On Fri, Aug 26, 2011 at 2:34 PM, Gilles Chanteperdrix
 gilles.chanteperd...@xenomai.org wrote:

 Hi,

 I think it is about time we release Xenomai 2.6.0. Has anyone anything
 pending (maybe Alex)? Should we release an -rc first?

 Yes. in my experimental branch, I have a few things which are not that
 experimental. I would like to push:
 - a first version of Julien Delange's ni_660x driver
 - Anders Blomdell's fix on duplicate sympbols with comedi
 - Anders Blomdell's fix in pcimio driver (wrong IRQ number after reboot)
 - some waveform generation tools (fully generic)
 - an overhaul of the testing drivers (fake + loop = fake)
 
 I will integrate them in my analogy branch and send a pull request if
 you are OK with that.

Ok for me.


-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] Xenomai 2.6.0, or -rc1?

2011-08-30 Thread Julien Delange

On Tue, Aug 30, 2011 at 1:00 AM, Alexis Berlemont
alexis.berlem...@gmail.com wrote:
 - a first version of Julien Delange's ni_660x driver

And also the one for the 670x board, no ?

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] Xenomai 2.6.0, or -rc1?

2011-08-29 Thread Alexis Berlemont

Hi,

On Fri, Aug 26, 2011 at 2:34 PM, Gilles Chanteperdrix
gilles.chanteperd...@xenomai.org wrote:

 Hi,

 I think it is about time we release Xenomai 2.6.0. Has anyone anything
 pending (maybe Alex)? Should we release an -rc first?

Yes. in my experimental branch, I have a few things which are not that
experimental. I would like to push:
- a first version of Julien Delange's ni_660x driver
- Anders Blomdell's fix on duplicate sympbols with comedi
- Anders Blomdell's fix in pcimio driver (wrong IRQ number after reboot)
- some waveform generation tools (fully generic)
- an overhaul of the testing drivers (fake + loop = fake)

I will integrate them in my analogy branch and send a pull request if
you are OK with that.

Alexis.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] Xenomai 2.6.0, or -rc1?

2011-08-26 Thread Jan Kiszka

On 2011-08-26 14:34, Gilles Chanteperdrix wrote:
 
 Hi,
 
 I think it is about time we release Xenomai 2.6.0. Has anyone anything
 pending (maybe Alex)? Should we release an -rc first?

No patches ATM, but [1] is still an open bug - a bug that affects the ABI.

Jan

[1] http://thread.gmane.org/gmane.linux.real-time.xenomai.devel/8343

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] Xenomai 2.6.0, or -rc1?

2011-08-26 Thread Philippe Gerum

On Fri, 2011-08-26 at 14:34 +0200, Gilles Chanteperdrix wrote:
 Hi,
 
 I think it is about time we release Xenomai 2.6.0. Has anyone anything
 pending (maybe Alex)? Should we release an -rc first?
 
 Thanks in advance for your input.
 


Nothing pending for 2.6, I'm focusing on 3.x now. However let's go for
-rc1 first, this is a major release anyway.

-- 
Philippe.




___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] Xenomai 2.6.0, or -rc1?

2011-08-26 Thread Gilles Chanteperdrix

On 08/26/2011 03:05 PM, Jan Kiszka wrote:
 On 2011-08-26 14:34, Gilles Chanteperdrix wrote:

 Hi,

 I think it is about time we release Xenomai 2.6.0. Has anyone anything
 pending (maybe Alex)? Should we release an -rc first?
 
 No patches ATM, but [1] is still an open bug - a bug that affects the ABI.
 
 Jan
 
 [1] http://thread.gmane.org/gmane.linux.real-time.xenomai.devel/8343
 

I had forgotten about this one. So, the only real problem is if a
SCHED_NOTOTHER thread switches to SCHED_OTHER, this appears to be a
corner case, so, I wonder if you should not simply add a special
treatment, only for this corner case.

What I have in mind is keeping a list of xnsynch in kernel-space (this
basically means having an xnholder_t more in the xnsynch structure), and
when we trip the corner case (thread with SCHED_FIFO switches to
SCHED_OTHER), walk the list to find how many xnsynch the thread is the
owner, we have that info in kernel-space, and set the refcnt accordingly.

Or does it still sound overkill?


-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] Xenomai 2.6.0, or -rc1?

2011-08-26 Thread Jan Kiszka

On 2011-08-26 20:07, Gilles Chanteperdrix wrote:
 On 08/26/2011 03:05 PM, Jan Kiszka wrote:
 On 2011-08-26 14:34, Gilles Chanteperdrix wrote:

 Hi,

 I think it is about time we release Xenomai 2.6.0. Has anyone anything
 pending (maybe Alex)? Should we release an -rc first?

 No patches ATM, but [1] is still an open bug - a bug that affects the ABI.

 Jan

 [1] http://thread.gmane.org/gmane.linux.real-time.xenomai.devel/8343

 
 I had forgotten about this one. So, the only real problem is if a
 SCHED_NOTOTHER thread switches to SCHED_OTHER, this appears to be a
 corner case, so, I wonder if you should not simply add a special
 treatment, only for this corner case.
 
 What I have in mind is keeping a list of xnsynch in kernel-space (this
 basically means having an xnholder_t more in the xnsynch structure), and
 when we trip the corner case (thread with SCHED_FIFO switches to
 SCHED_OTHER), walk the list to find how many xnsynch the thread is the
 owner, we have that info in kernel-space, and set the refcnt accordingly.
 
 Or does it still sound overkill?
 

Mmh, need to think about it. Yeah, we do not support
PTHREAD_MUTEX_INITIALIZER, so we do not share that part of the problem
with futexes.

If we have all objects and can explore ownership, we can also implement
robust mutexes this way, i.e. waiter signaling when the owner dies.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] Xenomai 2.6.0, or -rc1?

2011-08-26 Thread Gilles Chanteperdrix

On 08/26/2011 08:19 PM, Jan Kiszka wrote:
 On 2011-08-26 20:07, Gilles Chanteperdrix wrote:
 On 08/26/2011 03:05 PM, Jan Kiszka wrote:
 On 2011-08-26 14:34, Gilles Chanteperdrix wrote:

 Hi,

 I think it is about time we release Xenomai 2.6.0. Has anyone anything
 pending (maybe Alex)? Should we release an -rc first?

 No patches ATM, but [1] is still an open bug - a bug that affects the ABI.

 Jan

 [1] http://thread.gmane.org/gmane.linux.real-time.xenomai.devel/8343


 I had forgotten about this one. So, the only real problem is if a
 SCHED_NOTOTHER thread switches to SCHED_OTHER, this appears to be a
 corner case, so, I wonder if you should not simply add a special
 treatment, only for this corner case.

 What I have in mind is keeping a list of xnsynch in kernel-space (this
 basically means having an xnholder_t more in the xnsynch structure), and
 when we trip the corner case (thread with SCHED_FIFO switches to
 SCHED_OTHER), walk the list to find how many xnsynch the thread is the
 owner, we have that info in kernel-space, and set the refcnt accordingly.

 Or does it still sound overkill?

 
 Mmh, need to think about it. Yeah, we do not support
 PTHREAD_MUTEX_INITIALIZER, so we do not share that part of the problem
 with futexes.

Actually, we could implement PTHREAD_MUTEX_INITIALIZER: when the magic
is wrong, just issue a pthread_mutex_init syscall, and try locking
again. But the problem is that this particular call to
pthread_mutex_lock would be much heavier than locking an initialized
mutex for reasons which are not obvious (besides, we would have to
handle concurrency by some way, like having a pthread_once_t in
pthread_mutex_t). I find not having PTHREAD_MUTEX_INITIALIZER more
clear, even if this makes us not really posix compliant.

 
 If we have all objects and can explore ownership, we can also implement
 robust mutexes this way, i.e. waiter signaling when the owner dies.
 
 Jan
 


-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] xenomai-head compile failure

2011-08-09 Thread Gilles Chanteperdrix

On 08/09/2011 02:51 PM, Daniele Nicolodi wrote:
 Hello,
 
 I'm compiling xenomai-head on i386 debian/testing. I found that the file
 src/skins/posix/wrappers.c is missing an include of signal.h for the
 definition of pthread_kill().

Fixed, thanks.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : rt_print: Provide rt_puts

2011-07-31 Thread Gilles Chanteperdrix

On 07/31/2011 06:49 PM, GIT version control wrote:
 +int rt_puts(const char *s)
 +{
 + return print_to_buffer(stdout, 0, RT_PRINT_MODE_PUTS, s, NULL);
 +}

gcc for ARM chokes here: it says that NULL can not be converted to a
va_list, however I try it.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : rt_print: Provide rt_puts

2011-07-31 Thread Jan Kiszka

On 2011-07-31 19:21, Gilles Chanteperdrix wrote:
 On 07/31/2011 06:49 PM, GIT version control wrote:
 +int rt_puts(const char *s)
 +{
 +return print_to_buffer(stdout, 0, RT_PRINT_MODE_PUTS, s, NULL);
 +}
 
 gcc for ARM chokes here: it says that NULL can not be converted to a
 va_list, however I try it.

Hmm. Does this work?

diff --git a/src/skins/common/rt_print.c b/src/skins/common/rt_print.c
index 186de48..52538d8 100644
--- a/src/skins/common/rt_print.c
+++ b/src/skins/common/rt_print.c
@@ -243,7 +243,9 @@ int rt_printf(const char *format, ...)

 int rt_puts(const char *s)
 {
-   return print_to_buffer(stdout, 0, RT_PRINT_MODE_PUTS, s, NULL);
+   va_list dummy;
+
+   return print_to_buffer(stdout, 0, RT_PRINT_MODE_PUTS, s, dummy);
 }

 void rt_syslog(int priority, const char *format, ...)


Not really beautiful as well, I know.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : rt_print: Provide rt_puts

2011-07-31 Thread Gilles Chanteperdrix

On 07/31/2011 07:42 PM, Jan Kiszka wrote:
 On 2011-07-31 19:21, Gilles Chanteperdrix wrote:
 On 07/31/2011 06:49 PM, GIT version control wrote:
 +int rt_puts(const char *s)
 +{
 +   return print_to_buffer(stdout, 0, RT_PRINT_MODE_PUTS, s, NULL);
 +}

 gcc for ARM chokes here: it says that NULL can not be converted to a
 va_list, however I try it.
 
 Hmm. Does this work?
 
 diff --git a/src/skins/common/rt_print.c b/src/skins/common/rt_print.c
 index 186de48..52538d8 100644
 --- a/src/skins/common/rt_print.c
 +++ b/src/skins/common/rt_print.c
 @@ -243,7 +243,9 @@ int rt_printf(const char *format, ...)
 
  int rt_puts(const char *s)
  {
 - return print_to_buffer(stdout, 0, RT_PRINT_MODE_PUTS, s, NULL);
 + va_list dummy;
 +
 + return print_to_buffer(stdout, 0, RT_PRINT_MODE_PUTS, s, dummy);
  }
 
  void rt_syslog(int priority, const char *format, ...)
 
 
 Not really beautiful as well, I know.

It seems to work now, but some later version of gcc may decide to warn
us that this argument is used without being initialized...

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : rt_print: Provide rt_puts

2011-07-31 Thread Jan Kiszka

On 2011-07-31 19:46, Gilles Chanteperdrix wrote:
 On 07/31/2011 07:42 PM, Jan Kiszka wrote:
 On 2011-07-31 19:21, Gilles Chanteperdrix wrote:
 On 07/31/2011 06:49 PM, GIT version control wrote:
 +int rt_puts(const char *s)
 +{
 +  return print_to_buffer(stdout, 0, RT_PRINT_MODE_PUTS, s, NULL);
 +}

 gcc for ARM chokes here: it says that NULL can not be converted to a
 va_list, however I try it.

 Hmm. Does this work?

 diff --git a/src/skins/common/rt_print.c b/src/skins/common/rt_print.c
 index 186de48..52538d8 100644
 --- a/src/skins/common/rt_print.c
 +++ b/src/skins/common/rt_print.c
 @@ -243,7 +243,9 @@ int rt_printf(const char *format, ...)

  int rt_puts(const char *s)
  {
 -return print_to_buffer(stdout, 0, RT_PRINT_MODE_PUTS, s, NULL);
 +va_list dummy;
 +
 +return print_to_buffer(stdout, 0, RT_PRINT_MODE_PUTS, s, dummy);
  }

  void rt_syslog(int priority, const char *format, ...)


 Not really beautiful as well, I know.
 
 It seems to work now, but some later version of gcc may decide to warn
 us that this argument is used without being initialized...

Yes. I've pushed a cleaner version.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-16 Thread Jan Kiszka

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 2011-07-15 15:10, Jan Kiszka wrote:
 But... right now it looks like we found our primary regression: 
 nucleus/shadow: shorten the uninterruptible path to secondary mode.
 It opens a short windows during relax where the migrated task may be
 active under both schedulers. We are currently evaluating a revert
 (looks good so far), and I need to work out my theory in more
 details.

Looks like this commit just made a long-standing flaw in Xenomai's
interrupt handling more visible: We reschedule over the interrupt stack
in the Xenomai interrupt handler tails, at least on x86-64. Not sure if
other archs have interrupt stacks, the point is Xenomai's design wrongly
assumes there are no such things. We were lucky so far that the values
saved on this shared stack were apparently compatible, means we were
overwriting them with identical or harmless values. But that's no longer
true when interrupts are hitting us in the xnpod_suspend_thread path of
a relaxing shadow.

Likely the only possible fix is establishing a reschedule hook for
Xenomai in the interrupt exit path after the original stack is restored
- - just like Linux works. Requires changes to both ipipe and Xenomai
unfortunately.

Jan

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.16 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk4hSDsACgkQitSsb3rl5xSmOACfbZfcNKyO9YDvPE+R5H75d0ky
DX0An32BrZW+lpEnxnLLCHSQ5r8itnE9
=n6u8
-END PGP SIGNATURE-

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-16 Thread Jan Kiszka


On 2011-07-16 10:52, Philippe Gerum wrote:

On Sat, 2011-07-16 at 10:13 +0200, Jan Kiszka wrote:

On 2011-07-15 15:10, Jan Kiszka wrote:

But... right now it looks like we found our primary regression:
nucleus/shadow: shorten the uninterruptible path to secondary mode.
It opens a short windows during relax where the migrated task may be
active under both schedulers. We are currently evaluating a revert
(looks good so far), and I need to work out my theory in more
details.


Looks like this commit just made a long-standing flaw in Xenomai's
interrupt handling more visible: We reschedule over the interrupt stack
in the Xenomai interrupt handler tails, at least on x86-64. Not sure if
other archs have interrupt stacks, the point is Xenomai's design wrongly
assumes there are no such things.


Fortunately, no, this is not a design issue, no such assumption was ever
made, but the Xenomai core expects this to be handled on a per-arch
basis with the interrupt pipeline.


And that's already the problem: If Linux uses interrupt stacks, relying 
on ipipe to disable this during Xenomai interrupt handler execution is 
at best a workaround. A fragile one unless you increase the pre-thread 
stack size by the size of the interrupt stack. Lacking support for a 
generic rescheduling hook became a problem by the time Linux introduced 
interrupt threads.



As you pointed out, there is no way
to handle this via some generic Xenomai-only support.

ppc64 now has separate interrupt stacks, which is why I disabled
IRQSTACKS which became the builtin default at some point. Blackfin goes
through a Xenomai-defined irq tail handler as well, because it may not
reschedule over nested interrupt stacks.


How does this arch prevent that xnpod_schedule in the generic interrupt 
handler tail does its normal work?



Fact is that such pending
problem with x86_64 was overlooked since day #1 by /me.


  We were lucky so far that the values
saved on this shared stack were apparently compatible, means we were
overwriting them with identical or harmless values. But that's no longer
true when interrupts are hitting us in the xnpod_suspend_thread path of
a relaxing shadow.



Makes sense. It would be better to find a solution that does not make
the relax path uninterruptible again for a significant amount of time.
On low end platforms we support (i.e. non-x86* mainly), this causes
obvious latency spots.


I agree. Conceptually, the interruptible relaxation should be safe now 
after recent fixes.





Likely the only possible fix is establishing a reschedule hook for
Xenomai in the interrupt exit path after the original stack is restored
- - just like Linux works. Requires changes to both ipipe and Xenomai
unfortunately.


__ipipe_run_irqtail() is in the I-pipe core for such purpose. If
instantiated properly for x86_64, and paired with xnarch_escalate() for
that arch as well, it could be an option for running the rescheduling
procedure when safe.


Nope, that doesn't work. The stack is switched later in the return path 
in entry_64.S. We need a hook there, ideally a conditional one, 
controlled by some per-cpu variable that is set by Xenomai on return 
from its interrupt handlers to signal the rescheduling need.


Jan

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-16 Thread Philippe Gerum

On Sat, 2011-07-16 at 11:15 +0200, Jan Kiszka wrote:
 On 2011-07-16 10:52, Philippe Gerum wrote:
  On Sat, 2011-07-16 at 10:13 +0200, Jan Kiszka wrote:
  On 2011-07-15 15:10, Jan Kiszka wrote:
  But... right now it looks like we found our primary regression:
  nucleus/shadow: shorten the uninterruptible path to secondary mode.
  It opens a short windows during relax where the migrated task may be
  active under both schedulers. We are currently evaluating a revert
  (looks good so far), and I need to work out my theory in more
  details.
 
  Looks like this commit just made a long-standing flaw in Xenomai's
  interrupt handling more visible: We reschedule over the interrupt stack
  in the Xenomai interrupt handler tails, at least on x86-64. Not sure if
  other archs have interrupt stacks, the point is Xenomai's design wrongly
  assumes there are no such things.
 
  Fortunately, no, this is not a design issue, no such assumption was ever
  made, but the Xenomai core expects this to be handled on a per-arch
  basis with the interrupt pipeline.
 
 And that's already the problem: If Linux uses interrupt stacks, relying 
 on ipipe to disable this during Xenomai interrupt handler execution is 
 at best a workaround. A fragile one unless you increase the pre-thread 
 stack size by the size of the interrupt stack. Lacking support for a 
 generic rescheduling hook became a problem by the time Linux introduced 
 interrupt threads.

Don't assume too much. What was done for ppc64 was not meant as a
general policy. Again, this is a per-arch decision.

 
  As you pointed out, there is no way
  to handle this via some generic Xenomai-only support.
 
  ppc64 now has separate interrupt stacks, which is why I disabled
  IRQSTACKS which became the builtin default at some point. Blackfin goes
  through a Xenomai-defined irq tail handler as well, because it may not
  reschedule over nested interrupt stacks.
 
 How does this arch prevent that xnpod_schedule in the generic interrupt 
 handler tail does its normal work?

It polls some hw status to know whether a rescheduling would be safe.
See xnarch_escalate().

 
  Fact is that such pending
  problem with x86_64 was overlooked since day #1 by /me.
 
We were lucky so far that the values
  saved on this shared stack were apparently compatible, means we were
  overwriting them with identical or harmless values. But that's no longer
  true when interrupts are hitting us in the xnpod_suspend_thread path of
  a relaxing shadow.
 
 
  Makes sense. It would be better to find a solution that does not make
  the relax path uninterruptible again for a significant amount of time.
  On low end platforms we support (i.e. non-x86* mainly), this causes
  obvious latency spots.
 
 I agree. Conceptually, the interruptible relaxation should be safe now 
 after recent fixes.
 
 
  Likely the only possible fix is establishing a reschedule hook for
  Xenomai in the interrupt exit path after the original stack is restored
  - - just like Linux works. Requires changes to both ipipe and Xenomai
  unfortunately.
 
  __ipipe_run_irqtail() is in the I-pipe core for such purpose. If
  instantiated properly for x86_64, and paired with xnarch_escalate() for
  that arch as well, it could be an option for running the rescheduling
  procedure when safe.
 
 Nope, that doesn't work. The stack is switched later in the return path 
 in entry_64.S. We need a hook there, ideally a conditional one, 
 controlled by some per-cpu variable that is set by Xenomai on return 
 from its interrupt handlers to signal the rescheduling need.
 

Yes, makes sense. The way to make it conditional without dragging bits
of Xenomai logic into the kernel innards is not obvious though.

It is probably time to officially introduce exo-kernel oriented bits
into the Linux thread info. PTDs have too lose semantics to be practical
if we want to avoid trashing the I-cache by calling probe hooks within
the dual kernel, each time we want to check some basic condition (e.g.
resched needed). A backlink to a foreign TCB there would help too.

Which leads us to killing the ad hoc kernel threads (and stacks) at some
point, which are an absolute pain.

 Jan

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-16 Thread Jan Kiszka

On 2011-07-16 11:56, Philippe Gerum wrote:
 On Sat, 2011-07-16 at 11:15 +0200, Jan Kiszka wrote:
 On 2011-07-16 10:52, Philippe Gerum wrote:
 On Sat, 2011-07-16 at 10:13 +0200, Jan Kiszka wrote:
 On 2011-07-15 15:10, Jan Kiszka wrote:
 But... right now it looks like we found our primary regression:
 nucleus/shadow: shorten the uninterruptible path to secondary mode.
 It opens a short windows during relax where the migrated task may be
 active under both schedulers. We are currently evaluating a revert
 (looks good so far), and I need to work out my theory in more
 details.

 Looks like this commit just made a long-standing flaw in Xenomai's
 interrupt handling more visible: We reschedule over the interrupt stack
 in the Xenomai interrupt handler tails, at least on x86-64. Not sure if
 other archs have interrupt stacks, the point is Xenomai's design wrongly
 assumes there are no such things.

 Fortunately, no, this is not a design issue, no such assumption was ever
 made, but the Xenomai core expects this to be handled on a per-arch
 basis with the interrupt pipeline.

 And that's already the problem: If Linux uses interrupt stacks, relying 
 on ipipe to disable this during Xenomai interrupt handler execution is 
 at best a workaround. A fragile one unless you increase the pre-thread 
 stack size by the size of the interrupt stack. Lacking support for a 
 generic rescheduling hook became a problem by the time Linux introduced 
 interrupt threads.
 
 Don't assume too much. What was done for ppc64 was not meant as a
 general policy. Again, this is a per-arch decision.

Actually, it was the right decision, not only for ppc64: Reusing Linux
interrupt stacks for Xenomai does not work. If we interrupt Linux while
it is already running over the interrupt stack, the stack becomes taboo
on that CPU. From that point on, no RT IRQ must run over the Linux
interrupt stack as it would smash it.

But then the question is why we should try to use the interrupt stacks
for Xenomai at all. It's better to increase the task kernel stacks and
disable interrupt stacks when ipipe is enabled. That's what I'm heading
for with x86-64 now (THREAD_ORDER 2, no stack switching).

What we may do is introducing per-domain interrupt stacks. But that's at
best Xenomai 3 / I-pipe 3 stuff.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-15 Thread Gilles Chanteperdrix

On 07/14/2011 10:57 PM, Jan Kiszka wrote:
 On 2011-07-13 21:12, Gilles Chanteperdrix wrote:
 On 07/13/2011 09:04 PM, Jan Kiszka wrote:
 On 2011-07-13 20:39, Gilles Chanteperdrix wrote:
 On 07/12/2011 07:43 PM, Jan Kiszka wrote:
 On 2011-07-12 19:38, Gilles Chanteperdrix wrote:
 On 07/12/2011 07:34 PM, Jan Kiszka wrote:
 On 2011-07-12 19:31, Gilles Chanteperdrix wrote:
 On 07/12/2011 02:57 PM, Jan Kiszka wrote:
   xnlock_put_irqrestore(nklock, s);
   xnpod_schedule();
   }
 @@ -1036,6 +1043,7 @@ redo:
* to process this signal anyway.
*/
   if (rthal_current_domain == rthal_root_domain) {
 + XENO_BUGON(NUCLEUS, xnthread_test_info(thread, 
 XNATOMIC));

 Misleading dead code again, XNATOMIC is cleared not ten lines above.

 Nope, I forgot to remove that line.


   if (XENO_DEBUG(NUCLEUS)  (!signal_pending(this_task)
   || this_task-state != TASK_RUNNING))
   xnpod_fatal
 @@ -1044,6 +1052,8 @@ redo:
   return -ERESTARTSYS;
   }
  
 + xnthread_clear_info(thread, XNATOMIC);

 Why this? I find the xnthread_clear_info(XNATOMIC) right at the right
 place at the point it currently is.

 Nope. Now we either clear XNATOMIC after successful migration or when
 the signal is about to be sent (ie. in the hook). That way we can test
 more reliably (TM) in the gatekeeper if the thread can be migrated.

 Ok for adding the XNATOMIC test, because it improves the robustness, but
 why changing the way XNATOMIC is set and clear? Chances of breaking
 thing while changing code in this area are really high...

 The current code is (most probably) broken as it does not properly
 synchronizes the gatekeeper against a signaled and runaway target
 Linux task.

 We need an indication if a Linux signal will (or already has) woken up
 the to-be-migrated task. That task may have continued over its context,
 potentially on a different CPU. Providing this indication is the purpose
 of changing where XNATOMIC is cleared.

 What about synchronizing with the gatekeeper with a semaphore, as done
 in the first patch you sent, but doing it in xnshadow_harden, as soon as
 we detect that we are not back from schedule in primary mode? It seems
 it would avoid any further issue, as we would then be guaranteed that
 the thread could not switch to TASK_INTERRUPTIBLE again before the
 gatekeeper is finished.

 The problem is that the gatekeeper tests the task state without holding
 the task's rq lock (which is not available to us without a kernel
 patch). That cannot work reliably as long as we accept signals. That's
 why I'm trying to move state change and test under nklock.


 What worries me is the comment in xnshadow_harden:

 * gatekeeper sent us to primary mode. Since
 * TASK_UNINTERRUPTIBLE is unavailable to us without wrecking
 * the runqueue's count of uniniterruptible tasks, we just
 * notice the issue and gracefully fail; the caller will have
 * to process this signal anyway.
 */

 Does this mean that we can not switch to TASK_UNINTERRUPTIBLE at this
 point? Or simply that TASK_UNINTERRUPTIBLE is not available for the
 business of xnshadow_harden?


 TASK_UNINTERRUPTIBLE is not available without patching the kernel's
 scheduler for the reason mentioned in the comment (the scheduler becomes
 confused and may pick the wrong tasks, IIRC).

 Does not using down/up in the taskexit event handler risk to cause the
 same issue?
 
 Yes, and that means the first patch is incomplete without something like
 the second.
 


 But I would refrain from trying to improve the gatekeeper design. I've
 recently mentioned this to Philippe offlist: For Xenomai 3 with some
 ipipe v3, we must rather patch schedule() to enable zero-switch domain
 migration. Means: enter the scheduler, let it suspend current and pick
 another task, but then simply escalate to the RT domain before doing any
 context switch. That's much cheaper than the current design and
 hopefully also less error-prone.

 So, do you want me to merge your for-upstream branch?
 
 You may merge up to for-upstream^, ie. without any gatekeeper fixes.
 
 I strongly suspect that there are still more races in the migration
 path. The crashes we face even with all patches applied may be related
 to a shadow task being executed under Linux and Xenomai at the same time.

Maybe we could try the following patch instead?

diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
index 01f4200..deb7620 100644
--- a/ksrc/nucleus/shadow.c
+++ b/ksrc/nucleus/shadow.c
@@ -1033,6 +1033,8 @@ redo:
xnpod_fatal
(xnshadow_harden() failed for thread %s[%d],
 thread-name, xnthread_user_pid(thread));
+   down(sched-gksync);
+   up(sched-gksync);
return -ERESTARTSYS;
}
 

--

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-15 Thread Jan Kiszka

On 2011-07-15 14:30, Gilles Chanteperdrix wrote:
 On 07/14/2011 10:57 PM, Jan Kiszka wrote:
 On 2011-07-13 21:12, Gilles Chanteperdrix wrote:
 On 07/13/2011 09:04 PM, Jan Kiszka wrote:
 On 2011-07-13 20:39, Gilles Chanteperdrix wrote:
 On 07/12/2011 07:43 PM, Jan Kiszka wrote:
 On 2011-07-12 19:38, Gilles Chanteperdrix wrote:
 On 07/12/2011 07:34 PM, Jan Kiszka wrote:
 On 2011-07-12 19:31, Gilles Chanteperdrix wrote:
 On 07/12/2011 02:57 PM, Jan Kiszka wrote:
  xnlock_put_irqrestore(nklock, s);
  xnpod_schedule();
  }
 @@ -1036,6 +1043,7 @@ redo:
   * to process this signal anyway.
   */
  if (rthal_current_domain == rthal_root_domain) {
 +XENO_BUGON(NUCLEUS, xnthread_test_info(thread, 
 XNATOMIC));

 Misleading dead code again, XNATOMIC is cleared not ten lines above.

 Nope, I forgot to remove that line.


  if (XENO_DEBUG(NUCLEUS)  (!signal_pending(this_task)
  || this_task-state != TASK_RUNNING))
  xnpod_fatal
 @@ -1044,6 +1052,8 @@ redo:
  return -ERESTARTSYS;
  }
  
 +xnthread_clear_info(thread, XNATOMIC);

 Why this? I find the xnthread_clear_info(XNATOMIC) right at the right
 place at the point it currently is.

 Nope. Now we either clear XNATOMIC after successful migration or when
 the signal is about to be sent (ie. in the hook). That way we can test
 more reliably (TM) in the gatekeeper if the thread can be migrated.

 Ok for adding the XNATOMIC test, because it improves the robustness, but
 why changing the way XNATOMIC is set and clear? Chances of breaking
 thing while changing code in this area are really high...

 The current code is (most probably) broken as it does not properly
 synchronizes the gatekeeper against a signaled and runaway target
 Linux task.

 We need an indication if a Linux signal will (or already has) woken up
 the to-be-migrated task. That task may have continued over its context,
 potentially on a different CPU. Providing this indication is the purpose
 of changing where XNATOMIC is cleared.

 What about synchronizing with the gatekeeper with a semaphore, as done
 in the first patch you sent, but doing it in xnshadow_harden, as soon as
 we detect that we are not back from schedule in primary mode? It seems
 it would avoid any further issue, as we would then be guaranteed that
 the thread could not switch to TASK_INTERRUPTIBLE again before the
 gatekeeper is finished.

 The problem is that the gatekeeper tests the task state without holding
 the task's rq lock (which is not available to us without a kernel
 patch). That cannot work reliably as long as we accept signals. That's
 why I'm trying to move state change and test under nklock.


 What worries me is the comment in xnshadow_harden:

* gatekeeper sent us to primary mode. Since
* TASK_UNINTERRUPTIBLE is unavailable to us without wrecking
* the runqueue's count of uniniterruptible tasks, we just
* notice the issue and gracefully fail; the caller will have
* to process this signal anyway.
*/

 Does this mean that we can not switch to TASK_UNINTERRUPTIBLE at this
 point? Or simply that TASK_UNINTERRUPTIBLE is not available for the
 business of xnshadow_harden?


 TASK_UNINTERRUPTIBLE is not available without patching the kernel's
 scheduler for the reason mentioned in the comment (the scheduler becomes
 confused and may pick the wrong tasks, IIRC).

 Does not using down/up in the taskexit event handler risk to cause the
 same issue?

 Yes, and that means the first patch is incomplete without something like
 the second.



 But I would refrain from trying to improve the gatekeeper design. I've
 recently mentioned this to Philippe offlist: For Xenomai 3 with some
 ipipe v3, we must rather patch schedule() to enable zero-switch domain
 migration. Means: enter the scheduler, let it suspend current and pick
 another task, but then simply escalate to the RT domain before doing any
 context switch. That's much cheaper than the current design and
 hopefully also less error-prone.

 So, do you want me to merge your for-upstream branch?

 You may merge up to for-upstream^, ie. without any gatekeeper fixes.

 I strongly suspect that there are still more races in the migration
 path. The crashes we face even with all patches applied may be related
 to a shadow task being executed under Linux and Xenomai at the same time.
 
 Maybe we could try the following patch instead?
 
 diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
 index 01f4200..deb7620 100644
 --- a/ksrc/nucleus/shadow.c
 +++ b/ksrc/nucleus/shadow.c
 @@ -1033,6 +1033,8 @@ redo:
   xnpod_fatal
   (xnshadow_harden() failed for thread %s[%d],
thread-name, xnthread_user_pid(thread));
 + down(sched-gksync);
 + up(sched-gksync);
   return -ERESTARTSYS;
   }
 

I don't think we need this. But

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-14 Thread Jan Kiszka

On 2011-07-13 21:12, Gilles Chanteperdrix wrote:
 On 07/13/2011 09:04 PM, Jan Kiszka wrote:
 On 2011-07-13 20:39, Gilles Chanteperdrix wrote:
 On 07/12/2011 07:43 PM, Jan Kiszka wrote:
 On 2011-07-12 19:38, Gilles Chanteperdrix wrote:
 On 07/12/2011 07:34 PM, Jan Kiszka wrote:
 On 2011-07-12 19:31, Gilles Chanteperdrix wrote:
 On 07/12/2011 02:57 PM, Jan Kiszka wrote:
xnlock_put_irqrestore(nklock, s);
xnpod_schedule();
}
 @@ -1036,6 +1043,7 @@ redo:
 * to process this signal anyway.
 */
if (rthal_current_domain == rthal_root_domain) {
 +  XENO_BUGON(NUCLEUS, xnthread_test_info(thread, 
 XNATOMIC));

 Misleading dead code again, XNATOMIC is cleared not ten lines above.

 Nope, I forgot to remove that line.


if (XENO_DEBUG(NUCLEUS)  (!signal_pending(this_task)
|| this_task-state != TASK_RUNNING))
xnpod_fatal
 @@ -1044,6 +1052,8 @@ redo:
return -ERESTARTSYS;
}
  
 +  xnthread_clear_info(thread, XNATOMIC);

 Why this? I find the xnthread_clear_info(XNATOMIC) right at the right
 place at the point it currently is.

 Nope. Now we either clear XNATOMIC after successful migration or when
 the signal is about to be sent (ie. in the hook). That way we can test
 more reliably (TM) in the gatekeeper if the thread can be migrated.

 Ok for adding the XNATOMIC test, because it improves the robustness, but
 why changing the way XNATOMIC is set and clear? Chances of breaking
 thing while changing code in this area are really high...

 The current code is (most probably) broken as it does not properly
 synchronizes the gatekeeper against a signaled and runaway target
 Linux task.

 We need an indication if a Linux signal will (or already has) woken up
 the to-be-migrated task. That task may have continued over its context,
 potentially on a different CPU. Providing this indication is the purpose
 of changing where XNATOMIC is cleared.

 What about synchronizing with the gatekeeper with a semaphore, as done
 in the first patch you sent, but doing it in xnshadow_harden, as soon as
 we detect that we are not back from schedule in primary mode? It seems
 it would avoid any further issue, as we would then be guaranteed that
 the thread could not switch to TASK_INTERRUPTIBLE again before the
 gatekeeper is finished.

 The problem is that the gatekeeper tests the task state without holding
 the task's rq lock (which is not available to us without a kernel
 patch). That cannot work reliably as long as we accept signals. That's
 why I'm trying to move state change and test under nklock.


 What worries me is the comment in xnshadow_harden:

  * gatekeeper sent us to primary mode. Since
  * TASK_UNINTERRUPTIBLE is unavailable to us without wrecking
  * the runqueue's count of uniniterruptible tasks, we just
  * notice the issue and gracefully fail; the caller will have
  * to process this signal anyway.
  */

 Does this mean that we can not switch to TASK_UNINTERRUPTIBLE at this
 point? Or simply that TASK_UNINTERRUPTIBLE is not available for the
 business of xnshadow_harden?


 TASK_UNINTERRUPTIBLE is not available without patching the kernel's
 scheduler for the reason mentioned in the comment (the scheduler becomes
 confused and may pick the wrong tasks, IIRC).
 
 Does not using down/up in the taskexit event handler risk to cause the
 same issue?

Yes, and that means the first patch is incomplete without something like
the second.

 

 But I would refrain from trying to improve the gatekeeper design. I've
 recently mentioned this to Philippe offlist: For Xenomai 3 with some
 ipipe v3, we must rather patch schedule() to enable zero-switch domain
 migration. Means: enter the scheduler, let it suspend current and pick
 another task, but then simply escalate to the RT domain before doing any
 context switch. That's much cheaper than the current design and
 hopefully also less error-prone.
 
 So, do you want me to merge your for-upstream branch?

You may merge up to for-upstream^, ie. without any gatekeeper fixes.

I strongly suspect that there are still more races in the migration
path. The crashes we face even with all patches applied may be related
to a shadow task being executed under Linux and Xenomai at the same time.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-13 Thread Gilles Chanteperdrix

On 07/12/2011 07:43 PM, Jan Kiszka wrote:
 On 2011-07-12 19:38, Gilles Chanteperdrix wrote:
 On 07/12/2011 07:34 PM, Jan Kiszka wrote:
 On 2011-07-12 19:31, Gilles Chanteperdrix wrote:
 On 07/12/2011 02:57 PM, Jan Kiszka wrote:
   xnlock_put_irqrestore(nklock, s);
   xnpod_schedule();
   }
 @@ -1036,6 +1043,7 @@ redo:
* to process this signal anyway.
*/
   if (rthal_current_domain == rthal_root_domain) {
 + XENO_BUGON(NUCLEUS, xnthread_test_info(thread, XNATOMIC));

 Misleading dead code again, XNATOMIC is cleared not ten lines above.

 Nope, I forgot to remove that line.


   if (XENO_DEBUG(NUCLEUS)  (!signal_pending(this_task)
   || this_task-state != TASK_RUNNING))
   xnpod_fatal
 @@ -1044,6 +1052,8 @@ redo:
   return -ERESTARTSYS;
   }
  
 + xnthread_clear_info(thread, XNATOMIC);

 Why this? I find the xnthread_clear_info(XNATOMIC) right at the right
 place at the point it currently is.

 Nope. Now we either clear XNATOMIC after successful migration or when
 the signal is about to be sent (ie. in the hook). That way we can test
 more reliably (TM) in the gatekeeper if the thread can be migrated.

 Ok for adding the XNATOMIC test, because it improves the robustness, but
 why changing the way XNATOMIC is set and clear? Chances of breaking
 thing while changing code in this area are really high...
 
 The current code is (most probably) broken as it does not properly
 synchronizes the gatekeeper against a signaled and runaway target
 Linux task.
 
 We need an indication if a Linux signal will (or already has) woken up
 the to-be-migrated task. That task may have continued over its context,
 potentially on a different CPU. Providing this indication is the purpose
 of changing where XNATOMIC is cleared.

What about synchronizing with the gatekeeper with a semaphore, as done
in the first patch you sent, but doing it in xnshadow_harden, as soon as
we detect that we are not back from schedule in primary mode? It seems
it would avoid any further issue, as we would then be guaranteed that
the thread could not switch to TASK_INTERRUPTIBLE again before the
gatekeeper is finished.

What worries me is the comment in xnshadow_harden:

 * gatekeeper sent us to primary mode. Since
 * TASK_UNINTERRUPTIBLE is unavailable to us without wrecking
 * the runqueue's count of uniniterruptible tasks, we just
 * notice the issue and gracefully fail; the caller will have
 * to process this signal anyway.
 */

Does this mean that we can not switch to TASK_UNINTERRUPTIBLE at this
point? Or simply that TASK_UNINTERRUPTIBLE is not available for the
business of xnshadow_harden?

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-13 Thread Jan Kiszka

On 2011-07-13 20:39, Gilles Chanteperdrix wrote:
 On 07/12/2011 07:43 PM, Jan Kiszka wrote:
 On 2011-07-12 19:38, Gilles Chanteperdrix wrote:
 On 07/12/2011 07:34 PM, Jan Kiszka wrote:
 On 2011-07-12 19:31, Gilles Chanteperdrix wrote:
 On 07/12/2011 02:57 PM, Jan Kiszka wrote:
  xnlock_put_irqrestore(nklock, s);
  xnpod_schedule();
  }
 @@ -1036,6 +1043,7 @@ redo:
   * to process this signal anyway.
   */
  if (rthal_current_domain == rthal_root_domain) {
 +XENO_BUGON(NUCLEUS, xnthread_test_info(thread, 
 XNATOMIC));

 Misleading dead code again, XNATOMIC is cleared not ten lines above.

 Nope, I forgot to remove that line.


  if (XENO_DEBUG(NUCLEUS)  (!signal_pending(this_task)
  || this_task-state != TASK_RUNNING))
  xnpod_fatal
 @@ -1044,6 +1052,8 @@ redo:
  return -ERESTARTSYS;
  }
  
 +xnthread_clear_info(thread, XNATOMIC);

 Why this? I find the xnthread_clear_info(XNATOMIC) right at the right
 place at the point it currently is.

 Nope. Now we either clear XNATOMIC after successful migration or when
 the signal is about to be sent (ie. in the hook). That way we can test
 more reliably (TM) in the gatekeeper if the thread can be migrated.

 Ok for adding the XNATOMIC test, because it improves the robustness, but
 why changing the way XNATOMIC is set and clear? Chances of breaking
 thing while changing code in this area are really high...

 The current code is (most probably) broken as it does not properly
 synchronizes the gatekeeper against a signaled and runaway target
 Linux task.

 We need an indication if a Linux signal will (or already has) woken up
 the to-be-migrated task. That task may have continued over its context,
 potentially on a different CPU. Providing this indication is the purpose
 of changing where XNATOMIC is cleared.
 
 What about synchronizing with the gatekeeper with a semaphore, as done
 in the first patch you sent, but doing it in xnshadow_harden, as soon as
 we detect that we are not back from schedule in primary mode? It seems
 it would avoid any further issue, as we would then be guaranteed that
 the thread could not switch to TASK_INTERRUPTIBLE again before the
 gatekeeper is finished.

The problem is that the gatekeeper tests the task state without holding
the task's rq lock (which is not available to us without a kernel
patch). That cannot work reliably as long as we accept signals. That's
why I'm trying to move state change and test under nklock.

 
 What worries me is the comment in xnshadow_harden:
 
* gatekeeper sent us to primary mode. Since
* TASK_UNINTERRUPTIBLE is unavailable to us without wrecking
* the runqueue's count of uniniterruptible tasks, we just
* notice the issue and gracefully fail; the caller will have
* to process this signal anyway.
*/
 
 Does this mean that we can not switch to TASK_UNINTERRUPTIBLE at this
 point? Or simply that TASK_UNINTERRUPTIBLE is not available for the
 business of xnshadow_harden?
 

TASK_UNINTERRUPTIBLE is not available without patching the kernel's
scheduler for the reason mentioned in the comment (the scheduler becomes
confused and may pick the wrong tasks, IIRC).

But I would refrain from trying to improve the gatekeeper design. I've
recently mentioned this to Philippe offlist: For Xenomai 3 with some
ipipe v3, we must rather patch schedule() to enable zero-switch domain
migration. Means: enter the scheduler, let it suspend current and pick
another task, but then simply escalate to the RT domain before doing any
context switch. That's much cheaper than the current design and
hopefully also less error-prone.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-13 Thread Gilles Chanteperdrix

On 07/13/2011 09:04 PM, Jan Kiszka wrote:
 On 2011-07-13 20:39, Gilles Chanteperdrix wrote:
 On 07/12/2011 07:43 PM, Jan Kiszka wrote:
 On 2011-07-12 19:38, Gilles Chanteperdrix wrote:
 On 07/12/2011 07:34 PM, Jan Kiszka wrote:
 On 2011-07-12 19:31, Gilles Chanteperdrix wrote:
 On 07/12/2011 02:57 PM, Jan Kiszka wrote:
 xnlock_put_irqrestore(nklock, s);
 xnpod_schedule();
 }
 @@ -1036,6 +1043,7 @@ redo:
  * to process this signal anyway.
  */
 if (rthal_current_domain == rthal_root_domain) {
 +   XENO_BUGON(NUCLEUS, xnthread_test_info(thread, 
 XNATOMIC));

 Misleading dead code again, XNATOMIC is cleared not ten lines above.

 Nope, I forgot to remove that line.


 if (XENO_DEBUG(NUCLEUS)  (!signal_pending(this_task)
 || this_task-state != TASK_RUNNING))
 xnpod_fatal
 @@ -1044,6 +1052,8 @@ redo:
 return -ERESTARTSYS;
 }
  
 +   xnthread_clear_info(thread, XNATOMIC);

 Why this? I find the xnthread_clear_info(XNATOMIC) right at the right
 place at the point it currently is.

 Nope. Now we either clear XNATOMIC after successful migration or when
 the signal is about to be sent (ie. in the hook). That way we can test
 more reliably (TM) in the gatekeeper if the thread can be migrated.

 Ok for adding the XNATOMIC test, because it improves the robustness, but
 why changing the way XNATOMIC is set and clear? Chances of breaking
 thing while changing code in this area are really high...

 The current code is (most probably) broken as it does not properly
 synchronizes the gatekeeper against a signaled and runaway target
 Linux task.

 We need an indication if a Linux signal will (or already has) woken up
 the to-be-migrated task. That task may have continued over its context,
 potentially on a different CPU. Providing this indication is the purpose
 of changing where XNATOMIC is cleared.

 What about synchronizing with the gatekeeper with a semaphore, as done
 in the first patch you sent, but doing it in xnshadow_harden, as soon as
 we detect that we are not back from schedule in primary mode? It seems
 it would avoid any further issue, as we would then be guaranteed that
 the thread could not switch to TASK_INTERRUPTIBLE again before the
 gatekeeper is finished.
 
 The problem is that the gatekeeper tests the task state without holding
 the task's rq lock (which is not available to us without a kernel
 patch). That cannot work reliably as long as we accept signals. That's
 why I'm trying to move state change and test under nklock.
 

 What worries me is the comment in xnshadow_harden:

   * gatekeeper sent us to primary mode. Since
   * TASK_UNINTERRUPTIBLE is unavailable to us without wrecking
   * the runqueue's count of uniniterruptible tasks, we just
   * notice the issue and gracefully fail; the caller will have
   * to process this signal anyway.
   */

 Does this mean that we can not switch to TASK_UNINTERRUPTIBLE at this
 point? Or simply that TASK_UNINTERRUPTIBLE is not available for the
 business of xnshadow_harden?

 
 TASK_UNINTERRUPTIBLE is not available without patching the kernel's
 scheduler for the reason mentioned in the comment (the scheduler becomes
 confused and may pick the wrong tasks, IIRC).

Does not using down/up in the taskexit event handler risk to cause the
same issue?

 
 But I would refrain from trying to improve the gatekeeper design. I've
 recently mentioned this to Philippe offlist: For Xenomai 3 with some
 ipipe v3, we must rather patch schedule() to enable zero-switch domain
 migration. Means: enter the scheduler, let it suspend current and pick
 another task, but then simply escalate to the RT domain before doing any
 context switch. That's much cheaper than the current design and
 hopefully also less error-prone.

So, do you want me to merge your for-upstream branch?
-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-13 Thread Philippe Gerum

On Wed, 2011-07-13 at 20:39 +0200, Gilles Chanteperdrix wrote:
 On 07/12/2011 07:43 PM, Jan Kiszka wrote:
  On 2011-07-12 19:38, Gilles Chanteperdrix wrote:
  On 07/12/2011 07:34 PM, Jan Kiszka wrote:
  On 2011-07-12 19:31, Gilles Chanteperdrix wrote:
  On 07/12/2011 02:57 PM, Jan Kiszka wrote:
  xnlock_put_irqrestore(nklock, s);
  xnpod_schedule();
  }
  @@ -1036,6 +1043,7 @@ redo:
   * to process this signal anyway.
   */
  if (rthal_current_domain == rthal_root_domain) {
  +   XENO_BUGON(NUCLEUS, xnthread_test_info(thread, 
  XNATOMIC));
 
  Misleading dead code again, XNATOMIC is cleared not ten lines above.
 
  Nope, I forgot to remove that line.
 
 
  if (XENO_DEBUG(NUCLEUS)  (!signal_pending(this_task)
  || this_task-state != TASK_RUNNING))
  xnpod_fatal
  @@ -1044,6 +1052,8 @@ redo:
  return -ERESTARTSYS;
  }
   
  +   xnthread_clear_info(thread, XNATOMIC);
 
  Why this? I find the xnthread_clear_info(XNATOMIC) right at the right
  place at the point it currently is.
 
  Nope. Now we either clear XNATOMIC after successful migration or when
  the signal is about to be sent (ie. in the hook). That way we can test
  more reliably (TM) in the gatekeeper if the thread can be migrated.
 
  Ok for adding the XNATOMIC test, because it improves the robustness, but
  why changing the way XNATOMIC is set and clear? Chances of breaking
  thing while changing code in this area are really high...
  
  The current code is (most probably) broken as it does not properly
  synchronizes the gatekeeper against a signaled and runaway target
  Linux task.
  
  We need an indication if a Linux signal will (or already has) woken up
  the to-be-migrated task. That task may have continued over its context,
  potentially on a different CPU. Providing this indication is the purpose
  of changing where XNATOMIC is cleared.
 
 What about synchronizing with the gatekeeper with a semaphore, as done
 in the first patch you sent, but doing it in xnshadow_harden, as soon as
 we detect that we are not back from schedule in primary mode? It seems
 it would avoid any further issue, as we would then be guaranteed that
 the thread could not switch to TASK_INTERRUPTIBLE again before the
 gatekeeper is finished.
 
 What worries me is the comment in xnshadow_harden:
 
* gatekeeper sent us to primary mode. Since
* TASK_UNINTERRUPTIBLE is unavailable to us without wrecking
* the runqueue's count of uniniterruptible tasks, we just
* notice the issue and gracefully fail; the caller will have
* to process this signal anyway.
*/
 
 Does this mean that we can not switch to TASK_UNINTERRUPTIBLE at this
 point? Or simply that TASK_UNINTERRUPTIBLE is not available for the
 business of xnshadow_harden?

Second interpretation is correct.

 

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 07/11/2011 10:12 PM, Jan Kiszka wrote:
 On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:06 PM, Jan Kiszka wrote:
 On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:59 PM, Jan Kiszka wrote:
 On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
 task_struct *p)
  magic = xnthread_get_magic(thread);
  
  xnlock_get_irqsave(nklock, s);
 +
 +gksched = thread-gksched;
 +if (gksched) {
 +xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is needed is not 
 an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also 
 preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper garanteed to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We request it,
 thus we wait for the gatekeeper to become idle again. While it is idle,
 we reset the queued reference - but I just realized that this may tramp
 on other tasks' values. I need to add a check that the value to be
 null'ified is actually still ours.

 Thinking again, that's actually not a problem: gktarget is only needed
 while gksync is zero - but then we won't get hold of it anyway and,
 thus, can't cause any damage.

 Well, you make it look like it does not work. From what I understand,
 what you want is to set gktarget to null if a task being hardened is
 destroyed. But by waiting for the semaphore, you actually wait for the
 harden to be complete, so setting to NULL is useless. Or am I missing
 something else?

 Setting to NULL is probably unneeded but still better than rely on the
 gatekeeper never waking up spuriously and then dereferencing a stale
 pointer.

 The key element of this fix is waitng on gksync, thus on the completion
 of the non-RT part of the hardening. Actually, this part usually fails
 as the target task received a termination signal at this point.

 Yes, but since you wait on the completion of the hardening, the test
 if (target ...) in the gatekeeper code will always be true, because at
 this point the cleanup code will still be waiting for the semaphore.

 Yes, except we will ever wake up the gatekeeper later on without an
 updated gktarget, ie. spuriously. Better safe than sorry, this is hairy
 code anyway (hopefully obsolete one day).

 The gatekeeper is not woken up by posting the semaphore, the gatekeeper
 is woken up by the thread which is going to be hardened (and this thread
 is the one which waits for the semaphore).
 
 All true. And what is the point?

The point being, would not something like this patch be sufficient?

diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
index 01f4200..4742c02 100644
--- a/ksrc/nucleus/shadow.c
+++ b/ksrc/nucleus/shadow.c
@@ -2527,6 +2527,18 @@ static inline void do_taskexit_event(struct
task_struct *p)
magic = xnthread_get_magic(thread);

xnlock_get_irqsave(nklock, s);
+   if (xnthread_test_info(thread, XNATOMIC)) {
+   struct xnsched *gksched = xnpod_sched_slot(task_cpu(p));
+   xnlock_put_irqrestore(nklock, s);
+
+   /* Thread is in flight to primary mode, wait for the
+  gatekeeper to be done with it. */
+   down(gksched-gksync);
+   up(gksched-gksync);
+
+   xnlock_get_irqsave(nklock, s);
+   }
+
/* Prevent wakeup call from xnshadow_unmap(). */
xnshadow_thrptd(p) = NULL;
xnthread_archtcb(thread)-user_task = NULL;

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 2011-07-12 08:41, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:12 PM, Jan Kiszka wrote:
 On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:06 PM, Jan Kiszka wrote:
 On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:59 PM, Jan Kiszka wrote:
 On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
 task_struct *p)
 magic = xnthread_get_magic(thread);
  
 xnlock_get_irqsave(nklock, s);
 +
 +   gksched = thread-gksched;
 +   if (gksched) {
 +   xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is needed is 
 not an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also 
 preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper garanteed to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We request 
 it,
 thus we wait for the gatekeeper to become idle again. While it is 
 idle,
 we reset the queued reference - but I just realized that this may 
 tramp
 on other tasks' values. I need to add a check that the value to be
 null'ified is actually still ours.

 Thinking again, that's actually not a problem: gktarget is only needed
 while gksync is zero - but then we won't get hold of it anyway and,
 thus, can't cause any damage.

 Well, you make it look like it does not work. From what I understand,
 what you want is to set gktarget to null if a task being hardened is
 destroyed. But by waiting for the semaphore, you actually wait for the
 harden to be complete, so setting to NULL is useless. Or am I missing
 something else?

 Setting to NULL is probably unneeded but still better than rely on the
 gatekeeper never waking up spuriously and then dereferencing a stale
 pointer.

 The key element of this fix is waitng on gksync, thus on the completion
 of the non-RT part of the hardening. Actually, this part usually fails
 as the target task received a termination signal at this point.

 Yes, but since you wait on the completion of the hardening, the test
 if (target ...) in the gatekeeper code will always be true, because at
 this point the cleanup code will still be waiting for the semaphore.

 Yes, except we will ever wake up the gatekeeper later on without an
 updated gktarget, ie. spuriously. Better safe than sorry, this is hairy
 code anyway (hopefully obsolete one day).

 The gatekeeper is not woken up by posting the semaphore, the gatekeeper
 is woken up by the thread which is going to be hardened (and this thread
 is the one which waits for the semaphore).

 All true. And what is the point?
 
 The point being, would not something like this patch be sufficient?
 
 diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
 index 01f4200..4742c02 100644
 --- a/ksrc/nucleus/shadow.c
 +++ b/ksrc/nucleus/shadow.c
 @@ -2527,6 +2527,18 @@ static inline void do_taskexit_event(struct
 task_struct *p)
   magic = xnthread_get_magic(thread);
 
   xnlock_get_irqsave(nklock, s);
 + if (xnthread_test_info(thread, XNATOMIC)) {
 + struct xnsched *gksched = xnpod_sched_slot(task_cpu(p));

That's not reliable, the task might have been migrated by Linux in the
meantime. We must use the stored gksched.

 + xnlock_put_irqrestore(nklock, s);
 +
 + /* Thread is in flight to primary mode, wait for the
 +gatekeeper to be done with it. */
 + down(gksched-gksync);
 + up(gksched-gksync);
 +
 + xnlock_get_irqsave(nklock, s);
 + }
 +
   /* Prevent wakeup call from xnshadow_unmap(). */
   xnshadow_thrptd(p) = NULL;
   xnthread_archtcb(thread)-user_task = NULL;
 

Again, setting gktarget to NULL and testing for NULL is simply safer,
and I see no gain in skipping that. But if you prefer the
micro-optimization, I'll drop it.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 07/12/2011 09:22 AM, Jan Kiszka wrote:
 On 2011-07-12 08:41, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:12 PM, Jan Kiszka wrote:
 On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:06 PM, Jan Kiszka wrote:
 On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:59 PM, Jan Kiszka wrote:
 On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
 task_struct *p)
magic = xnthread_get_magic(thread);
  
xnlock_get_irqsave(nklock, s);
 +
 +  gksched = thread-gksched;
 +  if (gksched) {
 +  xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is needed is 
 not an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also 
 preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper garanteed 
 to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We request 
 it,
 thus we wait for the gatekeeper to become idle again. While it is 
 idle,
 we reset the queued reference - but I just realized that this may 
 tramp
 on other tasks' values. I need to add a check that the value to be
 null'ified is actually still ours.

 Thinking again, that's actually not a problem: gktarget is only needed
 while gksync is zero - but then we won't get hold of it anyway and,
 thus, can't cause any damage.

 Well, you make it look like it does not work. From what I understand,
 what you want is to set gktarget to null if a task being hardened is
 destroyed. But by waiting for the semaphore, you actually wait for the
 harden to be complete, so setting to NULL is useless. Or am I missing
 something else?

 Setting to NULL is probably unneeded but still better than rely on the
 gatekeeper never waking up spuriously and then dereferencing a stale
 pointer.

 The key element of this fix is waitng on gksync, thus on the completion
 of the non-RT part of the hardening. Actually, this part usually fails
 as the target task received a termination signal at this point.

 Yes, but since you wait on the completion of the hardening, the test
 if (target ...) in the gatekeeper code will always be true, because at
 this point the cleanup code will still be waiting for the semaphore.

 Yes, except we will ever wake up the gatekeeper later on without an
 updated gktarget, ie. spuriously. Better safe than sorry, this is hairy
 code anyway (hopefully obsolete one day).

 The gatekeeper is not woken up by posting the semaphore, the gatekeeper
 is woken up by the thread which is going to be hardened (and this thread
 is the one which waits for the semaphore).

 All true. And what is the point?

 The point being, would not something like this patch be sufficient?

 diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
 index 01f4200..4742c02 100644
 --- a/ksrc/nucleus/shadow.c
 +++ b/ksrc/nucleus/shadow.c
 @@ -2527,6 +2527,18 @@ static inline void do_taskexit_event(struct
 task_struct *p)
  magic = xnthread_get_magic(thread);

  xnlock_get_irqsave(nklock, s);
 +if (xnthread_test_info(thread, XNATOMIC)) {
 +struct xnsched *gksched = xnpod_sched_slot(task_cpu(p));
 
 That's not reliable, the task might have been migrated by Linux in the
 meantime. We must use the stored gksched.
 
 +xnlock_put_irqrestore(nklock, s);
 +
 +/* Thread is in flight to primary mode, wait for the
 +   gatekeeper to be done with it. */
 +down(gksched-gksync);
 +up(gksched-gksync);
 +
 +xnlock_get_irqsave(nklock, s);
 +}
 +
  /* Prevent wakeup call from xnshadow_unmap(). */
  xnshadow_thrptd(p) = NULL;
  xnthread_archtcb(thread)-user_task = NULL;

 
 Again, setting gktarget to NULL and testing for NULL is simply safer,

From my point of view, testing for NULL is misleading dead code, since
it will never happen.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 07/12/2011 09:22 AM, Jan Kiszka wrote:
 On 2011-07-12 08:41, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:12 PM, Jan Kiszka wrote:
 On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:06 PM, Jan Kiszka wrote:
 On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:59 PM, Jan Kiszka wrote:
 On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
 task_struct *p)
magic = xnthread_get_magic(thread);
  
xnlock_get_irqsave(nklock, s);
 +
 +  gksched = thread-gksched;
 +  if (gksched) {
 +  xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is needed is 
 not an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also 
 preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper garanteed 
 to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We request 
 it,
 thus we wait for the gatekeeper to become idle again. While it is 
 idle,
 we reset the queued reference - but I just realized that this may 
 tramp
 on other tasks' values. I need to add a check that the value to be
 null'ified is actually still ours.

 Thinking again, that's actually not a problem: gktarget is only needed
 while gksync is zero - but then we won't get hold of it anyway and,
 thus, can't cause any damage.

 Well, you make it look like it does not work. From what I understand,
 what you want is to set gktarget to null if a task being hardened is
 destroyed. But by waiting for the semaphore, you actually wait for the
 harden to be complete, so setting to NULL is useless. Or am I missing
 something else?

 Setting to NULL is probably unneeded but still better than rely on the
 gatekeeper never waking up spuriously and then dereferencing a stale
 pointer.

 The key element of this fix is waitng on gksync, thus on the completion
 of the non-RT part of the hardening. Actually, this part usually fails
 as the target task received a termination signal at this point.

 Yes, but since you wait on the completion of the hardening, the test
 if (target ...) in the gatekeeper code will always be true, because at
 this point the cleanup code will still be waiting for the semaphore.

 Yes, except we will ever wake up the gatekeeper later on without an
 updated gktarget, ie. spuriously. Better safe than sorry, this is hairy
 code anyway (hopefully obsolete one day).

 The gatekeeper is not woken up by posting the semaphore, the gatekeeper
 is woken up by the thread which is going to be hardened (and this thread
 is the one which waits for the semaphore).

 All true. And what is the point?

 The point being, would not something like this patch be sufficient?

 diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
 index 01f4200..4742c02 100644
 --- a/ksrc/nucleus/shadow.c
 +++ b/ksrc/nucleus/shadow.c
 @@ -2527,6 +2527,18 @@ static inline void do_taskexit_event(struct
 task_struct *p)
  magic = xnthread_get_magic(thread);

  xnlock_get_irqsave(nklock, s);
 +if (xnthread_test_info(thread, XNATOMIC)) {
 +struct xnsched *gksched = xnpod_sched_slot(task_cpu(p));
 
 That's not reliable, the task might have been migrated by Linux in the
 meantime. We must use the stored gksched.
 
 +xnlock_put_irqrestore(nklock, s);
 +
 +/* Thread is in flight to primary mode, wait for the
 +   gatekeeper to be done with it. */
 +down(gksched-gksync);
 +up(gksched-gksync);
 +
 +xnlock_get_irqsave(nklock, s);
 +}
 +
  /* Prevent wakeup call from xnshadow_unmap(). */
  xnshadow_thrptd(p) = NULL;
  xnthread_archtcb(thread)-user_task = NULL;

 
 Again, setting gktarget to NULL and testing for NULL is simply safer,
 and I see no gain in skipping that. But if you prefer the
 micro-optimization, I'll drop it.

Could not we use an info bit instead of adding a pointer?

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 2011-07-12 12:59, Gilles Chanteperdrix wrote:
 On 07/12/2011 09:22 AM, Jan Kiszka wrote:
 On 2011-07-12 08:41, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:12 PM, Jan Kiszka wrote:
 On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:06 PM, Jan Kiszka wrote:
 On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:59 PM, Jan Kiszka wrote:
 On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void 
 do_taskexit_event(struct task_struct *p)
   magic = xnthread_get_magic(thread);
  
   xnlock_get_irqsave(nklock, s);
 +
 + gksched = thread-gksched;
 + if (gksched) {
 + xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is needed is 
 not an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also 
 preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper garanteed 
 to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We request 
 it,
 thus we wait for the gatekeeper to become idle again. While it is 
 idle,
 we reset the queued reference - but I just realized that this may 
 tramp
 on other tasks' values. I need to add a check that the value to be
 null'ified is actually still ours.

 Thinking again, that's actually not a problem: gktarget is only 
 needed
 while gksync is zero - but then we won't get hold of it anyway and,
 thus, can't cause any damage.

 Well, you make it look like it does not work. From what I understand,
 what you want is to set gktarget to null if a task being hardened is
 destroyed. But by waiting for the semaphore, you actually wait for the
 harden to be complete, so setting to NULL is useless. Or am I missing
 something else?

 Setting to NULL is probably unneeded but still better than rely on the
 gatekeeper never waking up spuriously and then dereferencing a stale
 pointer.

 The key element of this fix is waitng on gksync, thus on the completion
 of the non-RT part of the hardening. Actually, this part usually fails
 as the target task received a termination signal at this point.

 Yes, but since you wait on the completion of the hardening, the test
 if (target ...) in the gatekeeper code will always be true, because at
 this point the cleanup code will still be waiting for the semaphore.

 Yes, except we will ever wake up the gatekeeper later on without an
 updated gktarget, ie. spuriously. Better safe than sorry, this is hairy
 code anyway (hopefully obsolete one day).

 The gatekeeper is not woken up by posting the semaphore, the gatekeeper
 is woken up by the thread which is going to be hardened (and this thread
 is the one which waits for the semaphore).

 All true. And what is the point?

 The point being, would not something like this patch be sufficient?

 diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
 index 01f4200..4742c02 100644
 --- a/ksrc/nucleus/shadow.c
 +++ b/ksrc/nucleus/shadow.c
 @@ -2527,6 +2527,18 @@ static inline void do_taskexit_event(struct
 task_struct *p)
 magic = xnthread_get_magic(thread);

 xnlock_get_irqsave(nklock, s);
 +   if (xnthread_test_info(thread, XNATOMIC)) {
 +   struct xnsched *gksched = xnpod_sched_slot(task_cpu(p));

 That's not reliable, the task might have been migrated by Linux in the
 meantime. We must use the stored gksched.

 +   xnlock_put_irqrestore(nklock, s);
 +
 +   /* Thread is in flight to primary mode, wait for the
 +  gatekeeper to be done with it. */
 +   down(gksched-gksync);
 +   up(gksched-gksync);
 +
 +   xnlock_get_irqsave(nklock, s);
 +   }
 +
 /* Prevent wakeup call from xnshadow_unmap(). */
 xnshadow_thrptd(p) = NULL;
 xnthread_archtcb(thread)-user_task = NULL;


 Again, setting gktarget to NULL and testing for NULL is simply safer,
 and I see no gain in skipping that. But if you prefer the
 micro-optimization, I'll drop it.
 
 Could not we use an info bit instead of adding a pointer?
 

That's not reliable, the task might have been migrated by Linux in the
meantime. We must use the stored gksched.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 07/12/2011 01:00 PM, Jan Kiszka wrote:
 On 2011-07-12 12:59, Gilles Chanteperdrix wrote:
 On 07/12/2011 09:22 AM, Jan Kiszka wrote:
 On 2011-07-12 08:41, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:12 PM, Jan Kiszka wrote:
 On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:06 PM, Jan Kiszka wrote:
 On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:59 PM, Jan Kiszka wrote:
 On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void 
 do_taskexit_event(struct task_struct *p)
  magic = xnthread_get_magic(thread);
  
  xnlock_get_irqsave(nklock, s);
 +
 +gksched = thread-gksched;
 +if (gksched) {
 +xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is needed is 
 not an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also 
 preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper 
 garanteed to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We 
 request it,
 thus we wait for the gatekeeper to become idle again. While it is 
 idle,
 we reset the queued reference - but I just realized that this may 
 tramp
 on other tasks' values. I need to add a check that the value to be
 null'ified is actually still ours.

 Thinking again, that's actually not a problem: gktarget is only 
 needed
 while gksync is zero - but then we won't get hold of it anyway and,
 thus, can't cause any damage.

 Well, you make it look like it does not work. From what I understand,
 what you want is to set gktarget to null if a task being hardened is
 destroyed. But by waiting for the semaphore, you actually wait for 
 the
 harden to be complete, so setting to NULL is useless. Or am I missing
 something else?

 Setting to NULL is probably unneeded but still better than rely on the
 gatekeeper never waking up spuriously and then dereferencing a stale
 pointer.

 The key element of this fix is waitng on gksync, thus on the 
 completion
 of the non-RT part of the hardening. Actually, this part usually fails
 as the target task received a termination signal at this point.

 Yes, but since you wait on the completion of the hardening, the test
 if (target ...) in the gatekeeper code will always be true, because 
 at
 this point the cleanup code will still be waiting for the semaphore.

 Yes, except we will ever wake up the gatekeeper later on without an
 updated gktarget, ie. spuriously. Better safe than sorry, this is hairy
 code anyway (hopefully obsolete one day).

 The gatekeeper is not woken up by posting the semaphore, the gatekeeper
 is woken up by the thread which is going to be hardened (and this thread
 is the one which waits for the semaphore).

 All true. And what is the point?

 The point being, would not something like this patch be sufficient?

 diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
 index 01f4200..4742c02 100644
 --- a/ksrc/nucleus/shadow.c
 +++ b/ksrc/nucleus/shadow.c
 @@ -2527,6 +2527,18 @@ static inline void do_taskexit_event(struct
 task_struct *p)
magic = xnthread_get_magic(thread);

xnlock_get_irqsave(nklock, s);
 +  if (xnthread_test_info(thread, XNATOMIC)) {
 +  struct xnsched *gksched = xnpod_sched_slot(task_cpu(p));

 That's not reliable, the task might have been migrated by Linux in the
 meantime. We must use the stored gksched.

 +  xnlock_put_irqrestore(nklock, s);
 +
 +  /* Thread is in flight to primary mode, wait for the
 + gatekeeper to be done with it. */
 +  down(gksched-gksync);
 +  up(gksched-gksync);
 +
 +  xnlock_get_irqsave(nklock, s);
 +  }
 +
/* Prevent wakeup call from xnshadow_unmap(). */
xnshadow_thrptd(p) = NULL;
xnthread_archtcb(thread)-user_task = NULL;


 Again, setting gktarget to NULL and testing for NULL is simply safer,
 and I see no gain in skipping that. But if you prefer the
 micro-optimization, I'll drop it.

 Could not we use an info bit instead of adding a pointer?

 
 That's not reliable, the task might have been migrated by Linux in the
 meantime. We must use the stored gksched.

I mean add another info bit to mean that the task is queued for wakeup
by the gatekeeper.

XNGKQ, or something.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 2011-07-12 13:04, Gilles Chanteperdrix wrote:
 On 07/12/2011 01:00 PM, Jan Kiszka wrote:
 On 2011-07-12 12:59, Gilles Chanteperdrix wrote:
 On 07/12/2011 09:22 AM, Jan Kiszka wrote:
 On 2011-07-12 08:41, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:12 PM, Jan Kiszka wrote:
 On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:06 PM, Jan Kiszka wrote:
 On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:59 PM, Jan Kiszka wrote:
 On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void 
 do_taskexit_event(struct task_struct *p)
 magic = xnthread_get_magic(thread);
  
 xnlock_get_irqsave(nklock, s);
 +
 +   gksched = thread-gksched;
 +   if (gksched) {
 +   xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is needed 
 is not an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also 
 preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper 
 garanteed to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We 
 request it,
 thus we wait for the gatekeeper to become idle again. While it is 
 idle,
 we reset the queued reference - but I just realized that this may 
 tramp
 on other tasks' values. I need to add a check that the value to be
 null'ified is actually still ours.

 Thinking again, that's actually not a problem: gktarget is only 
 needed
 while gksync is zero - but then we won't get hold of it anyway and,
 thus, can't cause any damage.

 Well, you make it look like it does not work. From what I 
 understand,
 what you want is to set gktarget to null if a task being hardened is
 destroyed. But by waiting for the semaphore, you actually wait for 
 the
 harden to be complete, so setting to NULL is useless. Or am I 
 missing
 something else?

 Setting to NULL is probably unneeded but still better than rely on 
 the
 gatekeeper never waking up spuriously and then dereferencing a stale
 pointer.

 The key element of this fix is waitng on gksync, thus on the 
 completion
 of the non-RT part of the hardening. Actually, this part usually 
 fails
 as the target task received a termination signal at this point.

 Yes, but since you wait on the completion of the hardening, the test
 if (target ...) in the gatekeeper code will always be true, because 
 at
 this point the cleanup code will still be waiting for the semaphore.

 Yes, except we will ever wake up the gatekeeper later on without an
 updated gktarget, ie. spuriously. Better safe than sorry, this is hairy
 code anyway (hopefully obsolete one day).

 The gatekeeper is not woken up by posting the semaphore, the gatekeeper
 is woken up by the thread which is going to be hardened (and this thread
 is the one which waits for the semaphore).

 All true. And what is the point?

 The point being, would not something like this patch be sufficient?

 diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
 index 01f4200..4742c02 100644
 --- a/ksrc/nucleus/shadow.c
 +++ b/ksrc/nucleus/shadow.c
 @@ -2527,6 +2527,18 @@ static inline void do_taskexit_event(struct
 task_struct *p)
   magic = xnthread_get_magic(thread);

   xnlock_get_irqsave(nklock, s);
 + if (xnthread_test_info(thread, XNATOMIC)) {
 + struct xnsched *gksched = xnpod_sched_slot(task_cpu(p));

 That's not reliable, the task might have been migrated by Linux in the
 meantime. We must use the stored gksched.

 + xnlock_put_irqrestore(nklock, s);
 +
 + /* Thread is in flight to primary mode, wait for the
 +gatekeeper to be done with it. */
 + down(gksched-gksync);
 + up(gksched-gksync);
 +
 + xnlock_get_irqsave(nklock, s);
 + }
 +
   /* Prevent wakeup call from xnshadow_unmap(). */
   xnshadow_thrptd(p) = NULL;
   xnthread_archtcb(thread)-user_task = NULL;


 Again, setting gktarget to NULL and testing for NULL is simply safer,
 and I see no gain in skipping that. But if you prefer the
 micro-optimization, I'll drop it.

 Could not we use an info bit instead of adding a pointer?


 That's not reliable, the task might have been migrated by Linux in the
 meantime. We must use the stored gksched.
 
 I mean add another info bit to mean that the task is queued for wakeup
 by the gatekeeper.
 
 XNGKQ, or something.

What additional value does it provide to gksched != NULL? We need that
pointer anyway to identify the gatekeeper that holds a reference.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 07/12/2011 01:06 PM, Jan Kiszka wrote:
 On 2011-07-12 13:04, Gilles Chanteperdrix wrote:
 On 07/12/2011 01:00 PM, Jan Kiszka wrote:
 On 2011-07-12 12:59, Gilles Chanteperdrix wrote:
 On 07/12/2011 09:22 AM, Jan Kiszka wrote:
 On 2011-07-12 08:41, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:12 PM, Jan Kiszka wrote:
 On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:06 PM, Jan Kiszka wrote:
 On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:59 PM, Jan Kiszka wrote:
 On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void 
 do_taskexit_event(struct task_struct *p)
magic = xnthread_get_magic(thread);
  
xnlock_get_irqsave(nklock, s);
 +
 +  gksched = thread-gksched;
 +  if (gksched) {
 +  xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is needed 
 is not an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also 
 preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper 
 garanteed to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We 
 request it,
 thus we wait for the gatekeeper to become idle again. While it 
 is idle,
 we reset the queued reference - but I just realized that this 
 may tramp
 on other tasks' values. I need to add a check that the value to 
 be
 null'ified is actually still ours.

 Thinking again, that's actually not a problem: gktarget is only 
 needed
 while gksync is zero - but then we won't get hold of it anyway 
 and,
 thus, can't cause any damage.

 Well, you make it look like it does not work. From what I 
 understand,
 what you want is to set gktarget to null if a task being hardened 
 is
 destroyed. But by waiting for the semaphore, you actually wait for 
 the
 harden to be complete, so setting to NULL is useless. Or am I 
 missing
 something else?

 Setting to NULL is probably unneeded but still better than rely on 
 the
 gatekeeper never waking up spuriously and then dereferencing a stale
 pointer.

 The key element of this fix is waitng on gksync, thus on the 
 completion
 of the non-RT part of the hardening. Actually, this part usually 
 fails
 as the target task received a termination signal at this point.

 Yes, but since you wait on the completion of the hardening, the test
 if (target ...) in the gatekeeper code will always be true, 
 because at
 this point the cleanup code will still be waiting for the semaphore.

 Yes, except we will ever wake up the gatekeeper later on without an
 updated gktarget, ie. spuriously. Better safe than sorry, this is 
 hairy
 code anyway (hopefully obsolete one day).

 The gatekeeper is not woken up by posting the semaphore, the gatekeeper
 is woken up by the thread which is going to be hardened (and this 
 thread
 is the one which waits for the semaphore).

 All true. And what is the point?

 The point being, would not something like this patch be sufficient?

 diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
 index 01f4200..4742c02 100644
 --- a/ksrc/nucleus/shadow.c
 +++ b/ksrc/nucleus/shadow.c
 @@ -2527,6 +2527,18 @@ static inline void do_taskexit_event(struct
 task_struct *p)
  magic = xnthread_get_magic(thread);

  xnlock_get_irqsave(nklock, s);
 +if (xnthread_test_info(thread, XNATOMIC)) {
 +struct xnsched *gksched = xnpod_sched_slot(task_cpu(p));

 That's not reliable, the task might have been migrated by Linux in the
 meantime. We must use the stored gksched.

 +xnlock_put_irqrestore(nklock, s);
 +
 +/* Thread is in flight to primary mode, wait for the
 +   gatekeeper to be done with it. */
 +down(gksched-gksync);
 +up(gksched-gksync);
 +
 +xnlock_get_irqsave(nklock, s);
 +}
 +
  /* Prevent wakeup call from xnshadow_unmap(). */
  xnshadow_thrptd(p) = NULL;
  xnthread_archtcb(thread)-user_task = NULL;


 Again, setting gktarget to NULL and testing for NULL is simply safer,
 and I see no gain in skipping that. But if you prefer the
 micro-optimization, I'll drop it.

 Could not we use an info bit instead of adding a pointer?


 That's not reliable, the task might have been migrated by Linux in the
 meantime. We must use the stored gksched.

 I mean add another info bit to mean that the task is queued for wakeup
 by the gatekeeper.

 XNGKQ, or something.
 
 What additional value does it provide to gksched != NULL? We need that
 pointer anyway to identify the gatekeeper that holds a reference.

No, the scheduler which holds the reference is xnpod_sched_slot(task_cpu(p))

--

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 2011-07-12 13:08, Gilles Chanteperdrix wrote:
 On 07/12/2011 01:06 PM, Jan Kiszka wrote:
 On 2011-07-12 13:04, Gilles Chanteperdrix wrote:
 On 07/12/2011 01:00 PM, Jan Kiszka wrote:
 On 2011-07-12 12:59, Gilles Chanteperdrix wrote:
 On 07/12/2011 09:22 AM, Jan Kiszka wrote:
 On 2011-07-12 08:41, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:12 PM, Jan Kiszka wrote:
 On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:06 PM, Jan Kiszka wrote:
 On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:59 PM, Jan Kiszka wrote:
 On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void 
 do_taskexit_event(struct task_struct *p)
   magic = xnthread_get_magic(thread);
  
   xnlock_get_irqsave(nklock, s);
 +
 + gksched = thread-gksched;
 + if (gksched) {
 + xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is needed 
 is not an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also 
 preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper 
 garanteed to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We 
 request it,
 thus we wait for the gatekeeper to become idle again. While it 
 is idle,
 we reset the queued reference - but I just realized that this 
 may tramp
 on other tasks' values. I need to add a check that the value to 
 be
 null'ified is actually still ours.

 Thinking again, that's actually not a problem: gktarget is only 
 needed
 while gksync is zero - but then we won't get hold of it anyway 
 and,
 thus, can't cause any damage.

 Well, you make it look like it does not work. From what I 
 understand,
 what you want is to set gktarget to null if a task being hardened 
 is
 destroyed. But by waiting for the semaphore, you actually wait 
 for the
 harden to be complete, so setting to NULL is useless. Or am I 
 missing
 something else?

 Setting to NULL is probably unneeded but still better than rely on 
 the
 gatekeeper never waking up spuriously and then dereferencing a 
 stale
 pointer.

 The key element of this fix is waitng on gksync, thus on the 
 completion
 of the non-RT part of the hardening. Actually, this part usually 
 fails
 as the target task received a termination signal at this point.

 Yes, but since you wait on the completion of the hardening, the test
 if (target ...) in the gatekeeper code will always be true, 
 because at
 this point the cleanup code will still be waiting for the semaphore.

 Yes, except we will ever wake up the gatekeeper later on without an
 updated gktarget, ie. spuriously. Better safe than sorry, this is 
 hairy
 code anyway (hopefully obsolete one day).

 The gatekeeper is not woken up by posting the semaphore, the 
 gatekeeper
 is woken up by the thread which is going to be hardened (and this 
 thread
 is the one which waits for the semaphore).

 All true. And what is the point?

 The point being, would not something like this patch be sufficient?

 diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
 index 01f4200..4742c02 100644
 --- a/ksrc/nucleus/shadow.c
 +++ b/ksrc/nucleus/shadow.c
 @@ -2527,6 +2527,18 @@ static inline void do_taskexit_event(struct
 task_struct *p)
 magic = xnthread_get_magic(thread);

 xnlock_get_irqsave(nklock, s);
 +   if (xnthread_test_info(thread, XNATOMIC)) {
 +   struct xnsched *gksched = xnpod_sched_slot(task_cpu(p));

 That's not reliable, the task might have been migrated by Linux in the
 meantime. We must use the stored gksched.

 +   xnlock_put_irqrestore(nklock, s);
 +
 +   /* Thread is in flight to primary mode, wait for the
 +  gatekeeper to be done with it. */
 +   down(gksched-gksync);
 +   up(gksched-gksync);
 +
 +   xnlock_get_irqsave(nklock, s);
 +   }
 +
 /* Prevent wakeup call from xnshadow_unmap(). */
 xnshadow_thrptd(p) = NULL;
 xnthread_archtcb(thread)-user_task = NULL;


 Again, setting gktarget to NULL and testing for NULL is simply safer,
 and I see no gain in skipping that. But if you prefer the
 micro-optimization, I'll drop it.

 Could not we use an info bit instead of adding a pointer?


 That's not reliable, the task might have been migrated by Linux in the
 meantime. We must use the stored gksched.

 I mean add another info bit to mean that the task is queued for wakeup
 by the gatekeeper.

 XNGKQ, or something.

 What additional value does it provide to gksched != NULL? We need that
 pointer anyway to identify the gatekeeper that holds a reference.
 
 No, the scheduler which holds the reference is

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 07/12/2011 01:10 PM, Jan Kiszka wrote:
 On 2011-07-12 13:08, Gilles Chanteperdrix wrote:
 On 07/12/2011 01:06 PM, Jan Kiszka wrote:
 On 2011-07-12 13:04, Gilles Chanteperdrix wrote:
 On 07/12/2011 01:00 PM, Jan Kiszka wrote:
 On 2011-07-12 12:59, Gilles Chanteperdrix wrote:
 On 07/12/2011 09:22 AM, Jan Kiszka wrote:
 On 2011-07-12 08:41, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:12 PM, Jan Kiszka wrote:
 On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:06 PM, Jan Kiszka wrote:
 On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:59 PM, Jan Kiszka wrote:
 On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void 
 do_taskexit_event(struct task_struct *p)
  magic = xnthread_get_magic(thread);
  
  xnlock_get_irqsave(nklock, s);
 +
 +gksched = thread-gksched;
 +if (gksched) {
 +xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is 
 needed is not an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also 
 preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper 
 garanteed to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We 
 request it,
 thus we wait for the gatekeeper to become idle again. While it 
 is idle,
 we reset the queued reference - but I just realized that this 
 may tramp
 on other tasks' values. I need to add a check that the value 
 to be
 null'ified is actually still ours.

 Thinking again, that's actually not a problem: gktarget is only 
 needed
 while gksync is zero - but then we won't get hold of it anyway 
 and,
 thus, can't cause any damage.

 Well, you make it look like it does not work. From what I 
 understand,
 what you want is to set gktarget to null if a task being 
 hardened is
 destroyed. But by waiting for the semaphore, you actually wait 
 for the
 harden to be complete, so setting to NULL is useless. Or am I 
 missing
 something else?

 Setting to NULL is probably unneeded but still better than rely 
 on the
 gatekeeper never waking up spuriously and then dereferencing a 
 stale
 pointer.

 The key element of this fix is waitng on gksync, thus on the 
 completion
 of the non-RT part of the hardening. Actually, this part usually 
 fails
 as the target task received a termination signal at this point.

 Yes, but since you wait on the completion of the hardening, the 
 test
 if (target ...) in the gatekeeper code will always be true, 
 because at
 this point the cleanup code will still be waiting for the 
 semaphore.

 Yes, except we will ever wake up the gatekeeper later on without an
 updated gktarget, ie. spuriously. Better safe than sorry, this is 
 hairy
 code anyway (hopefully obsolete one day).

 The gatekeeper is not woken up by posting the semaphore, the 
 gatekeeper
 is woken up by the thread which is going to be hardened (and this 
 thread
 is the one which waits for the semaphore).

 All true. And what is the point?

 The point being, would not something like this patch be sufficient?

 diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
 index 01f4200..4742c02 100644
 --- a/ksrc/nucleus/shadow.c
 +++ b/ksrc/nucleus/shadow.c
 @@ -2527,6 +2527,18 @@ static inline void do_taskexit_event(struct
 task_struct *p)
magic = xnthread_get_magic(thread);

xnlock_get_irqsave(nklock, s);
 +  if (xnthread_test_info(thread, XNATOMIC)) {
 +  struct xnsched *gksched = xnpod_sched_slot(task_cpu(p));

 That's not reliable, the task might have been migrated by Linux in the
 meantime. We must use the stored gksched.

 +  xnlock_put_irqrestore(nklock, s);
 +
 +  /* Thread is in flight to primary mode, wait for the
 + gatekeeper to be done with it. */
 +  down(gksched-gksync);
 +  up(gksched-gksync);
 +
 +  xnlock_get_irqsave(nklock, s);
 +  }
 +
/* Prevent wakeup call from xnshadow_unmap(). */
xnshadow_thrptd(p) = NULL;
xnthread_archtcb(thread)-user_task = NULL;


 Again, setting gktarget to NULL and testing for NULL is simply safer,
 and I see no gain in skipping that. But if you prefer the
 micro-optimization, I'll drop it.

 Could not we use an info bit instead of adding a pointer?


 That's not reliable, the task might have been migrated by Linux in the
 meantime. We must use the stored gksched.

 I mean add another info bit to mean that the task is queued for wakeup
 by the gatekeeper.

 XNGKQ, or something.

 What additional value does it provide to gksched != NULL? We need that
 pointer anyway to identify the gatekeeper that holds a reference.

 No, the scheduler which

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 2011-07-12 13:26, Gilles Chanteperdrix wrote:
 On 07/12/2011 01:10 PM, Jan Kiszka wrote:
 On 2011-07-12 13:08, Gilles Chanteperdrix wrote:
 On 07/12/2011 01:06 PM, Jan Kiszka wrote:
 On 2011-07-12 13:04, Gilles Chanteperdrix wrote:
 On 07/12/2011 01:00 PM, Jan Kiszka wrote:
 On 2011-07-12 12:59, Gilles Chanteperdrix wrote:
 On 07/12/2011 09:22 AM, Jan Kiszka wrote:
 On 2011-07-12 08:41, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:12 PM, Jan Kiszka wrote:
 On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:06 PM, Jan Kiszka wrote:
 On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:59 PM, Jan Kiszka wrote:
 On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void 
 do_taskexit_event(struct task_struct *p)
 magic = xnthread_get_magic(thread);
  
 xnlock_get_irqsave(nklock, s);
 +
 +   gksched = thread-gksched;
 +   if (gksched) {
 +   xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is 
 needed is not an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also 
 preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper 
 garanteed to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We 
 request it,
 thus we wait for the gatekeeper to become idle again. While 
 it is idle,
 we reset the queued reference - but I just realized that this 
 may tramp
 on other tasks' values. I need to add a check that the value 
 to be
 null'ified is actually still ours.

 Thinking again, that's actually not a problem: gktarget is 
 only needed
 while gksync is zero - but then we won't get hold of it anyway 
 and,
 thus, can't cause any damage.

 Well, you make it look like it does not work. From what I 
 understand,
 what you want is to set gktarget to null if a task being 
 hardened is
 destroyed. But by waiting for the semaphore, you actually wait 
 for the
 harden to be complete, so setting to NULL is useless. Or am I 
 missing
 something else?

 Setting to NULL is probably unneeded but still better than rely 
 on the
 gatekeeper never waking up spuriously and then dereferencing a 
 stale
 pointer.

 The key element of this fix is waitng on gksync, thus on the 
 completion
 of the non-RT part of the hardening. Actually, this part usually 
 fails
 as the target task received a termination signal at this point.

 Yes, but since you wait on the completion of the hardening, the 
 test
 if (target ...) in the gatekeeper code will always be true, 
 because at
 this point the cleanup code will still be waiting for the 
 semaphore.

 Yes, except we will ever wake up the gatekeeper later on without an
 updated gktarget, ie. spuriously. Better safe than sorry, this is 
 hairy
 code anyway (hopefully obsolete one day).

 The gatekeeper is not woken up by posting the semaphore, the 
 gatekeeper
 is woken up by the thread which is going to be hardened (and this 
 thread
 is the one which waits for the semaphore).

 All true. And what is the point?

 The point being, would not something like this patch be sufficient?

 diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
 index 01f4200..4742c02 100644
 --- a/ksrc/nucleus/shadow.c
 +++ b/ksrc/nucleus/shadow.c
 @@ -2527,6 +2527,18 @@ static inline void do_taskexit_event(struct
 task_struct *p)
   magic = xnthread_get_magic(thread);

   xnlock_get_irqsave(nklock, s);
 + if (xnthread_test_info(thread, XNATOMIC)) {
 + struct xnsched *gksched = xnpod_sched_slot(task_cpu(p));

 That's not reliable, the task might have been migrated by Linux in the
 meantime. We must use the stored gksched.

 + xnlock_put_irqrestore(nklock, s);
 +
 + /* Thread is in flight to primary mode, wait for the
 +gatekeeper to be done with it. */
 + down(gksched-gksync);
 + up(gksched-gksync);
 +
 + xnlock_get_irqsave(nklock, s);
 + }
 +
   /* Prevent wakeup call from xnshadow_unmap(). */
   xnshadow_thrptd(p) = NULL;
   xnthread_archtcb(thread)-user_task = NULL;


 Again, setting gktarget to NULL and testing for NULL is simply safer,
 and I see no gain in skipping that. But if you prefer the
 micro-optimization, I'll drop it.

 Could not we use an info bit instead of adding a pointer?


 That's not reliable, the task might have been migrated by Linux in the
 meantime. We must use the stored gksched.

 I mean add another info bit to mean that the task is queued for wakeup
 by the gatekeeper.

 XNGKQ, or something.

 What additional value does it provide to gksched != NULL? We need that
 pointer anyway to identify the gatekeeper that holds a reference.

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 2011-07-12 13:56, Jan Kiszka wrote:
 However, this parallel unsynchronized execution of the gatekeeper and
 its target thread leaves an increasingly bad feeling on my side. Did we
 really catch all corner cases now? I wouldn't guarantee that yet.
 Specifically as I still have an obscure crash of a Xenomai thread on
 Linux schedule() on my table.
 
 What if the target thread woke up due to a signal, continued much
 further on a different CPU, blocked in TASK_INTERRUPTIBLE, and then the
 gatekeeper continued? I wish we could already eliminate this complexity
 and do the migration directly inside schedule()...

BTW, we do we mask out TASK_ATOMICSWITCH when checking the task state in
the gatekeeper? What would happen if we included it (state ==
(TASK_ATOMICSWITCH | TASK_INTERRUPTIBLE))?

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 07/12/2011 01:58 PM, Jan Kiszka wrote:
 On 2011-07-12 13:56, Jan Kiszka wrote:
 However, this parallel unsynchronized execution of the gatekeeper and
 its target thread leaves an increasingly bad feeling on my side. Did we
 really catch all corner cases now? I wouldn't guarantee that yet.
 Specifically as I still have an obscure crash of a Xenomai thread on
 Linux schedule() on my table.

 What if the target thread woke up due to a signal, continued much
 further on a different CPU, blocked in TASK_INTERRUPTIBLE, and then the
 gatekeeper continued? I wish we could already eliminate this complexity
 and do the migration directly inside schedule()...
 
 BTW, we do we mask out TASK_ATOMICSWITCH when checking the task state in
 the gatekeeper? What would happen if we included it (state ==
 (TASK_ATOMICSWITCH | TASK_INTERRUPTIBLE))?

I would tend to think that what we should check is
xnthread_test_info(XNATOMIC). Or maybe check both, the interruptible
state and the XNATOMIC info bit.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 2011-07-12 14:06, Gilles Chanteperdrix wrote:
 On 07/12/2011 01:58 PM, Jan Kiszka wrote:
 On 2011-07-12 13:56, Jan Kiszka wrote:
 However, this parallel unsynchronized execution of the gatekeeper and
 its target thread leaves an increasingly bad feeling on my side. Did we
 really catch all corner cases now? I wouldn't guarantee that yet.
 Specifically as I still have an obscure crash of a Xenomai thread on
 Linux schedule() on my table.

 What if the target thread woke up due to a signal, continued much
 further on a different CPU, blocked in TASK_INTERRUPTIBLE, and then the
 gatekeeper continued? I wish we could already eliminate this complexity
 and do the migration directly inside schedule()...

 BTW, we do we mask out TASK_ATOMICSWITCH when checking the task state in
 the gatekeeper? What would happen if we included it (state ==
 (TASK_ATOMICSWITCH | TASK_INTERRUPTIBLE))?
 
 I would tend to think that what we should check is
 xnthread_test_info(XNATOMIC). Or maybe check both, the interruptible
 state and the XNATOMIC info bit.

Actually, neither the info bits nor the task state is sufficiently
synchronized against the gatekeeper yet. We need to hold a shared lock
when testing and resetting the state. I'm not sure yet if that is
fixable given the gatekeeper architecture.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-12 Thread Philippe Gerum

On Tue, 2011-07-12 at 14:57 +0200, Jan Kiszka wrote:
 On 2011-07-12 14:13, Jan Kiszka wrote:
  On 2011-07-12 14:06, Gilles Chanteperdrix wrote:
  On 07/12/2011 01:58 PM, Jan Kiszka wrote:
  On 2011-07-12 13:56, Jan Kiszka wrote:
  However, this parallel unsynchronized execution of the gatekeeper and
  its target thread leaves an increasingly bad feeling on my side. Did we
  really catch all corner cases now? I wouldn't guarantee that yet.
  Specifically as I still have an obscure crash of a Xenomai thread on
  Linux schedule() on my table.
 
  What if the target thread woke up due to a signal, continued much
  further on a different CPU, blocked in TASK_INTERRUPTIBLE, and then the
  gatekeeper continued? I wish we could already eliminate this complexity
  and do the migration directly inside schedule()...
 
  BTW, we do we mask out TASK_ATOMICSWITCH when checking the task state in
  the gatekeeper? What would happen if we included it (state ==
  (TASK_ATOMICSWITCH | TASK_INTERRUPTIBLE))?
 
  I would tend to think that what we should check is
  xnthread_test_info(XNATOMIC). Or maybe check both, the interruptible
  state and the XNATOMIC info bit.
  
  Actually, neither the info bits nor the task state is sufficiently
  synchronized against the gatekeeper yet. We need to hold a shared lock
  when testing and resetting the state. I'm not sure yet if that is
  fixable given the gatekeeper architecture.
  
 
 This may work (on top of the exit-race fix):
 
 diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
 index 50dcf43..90feb16 100644
 --- a/ksrc/nucleus/shadow.c
 +++ b/ksrc/nucleus/shadow.c
 @@ -913,20 +913,27 @@ static int gatekeeper_thread(void *data)
   if ((xnthread_user_task(target)-state  ~TASK_ATOMICSWITCH) == 
 TASK_INTERRUPTIBLE) {
   rpi_pop(target);
   xnlock_get_irqsave(nklock, s);
 -#ifdef CONFIG_SMP
 +
   /*
 -  * If the task changed its CPU while in
 -  * secondary mode, change the CPU of the
 -  * underlying Xenomai shadow too. We do not
 -  * migrate the thread timers here, it would
 -  * not work. For a full migration comprising
 -  * timers, using xnpod_migrate_thread is
 -  * required.
 +  * Recheck XNATOMIC to avoid waking the shadow if the
 +  * Linux task received a signal meanwhile.
*/
 - if (target-sched != sched)
 - xnsched_migrate_passive(target, sched);
 + if (xnthread_test_info(target, XNATOMIC)) {
 +#ifdef CONFIG_SMP
 + /*
 +  * If the task changed its CPU while in
 +  * secondary mode, change the CPU of the
 +  * underlying Xenomai shadow too. We do not
 +  * migrate the thread timers here, it would
 +  * not work. For a full migration comprising
 +  * timers, using xnpod_migrate_thread is
 +  * required.
 +  */
 + if (target-sched != sched)
 + xnsched_migrate_passive(target, sched);
  #endif /* CONFIG_SMP */
 - xnpod_resume_thread(target, XNRELAX);
 + xnpod_resume_thread(target, XNRELAX);
 + }
   xnlock_put_irqrestore(nklock, s);
   xnpod_schedule();
   }
 @@ -1036,6 +1043,7 @@ redo:
* to process this signal anyway.
*/
   if (rthal_current_domain == rthal_root_domain) {
 + XENO_BUGON(NUCLEUS, xnthread_test_info(thread, XNATOMIC));
   if (XENO_DEBUG(NUCLEUS)  (!signal_pending(this_task)
   || this_task-state != TASK_RUNNING))
   xnpod_fatal
 @@ -1044,6 +1052,8 @@ redo:
   return -ERESTARTSYS;
   }
  
 + xnthread_clear_info(thread, XNATOMIC);
 +
   /* current is now running into the Xenomai domain. */
   thread-gksched = NULL;
   sched = xnsched_finish_unlocked_switch(thread-sched);
 @@ -2650,6 +2660,8 @@ static inline void do_sigwake_event(struct task_struct 
 *p)
  
   xnlock_get_irqsave(nklock, s);
  
 + xnthread_clear_info(thread, XNATOMIC);
 +
   if ((p-ptrace  PT_PTRACED)  !xnthread_test_state(thread, XNDEBUG)) {
   sigset_t pending;
  
 
 It totally ignores RPI and PREEMPT_RT for now. RPI is broken anyway,

I want to drop RPI in v3 for sure because it is misleading people. I'm
still pondering whether we should do that earlier during the 2.6
timeframe.

 ripping it out would allow to use solely XNATOMIC as condition in the
 gatekeeper.
 
 /me is now looking to get

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 2011-07-12 17:48, Philippe Gerum wrote:
 On Tue, 2011-07-12 at 14:57 +0200, Jan Kiszka wrote:
 On 2011-07-12 14:13, Jan Kiszka wrote:
 On 2011-07-12 14:06, Gilles Chanteperdrix wrote:
 On 07/12/2011 01:58 PM, Jan Kiszka wrote:
 On 2011-07-12 13:56, Jan Kiszka wrote:
 However, this parallel unsynchronized execution of the gatekeeper and
 its target thread leaves an increasingly bad feeling on my side. Did we
 really catch all corner cases now? I wouldn't guarantee that yet.
 Specifically as I still have an obscure crash of a Xenomai thread on
 Linux schedule() on my table.

 What if the target thread woke up due to a signal, continued much
 further on a different CPU, blocked in TASK_INTERRUPTIBLE, and then the
 gatekeeper continued? I wish we could already eliminate this complexity
 and do the migration directly inside schedule()...

 BTW, we do we mask out TASK_ATOMICSWITCH when checking the task state in
 the gatekeeper? What would happen if we included it (state ==
 (TASK_ATOMICSWITCH | TASK_INTERRUPTIBLE))?

 I would tend to think that what we should check is
 xnthread_test_info(XNATOMIC). Or maybe check both, the interruptible
 state and the XNATOMIC info bit.

 Actually, neither the info bits nor the task state is sufficiently
 synchronized against the gatekeeper yet. We need to hold a shared lock
 when testing and resetting the state. I'm not sure yet if that is
 fixable given the gatekeeper architecture.


 This may work (on top of the exit-race fix):

 diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
 index 50dcf43..90feb16 100644
 --- a/ksrc/nucleus/shadow.c
 +++ b/ksrc/nucleus/shadow.c
 @@ -913,20 +913,27 @@ static int gatekeeper_thread(void *data)
  if ((xnthread_user_task(target)-state  ~TASK_ATOMICSWITCH) == 
 TASK_INTERRUPTIBLE) {
  rpi_pop(target);
  xnlock_get_irqsave(nklock, s);
 -#ifdef CONFIG_SMP
 +
  /*
 - * If the task changed its CPU while in
 - * secondary mode, change the CPU of the
 - * underlying Xenomai shadow too. We do not
 - * migrate the thread timers here, it would
 - * not work. For a full migration comprising
 - * timers, using xnpod_migrate_thread is
 - * required.
 + * Recheck XNATOMIC to avoid waking the shadow if the
 + * Linux task received a signal meanwhile.
   */
 -if (target-sched != sched)
 -xnsched_migrate_passive(target, sched);
 +if (xnthread_test_info(target, XNATOMIC)) {
 +#ifdef CONFIG_SMP
 +/*
 + * If the task changed its CPU while in
 + * secondary mode, change the CPU of the
 + * underlying Xenomai shadow too. We do not
 + * migrate the thread timers here, it would
 + * not work. For a full migration comprising
 + * timers, using xnpod_migrate_thread is
 + * required.
 + */
 +if (target-sched != sched)
 +xnsched_migrate_passive(target, sched);
  #endif /* CONFIG_SMP */
 -xnpod_resume_thread(target, XNRELAX);
 +xnpod_resume_thread(target, XNRELAX);
 +}
  xnlock_put_irqrestore(nklock, s);
  xnpod_schedule();
  }
 @@ -1036,6 +1043,7 @@ redo:
   * to process this signal anyway.
   */
  if (rthal_current_domain == rthal_root_domain) {
 +XENO_BUGON(NUCLEUS, xnthread_test_info(thread, XNATOMIC));
  if (XENO_DEBUG(NUCLEUS)  (!signal_pending(this_task)
  || this_task-state != TASK_RUNNING))
  xnpod_fatal
 @@ -1044,6 +1052,8 @@ redo:
  return -ERESTARTSYS;
  }
  
 +xnthread_clear_info(thread, XNATOMIC);
 +
  /* current is now running into the Xenomai domain. */
  thread-gksched = NULL;
  sched = xnsched_finish_unlocked_switch(thread-sched);
 @@ -2650,6 +2660,8 @@ static inline void do_sigwake_event(struct task_struct 
 *p)
  
  xnlock_get_irqsave(nklock, s);
  
 +xnthread_clear_info(thread, XNATOMIC);
 +
  if ((p-ptrace  PT_PTRACED)  !xnthread_test_state(thread, XNDEBUG)) {
  sigset_t pending;
  

 It totally ignores RPI and PREEMPT_RT for now. RPI is broken anyway,
 
 I want to drop RPI in v3 for sure because it is misleading people. I'm
 still pondering whether we should do that earlier during the 2.6
 timeframe.

That would only leave us with XNATOMIC being used under PREEMPT-RT for
signaling LO_GKWAKE_REQ on schedule out while my patch may clear it on
signal

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 07/12/2011 02:57 PM, Jan Kiszka wrote:
   xnlock_put_irqrestore(nklock, s);
   xnpod_schedule();
   }
 @@ -1036,6 +1043,7 @@ redo:
* to process this signal anyway.
*/
   if (rthal_current_domain == rthal_root_domain) {
 + XENO_BUGON(NUCLEUS, xnthread_test_info(thread, XNATOMIC));

Misleading dead code again, XNATOMIC is cleared not ten lines above.

   if (XENO_DEBUG(NUCLEUS)  (!signal_pending(this_task)
   || this_task-state != TASK_RUNNING))
   xnpod_fatal
 @@ -1044,6 +1052,8 @@ redo:
   return -ERESTARTSYS;
   }
  
 + xnthread_clear_info(thread, XNATOMIC);

Why this? I find the xnthread_clear_info(XNATOMIC) right at the right
place at the point it currently is.

 +
   /* current is now running into the Xenomai domain. */
   thread-gksched = NULL;
   sched = xnsched_finish_unlocked_switch(thread-sched);
 @@ -2650,6 +2660,8 @@ static inline void do_sigwake_event(struct task_struct 
 *p)
  
   xnlock_get_irqsave(nklock, s);
  
 + xnthread_clear_info(thread, XNATOMIC);
 +

Ditto.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 2011-07-12 19:31, Gilles Chanteperdrix wrote:
 On 07/12/2011 02:57 PM, Jan Kiszka wrote:
  xnlock_put_irqrestore(nklock, s);
  xnpod_schedule();
  }
 @@ -1036,6 +1043,7 @@ redo:
   * to process this signal anyway.
   */
  if (rthal_current_domain == rthal_root_domain) {
 +XENO_BUGON(NUCLEUS, xnthread_test_info(thread, XNATOMIC));
 
 Misleading dead code again, XNATOMIC is cleared not ten lines above.

Nope, I forgot to remove that line.

 
  if (XENO_DEBUG(NUCLEUS)  (!signal_pending(this_task)
  || this_task-state != TASK_RUNNING))
  xnpod_fatal
 @@ -1044,6 +1052,8 @@ redo:
  return -ERESTARTSYS;
  }
  
 +xnthread_clear_info(thread, XNATOMIC);
 
 Why this? I find the xnthread_clear_info(XNATOMIC) right at the right
 place at the point it currently is.

Nope. Now we either clear XNATOMIC after successful migration or when
the signal is about to be sent (ie. in the hook). That way we can test
more reliably (TM) in the gatekeeper if the thread can be migrated.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 07/12/2011 07:34 PM, Jan Kiszka wrote:
 On 2011-07-12 19:31, Gilles Chanteperdrix wrote:
 On 07/12/2011 02:57 PM, Jan Kiszka wrote:
 xnlock_put_irqrestore(nklock, s);
 xnpod_schedule();
 }
 @@ -1036,6 +1043,7 @@ redo:
  * to process this signal anyway.
  */
 if (rthal_current_domain == rthal_root_domain) {
 +   XENO_BUGON(NUCLEUS, xnthread_test_info(thread, XNATOMIC));

 Misleading dead code again, XNATOMIC is cleared not ten lines above.
 
 Nope, I forgot to remove that line.
 

 if (XENO_DEBUG(NUCLEUS)  (!signal_pending(this_task)
 || this_task-state != TASK_RUNNING))
 xnpod_fatal
 @@ -1044,6 +1052,8 @@ redo:
 return -ERESTARTSYS;
 }
  
 +   xnthread_clear_info(thread, XNATOMIC);

 Why this? I find the xnthread_clear_info(XNATOMIC) right at the right
 place at the point it currently is.
 
 Nope. Now we either clear XNATOMIC after successful migration or when
 the signal is about to be sent (ie. in the hook). That way we can test
 more reliably (TM) in the gatekeeper if the thread can be migrated.

Ok for adding the XNATOMIC test, because it improves the robustness, but
why changing the way XNATOMIC is set and clear? Chances of breaking
thing while changing code in this area are really high...

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 2011-07-12 19:38, Gilles Chanteperdrix wrote:
 On 07/12/2011 07:34 PM, Jan Kiszka wrote:
 On 2011-07-12 19:31, Gilles Chanteperdrix wrote:
 On 07/12/2011 02:57 PM, Jan Kiszka wrote:
xnlock_put_irqrestore(nklock, s);
xnpod_schedule();
}
 @@ -1036,6 +1043,7 @@ redo:
 * to process this signal anyway.
 */
if (rthal_current_domain == rthal_root_domain) {
 +  XENO_BUGON(NUCLEUS, xnthread_test_info(thread, XNATOMIC));

 Misleading dead code again, XNATOMIC is cleared not ten lines above.

 Nope, I forgot to remove that line.


if (XENO_DEBUG(NUCLEUS)  (!signal_pending(this_task)
|| this_task-state != TASK_RUNNING))
xnpod_fatal
 @@ -1044,6 +1052,8 @@ redo:
return -ERESTARTSYS;
}
  
 +  xnthread_clear_info(thread, XNATOMIC);

 Why this? I find the xnthread_clear_info(XNATOMIC) right at the right
 place at the point it currently is.

 Nope. Now we either clear XNATOMIC after successful migration or when
 the signal is about to be sent (ie. in the hook). That way we can test
 more reliably (TM) in the gatekeeper if the thread can be migrated.
 
 Ok for adding the XNATOMIC test, because it improves the robustness, but
 why changing the way XNATOMIC is set and clear? Chances of breaking
 thing while changing code in this area are really high...

The current code is (most probably) broken as it does not properly
synchronizes the gatekeeper against a signaled and runaway target
Linux task.

We need an indication if a Linux signal will (or already has) woken up
the to-be-migrated task. That task may have continued over its context,
potentially on a different CPU. Providing this indication is the purpose
of changing where XNATOMIC is cleared.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
 task_struct *p)
   magic = xnthread_get_magic(thread);
  
   xnlock_get_irqsave(nklock, s);
 +
 + gksched = thread-gksched;
 + if (gksched) {
 + xnlock_put_irqrestore(nklock, s);

Are we sure irqs are on here? Are you sure that what is needed is not an
xnlock_clear_irqon? Furthermore, I do not understand how we
synchronize with the gatekeeper, how is the gatekeeper garanteed to
wait for this assignment?


-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
 task_struct *p)
  magic = xnthread_get_magic(thread);
  
  xnlock_get_irqsave(nklock, s);
 +
 +gksched = thread-gksched;
 +if (gksched) {
 +xnlock_put_irqrestore(nklock, s);
 
 Are we sure irqs are on here? Are you sure that what is needed is not an
 xnlock_clear_irqon?

We are in the context of do_exit. Not only IRQs are on, also preemption.
And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper garanteed to
 wait for this assignment?

The gatekeeper holds the gksync token while it's active. We request it,
thus we wait for the gatekeeper to become idle again. While it is idle,
we reset the queued reference - but I just realized that this may tramp
on other tasks' values. I need to add a check that the value to be
null'ified is actually still ours.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
 task_struct *p)
 magic = xnthread_get_magic(thread);
  
 xnlock_get_irqsave(nklock, s);
 +
 +   gksched = thread-gksched;
 +   if (gksched) {
 +   xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is needed is not an
 xnlock_clear_irqon?
 
 We are in the context of do_exit. Not only IRQs are on, also preemption.
 And surely no nklock is held.
 
 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper garanteed to
 wait for this assignment?
 
 The gatekeeper holds the gksync token while it's active. We request it,
 thus we wait for the gatekeeper to become idle again. While it is idle,
 we reset the queued reference - but I just realized that this may tramp
 on other tasks' values. I need to add a check that the value to be
 null'ified is actually still ours.

Thinking again, that's actually not a problem: gktarget is only needed
while gksync is zero - but then we won't get hold of it anyway and,
thus, can't cause any damage.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
 task_struct *p)
magic = xnthread_get_magic(thread);
  
xnlock_get_irqsave(nklock, s);
 +
 +  gksched = thread-gksched;
 +  if (gksched) {
 +  xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is needed is not an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper garanteed to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We request it,
 thus we wait for the gatekeeper to become idle again. While it is idle,
 we reset the queued reference - but I just realized that this may tramp
 on other tasks' values. I need to add a check that the value to be
 null'ified is actually still ours.
 
 Thinking again, that's actually not a problem: gktarget is only needed
 while gksync is zero - but then we won't get hold of it anyway and,
 thus, can't cause any damage.

Well, you make it look like it does not work. From what I understand,
what you want is to set gktarget to null if a task being hardened is
destroyed. But by waiting for the semaphore, you actually wait for the
harden to be complete, so setting to NULL is useless. Or am I missing
something else?

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
 task_struct *p)
   magic = xnthread_get_magic(thread);
  
   xnlock_get_irqsave(nklock, s);
 +
 + gksched = thread-gksched;
 + if (gksched) {
 + xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is needed is not an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper garanteed to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We request it,
 thus we wait for the gatekeeper to become idle again. While it is idle,
 we reset the queued reference - but I just realized that this may tramp
 on other tasks' values. I need to add a check that the value to be
 null'ified is actually still ours.

 Thinking again, that's actually not a problem: gktarget is only needed
 while gksync is zero - but then we won't get hold of it anyway and,
 thus, can't cause any damage.
 
 Well, you make it look like it does not work. From what I understand,
 what you want is to set gktarget to null if a task being hardened is
 destroyed. But by waiting for the semaphore, you actually wait for the
 harden to be complete, so setting to NULL is useless. Or am I missing
 something else?

Setting to NULL is probably unneeded but still better than rely on the
gatekeeper never waking up spuriously and then dereferencing a stale
pointer.

The key element of this fix is waitng on gksync, thus on the completion
of the non-RT part of the hardening. Actually, this part usually fails
as the target task received a termination signal at this point.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 07/11/2011 09:59 PM, Jan Kiszka wrote:
 On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
 task_struct *p)
  magic = xnthread_get_magic(thread);
  
  xnlock_get_irqsave(nklock, s);
 +
 +gksched = thread-gksched;
 +if (gksched) {
 +xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is needed is not an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper garanteed to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We request it,
 thus we wait for the gatekeeper to become idle again. While it is idle,
 we reset the queued reference - but I just realized that this may tramp
 on other tasks' values. I need to add a check that the value to be
 null'ified is actually still ours.

 Thinking again, that's actually not a problem: gktarget is only needed
 while gksync is zero - but then we won't get hold of it anyway and,
 thus, can't cause any damage.

 Well, you make it look like it does not work. From what I understand,
 what you want is to set gktarget to null if a task being hardened is
 destroyed. But by waiting for the semaphore, you actually wait for the
 harden to be complete, so setting to NULL is useless. Or am I missing
 something else?
 
 Setting to NULL is probably unneeded but still better than rely on the
 gatekeeper never waking up spuriously and then dereferencing a stale
 pointer.
 
 The key element of this fix is waitng on gksync, thus on the completion
 of the non-RT part of the hardening. Actually, this part usually fails
 as the target task received a termination signal at this point.

Yes, but since you wait on the completion of the hardening, the test
if (target ...) in the gatekeeper code will always be true, because at
this point the cleanup code will still be waiting for the semaphore.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:59 PM, Jan Kiszka wrote:
 On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
 task_struct *p)
 magic = xnthread_get_magic(thread);
  
 xnlock_get_irqsave(nklock, s);
 +
 +   gksched = thread-gksched;
 +   if (gksched) {
 +   xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is needed is not an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper garanteed to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We request it,
 thus we wait for the gatekeeper to become idle again. While it is idle,
 we reset the queued reference - but I just realized that this may tramp
 on other tasks' values. I need to add a check that the value to be
 null'ified is actually still ours.

 Thinking again, that's actually not a problem: gktarget is only needed
 while gksync is zero - but then we won't get hold of it anyway and,
 thus, can't cause any damage.

 Well, you make it look like it does not work. From what I understand,
 what you want is to set gktarget to null if a task being hardened is
 destroyed. But by waiting for the semaphore, you actually wait for the
 harden to be complete, so setting to NULL is useless. Or am I missing
 something else?

 Setting to NULL is probably unneeded but still better than rely on the
 gatekeeper never waking up spuriously and then dereferencing a stale
 pointer.

 The key element of this fix is waitng on gksync, thus on the completion
 of the non-RT part of the hardening. Actually, this part usually fails
 as the target task received a termination signal at this point.
 
 Yes, but since you wait on the completion of the hardening, the test
 if (target ...) in the gatekeeper code will always be true, because at
 this point the cleanup code will still be waiting for the semaphore.

Yes, except we will ever wake up the gatekeeper later on without an
updated gktarget, ie. spuriously. Better safe than sorry, this is hairy
code anyway (hopefully obsolete one day).

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 07/11/2011 10:06 PM, Jan Kiszka wrote:
 On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:59 PM, Jan Kiszka wrote:
 On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
 task_struct *p)
magic = xnthread_get_magic(thread);
  
xnlock_get_irqsave(nklock, s);
 +
 +  gksched = thread-gksched;
 +  if (gksched) {
 +  xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is needed is not an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper garanteed to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We request it,
 thus we wait for the gatekeeper to become idle again. While it is idle,
 we reset the queued reference - but I just realized that this may tramp
 on other tasks' values. I need to add a check that the value to be
 null'ified is actually still ours.

 Thinking again, that's actually not a problem: gktarget is only needed
 while gksync is zero - but then we won't get hold of it anyway and,
 thus, can't cause any damage.

 Well, you make it look like it does not work. From what I understand,
 what you want is to set gktarget to null if a task being hardened is
 destroyed. But by waiting for the semaphore, you actually wait for the
 harden to be complete, so setting to NULL is useless. Or am I missing
 something else?

 Setting to NULL is probably unneeded but still better than rely on the
 gatekeeper never waking up spuriously and then dereferencing a stale
 pointer.

 The key element of this fix is waitng on gksync, thus on the completion
 of the non-RT part of the hardening. Actually, this part usually fails
 as the target task received a termination signal at this point.

 Yes, but since you wait on the completion of the hardening, the test
 if (target ...) in the gatekeeper code will always be true, because at
 this point the cleanup code will still be waiting for the semaphore.
 
 Yes, except we will ever wake up the gatekeeper later on without an
 updated gktarget, ie. spuriously. Better safe than sorry, this is hairy
 code anyway (hopefully obsolete one day).

The gatekeeper is not woken up by posting the semaphore, the gatekeeper
is woken up by the thread which is going to be hardened (and this thread
is the one which waits for the semaphore).

 
 Jan
 


-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
 On 07/11/2011 10:06 PM, Jan Kiszka wrote:
 On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:59 PM, Jan Kiszka wrote:
 On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
 On 07/11/2011 09:16 PM, Jan Kiszka wrote:
 On 2011-07-11 21:10, Jan Kiszka wrote:
 On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
 On 07/08/2011 06:29 PM, GIT version control wrote:
 @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
 task_struct *p)
   magic = xnthread_get_magic(thread);
  
   xnlock_get_irqsave(nklock, s);
 +
 + gksched = thread-gksched;
 + if (gksched) {
 + xnlock_put_irqrestore(nklock, s);

 Are we sure irqs are on here? Are you sure that what is needed is not 
 an
 xnlock_clear_irqon?

 We are in the context of do_exit. Not only IRQs are on, also preemption.
 And surely no nklock is held.

 Furthermore, I do not understand how we
 synchronize with the gatekeeper, how is the gatekeeper garanteed to
 wait for this assignment?

 The gatekeeper holds the gksync token while it's active. We request it,
 thus we wait for the gatekeeper to become idle again. While it is idle,
 we reset the queued reference - but I just realized that this may tramp
 on other tasks' values. I need to add a check that the value to be
 null'ified is actually still ours.

 Thinking again, that's actually not a problem: gktarget is only needed
 while gksync is zero - but then we won't get hold of it anyway and,
 thus, can't cause any damage.

 Well, you make it look like it does not work. From what I understand,
 what you want is to set gktarget to null if a task being hardened is
 destroyed. But by waiting for the semaphore, you actually wait for the
 harden to be complete, so setting to NULL is useless. Or am I missing
 something else?

 Setting to NULL is probably unneeded but still better than rely on the
 gatekeeper never waking up spuriously and then dereferencing a stale
 pointer.

 The key element of this fix is waitng on gksync, thus on the completion
 of the non-RT part of the hardening. Actually, this part usually fails
 as the target task received a termination signal at this point.

 Yes, but since you wait on the completion of the hardening, the test
 if (target ...) in the gatekeeper code will always be true, because at
 this point the cleanup code will still be waiting for the semaphore.

 Yes, except we will ever wake up the gatekeeper later on without an
 updated gktarget, ie. spuriously. Better safe than sorry, this is hairy
 code anyway (hopefully obsolete one day).
 
 The gatekeeper is not woken up by posting the semaphore, the gatekeeper
 is woken up by the thread which is going to be hardened (and this thread
 is the one which waits for the semaphore).

All true. And what is the point?

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Allow drop_u_mode syscall from any context

2011-06-29 Thread Jan Kiszka

On 2011-06-28 23:29, Gilles Chanteperdrix wrote:
 On 06/28/2011 11:01 PM, GIT version control wrote:
 Module: xenomai-jki
 Branch: for-upstream
 Commit: 5597470d84584846875e8a35309e6302c768addf
 URL:
 http://git.xenomai.org/?p=xenomai-jki.git;a=commit;h=5597470d84584846875e8a35309e6302c768addf

 Author: Jan Kiszka jan.kis...@siemens.com
 Date:   Tue Jun 28 22:10:07 2011 +0200

 nucleus: Allow drop_u_mode syscall from any context

 xnshadow_sys_drop_u_mode already checks if the caller is a shadow. It
 does that without issuing a warning message if the check fails - in
 contrast to do_hisyscall_event. As user space may call this cleanup
 service even for non-shadow threads (e.g. after shadow creation failed),
 we better silence this warning.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 
 Jan,
 
 I have a branch here, which allocates u_mode in the shared heap, so,
 this syscall is about to become unnecessary.
 
 See:
 http://git.xenomai.org/?p=xenomai-gch.git;a=shortlog;h=refs/heads/u_mode

Even better. When do you plan to merge all this? I'd like to finally fix
the various MPS fastlock breakages, specifically as they overlap with
other issues there.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Allow drop_u_mode syscall from any context

2011-06-29 Thread Gilles Chanteperdrix

On 06/29/2011 09:06 AM, Jan Kiszka wrote:
On 2011-06-28 23:29, Gilles Chanteperdrix wrote:
On 06/28/2011 11:01 PM, GIT version control wrote:
Module: xenomai-jki
Branch: for-upstream
Commit: 5597470d84584846875e8a35309e6302c768addf
URL:
http://git.xenomai.org/?p=xenomai-jki.git;a=commit;h=5597470d84584846875e8a35309e6302c768addf

Author: Jan Kiszka jan.kis...@siemens.com
Date: Tue Jun 28 22:10:07 2011 +0200

nucleus: Allow drop_u_mode syscall from any context

xnshadow_sys_drop_u_mode already checks if the caller is a shadow. It
does that without issuing a warning message if the check fails - in
contrast to do_hisyscall_event. As user space may call this cleanup
service even for non-shadow threads (e.g. after shadow creation failed),
we better silence this warning.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com

Jan,

I have a branch here, which allocates u_mode in the shared heap, so,
this syscall is about to become unnecessary.

See:
http://git.xenomai.org/?p=xenomai-gch.git;a=shortlog;h=refs/heads/u_mode

Even better. When do you plan to merge all this? I'd like to finally fix
the various MPS fastlock breakages, specifically as they overlap with
other issues there.

I would like to give a chance to Philippe to have a look at it before it
is merged (especially the commit using an adeos ptd).

Normally, the MPS fastlock issues are solved in this branch.

--
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Allow drop_u_mode syscall from any context

2011-06-29 Thread Jan Kiszka

On 2011-06-29 09:25, Gilles Chanteperdrix wrote:
On 06/29/2011 09:06 AM, Jan Kiszka wrote:
On 2011-06-28 23:29, Gilles Chanteperdrix wrote:
On 06/28/2011 11:01 PM, GIT version control wrote:
Module: xenomai-jki
Branch: for-upstream
Commit: 5597470d84584846875e8a35309e6302c768addf
URL:
http://git.xenomai.org/?p=xenomai-jki.git;a=commit;h=5597470d84584846875e8a35309e6302c768addf

Author: Jan Kiszka jan.kis...@siemens.com
Date: Tue Jun 28 22:10:07 2011 +0200

nucleus: Allow drop_u_mode syscall from any context

Signed-off-by: Jan Kiszka jan.kis...@siemens.com

Jan,

I have a branch here, which allocates u_mode in the shared heap, so,
this syscall is about to become unnecessary.

See:
http://git.xenomai.org/?p=xenomai-gch.git;a=shortlog;h=refs/heads/u_mode

Even better. When do you plan to merge all this? I'd like to finally fix
the various MPS fastlock breakages, specifically as they overlap with
other issues there.

I would like to give a chance to Philippe to have a look at it before it
is merged (especially the commit using an adeos ptd).

OK.

Normally, the MPS fastlock issues are solved in this branch.

Only the previously discussed leak. MSP-disabled is still oopsing, and
error clean up is also broken - but not only for MPS.

I'll base my fixes on top of your branch.

Jan

signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Allow drop_u_mode syscall from any context

2011-06-28 Thread Gilles Chanteperdrix

On 06/28/2011 11:01 PM, GIT version control wrote:
 Module: xenomai-jki
 Branch: for-upstream
 Commit: 5597470d84584846875e8a35309e6302c768addf
 URL:
 http://git.xenomai.org/?p=xenomai-jki.git;a=commit;h=5597470d84584846875e8a35309e6302c768addf
 
 Author: Jan Kiszka jan.kis...@siemens.com
 Date:   Tue Jun 28 22:10:07 2011 +0200
 
 nucleus: Allow drop_u_mode syscall from any context
 
 xnshadow_sys_drop_u_mode already checks if the caller is a shadow. It
 does that without issuing a warning message if the check fails - in
 contrast to do_hisyscall_event. As user space may call this cleanup
 service even for non-shadow threads (e.g. after shadow creation failed),
 we better silence this warning.
 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com

Jan,

I have a branch here, which allocates u_mode in the shared heap, so,
this syscall is about to become unnecessary.

See:
http://git.xenomai.org/?p=xenomai-gch.git;a=shortlog;h=refs/heads/u_mode

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails

On 2011-06-19 17:41, Gilles Chanteperdrix wrote:
 Merged your whole branch, but took the liberty to change it a bit
 (replacing the commit concerning unlocked context switches with comments
 changes only, and changing the commit about xntbase_tick).

What makes splmax() redundant for the unlocked context switch case? IMO
that bug is still present.

We can clean up xnintr_clock_handler a bit after the changes, will
follow up with a patch.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails

On 06/20/2011 06:43 PM, Jan Kiszka wrote:
 On 2011-06-19 17:41, Gilles Chanteperdrix wrote:
 Merged your whole branch, but took the liberty to change it a bit
 (replacing the commit concerning unlocked context switches with comments
 changes only, and changing the commit about xntbase_tick).
 
 What makes splmax() redundant for the unlocked context switch case? IMO
 that bug is still present.

No, the bug is between my keyboard and chair. On architectures with
unlocked context switches, the Linux task switch still happens with irqs
off, only the mm switch happens with irqs on.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails

On 2011-06-20 19:33, Gilles Chanteperdrix wrote:
 On 06/20/2011 06:43 PM, Jan Kiszka wrote:
 On 2011-06-19 17:41, Gilles Chanteperdrix wrote:
 Merged your whole branch, but took the liberty to change it a bit
 (replacing the commit concerning unlocked context switches with comments
 changes only, and changing the commit about xntbase_tick).

 What makes splmax() redundant for the unlocked context switch case? IMO
 that bug is still present.
 
 No, the bug is between my keyboard and chair. On architectures with
 unlocked context switches, the Linux task switch still happens with irqs
 off, only the mm switch happens with irqs on.

Then why do we call xnlock_get_irqsave in
xnsched_finish_unlocked_switch? Why not simply xnlock_get if irqs are
off anyway?

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails

On 06/20/2011 09:38 PM, Jan Kiszka wrote:
 On 2011-06-20 19:33, Gilles Chanteperdrix wrote:
 On 06/20/2011 06:43 PM, Jan Kiszka wrote:
 On 2011-06-19 17:41, Gilles Chanteperdrix wrote:
 Merged your whole branch, but took the liberty to change it a bit
 (replacing the commit concerning unlocked context switches with comments
 changes only, and changing the commit about xntbase_tick).

 What makes splmax() redundant for the unlocked context switch case? IMO
 that bug is still present.

 No, the bug is between my keyboard and chair. On architectures with
 unlocked context switches, the Linux task switch still happens with irqs
 off, only the mm switch happens with irqs on.
 
 Then why do we call xnlock_get_irqsave in
 xnsched_finish_unlocked_switch? Why not simply xnlock_get if irqs are
 off anyway?

Because of the Xenomai task switch, not the Linux task switch.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails

On 2011-06-20 21:41, Gilles Chanteperdrix wrote:
 On 06/20/2011 09:38 PM, Jan Kiszka wrote:
 On 2011-06-20 19:33, Gilles Chanteperdrix wrote:
 On 06/20/2011 06:43 PM, Jan Kiszka wrote:
 On 2011-06-19 17:41, Gilles Chanteperdrix wrote:
 Merged your whole branch, but took the liberty to change it a bit
 (replacing the commit concerning unlocked context switches with comments
 changes only, and changing the commit about xntbase_tick).

 What makes splmax() redundant for the unlocked context switch case? IMO
 that bug is still present.

 No, the bug is between my keyboard and chair. On architectures with
 unlocked context switches, the Linux task switch still happens with irqs
 off, only the mm switch happens with irqs on.

 Then why do we call xnlock_get_irqsave in
 xnsched_finish_unlocked_switch? Why not simply xnlock_get if irqs are
 off anyway?
 
 Because of the Xenomai task switch, not the Linux task switch.

--verbose please.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails

On 06/20/2011 09:41 PM, Jan Kiszka wrote:
 On 2011-06-20 21:41, Gilles Chanteperdrix wrote:
 On 06/20/2011 09:38 PM, Jan Kiszka wrote:
 On 2011-06-20 19:33, Gilles Chanteperdrix wrote:
 On 06/20/2011 06:43 PM, Jan Kiszka wrote:
 On 2011-06-19 17:41, Gilles Chanteperdrix wrote:
 Merged your whole branch, but took the liberty to change it a bit
 (replacing the commit concerning unlocked context switches with comments
 changes only, and changing the commit about xntbase_tick).

 What makes splmax() redundant for the unlocked context switch case? IMO
 that bug is still present.

 No, the bug is between my keyboard and chair. On architectures with
 unlocked context switches, the Linux task switch still happens with irqs
 off, only the mm switch happens with irqs on.

 Then why do we call xnlock_get_irqsave in
 xnsched_finish_unlocked_switch? Why not simply xnlock_get if irqs are
 off anyway?

 Because of the Xenomai task switch, not the Linux task switch.
 
 --verbose please.

There are two kind of task switches, switches between Linux tasks,
handled by Linux kernel function/macro/inline asm switch_to(). And those
between Xenomai tasks, handled by function/macro/inline asm
xnarch_switch_to().

Since a Linux kernel context switch may still be interrupted by a
(primary mode) interrupt which could decide to switch context, it can
not happen with interrupts enabled, due to the way it works (spill the
registers in a place relative to the SP, then change SP not atomically).

The Xenomai context switches have no such risk, so, they may happen with
irqs on, completely.

In case of relax, the two halves of context switches are not of the same
kind. The first half of the context switch is a Xenomai switch, but the
second half is the epilogue of a Linux context switch (which, by the
way, is why we need skipping all the house keeping in __xnpod_schedule
in that case, and also why we go to all the pain for keeping the two
context switches compatible), hence, even on machines with unlocked
context switches, irqs are off at this point.

Hope this is more clear.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails

On 2011-06-20 21:51, Gilles Chanteperdrix wrote:
 On 06/20/2011 09:41 PM, Jan Kiszka wrote:
 On 2011-06-20 21:41, Gilles Chanteperdrix wrote:
 On 06/20/2011 09:38 PM, Jan Kiszka wrote:
 On 2011-06-20 19:33, Gilles Chanteperdrix wrote:
 On 06/20/2011 06:43 PM, Jan Kiszka wrote:
 On 2011-06-19 17:41, Gilles Chanteperdrix wrote:
 Merged your whole branch, but took the liberty to change it a bit
 (replacing the commit concerning unlocked context switches with comments
 changes only, and changing the commit about xntbase_tick).

 What makes splmax() redundant for the unlocked context switch case? IMO
 that bug is still present.

 No, the bug is between my keyboard and chair. On architectures with
 unlocked context switches, the Linux task switch still happens with irqs
 off, only the mm switch happens with irqs on.

 Then why do we call xnlock_get_irqsave in
 xnsched_finish_unlocked_switch? Why not simply xnlock_get if irqs are
 off anyway?

 Because of the Xenomai task switch, not the Linux task switch.

 --verbose please.
 
 There are two kind of task switches, switches between Linux tasks,
 handled by Linux kernel function/macro/inline asm switch_to(). And those
 between Xenomai tasks, handled by function/macro/inline asm
 xnarch_switch_to().

xnarch_switch_to is the central entry point for everyone. It may decide
to branch to switch_to or __switch_to, or it simply handles all on its
own - that's depending on the arch.

 
 Since a Linux kernel context switch may still be interrupted by a
 (primary mode) interrupt which could decide to switch context, it can
 not happen with interrupts enabled, due to the way it works (spill the
 registers in a place relative to the SP, then change SP not atomically).
 
 The Xenomai context switches have no such risk, so, they may happen with
 irqs on, completely.
 
 In case of relax, the two halves of context switches are not of the same
 kind. The first half of the context switch is a Xenomai switch, but the
 second half is the epilogue of a Linux context switch (which, by the
 way, is why we need skipping all the house keeping in __xnpod_schedule
 in that case, and also why we go to all the pain for keeping the two
 context switches compatible), hence, even on machines with unlocked
 context switches, irqs are off at this point.
 
 Hope this is more clear.

That's all clear. But it's unclear how this maps to our key question.

Can you point out where in those paths irqs are disabled again (after
entering xnarch_switch_to) and left off for each of the unlocked
switching archs? I'm still skeptical that the need for disable irqs
during thread switch on some archs also leads to unconditionally
disabled hard irqs when returning from xnarch_switch_to.

But even if that's all the case today, we would better set this
requirement in stone:

diff --git a/ksrc/nucleus/pod.c b/ksrc/nucleus/pod.c
index f2fc2ab..c4c5807 100644
--- a/ksrc/nucleus/pod.c
+++ b/ksrc/nucleus/pod.c
@@ -2273,6 +2273,8 @@ reschedule:

xnpod_switch_to(sched, prev, next);

+   XENO_BUGON(NUCLEUS, !irqs_disabled_hw());
+
 #ifdef CONFIG_XENO_OPT_PERVASIVE
/*
 * Test whether we transitioned from primary mode to secondary

[ just demonstrating, would require some cleanup ]

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails

On 06/20/2011 10:41 PM, Jan Kiszka wrote:
 xnarch_switch_to is the central entry point for everyone. It may decide
 to branch to switch_to or __switch_to, or it simply handles all on its
 own - that's depending on the arch.

No, the Linux kernel does not know anything about xnarch_switch_to, so
the schedule() function continues to use switch_to happily.
xnarch_switch_to is only used to switch from xnthread_t to xnthread_t,
by __xnpod_schedule(). Now, that some architecture (namely x86) decide
that xnarch_switch_to should use switch_to (or more likely an inner
__switch_to) when the xnthread_t has a non NULL user_task member is an
implementation detail.

 Can you point out where in those paths irqs are disabled again (after
 entering xnarch_switch_to)

They are not disabled again after xnarch_switch_to, they are disabled
when starting switch_to.

 and left off for each of the unlocked
 switching archs? I'm still skeptical that the need for disable irqs
 during thread switch on some archs also leads to unconditionally
 disabled hard irqs when returning from xnarch_switch_to.
 
 But even if that's all the case today, we would better set this
 requirement in stone:
 
 diff --git a/ksrc/nucleus/pod.c b/ksrc/nucleus/pod.c
 index f2fc2ab..c4c5807 100644
 --- a/ksrc/nucleus/pod.c
 +++ b/ksrc/nucleus/pod.c
 @@ -2273,6 +2273,8 @@ reschedule:
 
   xnpod_switch_to(sched, prev, next);
 
 + XENO_BUGON(NUCLEUS, !irqs_disabled_hw());
 +

You misunderstand me: only after the second half context switch in the
case of xnshadow_relax are the interrupts disabled. Because this second
half-switch started as a switch_to and not an xnarch_switch_to, so,
started as:

#define switch_to(prev,next,last)   \
do {\
local_irq_disable_hw_cond();\
last = __switch_to(prev,task_thread_info(prev),
task_thread_info(next));\
local_irq_enable_hw_cond(); \
} while (0)

(On ARM for instance).

But that is true, we could assert this in the shadow epilogue case.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails