Re: [Xenomai-core] Pending ownership and resource stealing
Gilles Chanteperdrix wrote: Philippe Gerum wrote: > Gilles Chanteperdrix wrote: > > Philippe Gerum wrote: > > > prio(task1) > prio(task2) > > > > > > 1. task1 grabs a resource > > > 2. task1 sleeps for some time > > > 3. task2 blocks requesting the resource > > > 4. task1 wakes up from the sleep and releases the resource to task2 > > > 5. task1 wants the resource back immediately and calls > > > xnsynch_sleep_on() since the ownership has been transferred to task2 > > > since step 4. > > > 6a. old way: task1 would block and task2 would run anyway, with a PIP > > > boost, blocking task1 until the resource is released > > > 6b. new way: task1 steals the resource previously granted to task2 > > > directly from xnsynch_sleep_on(), but doing so, nobody downstream has > > > had a chance to update any skin-specific data, such as an additional > > > "owner" field. > > > > Posix skin mutexes work the new way without calling xnsynch_sleep_on, so > > probably need fixing. > > > > I don't see any additional information maintained by the skin, aside of > the sem->count field, so that should be ok as it is. Is there anything > else recorded for tracking the current ownership of a sem-mutex object? The mutex->count field is unconditionnaly set to 0 and task2 woken up when task1 releases the mutex, so that when task1 want the mutex again, it will appear as free, and task1 will get it without calling xnsynch_sleep_on, task2 will eventually wakeup spuriously or not depending on whether the mutex is free. So, setting the count to 0 when waking up seems wrong. We just need to set it to 1 instead, just like if we were returning from mutex_trylock_internal() successfully. In a nut shell, the posix skin should be changed to use the new feature instead of implementing its own behaviour. There is no new feature per se, just an additional case upon return from xnsynch_sleep_on() to take into account. -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Pending ownership and resource stealing
Philippe Gerum wrote: > Gilles Chanteperdrix wrote: > > Philippe Gerum wrote: > > > prio(task1) > prio(task2) > > > > > > 1. task1 grabs a resource > > > 2. task1 sleeps for some time > > > 3. task2 blocks requesting the resource > > > 4. task1 wakes up from the sleep and releases the resource to task2 > > > 5. task1 wants the resource back immediately and calls > > > xnsynch_sleep_on() since the ownership has been transferred to task2 > > > since step 4. > > > 6a. old way: task1 would block and task2 would run anyway, with a PIP > > > boost, blocking task1 until the resource is released > > > 6b. new way: task1 steals the resource previously granted to task2 > > > directly from xnsynch_sleep_on(), but doing so, nobody downstream has > > > had a chance to update any skin-specific data, such as an additional > > > "owner" field. > > > > Posix skin mutexes work the new way without calling xnsynch_sleep_on, so > > probably need fixing. > > > > I don't see any additional information maintained by the skin, aside of > the sem->count field, so that should be ok as it is. Is there anything > else recorded for tracking the current ownership of a sem-mutex object? The mutex->count field is unconditionnaly set to 0 and task2 woken up when task1 releases the mutex, so that when task1 want the mutex again, it will appear as free, and task1 will get it without calling xnsynch_sleep_on, task2 will eventually wakeup spuriously or not depending on whether the mutex is free. So, setting the count to 0 when waking up seems wrong. In a nut shell, the posix skin should be changed to use the new feature instead of implementing its own behaviour. -- Gilles Chanteperdrix. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Pending ownership and resource stealing
Philippe Gerum wrote: > Jan Kiszka wrote: >> Philippe Gerum wrote: >>> The only thing that >>> should change downstream compared to the previous behaviour is that >>> xnsynch_sleep_on() might unblock immediately at skin level without any >>> xnsynch_wakeup_sleeper() calls being previously invoked, since the >>> blocking call does the stealing during the pending ownership window. >>> >>> This means that skins now _must_ fix their internal state when unblocked >>> from xnsynch_sleep_on() if they happen to track their own resource owner >>> field for instance, since they might become the owner of such resource >>> without any unlock/release/whatever routine being called at the said >>> skin level. I've fixed a couple of skins for that purpose (not checked >>> RTDM btw), but it would be safer if you could double-check the impact of >>> such change on the interfaces you've crafted. >> >> >> Well, if this means that once you have called xnsynch_wakeup_sleeper() >> for some lower-prio task, you must call xnsynch_sleep_on() to steal it >> for a higher-prio task, then RTDM needs fixing (it only sets a private >> lock bit in this case). > > No need to call xnsynch_sleep_on() more than usually done; just have a > look at native/mutex.c in rt_mutex_lock(), and follow the code labeled > grab_mutex, it should give your the proper illustration of the issue. I did so and discovered that prio-inheritance was broken for RTDM-mutexes right from the beginning. In case such a mutex was entered uncontended, the owner was not recorded. Oops... This caused no crash due to the check "owner != NULL" in xnsynch_sleep_on [1]. But this also made me wonder if it is a reasonable state for a PIP synch object to be acquired without having an owner. If no, we should rather bail out in some way here. Anyway, RTDM seems to work fine now (trunk and 2.1.x), even with a reduced rtdm_mutex_t data structure and shrunken code (rtdm_mutex_unlock became trivial by only using xnsynch_t). Further tests will follow during the day. Oh, and the look-steeling code behaved as expected so far. Jan [1]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/synch.c#L179 signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Pending ownership and resource stealing
Gilles Chanteperdrix wrote: Philippe Gerum wrote: > prio(task1) > prio(task2) > > 1. task1 grabs a resource > 2. task1 sleeps for some time > 3. task2 blocks requesting the resource > 4. task1 wakes up from the sleep and releases the resource to task2 > 5. task1 wants the resource back immediately and calls > xnsynch_sleep_on() since the ownership has been transferred to task2 > since step 4. > 6a. old way: task1 would block and task2 would run anyway, with a PIP > boost, blocking task1 until the resource is released > 6b. new way: task1 steals the resource previously granted to task2 > directly from xnsynch_sleep_on(), but doing so, nobody downstream has > had a chance to update any skin-specific data, such as an additional > "owner" field. Posix skin mutexes work the new way without calling xnsynch_sleep_on, so probably need fixing. I don't see any additional information maintained by the skin, aside of the sem->count field, so that should be ok as it is. Is there anything else recorded for tracking the current ownership of a sem-mutex object? -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Pending ownership and resource stealing
Jan Kiszka wrote: Philippe Gerum wrote: Jan Kiszka wrote: Besides this, we then may want to consider if introducing a pending ownership of synch objects is worthwhile to improve efficiency of PIP users. Not critical, but if it comes at a reasonable price... Will try to draft something. I've committed an implementation of this stuff to the trunk, tested on your testcase over the simulator. So far, it's ok. I'll give up, you are too fast for me. ;) The only thing that should change downstream compared to the previous behaviour is that xnsynch_sleep_on() might unblock immediately at skin level without any xnsynch_wakeup_sleeper() calls being previously invoked, since the blocking call does the stealing during the pending ownership window. This means that skins now _must_ fix their internal state when unblocked from xnsynch_sleep_on() if they happen to track their own resource owner field for instance, since they might become the owner of such resource without any unlock/release/whatever routine being called at the said skin level. I've fixed a couple of skins for that purpose (not checked RTDM btw), but it would be safer if you could double-check the impact of such change on the interfaces you've crafted. Well, if this means that once you have called xnsynch_wakeup_sleeper() for some lower-prio task, you must call xnsynch_sleep_on() to steal it for a higher-prio task, then RTDM needs fixing (it only sets a private lock bit in this case). No need to call xnsynch_sleep_on() more than usually done; just have a look at native/mutex.c in rt_mutex_lock(), and follow the code labeled grab_mutex, it should give your the proper illustration of the issue. This change only affects PIP-enabled synchronization objects in a reasonably limited manner and seems to behave properly, but please, give this code hell on your side too. Will do. Jan PS: The usage can be checked also via the cross reference: http://www.rts.uni-hannover.de/xenomai/lxr/ident?v=SVN-trunk;i=XNSYNCH_PIP -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Pending ownership and resource stealing
Philippe Gerum wrote: > prio(task1) > prio(task2) > > 1. task1 grabs a resource > 2. task1 sleeps for some time > 3. task2 blocks requesting the resource > 4. task1 wakes up from the sleep and releases the resource to task2 > 5. task1 wants the resource back immediately and calls > xnsynch_sleep_on() since the ownership has been transferred to task2 > since step 4. > 6a. old way: task1 would block and task2 would run anyway, with a PIP > boost, blocking task1 until the resource is released > 6b. new way: task1 steals the resource previously granted to task2 > directly from xnsynch_sleep_on(), but doing so, nobody downstream has > had a chance to update any skin-specific data, such as an additional > "owner" field. Posix skin mutexes work the new way without calling xnsynch_sleep_on, so probably need fixing. -- Gilles Chanteperdrix. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Pending ownership and resource stealing
Gilles Chanteperdrix wrote: Philippe Gerum wrote: > This means that skins now _must_ fix their internal state when unblocked > from xnsynch_sleep_on() if they happen to track their own resource owner > field for instance, since they might become the owner of such resource > without any unlock/release/whatever routine being called at the said > skin level. I've fixed a couple of skins for that purpose (not checked > RTDM btw), but it would be safer if you could double-check the impact of > such change on the interfaces you've crafted. When waking up, how do we know that we are in the problematic situation ? The only possible issue with the new implementation is when the skin maintains its own specific data for tracking ownership of a resource, that complements the synch::owner field of the core nucleus resource object. The example code Jan has posted lately to raise the concern about the XNBREAK issue is also demonstrative of the issue of pending ownership, i.e.: prio(task1) > prio(task2) 1. task1 grabs a resource 2. task1 sleeps for some time 3. task2 blocks requesting the resource 4. task1 wakes up from the sleep and releases the resource to task2 5. task1 wants the resource back immediately and calls xnsynch_sleep_on() since the ownership has been transferred to task2 since step 4. 6a. old way: task1 would block and task2 would run anyway, with a PIP boost, blocking task1 until the resource is released 6b. new way: task1 steals the resource previously granted to task2 directly from xnsynch_sleep_on(), but doing so, nobody downstream has had a chance to update any skin-specific data, such as an additional "owner" field. The caller of xnsynch_sleep_on() in native/mutex.c illustrates this (i.e. grab_mutex label). Can we do the house keeping in the callback registered with xnsynch_register_cleanup, or are you talking of a different situation ? It's different. The cleanup handler as described by Jan is only there to perform some housekeeping chores on synch objects which get forcibly released because their previous owner went away (typically xnpod_delete()). -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Pending ownership and resource stealing
Philippe Gerum wrote: > Jan Kiszka wrote: > >> Besides this, we then may want to consider if introducing a pending >> ownership of synch objects is worthwhile to improve efficiency of PIP >> users. Not critical, but if it comes at a reasonable price... Will try >> to draft something. >> > > I've committed an implementation of this stuff to the trunk, tested on > your testcase over the simulator. So far, it's ok. I'll give up, you are too fast for me. ;) > The only thing that > should change downstream compared to the previous behaviour is that > xnsynch_sleep_on() might unblock immediately at skin level without any > xnsynch_wakeup_sleeper() calls being previously invoked, since the > blocking call does the stealing during the pending ownership window. > > This means that skins now _must_ fix their internal state when unblocked > from xnsynch_sleep_on() if they happen to track their own resource owner > field for instance, since they might become the owner of such resource > without any unlock/release/whatever routine being called at the said > skin level. I've fixed a couple of skins for that purpose (not checked > RTDM btw), but it would be safer if you could double-check the impact of > such change on the interfaces you've crafted. Well, if this means that once you have called xnsynch_wakeup_sleeper() for some lower-prio task, you must call xnsynch_sleep_on() to steal it for a higher-prio task, then RTDM needs fixing (it only sets a private lock bit in this case). > > This change only affects PIP-enabled synchronization objects in a > reasonably limited manner and seems to behave properly, but please, give > this code hell on your side too. > Will do. Jan PS: The usage can be checked also via the cross reference: http://www.rts.uni-hannover.de/xenomai/lxr/ident?v=SVN-trunk;i=XNSYNCH_PIP signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Pending ownership and resource stealing
Philippe Gerum wrote: > This means that skins now _must_ fix their internal state when unblocked > from xnsynch_sleep_on() if they happen to track their own resource owner > field for instance, since they might become the owner of such resource > without any unlock/release/whatever routine being called at the said > skin level. I've fixed a couple of skins for that purpose (not checked > RTDM btw), but it would be safer if you could double-check the impact of > such change on the interfaces you've crafted. When waking up, how do we know that we are in the problematic situation ? Can we do the house keeping in the callback registered with xnsynch_register_cleanup, or are you talking of a different situation ? -- Gilles Chanteperdrix. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core