Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Jan Kiszka
Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> GIT version control wrote:
 diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c
 index 11f47c0..a896031 100644
 --- a/ksrc/skins/posix/mq.c
 +++ b/ksrc/skins/posix/mq.c
 @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct xnselector 
 *selector,
return 0;
  
unlock_and_error:
 +  xnfree(binding):
xnlock_put_irqrestore(&nklock, s);
return err;
  }
>>> Ok. Will pull. But I really need to fix that.
>>>
>> Ack - now that I see it myself.
>>
> 
> I fixed this in my branch and added another patch to transform EIDRM
> into EBADF when selecting a (half-)deleted RTDM device. Please merge.
> 

Wait! When the sync object behind some file descriptor is deleted but
the descriptor itself is still existing, we rather have to return that
fd signaled from select() instead of letting the call fail. I beed to
look into this again.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 GIT version control wrote:
> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c
> index 11f47c0..a896031 100644
> --- a/ksrc/skins/posix/mq.c
> +++ b/ksrc/skins/posix/mq.c
> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct 
> xnselector *selector,
>   return 0;
>  
>unlock_and_error:
> + xnfree(binding):
>   xnlock_put_irqrestore(&nklock, s);
>   return err;
>  }
 Ok. Will pull. But I really need to fix that.

>>> Ack - now that I see it myself.
>>>
>> I fixed this in my branch and added another patch to transform EIDRM
>> into EBADF when selecting a (half-)deleted RTDM device. Please merge.
>>
> 
> Wait! When the sync object behind some file descriptor is deleted but
> the descriptor itself is still existing, we rather have to return that
> fd signaled from select() instead of letting the call fail. I beed to
> look into this again.

It looks to me like a transitory state, we can wait for the sync object
to be deleted to have the fd destructor signaled. It should not be long.

> 
> Jan
> 


-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
 Gilles Chanteperdrix wrote:
> GIT version control wrote:
>> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c
>> index 11f47c0..a896031 100644
>> --- a/ksrc/skins/posix/mq.c
>> +++ b/ksrc/skins/posix/mq.c
>> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct 
>> xnselector *selector,
>>  return 0;
>>  
>>unlock_and_error:
>> +xnfree(binding):
>>  xnlock_put_irqrestore(&nklock, s);
>>  return err;
>>  }
> Ok. Will pull. But I really need to fix that.
>
 Ack - now that I see it myself.

>>> I fixed this in my branch and added another patch to transform EIDRM
>>> into EBADF when selecting a (half-)deleted RTDM device. Please merge.
>>>
>> Wait! When the sync object behind some file descriptor is deleted but
>> the descriptor itself is still existing, we rather have to return that
>> fd signaled from select() instead of letting the call fail. I beed to
>> look into this again.
> 
> It looks to me like a transitory state, we can wait for the sync object
> to be deleted to have the fd destructor signaled. It should not be long.

That's not an issue of waiting for this. See e.g. TCP: peer closes
connection -> internal sync objects will be destroyed (to make
read/write fail). But the fd will remain valid until the local side
closes it as well.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
 Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> GIT version control wrote:
>>> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c
>>> index 11f47c0..a896031 100644
>>> --- a/ksrc/skins/posix/mq.c
>>> +++ b/ksrc/skins/posix/mq.c
>>> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct 
>>> xnselector *selector,
>>> return 0;
>>>  
>>>unlock_and_error:
>>> +   xnfree(binding):
>>> xnlock_put_irqrestore(&nklock, s);
>>> return err;
>>>  }
>> Ok. Will pull. But I really need to fix that.
>>
> Ack - now that I see it myself.
>
 I fixed this in my branch and added another patch to transform EIDRM
 into EBADF when selecting a (half-)deleted RTDM device. Please merge.

>>> Wait! When the sync object behind some file descriptor is deleted but
>>> the descriptor itself is still existing, we rather have to return that
>>> fd signaled from select() instead of letting the call fail. I beed to
>>> look into this again.
>> It looks to me like a transitory state, we can wait for the sync object
>> to be deleted to have the fd destructor signaled. It should not be long.
> 
> That's not an issue of waiting for this. See e.g. TCP: peer closes
> connection -> internal sync objects will be destroyed (to make
> read/write fail). But the fd will remain valid until the local side
> closes it as well.

It looks to me like this is going to complicate things a lot, and will
be a source of regressions. Why can we not have sync objects be
destroyed when the fd is really destroyed and use a status bit of some
kind to signal read/write that the fd was closed by peer?

> 
> Jan
> 


-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
 Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> GIT version control wrote:
 diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c
 index 11f47c0..a896031 100644
 --- a/ksrc/skins/posix/mq.c
 +++ b/ksrc/skins/posix/mq.c
 @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct 
 xnselector *selector,
return 0;
  
unlock_and_error:
 +  xnfree(binding):
xnlock_put_irqrestore(&nklock, s);
return err;
  }
>>> Ok. Will pull. But I really need to fix that.
>>>
>> Ack - now that I see it myself.
>>
> I fixed this in my branch and added another patch to transform EIDRM
> into EBADF when selecting a (half-)deleted RTDM device. Please merge.
>
 Wait! When the sync object behind some file descriptor is deleted but
 the descriptor itself is still existing, we rather have to return that
 fd signaled from select() instead of letting the call fail. I beed to
 look into this again.
>>> It looks to me like a transitory state, we can wait for the sync object
>>> to be deleted to have the fd destructor signaled. It should not be long.
>> That's not an issue of waiting for this. See e.g. TCP: peer closes
>> connection -> internal sync objects will be destroyed (to make
>> read/write fail). But the fd will remain valid until the local side
>> closes it as well.
> 
> It looks to me like this is going to complicate things a lot, and will
> be a source of regressions. Why can we not have sync objects be
> destroyed when the fd is really destroyed and use a status bit of some
> kind to signal read/write that the fd was closed by peer?

It is way easier and more consistent to unblock reader and writers via
destroying the sync object than to signal it and add tests for specific
states to detect that. Keep in mind that this pattern is in use even
without select support. Diverging from it just to add select awareness
to some driver would be a step back.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Jan Kiszka
Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
> Wait! When the sync object behind some file descriptor is deleted but
> the descriptor itself is still existing, we rather have to return that
> fd signaled from select() instead of letting the call fail. I beed to
> look into this again.
 It looks to me like a transitory state, we can wait for the sync object
 to be deleted to have the fd destructor signaled. It should not be long.
>>> That's not an issue of waiting for this. See e.g. TCP: peer closes
>>> connection -> internal sync objects will be destroyed (to make
>>> read/write fail). But the fd will remain valid until the local side
>>> closes it as well.
>> It looks to me like this is going to complicate things a lot, and will
>> be a source of regressions. Why can we not have sync objects be
>> destroyed when the fd is really destroyed and use a status bit of some
>> kind to signal read/write that the fd was closed by peer?
> 
> It is way easier and more consistent to unblock reader and writers via
> destroying the sync object than to signal it and add tests for specific
> states to detect that. Keep in mind that this pattern is in use even
> without select support. Diverging from it just to add select awareness
> to some driver would be a step back.
> 

First draft of a fix that so far does what it is supposed to do,
comments welcome:

diff --git a/ksrc/skins/posix/syscall.c b/ksrc/skins/posix/syscall.c
index 959b61c..4e46cb6 100644
--- a/ksrc/skins/posix/syscall.c
+++ b/ksrc/skins/posix/syscall.c
@@ -2298,24 +2298,30 @@ static int select_bind_one(struct xnselector *selector, 
unsigned type, int fd)
 }
 
 static int select_bind_all(struct xnselector *selector,
-  fd_set *fds[XNSELECT_MAX_TYPES], int nfds)
+  fd_set *in_fds[XNSELECT_MAX_TYPES],
+  fd_set *out_fds[XNSELECT_MAX_TYPES], int nfds)
 {
unsigned fd, type;
+   int pending = 0;
int err;
 
for (type = 0; type < XNSELECT_MAX_TYPES; type++) {
-   fd_set *set = fds[type];
+   fd_set *set = in_fds[type];
if (set)
for (fd = find_first_bit(set->fds_bits, nfds);
 fd < nfds;
 fd = find_next_bit(set->fds_bits, nfds, fd + 1)) {
err = select_bind_one(selector, type, fd);
-   if (err)
-   return err;
+   if (err) {
+   if (err != -EIDRM)
+   return err;
+   __FD_SET(fd, out_fds[type]);
+   pending++;
+   }
}
}
 
-   return 0;
+   return pending;
 }
 
 /* int select(int, fd_set *, fd_set *, fd_set *, struct timeval *) */
@@ -2387,7 +2393,7 @@ static int __select(struct pt_regs *regs)
 
/* Bind directly the file descriptors, we do not need to go
   through xnselect returning -ECHRNG */
-   if ((err = select_bind_all(selector, in_fds, nfds)))
+   if ((err = select_bind_all(selector, in_fds, out_fds, nfds)))
return err;
}
 
@@ -2395,7 +2401,8 @@ static int __select(struct pt_regs *regs)
err = xnselect(selector, out_fds, in_fds, nfds, timeout, mode);
 
if (err == -ECHRNG) {
-   int err = select_bind_all(selector, out_fds, nfds);
+   int err = select_bind_all(selector, out_fds, out_fds,
+ nfds);
if (err)
return err;
}

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Jan Kiszka
Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
 Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Wait! When the sync object behind some file descriptor is deleted but
>> the descriptor itself is still existing, we rather have to return that
>> fd signaled from select() instead of letting the call fail. I beed to
>> look into this again.
> It looks to me like a transitory state, we can wait for the sync object
> to be deleted to have the fd destructor signaled. It should not be long.
 That's not an issue of waiting for this. See e.g. TCP: peer closes
 connection -> internal sync objects will be destroyed (to make
 read/write fail). But the fd will remain valid until the local side
 closes it as well.
>>> It looks to me like this is going to complicate things a lot, and will
>>> be a source of regressions. Why can we not have sync objects be
>>> destroyed when the fd is really destroyed and use a status bit of some
>>> kind to signal read/write that the fd was closed by peer?
>> It is way easier and more consistent to unblock reader and writers via
>> destroying the sync object than to signal it and add tests for specific
>> states to detect that. Keep in mind that this pattern is in use even
>> without select support. Diverging from it just to add select awareness
>> to some driver would be a step back.
>>
> 
> First draft of a fix that so far does what it is supposed to do,
> comments welcome:
> 
> diff --git a/ksrc/skins/posix/syscall.c b/ksrc/skins/posix/syscall.c
> index 959b61c..4e46cb6 100644
> --- a/ksrc/skins/posix/syscall.c
> +++ b/ksrc/skins/posix/syscall.c
> @@ -2298,24 +2298,30 @@ static int select_bind_one(struct xnselector 
> *selector, unsigned type, int fd)
>  }
>  
>  static int select_bind_all(struct xnselector *selector,
> -fd_set *fds[XNSELECT_MAX_TYPES], int nfds)
> +fd_set *in_fds[XNSELECT_MAX_TYPES],
> +fd_set *out_fds[XNSELECT_MAX_TYPES], int nfds)
>  {
>   unsigned fd, type;
> + int pending = 0;
>   int err;
>  
>   for (type = 0; type < XNSELECT_MAX_TYPES; type++) {
> - fd_set *set = fds[type];
> + fd_set *set = in_fds[type];
>   if (set)
>   for (fd = find_first_bit(set->fds_bits, nfds);
>fd < nfds;
>fd = find_next_bit(set->fds_bits, nfds, fd + 1)) {
>   err = select_bind_one(selector, type, fd);
> - if (err)
> - return err;
> + if (err) {
> + if (err != -EIDRM)
> + return err;
> + __FD_SET(fd, out_fds[type]);
> + pending++;
> + }
>   }
>   }
>  
> - return 0;
> + return pending;
>  }
>  
>  /* int select(int, fd_set *, fd_set *, fd_set *, struct timeval *) */
> @@ -2387,7 +2393,7 @@ static int __select(struct pt_regs *regs)
>  
>   /* Bind directly the file descriptors, we do not need to go
>  through xnselect returning -ECHRNG */
> - if ((err = select_bind_all(selector, in_fds, nfds)))
> + if ((err = select_bind_all(selector, in_fds, out_fds, nfds)))
>   return err;
>   }
>  
> @@ -2395,7 +2401,8 @@ static int __select(struct pt_regs *regs)
>   err = xnselect(selector, out_fds, in_fds, nfds, timeout, mode);
>  
>   if (err == -ECHRNG) {
> - int err = select_bind_all(selector, out_fds, nfds);
> + int err = select_bind_all(selector, out_fds, out_fds,
> +   nfds);

Mmh, this is probably no good idea, need to truly split in (aka "no yet
bound") and out (aka "pending") sets here as well.

>   if (err)
>   return err;
>   }

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 GIT version control wrote:
> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c
> index 11f47c0..a896031 100644
> --- a/ksrc/skins/posix/mq.c
> +++ b/ksrc/skins/posix/mq.c
> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct 
> xnselector *selector,
>   return 0;
>  
>unlock_and_error:
> + xnfree(binding):
>   xnlock_put_irqrestore(&nklock, s);
>   return err;
>  }
 Ok. Will pull. But I really need to fix that.

>>> Ack - now that I see it myself.
>>>
>> I fixed this in my branch and added another patch to transform EIDRM
>> into EBADF when selecting a (half-)deleted RTDM device. Please merge.
>>
> Wait! When the sync object behind some file descriptor is deleted but
> the descriptor itself is still existing, we rather have to return that
> fd signaled from select() instead of letting the call fail. I beed to
> look into this again.
 It looks to me like a transitory state, we can wait for the sync object
 to be deleted to have the fd destructor signaled. It should not be long.
>>> That's not an issue of waiting for this. See e.g. TCP: peer closes
>>> connection -> internal sync objects will be destroyed (to make
>>> read/write fail). But the fd will remain valid until the local side
>>> closes it as well.
>> It looks to me like this is going to complicate things a lot, and will
>> be a source of regressions. Why can we not have sync objects be
>> destroyed when the fd is really destroyed and use a status bit of some
>> kind to signal read/write that the fd was closed by peer?
> 
> It is way easier and more consistent to unblock reader and writers via
> destroying the sync object than to signal it and add tests for specific
> states to detect that. Keep in mind that this pattern is in use even
> without select support. Diverging from it just to add select awareness
> to some driver would be a step back.

Ok. As you wish. But in that case, please provide a unit test case which
we can run on all architectures to validate your modifications. We will
not put this test case in xenomai repository but will create a
repository for test cases later on.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 GIT version control wrote:
> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c
> index 11f47c0..a896031 100644
> --- a/ksrc/skins/posix/mq.c
> +++ b/ksrc/skins/posix/mq.c
> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct 
> xnselector *selector,
>   return 0;
>  
>unlock_and_error:
> + xnfree(binding):
>   xnlock_put_irqrestore(&nklock, s);
>   return err;
>  }
 Ok. Will pull. But I really need to fix that.

>>> Ack - now that I see it myself.
>>>
>> I fixed this in my branch and added another patch to transform EIDRM
>> into EBADF when selecting a (half-)deleted RTDM device. Please merge.
>>
> Wait! When the sync object behind some file descriptor is deleted but
> the descriptor itself is still existing, we rather have to return that
> fd signaled from select() instead of letting the call fail. I beed to
> look into this again.
 It looks to me like a transitory state, we can wait for the sync object
 to be deleted to have the fd destructor signaled. It should not be long.
>>> That's not an issue of waiting for this. See e.g. TCP: peer closes
>>> connection -> internal sync objects will be destroyed (to make
>>> read/write fail). But the fd will remain valid until the local side
>>> closes it as well.
>> It looks to me like this is going to complicate things a lot, and will
>> be a source of regressions. Why can we not have sync objects be
>> destroyed when the fd is really destroyed and use a status bit of some
>> kind to signal read/write that the fd was closed by peer?
> 
> It is way easier and more consistent to unblock reader and writers via
> destroying the sync object than to signal it and add tests for specific
> states to detect that.

>From my point of view, having an object with partially destroyed parts
is asking for trouble.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
 Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
 Gilles Chanteperdrix wrote:
> GIT version control wrote:
>> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c
>> index 11f47c0..a896031 100644
>> --- a/ksrc/skins/posix/mq.c
>> +++ b/ksrc/skins/posix/mq.c
>> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct 
>> xnselector *selector,
>>  return 0;
>>  
>>unlock_and_error:
>> +xnfree(binding):
>>  xnlock_put_irqrestore(&nklock, s);
>>  return err;
>>  }
> Ok. Will pull. But I really need to fix that.
>
 Ack - now that I see it myself.

>>> I fixed this in my branch and added another patch to transform EIDRM
>>> into EBADF when selecting a (half-)deleted RTDM device. Please merge.
>>>
>> Wait! When the sync object behind some file descriptor is deleted but
>> the descriptor itself is still existing, we rather have to return that
>> fd signaled from select() instead of letting the call fail. I beed to
>> look into this again.
> It looks to me like a transitory state, we can wait for the sync object
> to be deleted to have the fd destructor signaled. It should not be long.
 That's not an issue of waiting for this. See e.g. TCP: peer closes
 connection -> internal sync objects will be destroyed (to make
 read/write fail). But the fd will remain valid until the local side
 closes it as well.
>>> It looks to me like this is going to complicate things a lot, and will
>>> be a source of regressions. Why can we not have sync objects be
>>> destroyed when the fd is really destroyed and use a status bit of some
>>> kind to signal read/write that the fd was closed by peer?
>> It is way easier and more consistent to unblock reader and writers via
>> destroying the sync object than to signal it and add tests for specific
>> states to detect that. Keep in mind that this pattern is in use even
>> without select support. Diverging from it just to add select awareness
>> to some driver would be a step back.
> 
> Ok. As you wish. But in that case, please provide a unit test case which
> we can run on all architectures to validate your modifications. We will
> not put this test case in xenomai repository but will create a
> repository for test cases later on.

As it requires quite some infrastructure to get there (or do I miss some
preexisting select unit test?), I can just point to RTnet + TCP for now:
run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and
terminate the client prematurely. This does not happen if you run the
very same test without RTnet loaded (ie. against Linux' TCP).

Just pushed the corresponding fix.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Yet another ((weak)) bug

2010-03-01 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Stefan Kisdaroczi wrote:
 Am 12.02.2010 17:22, schrieb Jan Kiszka:
> libxnskin or so?
>
> Jan
>
 libxenomai ?
>>> Ok. Let's go for libxenomai. I will try and do that in the next few days.
>>>
>> Already had a chance to look into this?
> 
> No. I was off a few days. I just looked at it in the train, but could
> not test it yet.

If there is anything I can do to help accelerating it, please let me
know. We need to bake a new Xenomai package this week for our customer,
and at some point I need to decide between upstream and my temporary
fix. Upstream is generally preferred - if available.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Yet another ((weak)) bug

2010-03-01 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 Stefan Kisdaroczi wrote:
> Am 12.02.2010 17:22, schrieb Jan Kiszka:
>> libxnskin or so?
>>
>> Jan
>>
> libxenomai ?
 Ok. Let's go for libxenomai. I will try and do that in the next few days.

>>> Already had a chance to look into this?
>> No. I was off a few days. I just looked at it in the train, but could
>> not test it yet.
> 
> If there is anything I can do to help accelerating it, please let me
> know. We need to bake a new Xenomai package this week for our customer,
> and at some point I need to decide between upstream and my temporary
> fix. Upstream is generally preferred - if available.

Yes, I have been a bit delayed, but should handle this this week, so as
to be ready for a release next week-end.

> 
> Jan
> 


-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Yet another ((weak)) bug

2010-03-01 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
 Gilles Chanteperdrix wrote:
> Stefan Kisdaroczi wrote:
>> Am 12.02.2010 17:22, schrieb Jan Kiszka:
>>> libxnskin or so?
>>>
>>> Jan
>>>
>> libxenomai ?
> Ok. Let's go for libxenomai. I will try and do that in the next few days.
>
 Already had a chance to look into this?
>>> No. I was off a few days. I just looked at it in the train, but could
>>> not test it yet.
>> If there is anything I can do to help accelerating it, please let me
>> know. We need to bake a new Xenomai package this week for our customer,
>> and at some point I need to decide between upstream and my temporary
>> fix. Upstream is generally preferred - if available.
> 
> Yes, I have been a bit delayed, but should handle this this week, so as
> to be ready for a release next week-end.

Anything in some private branch already?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
 Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> GIT version control wrote:
>>> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c
>>> index 11f47c0..a896031 100644
>>> --- a/ksrc/skins/posix/mq.c
>>> +++ b/ksrc/skins/posix/mq.c
>>> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct 
>>> xnselector *selector,
>>> return 0;
>>>  
>>>unlock_and_error:
>>> +   xnfree(binding):
>>> xnlock_put_irqrestore(&nklock, s);
>>> return err;
>>>  }
>> Ok. Will pull. But I really need to fix that.
>>
> Ack - now that I see it myself.
>
 I fixed this in my branch and added another patch to transform EIDRM
 into EBADF when selecting a (half-)deleted RTDM device. Please merge.

>>> Wait! When the sync object behind some file descriptor is deleted but
>>> the descriptor itself is still existing, we rather have to return that
>>> fd signaled from select() instead of letting the call fail. I beed to
>>> look into this again.
>> It looks to me like a transitory state, we can wait for the sync object
>> to be deleted to have the fd destructor signaled. It should not be long.
> That's not an issue of waiting for this. See e.g. TCP: peer closes
> connection -> internal sync objects will be destroyed (to make
> read/write fail). But the fd will remain valid until the local side
> closes it as well.
 It looks to me like this is going to complicate things a lot, and will
 be a source of regressions. Why can we not have sync objects be
 destroyed when the fd is really destroyed and use a status bit of some
 kind to signal read/write that the fd was closed by peer?
>>> It is way easier and more consistent to unblock reader and writers via
>>> destroying the sync object than to signal it and add tests for specific
>>> states to detect that. Keep in mind that this pattern is in use even
>>> without select support. Diverging from it just to add select awareness
>>> to some driver would be a step back.
>> Ok. As you wish. But in that case, please provide a unit test case which
>> we can run on all architectures to validate your modifications. We will
>> not put this test case in xenomai repository but will create a
>> repository for test cases later on.
> 
> As it requires quite some infrastructure to get there (or do I miss some
> preexisting select unit test?), I can just point to RTnet + TCP for now:
> run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and
> terminate the client prematurely. This does not happen if you run the
> very same test without RTnet loaded (ie. against Linux' TCP).
> 
> Just pushed the corresponding fix.

I really do not understand what you are trying to do. What is the
problem exactly, and how do you fix it? You are reserving 384 more bytes
on the stack. What for?

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
 Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
 Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> GIT version control wrote:
 diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c
 index 11f47c0..a896031 100644
 --- a/ksrc/skins/posix/mq.c
 +++ b/ksrc/skins/posix/mq.c
 @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct 
 xnselector *selector,
return 0;
  
unlock_and_error:
 +  xnfree(binding):
xnlock_put_irqrestore(&nklock, s);
return err;
  }
>>> Ok. Will pull. But I really need to fix that.
>>>
>> Ack - now that I see it myself.
>>
> I fixed this in my branch and added another patch to transform EIDRM
> into EBADF when selecting a (half-)deleted RTDM device. Please merge.
>
 Wait! When the sync object behind some file descriptor is deleted but
 the descriptor itself is still existing, we rather have to return that
 fd signaled from select() instead of letting the call fail. I beed to
 look into this again.
>>> It looks to me like a transitory state, we can wait for the sync object
>>> to be deleted to have the fd destructor signaled. It should not be long.
>> That's not an issue of waiting for this. See e.g. TCP: peer closes
>> connection -> internal sync objects will be destroyed (to make
>> read/write fail). But the fd will remain valid until the local side
>> closes it as well.
> It looks to me like this is going to complicate things a lot, and will
> be a source of regressions. Why can we not have sync objects be
> destroyed when the fd is really destroyed and use a status bit of some
> kind to signal read/write that the fd was closed by peer?
 It is way easier and more consistent to unblock reader and writers via
 destroying the sync object than to signal it and add tests for specific
 states to detect that. Keep in mind that this pattern is in use even
 without select support. Diverging from it just to add select awareness
 to some driver would be a step back.
>>> Ok. As you wish. But in that case, please provide a unit test case which
>>> we can run on all architectures to validate your modifications. We will
>>> not put this test case in xenomai repository but will create a
>>> repository for test cases later on.
>> As it requires quite some infrastructure to get there (or do I miss some
>> preexisting select unit test?), I can just point to RTnet + TCP for now:
>> run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and
>> terminate the client prematurely. This does not happen if you run the
>> very same test without RTnet loaded (ie. against Linux' TCP).
>>
>> Just pushed the corresponding fix.
> 
> I really do not understand what you are trying to do. What is the
> problem exactly, and how do you fix it? You are reserving 384 more bytes
> on the stack. What for?
> 

"We already return an fd as pending if the corresponding xnsynch object
is delete while waiting on it. Extend select to do the same if the
object is already marked deleted on binding.

This fixes the return code select on half-closed RTnet TCP sockets from
-EIDRM to > 0 (for pending fds)."

HTH.

Regarding the additional stack requirements - yes, not nice. Maybe we
can avoid this by clearing the in_fds in select_bind_all while
processing it unless an fd is marked deleted. Would also avoid passing a
separate fds to it.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 GIT version control wrote:
> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c
> index 11f47c0..a896031 100644
> --- a/ksrc/skins/posix/mq.c
> +++ b/ksrc/skins/posix/mq.c
> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct 
> xnselector *selector,
>   return 0;
>  
>unlock_and_error:
> + xnfree(binding):
>   xnlock_put_irqrestore(&nklock, s);
>   return err;
>  }
 Ok. Will pull. But I really need to fix that.

>>> Ack - now that I see it myself.
>>>
>> I fixed this in my branch and added another patch to transform EIDRM
>> into EBADF when selecting a (half-)deleted RTDM device. Please merge.
>>
> Wait! When the sync object behind some file descriptor is deleted but
> the descriptor itself is still existing, we rather have to return that
> fd signaled from select() instead of letting the call fail. I beed to
> look into this again.
 It looks to me like a transitory state, we can wait for the sync object
 to be deleted to have the fd destructor signaled. It should not be 
 long.
>>> That's not an issue of waiting for this. See e.g. TCP: peer closes
>>> connection -> internal sync objects will be destroyed (to make
>>> read/write fail). But the fd will remain valid until the local side
>>> closes it as well.
>> It looks to me like this is going to complicate things a lot, and will
>> be a source of regressions. Why can we not have sync objects be
>> destroyed when the fd is really destroyed and use a status bit of some
>> kind to signal read/write that the fd was closed by peer?
> It is way easier and more consistent to unblock reader and writers via
> destroying the sync object than to signal it and add tests for specific
> states to detect that. Keep in mind that this pattern is in use even
> without select support. Diverging from it just to add select awareness
> to some driver would be a step back.
 Ok. As you wish. But in that case, please provide a unit test case which
 we can run on all architectures to validate your modifications. We will
 not put this test case in xenomai repository but will create a
 repository for test cases later on.
>>> As it requires quite some infrastructure to get there (or do I miss some
>>> preexisting select unit test?), I can just point to RTnet + TCP for now:
>>> run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and
>>> terminate the client prematurely. This does not happen if you run the
>>> very same test without RTnet loaded (ie. against Linux' TCP).
>>>
>>> Just pushed the corresponding fix.
>> I really do not understand what you are trying to do. What is the
>> problem exactly, and how do you fix it? You are reserving 384 more bytes
>> on the stack. What for?
>>
> 
> "We already return an fd as pending if the corresponding xnsynch object
> is delete while waiting on it. Extend select to do the same if the
> object is already marked deleted on binding.
> 
> This fixes the return code select on half-closed RTnet TCP sockets from
> -EIDRM to > 0 (for pending fds)."
> 
> HTH.
> 
> Regarding the additional stack requirements - yes, not nice. Maybe we
> can avoid this by clearing the in_fds in select_bind_all while
> processing it unless an fd is marked deleted. Would also avoid passing a
> separate fds to it.

I do not like calling FD_ZERO all the time either. Please take the time
to explain me your implementation (I understood the reason, not the
implementation itself), and to think calmly about the implementation. I
am not going to merge anything before tonight anyway.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 GIT version control wrote:
> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c
> index 11f47c0..a896031 100644
> --- a/ksrc/skins/posix/mq.c
> +++ b/ksrc/skins/posix/mq.c
> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct 
> xnselector *selector,
>   return 0;
>  
>unlock_and_error:
> + xnfree(binding):
>   xnlock_put_irqrestore(&nklock, s);
>   return err;
>  }
 Ok. Will pull. But I really need to fix that.

>>> Ack - now that I see it myself.
>>>
>> I fixed this in my branch and added another patch to transform EIDRM
>> into EBADF when selecting a (half-)deleted RTDM device. Please merge.
>>
> Wait! When the sync object behind some file descriptor is deleted but
> the descriptor itself is still existing, we rather have to return that
> fd signaled from select() instead of letting the call fail. I beed to
> look into this again.
 It looks to me like a transitory state, we can wait for the sync object
 to be deleted to have the fd destructor signaled. It should not be 
 long.
>>> That's not an issue of waiting for this. See e.g. TCP: peer closes
>>> connection -> internal sync objects will be destroyed (to make
>>> read/write fail). But the fd will remain valid until the local side
>>> closes it as well.
>> It looks to me like this is going to complicate things a lot, and will
>> be a source of regressions. Why can we not have sync objects be
>> destroyed when the fd is really destroyed and use a status bit of some
>> kind to signal read/write that the fd was closed by peer?
> It is way easier and more consistent to unblock reader and writers via
> destroying the sync object than to signal it and add tests for specific
> states to detect that. Keep in mind that this pattern is in use even
> without select support. Diverging from it just to add select awareness
> to some driver would be a step back.
 Ok. As you wish. But in that case, please provide a unit test case which
 we can run on all architectures to validate your modifications. We will
 not put this test case in xenomai repository but will create a
 repository for test cases later on.
>>> As it requires quite some infrastructure to get there (or do I miss some
>>> preexisting select unit test?), I can just point to RTnet + TCP for now:
>>> run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and
>>> terminate the client prematurely. This does not happen if you run the
>>> very same test without RTnet loaded (ie. against Linux' TCP).
>>>
>>> Just pushed the corresponding fix.
>> I really do not understand what you are trying to do. What is the
>> problem exactly, and how do you fix it? You are reserving 384 more bytes
>> on the stack. What for?
>>
> 
> "We already return an fd as pending if the corresponding xnsynch object
> is delete while waiting on it. Extend select to do the same if the
> object is already marked deleted on binding.
> 
> This fixes the return code select on half-closed RTnet TCP sockets from
> -EIDRM to > 0 (for pending fds)."
> 
> HTH.
> 
> Regarding the additional stack requirements - yes, not nice. Maybe we
> can avoid this by clearing the in_fds in select_bind_all while
> processing it unless an fd is marked deleted. Would also avoid passing a
> separate fds to it.

To be more precise, it looks to me like each call to bind_one is
supposed to set the bit of the corresponding fd, so, if bind_one fails
because the fd is in the transitory state (the one which I do not like),
bind_all should simply set this bit to 1. And there does not seem to be
any need for additional parameters, additional space on stack, or
anything else.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
 Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
 Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
 Gilles Chanteperdrix wrote:
> GIT version control wrote:
>> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c
>> index 11f47c0..a896031 100644
>> --- a/ksrc/skins/posix/mq.c
>> +++ b/ksrc/skins/posix/mq.c
>> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct 
>> xnselector *selector,
>>  return 0;
>>  
>>unlock_and_error:
>> +xnfree(binding):
>>  xnlock_put_irqrestore(&nklock, s);
>>  return err;
>>  }
> Ok. Will pull. But I really need to fix that.
>
 Ack - now that I see it myself.

>>> I fixed this in my branch and added another patch to transform EIDRM
>>> into EBADF when selecting a (half-)deleted RTDM device. Please 
>>> merge.
>>>
>> Wait! When the sync object behind some file descriptor is deleted but
>> the descriptor itself is still existing, we rather have to return 
>> that
>> fd signaled from select() instead of letting the call fail. I beed to
>> look into this again.
> It looks to me like a transitory state, we can wait for the sync 
> object
> to be deleted to have the fd destructor signaled. It should not be 
> long.
 That's not an issue of waiting for this. See e.g. TCP: peer closes
 connection -> internal sync objects will be destroyed (to make
 read/write fail). But the fd will remain valid until the local side
 closes it as well.
>>> It looks to me like this is going to complicate things a lot, and will
>>> be a source of regressions. Why can we not have sync objects be
>>> destroyed when the fd is really destroyed and use a status bit of some
>>> kind to signal read/write that the fd was closed by peer?
>> It is way easier and more consistent to unblock reader and writers via
>> destroying the sync object than to signal it and add tests for specific
>> states to detect that. Keep in mind that this pattern is in use even
>> without select support. Diverging from it just to add select awareness
>> to some driver would be a step back.
> Ok. As you wish. But in that case, please provide a unit test case which
> we can run on all architectures to validate your modifications. We will
> not put this test case in xenomai repository but will create a
> repository for test cases later on.
 As it requires quite some infrastructure to get there (or do I miss some
 preexisting select unit test?), I can just point to RTnet + TCP for now:
 run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and
 terminate the client prematurely. This does not happen if you run the
 very same test without RTnet loaded (ie. against Linux' TCP).

 Just pushed the corresponding fix.
>>> I really do not understand what you are trying to do. What is the
>>> problem exactly, and how do you fix it? You are reserving 384 more bytes
>>> on the stack. What for?
>>>
>> "We already return an fd as pending if the corresponding xnsynch object
>> is delete while waiting on it. Extend select to do the same if the
>> object is already marked deleted on binding.
>>
>> This fixes the return code select on half-closed RTnet TCP sockets from
>> -EIDRM to > 0 (for pending fds)."
>>
>> HTH.
>>
>> Regarding the additional stack requirements - yes, not nice. Maybe we
>> can avoid this by clearing the in_fds in select_bind_all while
>> processing it unless an fd is marked deleted. Would also avoid passing a
>> separate fds to it.
> 
> To be more precise, it looks to me like each call to bind_one is
> supposed to set the bit of the corresponding fd, so, if bind_one fails
> because the fd is in the transitory state (the one which I do not like),
> bind_all should simply set this bit to 1. And there does not seem to be
> any need for additional parameters, additional space on stack, or
> anything else.

Right, the trick is likely to properly maintain the output fds of
bind_all in that not only pending fds are set, but others are cleared -
avoids the third bitmap. Still playing with such an approach.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
 Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> GIT version control wrote:
>>> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c
>>> index 11f47c0..a896031 100644
>>> --- a/ksrc/skins/posix/mq.c
>>> +++ b/ksrc/skins/posix/mq.c
>>> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct 
>>> xnselector *selector,
>>> return 0;
>>>  
>>>unlock_and_error:
>>> +   xnfree(binding):
>>> xnlock_put_irqrestore(&nklock, s);
>>> return err;
>>>  }
>> Ok. Will pull. But I really need to fix that.
>>
> Ack - now that I see it myself.
>
 I fixed this in my branch and added another patch to transform 
 EIDRM
 into EBADF when selecting a (half-)deleted RTDM device. Please 
 merge.

>>> Wait! When the sync object behind some file descriptor is deleted 
>>> but
>>> the descriptor itself is still existing, we rather have to return 
>>> that
>>> fd signaled from select() instead of letting the call fail. I beed 
>>> to
>>> look into this again.
>> It looks to me like a transitory state, we can wait for the sync 
>> object
>> to be deleted to have the fd destructor signaled. It should not be 
>> long.
> That's not an issue of waiting for this. See e.g. TCP: peer closes
> connection -> internal sync objects will be destroyed (to make
> read/write fail). But the fd will remain valid until the local side
> closes it as well.
 It looks to me like this is going to complicate things a lot, and will
 be a source of regressions. Why can we not have sync objects be
 destroyed when the fd is really destroyed and use a status bit of some
 kind to signal read/write that the fd was closed by peer?
>>> It is way easier and more consistent to unblock reader and writers via
>>> destroying the sync object than to signal it and add tests for specific
>>> states to detect that. Keep in mind that this pattern is in use even
>>> without select support. Diverging from it just to add select awareness
>>> to some driver would be a step back.
>> Ok. As you wish. But in that case, please provide a unit test case which
>> we can run on all architectures to validate your modifications. We will
>> not put this test case in xenomai repository but will create a
>> repository for test cases later on.
> As it requires quite some infrastructure to get there (or do I miss some
> preexisting select unit test?), I can just point to RTnet + TCP for now:
> run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and
> terminate the client prematurely. This does not happen if you run the
> very same test without RTnet loaded (ie. against Linux' TCP).
>
> Just pushed the corresponding fix.
 I really do not understand what you are trying to do. What is the
 problem exactly, and how do you fix it? You are reserving 384 more bytes
 on the stack. What for?

>>> "We already return an fd as pending if the corresponding xnsynch object
>>> is delete while waiting on it. Extend select to do the same if the
>>> object is already marked deleted on binding.
>>>
>>> This fixes the return code select on half-closed RTnet TCP sockets from
>>> -EIDRM to > 0 (for pending fds)."
>>>
>>> HTH.
>>>
>>> Regarding the additional stack requirements - yes, not nice. Maybe we
>>> can avoid this by clearing the in_fds in select_bind_all while
>>> processing it unless an fd is marked deleted. Would also avoid passing a
>>> separate fds to it.
>> To be more precise, it looks to me like each call to bind_one is
>> supposed to set the bit of the corresponding fd, so, if bind_one fails
>> because the fd is in the transitory state (the one which I do not like),
>> bind_all should simply set this bit to 1. And there does not seem to be
>> any need for additional parameters, additional space on stack, or
>> anything else.
> 
> Right, the trick is likely to properly maintain the output fds of
> bind_all in that not only pending fds are set, but others are cleared -
> avoids the third bitmap. Still playing with such an approach.

xnselect_bind, called by bind_one should take care of setting/clearing
individual bits, at least, it is the way it currently works.

-- 

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
 Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> GIT version control wrote:
>>> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c
>>> index 11f47c0..a896031 100644
>>> --- a/ksrc/skins/posix/mq.c
>>> +++ b/ksrc/skins/posix/mq.c
>>> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct 
>>> xnselector *selector,
>>> return 0;
>>>  
>>>unlock_and_error:
>>> +   xnfree(binding):
>>> xnlock_put_irqrestore(&nklock, s);
>>> return err;
>>>  }
>> Ok. Will pull. But I really need to fix that.
>>
> Ack - now that I see it myself.
>
 I fixed this in my branch and added another patch to transform 
 EIDRM
 into EBADF when selecting a (half-)deleted RTDM device. Please 
 merge.

>>> Wait! When the sync object behind some file descriptor is deleted 
>>> but
>>> the descriptor itself is still existing, we rather have to return 
>>> that
>>> fd signaled from select() instead of letting the call fail. I beed 
>>> to
>>> look into this again.
>> It looks to me like a transitory state, we can wait for the sync 
>> object
>> to be deleted to have the fd destructor signaled. It should not be 
>> long.
> That's not an issue of waiting for this. See e.g. TCP: peer closes
> connection -> internal sync objects will be destroyed (to make
> read/write fail). But the fd will remain valid until the local side
> closes it as well.
 It looks to me like this is going to complicate things a lot, and will
 be a source of regressions. Why can we not have sync objects be
 destroyed when the fd is really destroyed and use a status bit of some
 kind to signal read/write that the fd was closed by peer?
>>> It is way easier and more consistent to unblock reader and writers via
>>> destroying the sync object than to signal it and add tests for specific
>>> states to detect that. Keep in mind that this pattern is in use even
>>> without select support. Diverging from it just to add select awareness
>>> to some driver would be a step back.
>> Ok. As you wish. But in that case, please provide a unit test case which
>> we can run on all architectures to validate your modifications. We will
>> not put this test case in xenomai repository but will create a
>> repository for test cases later on.
> As it requires quite some infrastructure to get there (or do I miss some
> preexisting select unit test?), I can just point to RTnet + TCP for now:
> run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and
> terminate the client prematurely. This does not happen if you run the
> very same test without RTnet loaded (ie. against Linux' TCP).
>
> Just pushed the corresponding fix.
 I really do not understand what you are trying to do. What is the
 problem exactly, and how do you fix it? You are reserving 384 more bytes
 on the stack. What for?

>>> "We already return an fd as pending if the corresponding xnsynch object
>>> is delete while waiting on it. Extend select to do the same if the
>>> object is already marked deleted on binding.
>>>
>>> This fixes the return code select on half-closed RTnet TCP sockets from
>>> -EIDRM to > 0 (for pending fds)."
>>>
>>> HTH.
>>>
>>> Regarding the additional stack requirements - yes, not nice. Maybe we
>>> can avoid this by clearing the in_fds in select_bind_all while
>>> processing it unless an fd is marked deleted. Would also avoid passing a
>>> separate fds to it.
>> To be more precise, it looks to me like each call to bind_one is
>> supposed to set the bit of the corresponding fd, so, if bind_one fails
>> because the fd is in the transitory state (the one which I do not like),
>> bind_all should simply set this bit to 1. And there does not seem to be
>> any need for additional parameters, additional space on stack, or
>> anything else.
> 
> Right, the trick is likely to properly maintain the output fds of
> bind_all in that not only pending fds are set, but others are cleared -
> avoids the third bitmap. Still playing with such an approach.

Another precision, xnselect_bind accesses directly the "pending" set of
the xnselector structure, so, you have the right to do that for the fds
which fai

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
> Right, the trick is likely to properly maintain the output fds of
> bind_all in that not only pending fds are set, but others are cleared -
> avoids the third bitmap. Still playing with such an approach.

What we are trying to do, in a nutshell, is to create a notion of
stateless binding between a file descriptor and an xnselector. If we
really want to do this, we will have to extend the xnselect interface.
But this looks really wrong to me.

It seems possible to keep the bindings around as long as the file
descriptor exists. The binding is here for this reason, to express the
link between the file descriptor and the thread which runs select, and
its lifecycle should be the same as the file descriptor. And if you want
to signal that such a file descriptor is ready for reading/writing, even
if it is closed, you should be using xnselect_signal. The interface is
there, it is meant for that.

So, the more I think about it, the more I think that we are going in the
wrong direction.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors

2010-03-01 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> To be more precise, it looks to me like each call to bind_one is
>>> supposed to set the bit of the corresponding fd, so, if bind_one fails
>>> because the fd is in the transitory state (the one which I do not like),
>>> bind_all should simply set this bit to 1. And there does not seem to be
>>> any need for additional parameters, additional space on stack, or
>>> anything else.
>> Right, the trick is likely to properly maintain the output fds of
>> bind_all in that not only pending fds are set, but others are cleared -
>> avoids the third bitmap. Still playing with such an approach.
> 
> Another precision, xnselect_bind accesses directly the "pending" set of
> the xnselector structure, so, you have the right to do that for the fds
> which failed binding but which you want to mark as pending.

Right, /me was blind, the issue can trivially be fixed inside RTDM by
binding as pending even if the object is marked deleted. I'm now just
cleaning up some other RTDM bits at this chance, pull request will follow.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Xenomai in Debian

2010-03-01 Thread Gilles Chanteperdrix
Roland Stigge wrote:
> Hi Gilles,
> 
> first - I'm sorry if you sometimes feel offended by my work on Xenomai 
> in Debian. I understand that you are very much connected to your project 
> and want to have it working perfectly everywhere.
> 
> Unfortunately, my time to work on this is limited and the last uploads 
> were work in progress - to provide latest Xenomai in Debian. Further 
> work on it was planned for this weekend.
> 
> But please also understand that Debian developers will possibly 
> prioritize work on upstream packages where they feel their work is 
> appreciated. So please think about your tone before sending email and 
> driving people away from Xenomai.

Ok. Please accept my sincere apologies. I understand that I am in no
position to ask anything the way I did it.

Trying to be more positive, would it help you, or anybody else if we
created a xenomai-announce mailing list, where we would only publish
releases and releasees informations ?

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [pull request] RTDM fixes and cleanups

2010-03-01 Thread Jan Kiszka
The following changes since commit 95412b8c10ba7ba199089a3cbb60c67c8d718186:
  Gilles Chanteperdrix (1):
timeconv: revert 97b78d45bf4c5571fccd9675fae8bb008a341769

are available in the git repository at:

  git://git.xenomai.org/xenomai-jki.git for-upstream

Jan Kiszka (5):
  RTDM+POSIX: Avoid leaking binding objects on errors
  RTDM: Bind deleted sem/event objects, but mark them pending
  RTDM: Consistently use xnsynch_test_flags to test for RTDM_SYNCH_DELETED
  RTDM: Avoid calling cleanup_instance with held spin lock
  RTDM: Add smp barrier to rtdm_context_unlock

 include/rtdm/rtdm_driver.h |1 +
 ksrc/skins/posix/mq.c  |1 +
 ksrc/skins/rtdm/core.c |   45 ++-
 ksrc/skins/rtdm/drvlib.c   |   40 --
 4 files changed, 46 insertions(+), 41 deletions(-)

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Xenomai in Debian

2010-03-01 Thread Roland Stigge
Hi Gilles,

thanks for your mail!

And don't worry - there's an updated Xenomai version in Debian sid now
thanks to everyone from the Xenomai dev list. Sven put quite some effort
into syncing the two projects. If there are any problems, feel free to
"bug" me. ;-)

Regarding the xenomai-announce list, this sounds like a good idea since
many projects have it, including Debian. For me personally, my work much
depends on free ressources and Debian release schedule urgency. Btw: The
Debian watch system automatically generates an email when xenomai has
got a new release. :-) So also no hurry here.

See you,

bye,
  Roland


___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Yet another ((weak)) bug

2010-03-01 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Stefan Kisdaroczi wrote:
>>> Am 12.02.2010 17:22, schrieb Jan Kiszka:
 libxnskin or so?

 Jan

>>> libxenomai ?
>> Ok. Let's go for libxenomai. I will try and do that in the next few days.
>>
> Already had a chance to look into this?
 No. I was off a few days. I just looked at it in the train, but could
 not test it yet.
>>> If there is anything I can do to help accelerating it, please let me
>>> know. We need to bake a new Xenomai package this week for our customer,
>>> and at some point I need to decide between upstream and my temporary
>>> fix. Upstream is generally preferred - if available.
>> Yes, I have been a bit delayed, but should handle this this week, so as
>> to be ready for a release next week-end.
> 
> Anything in some private branch already?

Ok. Everything pushed. It works on ARM and x86. I have not tested in the
multilib and/or dlopen case, so please test it if you can.

> 
> Jan
> 


-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core