Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Jan Kiszka wrote: > Jan Kiszka wrote: >> Gilles Chanteperdrix wrote: >>> GIT version control wrote: diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c index 11f47c0..a896031 100644 --- a/ksrc/skins/posix/mq.c +++ b/ksrc/skins/posix/mq.c @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct xnselector *selector, return 0; unlock_and_error: + xnfree(binding): xnlock_put_irqrestore(&nklock, s); return err; } >>> Ok. Will pull. But I really need to fix that. >>> >> Ack - now that I see it myself. >> > > I fixed this in my branch and added another patch to transform EIDRM > into EBADF when selecting a (half-)deleted RTDM device. Please merge. > Wait! When the sync object behind some file descriptor is deleted but the descriptor itself is still existing, we rather have to return that fd signaled from select() instead of letting the call fail. I beed to look into this again. Jan signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Jan Kiszka wrote: > Jan Kiszka wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: GIT version control wrote: > diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c > index 11f47c0..a896031 100644 > --- a/ksrc/skins/posix/mq.c > +++ b/ksrc/skins/posix/mq.c > @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct > xnselector *selector, > return 0; > >unlock_and_error: > + xnfree(binding): > xnlock_put_irqrestore(&nklock, s); > return err; > } Ok. Will pull. But I really need to fix that. >>> Ack - now that I see it myself. >>> >> I fixed this in my branch and added another patch to transform EIDRM >> into EBADF when selecting a (half-)deleted RTDM device. Please merge. >> > > Wait! When the sync object behind some file descriptor is deleted but > the descriptor itself is still existing, we rather have to return that > fd signaled from select() instead of letting the call fail. I beed to > look into this again. It looks to me like a transitory state, we can wait for the sync object to be deleted to have the fd destructor signaled. It should not be long. > > Jan > -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Jan Kiszka wrote: >>> Jan Kiszka wrote: Gilles Chanteperdrix wrote: > GIT version control wrote: >> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c >> index 11f47c0..a896031 100644 >> --- a/ksrc/skins/posix/mq.c >> +++ b/ksrc/skins/posix/mq.c >> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct >> xnselector *selector, >> return 0; >> >>unlock_and_error: >> +xnfree(binding): >> xnlock_put_irqrestore(&nklock, s); >> return err; >> } > Ok. Will pull. But I really need to fix that. > Ack - now that I see it myself. >>> I fixed this in my branch and added another patch to transform EIDRM >>> into EBADF when selecting a (half-)deleted RTDM device. Please merge. >>> >> Wait! When the sync object behind some file descriptor is deleted but >> the descriptor itself is still existing, we rather have to return that >> fd signaled from select() instead of letting the call fail. I beed to >> look into this again. > > It looks to me like a transitory state, we can wait for the sync object > to be deleted to have the fd destructor signaled. It should not be long. That's not an issue of waiting for this. See e.g. TCP: peer closes connection -> internal sync objects will be destroyed (to make read/write fail). But the fd will remain valid until the local side closes it as well. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Jan Kiszka wrote: Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> GIT version control wrote: >>> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c >>> index 11f47c0..a896031 100644 >>> --- a/ksrc/skins/posix/mq.c >>> +++ b/ksrc/skins/posix/mq.c >>> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct >>> xnselector *selector, >>> return 0; >>> >>>unlock_and_error: >>> + xnfree(binding): >>> xnlock_put_irqrestore(&nklock, s); >>> return err; >>> } >> Ok. Will pull. But I really need to fix that. >> > Ack - now that I see it myself. > I fixed this in my branch and added another patch to transform EIDRM into EBADF when selecting a (half-)deleted RTDM device. Please merge. >>> Wait! When the sync object behind some file descriptor is deleted but >>> the descriptor itself is still existing, we rather have to return that >>> fd signaled from select() instead of letting the call fail. I beed to >>> look into this again. >> It looks to me like a transitory state, we can wait for the sync object >> to be deleted to have the fd destructor signaled. It should not be long. > > That's not an issue of waiting for this. See e.g. TCP: peer closes > connection -> internal sync objects will be destroyed (to make > read/write fail). But the fd will remain valid until the local side > closes it as well. It looks to me like this is going to complicate things a lot, and will be a source of regressions. Why can we not have sync objects be destroyed when the fd is really destroyed and use a status bit of some kind to signal read/write that the fd was closed by peer? > > Jan > -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Gilles Chanteperdrix wrote: >>> Jan Kiszka wrote: Jan Kiszka wrote: > Jan Kiszka wrote: >> Gilles Chanteperdrix wrote: >>> GIT version control wrote: diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c index 11f47c0..a896031 100644 --- a/ksrc/skins/posix/mq.c +++ b/ksrc/skins/posix/mq.c @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct xnselector *selector, return 0; unlock_and_error: + xnfree(binding): xnlock_put_irqrestore(&nklock, s); return err; } >>> Ok. Will pull. But I really need to fix that. >>> >> Ack - now that I see it myself. >> > I fixed this in my branch and added another patch to transform EIDRM > into EBADF when selecting a (half-)deleted RTDM device. Please merge. > Wait! When the sync object behind some file descriptor is deleted but the descriptor itself is still existing, we rather have to return that fd signaled from select() instead of letting the call fail. I beed to look into this again. >>> It looks to me like a transitory state, we can wait for the sync object >>> to be deleted to have the fd destructor signaled. It should not be long. >> That's not an issue of waiting for this. See e.g. TCP: peer closes >> connection -> internal sync objects will be destroyed (to make >> read/write fail). But the fd will remain valid until the local side >> closes it as well. > > It looks to me like this is going to complicate things a lot, and will > be a source of regressions. Why can we not have sync objects be > destroyed when the fd is really destroyed and use a status bit of some > kind to signal read/write that the fd was closed by peer? It is way easier and more consistent to unblock reader and writers via destroying the sync object than to signal it and add tests for specific states to detect that. Keep in mind that this pattern is in use even without select support. Diverging from it just to add select awareness to some driver would be a step back. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: Jan Kiszka wrote: > Wait! When the sync object behind some file descriptor is deleted but > the descriptor itself is still existing, we rather have to return that > fd signaled from select() instead of letting the call fail. I beed to > look into this again. It looks to me like a transitory state, we can wait for the sync object to be deleted to have the fd destructor signaled. It should not be long. >>> That's not an issue of waiting for this. See e.g. TCP: peer closes >>> connection -> internal sync objects will be destroyed (to make >>> read/write fail). But the fd will remain valid until the local side >>> closes it as well. >> It looks to me like this is going to complicate things a lot, and will >> be a source of regressions. Why can we not have sync objects be >> destroyed when the fd is really destroyed and use a status bit of some >> kind to signal read/write that the fd was closed by peer? > > It is way easier and more consistent to unblock reader and writers via > destroying the sync object than to signal it and add tests for specific > states to detect that. Keep in mind that this pattern is in use even > without select support. Diverging from it just to add select awareness > to some driver would be a step back. > First draft of a fix that so far does what it is supposed to do, comments welcome: diff --git a/ksrc/skins/posix/syscall.c b/ksrc/skins/posix/syscall.c index 959b61c..4e46cb6 100644 --- a/ksrc/skins/posix/syscall.c +++ b/ksrc/skins/posix/syscall.c @@ -2298,24 +2298,30 @@ static int select_bind_one(struct xnselector *selector, unsigned type, int fd) } static int select_bind_all(struct xnselector *selector, - fd_set *fds[XNSELECT_MAX_TYPES], int nfds) + fd_set *in_fds[XNSELECT_MAX_TYPES], + fd_set *out_fds[XNSELECT_MAX_TYPES], int nfds) { unsigned fd, type; + int pending = 0; int err; for (type = 0; type < XNSELECT_MAX_TYPES; type++) { - fd_set *set = fds[type]; + fd_set *set = in_fds[type]; if (set) for (fd = find_first_bit(set->fds_bits, nfds); fd < nfds; fd = find_next_bit(set->fds_bits, nfds, fd + 1)) { err = select_bind_one(selector, type, fd); - if (err) - return err; + if (err) { + if (err != -EIDRM) + return err; + __FD_SET(fd, out_fds[type]); + pending++; + } } } - return 0; + return pending; } /* int select(int, fd_set *, fd_set *, fd_set *, struct timeval *) */ @@ -2387,7 +2393,7 @@ static int __select(struct pt_regs *regs) /* Bind directly the file descriptors, we do not need to go through xnselect returning -ECHRNG */ - if ((err = select_bind_all(selector, in_fds, nfds))) + if ((err = select_bind_all(selector, in_fds, out_fds, nfds))) return err; } @@ -2395,7 +2401,8 @@ static int __select(struct pt_regs *regs) err = xnselect(selector, out_fds, in_fds, nfds, timeout, mode); if (err == -ECHRNG) { - int err = select_bind_all(selector, out_fds, nfds); + int err = select_bind_all(selector, out_fds, out_fds, + nfds); if (err) return err; } ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Jan Kiszka wrote: > Jan Kiszka wrote: >> Gilles Chanteperdrix wrote: >>> Jan Kiszka wrote: Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Wait! When the sync object behind some file descriptor is deleted but >> the descriptor itself is still existing, we rather have to return that >> fd signaled from select() instead of letting the call fail. I beed to >> look into this again. > It looks to me like a transitory state, we can wait for the sync object > to be deleted to have the fd destructor signaled. It should not be long. That's not an issue of waiting for this. See e.g. TCP: peer closes connection -> internal sync objects will be destroyed (to make read/write fail). But the fd will remain valid until the local side closes it as well. >>> It looks to me like this is going to complicate things a lot, and will >>> be a source of regressions. Why can we not have sync objects be >>> destroyed when the fd is really destroyed and use a status bit of some >>> kind to signal read/write that the fd was closed by peer? >> It is way easier and more consistent to unblock reader and writers via >> destroying the sync object than to signal it and add tests for specific >> states to detect that. Keep in mind that this pattern is in use even >> without select support. Diverging from it just to add select awareness >> to some driver would be a step back. >> > > First draft of a fix that so far does what it is supposed to do, > comments welcome: > > diff --git a/ksrc/skins/posix/syscall.c b/ksrc/skins/posix/syscall.c > index 959b61c..4e46cb6 100644 > --- a/ksrc/skins/posix/syscall.c > +++ b/ksrc/skins/posix/syscall.c > @@ -2298,24 +2298,30 @@ static int select_bind_one(struct xnselector > *selector, unsigned type, int fd) > } > > static int select_bind_all(struct xnselector *selector, > -fd_set *fds[XNSELECT_MAX_TYPES], int nfds) > +fd_set *in_fds[XNSELECT_MAX_TYPES], > +fd_set *out_fds[XNSELECT_MAX_TYPES], int nfds) > { > unsigned fd, type; > + int pending = 0; > int err; > > for (type = 0; type < XNSELECT_MAX_TYPES; type++) { > - fd_set *set = fds[type]; > + fd_set *set = in_fds[type]; > if (set) > for (fd = find_first_bit(set->fds_bits, nfds); >fd < nfds; >fd = find_next_bit(set->fds_bits, nfds, fd + 1)) { > err = select_bind_one(selector, type, fd); > - if (err) > - return err; > + if (err) { > + if (err != -EIDRM) > + return err; > + __FD_SET(fd, out_fds[type]); > + pending++; > + } > } > } > > - return 0; > + return pending; > } > > /* int select(int, fd_set *, fd_set *, fd_set *, struct timeval *) */ > @@ -2387,7 +2393,7 @@ static int __select(struct pt_regs *regs) > > /* Bind directly the file descriptors, we do not need to go > through xnselect returning -ECHRNG */ > - if ((err = select_bind_all(selector, in_fds, nfds))) > + if ((err = select_bind_all(selector, in_fds, out_fds, nfds))) > return err; > } > > @@ -2395,7 +2401,8 @@ static int __select(struct pt_regs *regs) > err = xnselect(selector, out_fds, in_fds, nfds, timeout, mode); > > if (err == -ECHRNG) { > - int err = select_bind_all(selector, out_fds, nfds); > + int err = select_bind_all(selector, out_fds, out_fds, > + nfds); Mmh, this is probably no good idea, need to truly split in (aka "no yet bound") and out (aka "pending") sets here as well. > if (err) > return err; > } Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: Jan Kiszka wrote: > Jan Kiszka wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: GIT version control wrote: > diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c > index 11f47c0..a896031 100644 > --- a/ksrc/skins/posix/mq.c > +++ b/ksrc/skins/posix/mq.c > @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct > xnselector *selector, > return 0; > >unlock_and_error: > + xnfree(binding): > xnlock_put_irqrestore(&nklock, s); > return err; > } Ok. Will pull. But I really need to fix that. >>> Ack - now that I see it myself. >>> >> I fixed this in my branch and added another patch to transform EIDRM >> into EBADF when selecting a (half-)deleted RTDM device. Please merge. >> > Wait! When the sync object behind some file descriptor is deleted but > the descriptor itself is still existing, we rather have to return that > fd signaled from select() instead of letting the call fail. I beed to > look into this again. It looks to me like a transitory state, we can wait for the sync object to be deleted to have the fd destructor signaled. It should not be long. >>> That's not an issue of waiting for this. See e.g. TCP: peer closes >>> connection -> internal sync objects will be destroyed (to make >>> read/write fail). But the fd will remain valid until the local side >>> closes it as well. >> It looks to me like this is going to complicate things a lot, and will >> be a source of regressions. Why can we not have sync objects be >> destroyed when the fd is really destroyed and use a status bit of some >> kind to signal read/write that the fd was closed by peer? > > It is way easier and more consistent to unblock reader and writers via > destroying the sync object than to signal it and add tests for specific > states to detect that. Keep in mind that this pattern is in use even > without select support. Diverging from it just to add select awareness > to some driver would be a step back. Ok. As you wish. But in that case, please provide a unit test case which we can run on all architectures to validate your modifications. We will not put this test case in xenomai repository but will create a repository for test cases later on. -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: Jan Kiszka wrote: > Jan Kiszka wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: GIT version control wrote: > diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c > index 11f47c0..a896031 100644 > --- a/ksrc/skins/posix/mq.c > +++ b/ksrc/skins/posix/mq.c > @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct > xnselector *selector, > return 0; > >unlock_and_error: > + xnfree(binding): > xnlock_put_irqrestore(&nklock, s); > return err; > } Ok. Will pull. But I really need to fix that. >>> Ack - now that I see it myself. >>> >> I fixed this in my branch and added another patch to transform EIDRM >> into EBADF when selecting a (half-)deleted RTDM device. Please merge. >> > Wait! When the sync object behind some file descriptor is deleted but > the descriptor itself is still existing, we rather have to return that > fd signaled from select() instead of letting the call fail. I beed to > look into this again. It looks to me like a transitory state, we can wait for the sync object to be deleted to have the fd destructor signaled. It should not be long. >>> That's not an issue of waiting for this. See e.g. TCP: peer closes >>> connection -> internal sync objects will be destroyed (to make >>> read/write fail). But the fd will remain valid until the local side >>> closes it as well. >> It looks to me like this is going to complicate things a lot, and will >> be a source of regressions. Why can we not have sync objects be >> destroyed when the fd is really destroyed and use a status bit of some >> kind to signal read/write that the fd was closed by peer? > > It is way easier and more consistent to unblock reader and writers via > destroying the sync object than to signal it and add tests for specific > states to detect that. >From my point of view, having an object with partially destroyed parts is asking for trouble. -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Gilles Chanteperdrix wrote: >>> Jan Kiszka wrote: Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Jan Kiszka wrote: >>> Jan Kiszka wrote: Gilles Chanteperdrix wrote: > GIT version control wrote: >> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c >> index 11f47c0..a896031 100644 >> --- a/ksrc/skins/posix/mq.c >> +++ b/ksrc/skins/posix/mq.c >> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct >> xnselector *selector, >> return 0; >> >>unlock_and_error: >> +xnfree(binding): >> xnlock_put_irqrestore(&nklock, s); >> return err; >> } > Ok. Will pull. But I really need to fix that. > Ack - now that I see it myself. >>> I fixed this in my branch and added another patch to transform EIDRM >>> into EBADF when selecting a (half-)deleted RTDM device. Please merge. >>> >> Wait! When the sync object behind some file descriptor is deleted but >> the descriptor itself is still existing, we rather have to return that >> fd signaled from select() instead of letting the call fail. I beed to >> look into this again. > It looks to me like a transitory state, we can wait for the sync object > to be deleted to have the fd destructor signaled. It should not be long. That's not an issue of waiting for this. See e.g. TCP: peer closes connection -> internal sync objects will be destroyed (to make read/write fail). But the fd will remain valid until the local side closes it as well. >>> It looks to me like this is going to complicate things a lot, and will >>> be a source of regressions. Why can we not have sync objects be >>> destroyed when the fd is really destroyed and use a status bit of some >>> kind to signal read/write that the fd was closed by peer? >> It is way easier and more consistent to unblock reader and writers via >> destroying the sync object than to signal it and add tests for specific >> states to detect that. Keep in mind that this pattern is in use even >> without select support. Diverging from it just to add select awareness >> to some driver would be a step back. > > Ok. As you wish. But in that case, please provide a unit test case which > we can run on all architectures to validate your modifications. We will > not put this test case in xenomai repository but will create a > repository for test cases later on. As it requires quite some infrastructure to get there (or do I miss some preexisting select unit test?), I can just point to RTnet + TCP for now: run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and terminate the client prematurely. This does not happen if you run the very same test without RTnet loaded (ie. against Linux' TCP). Just pushed the corresponding fix. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Yet another ((weak)) bug
Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Gilles Chanteperdrix wrote: >>> Stefan Kisdaroczi wrote: Am 12.02.2010 17:22, schrieb Jan Kiszka: > libxnskin or so? > > Jan > libxenomai ? >>> Ok. Let's go for libxenomai. I will try and do that in the next few days. >>> >> Already had a chance to look into this? > > No. I was off a few days. I just looked at it in the train, but could > not test it yet. If there is anything I can do to help accelerating it, please let me know. We need to bake a new Xenomai package this week for our customer, and at some point I need to decide between upstream and my temporary fix. Upstream is generally preferred - if available. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Yet another ((weak)) bug
Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: Stefan Kisdaroczi wrote: > Am 12.02.2010 17:22, schrieb Jan Kiszka: >> libxnskin or so? >> >> Jan >> > libxenomai ? Ok. Let's go for libxenomai. I will try and do that in the next few days. >>> Already had a chance to look into this? >> No. I was off a few days. I just looked at it in the train, but could >> not test it yet. > > If there is anything I can do to help accelerating it, please let me > know. We need to bake a new Xenomai package this week for our customer, > and at some point I need to decide between upstream and my temporary > fix. Upstream is generally preferred - if available. Yes, I have been a bit delayed, but should handle this this week, so as to be ready for a release next week-end. > > Jan > -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Yet another ((weak)) bug
Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Gilles Chanteperdrix wrote: >>> Jan Kiszka wrote: Gilles Chanteperdrix wrote: > Stefan Kisdaroczi wrote: >> Am 12.02.2010 17:22, schrieb Jan Kiszka: >>> libxnskin or so? >>> >>> Jan >>> >> libxenomai ? > Ok. Let's go for libxenomai. I will try and do that in the next few days. > Already had a chance to look into this? >>> No. I was off a few days. I just looked at it in the train, but could >>> not test it yet. >> If there is anything I can do to help accelerating it, please let me >> know. We need to bake a new Xenomai package this week for our customer, >> and at some point I need to decide between upstream and my temporary >> fix. Upstream is generally preferred - if available. > > Yes, I have been a bit delayed, but should handle this this week, so as > to be ready for a release next week-end. Anything in some private branch already? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Jan Kiszka wrote: Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> GIT version control wrote: >>> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c >>> index 11f47c0..a896031 100644 >>> --- a/ksrc/skins/posix/mq.c >>> +++ b/ksrc/skins/posix/mq.c >>> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct >>> xnselector *selector, >>> return 0; >>> >>>unlock_and_error: >>> + xnfree(binding): >>> xnlock_put_irqrestore(&nklock, s); >>> return err; >>> } >> Ok. Will pull. But I really need to fix that. >> > Ack - now that I see it myself. > I fixed this in my branch and added another patch to transform EIDRM into EBADF when selecting a (half-)deleted RTDM device. Please merge. >>> Wait! When the sync object behind some file descriptor is deleted but >>> the descriptor itself is still existing, we rather have to return that >>> fd signaled from select() instead of letting the call fail. I beed to >>> look into this again. >> It looks to me like a transitory state, we can wait for the sync object >> to be deleted to have the fd destructor signaled. It should not be long. > That's not an issue of waiting for this. See e.g. TCP: peer closes > connection -> internal sync objects will be destroyed (to make > read/write fail). But the fd will remain valid until the local side > closes it as well. It looks to me like this is going to complicate things a lot, and will be a source of regressions. Why can we not have sync objects be destroyed when the fd is really destroyed and use a status bit of some kind to signal read/write that the fd was closed by peer? >>> It is way easier and more consistent to unblock reader and writers via >>> destroying the sync object than to signal it and add tests for specific >>> states to detect that. Keep in mind that this pattern is in use even >>> without select support. Diverging from it just to add select awareness >>> to some driver would be a step back. >> Ok. As you wish. But in that case, please provide a unit test case which >> we can run on all architectures to validate your modifications. We will >> not put this test case in xenomai repository but will create a >> repository for test cases later on. > > As it requires quite some infrastructure to get there (or do I miss some > preexisting select unit test?), I can just point to RTnet + TCP for now: > run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and > terminate the client prematurely. This does not happen if you run the > very same test without RTnet loaded (ie. against Linux' TCP). > > Just pushed the corresponding fix. I really do not understand what you are trying to do. What is the problem exactly, and how do you fix it? You are reserving 384 more bytes on the stack. What for? -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Gilles Chanteperdrix wrote: >>> Jan Kiszka wrote: Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Gilles Chanteperdrix wrote: >>> Jan Kiszka wrote: Jan Kiszka wrote: > Jan Kiszka wrote: >> Gilles Chanteperdrix wrote: >>> GIT version control wrote: diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c index 11f47c0..a896031 100644 --- a/ksrc/skins/posix/mq.c +++ b/ksrc/skins/posix/mq.c @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct xnselector *selector, return 0; unlock_and_error: + xnfree(binding): xnlock_put_irqrestore(&nklock, s); return err; } >>> Ok. Will pull. But I really need to fix that. >>> >> Ack - now that I see it myself. >> > I fixed this in my branch and added another patch to transform EIDRM > into EBADF when selecting a (half-)deleted RTDM device. Please merge. > Wait! When the sync object behind some file descriptor is deleted but the descriptor itself is still existing, we rather have to return that fd signaled from select() instead of letting the call fail. I beed to look into this again. >>> It looks to me like a transitory state, we can wait for the sync object >>> to be deleted to have the fd destructor signaled. It should not be long. >> That's not an issue of waiting for this. See e.g. TCP: peer closes >> connection -> internal sync objects will be destroyed (to make >> read/write fail). But the fd will remain valid until the local side >> closes it as well. > It looks to me like this is going to complicate things a lot, and will > be a source of regressions. Why can we not have sync objects be > destroyed when the fd is really destroyed and use a status bit of some > kind to signal read/write that the fd was closed by peer? It is way easier and more consistent to unblock reader and writers via destroying the sync object than to signal it and add tests for specific states to detect that. Keep in mind that this pattern is in use even without select support. Diverging from it just to add select awareness to some driver would be a step back. >>> Ok. As you wish. But in that case, please provide a unit test case which >>> we can run on all architectures to validate your modifications. We will >>> not put this test case in xenomai repository but will create a >>> repository for test cases later on. >> As it requires quite some infrastructure to get there (or do I miss some >> preexisting select unit test?), I can just point to RTnet + TCP for now: >> run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and >> terminate the client prematurely. This does not happen if you run the >> very same test without RTnet loaded (ie. against Linux' TCP). >> >> Just pushed the corresponding fix. > > I really do not understand what you are trying to do. What is the > problem exactly, and how do you fix it? You are reserving 384 more bytes > on the stack. What for? > "We already return an fd as pending if the corresponding xnsynch object is delete while waiting on it. Extend select to do the same if the object is already marked deleted on binding. This fixes the return code select on half-closed RTnet TCP sockets from -EIDRM to > 0 (for pending fds)." HTH. Regarding the additional stack requirements - yes, not nice. Maybe we can avoid this by clearing the in_fds in select_bind_all while processing it unless an fd is marked deleted. Would also avoid passing a separate fds to it. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: Jan Kiszka wrote: > Jan Kiszka wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: GIT version control wrote: > diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c > index 11f47c0..a896031 100644 > --- a/ksrc/skins/posix/mq.c > +++ b/ksrc/skins/posix/mq.c > @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct > xnselector *selector, > return 0; > >unlock_and_error: > + xnfree(binding): > xnlock_put_irqrestore(&nklock, s); > return err; > } Ok. Will pull. But I really need to fix that. >>> Ack - now that I see it myself. >>> >> I fixed this in my branch and added another patch to transform EIDRM >> into EBADF when selecting a (half-)deleted RTDM device. Please merge. >> > Wait! When the sync object behind some file descriptor is deleted but > the descriptor itself is still existing, we rather have to return that > fd signaled from select() instead of letting the call fail. I beed to > look into this again. It looks to me like a transitory state, we can wait for the sync object to be deleted to have the fd destructor signaled. It should not be long. >>> That's not an issue of waiting for this. See e.g. TCP: peer closes >>> connection -> internal sync objects will be destroyed (to make >>> read/write fail). But the fd will remain valid until the local side >>> closes it as well. >> It looks to me like this is going to complicate things a lot, and will >> be a source of regressions. Why can we not have sync objects be >> destroyed when the fd is really destroyed and use a status bit of some >> kind to signal read/write that the fd was closed by peer? > It is way easier and more consistent to unblock reader and writers via > destroying the sync object than to signal it and add tests for specific > states to detect that. Keep in mind that this pattern is in use even > without select support. Diverging from it just to add select awareness > to some driver would be a step back. Ok. As you wish. But in that case, please provide a unit test case which we can run on all architectures to validate your modifications. We will not put this test case in xenomai repository but will create a repository for test cases later on. >>> As it requires quite some infrastructure to get there (or do I miss some >>> preexisting select unit test?), I can just point to RTnet + TCP for now: >>> run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and >>> terminate the client prematurely. This does not happen if you run the >>> very same test without RTnet loaded (ie. against Linux' TCP). >>> >>> Just pushed the corresponding fix. >> I really do not understand what you are trying to do. What is the >> problem exactly, and how do you fix it? You are reserving 384 more bytes >> on the stack. What for? >> > > "We already return an fd as pending if the corresponding xnsynch object > is delete while waiting on it. Extend select to do the same if the > object is already marked deleted on binding. > > This fixes the return code select on half-closed RTnet TCP sockets from > -EIDRM to > 0 (for pending fds)." > > HTH. > > Regarding the additional stack requirements - yes, not nice. Maybe we > can avoid this by clearing the in_fds in select_bind_all while > processing it unless an fd is marked deleted. Would also avoid passing a > separate fds to it. I do not like calling FD_ZERO all the time either. Please take the time to explain me your implementation (I understood the reason, not the implementation itself), and to think calmly about the implementation. I am not going to merge anything before tonight anyway. -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: Jan Kiszka wrote: > Jan Kiszka wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: GIT version control wrote: > diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c > index 11f47c0..a896031 100644 > --- a/ksrc/skins/posix/mq.c > +++ b/ksrc/skins/posix/mq.c > @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct > xnselector *selector, > return 0; > >unlock_and_error: > + xnfree(binding): > xnlock_put_irqrestore(&nklock, s); > return err; > } Ok. Will pull. But I really need to fix that. >>> Ack - now that I see it myself. >>> >> I fixed this in my branch and added another patch to transform EIDRM >> into EBADF when selecting a (half-)deleted RTDM device. Please merge. >> > Wait! When the sync object behind some file descriptor is deleted but > the descriptor itself is still existing, we rather have to return that > fd signaled from select() instead of letting the call fail. I beed to > look into this again. It looks to me like a transitory state, we can wait for the sync object to be deleted to have the fd destructor signaled. It should not be long. >>> That's not an issue of waiting for this. See e.g. TCP: peer closes >>> connection -> internal sync objects will be destroyed (to make >>> read/write fail). But the fd will remain valid until the local side >>> closes it as well. >> It looks to me like this is going to complicate things a lot, and will >> be a source of regressions. Why can we not have sync objects be >> destroyed when the fd is really destroyed and use a status bit of some >> kind to signal read/write that the fd was closed by peer? > It is way easier and more consistent to unblock reader and writers via > destroying the sync object than to signal it and add tests for specific > states to detect that. Keep in mind that this pattern is in use even > without select support. Diverging from it just to add select awareness > to some driver would be a step back. Ok. As you wish. But in that case, please provide a unit test case which we can run on all architectures to validate your modifications. We will not put this test case in xenomai repository but will create a repository for test cases later on. >>> As it requires quite some infrastructure to get there (or do I miss some >>> preexisting select unit test?), I can just point to RTnet + TCP for now: >>> run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and >>> terminate the client prematurely. This does not happen if you run the >>> very same test without RTnet loaded (ie. against Linux' TCP). >>> >>> Just pushed the corresponding fix. >> I really do not understand what you are trying to do. What is the >> problem exactly, and how do you fix it? You are reserving 384 more bytes >> on the stack. What for? >> > > "We already return an fd as pending if the corresponding xnsynch object > is delete while waiting on it. Extend select to do the same if the > object is already marked deleted on binding. > > This fixes the return code select on half-closed RTnet TCP sockets from > -EIDRM to > 0 (for pending fds)." > > HTH. > > Regarding the additional stack requirements - yes, not nice. Maybe we > can avoid this by clearing the in_fds in select_bind_all while > processing it unless an fd is marked deleted. Would also avoid passing a > separate fds to it. To be more precise, it looks to me like each call to bind_one is supposed to set the bit of the corresponding fd, so, if bind_one fails because the fd is in the transitory state (the one which I do not like), bind_all should simply set this bit to 1. And there does not seem to be any need for additional parameters, additional space on stack, or anything else. -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Gilles Chanteperdrix wrote: >>> Jan Kiszka wrote: Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Gilles Chanteperdrix wrote: >>> Jan Kiszka wrote: Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Jan Kiszka wrote: >>> Jan Kiszka wrote: Gilles Chanteperdrix wrote: > GIT version control wrote: >> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c >> index 11f47c0..a896031 100644 >> --- a/ksrc/skins/posix/mq.c >> +++ b/ksrc/skins/posix/mq.c >> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct >> xnselector *selector, >> return 0; >> >>unlock_and_error: >> +xnfree(binding): >> xnlock_put_irqrestore(&nklock, s); >> return err; >> } > Ok. Will pull. But I really need to fix that. > Ack - now that I see it myself. >>> I fixed this in my branch and added another patch to transform EIDRM >>> into EBADF when selecting a (half-)deleted RTDM device. Please >>> merge. >>> >> Wait! When the sync object behind some file descriptor is deleted but >> the descriptor itself is still existing, we rather have to return >> that >> fd signaled from select() instead of letting the call fail. I beed to >> look into this again. > It looks to me like a transitory state, we can wait for the sync > object > to be deleted to have the fd destructor signaled. It should not be > long. That's not an issue of waiting for this. See e.g. TCP: peer closes connection -> internal sync objects will be destroyed (to make read/write fail). But the fd will remain valid until the local side closes it as well. >>> It looks to me like this is going to complicate things a lot, and will >>> be a source of regressions. Why can we not have sync objects be >>> destroyed when the fd is really destroyed and use a status bit of some >>> kind to signal read/write that the fd was closed by peer? >> It is way easier and more consistent to unblock reader and writers via >> destroying the sync object than to signal it and add tests for specific >> states to detect that. Keep in mind that this pattern is in use even >> without select support. Diverging from it just to add select awareness >> to some driver would be a step back. > Ok. As you wish. But in that case, please provide a unit test case which > we can run on all architectures to validate your modifications. We will > not put this test case in xenomai repository but will create a > repository for test cases later on. As it requires quite some infrastructure to get there (or do I miss some preexisting select unit test?), I can just point to RTnet + TCP for now: run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and terminate the client prematurely. This does not happen if you run the very same test without RTnet loaded (ie. against Linux' TCP). Just pushed the corresponding fix. >>> I really do not understand what you are trying to do. What is the >>> problem exactly, and how do you fix it? You are reserving 384 more bytes >>> on the stack. What for? >>> >> "We already return an fd as pending if the corresponding xnsynch object >> is delete while waiting on it. Extend select to do the same if the >> object is already marked deleted on binding. >> >> This fixes the return code select on half-closed RTnet TCP sockets from >> -EIDRM to > 0 (for pending fds)." >> >> HTH. >> >> Regarding the additional stack requirements - yes, not nice. Maybe we >> can avoid this by clearing the in_fds in select_bind_all while >> processing it unless an fd is marked deleted. Would also avoid passing a >> separate fds to it. > > To be more precise, it looks to me like each call to bind_one is > supposed to set the bit of the corresponding fd, so, if bind_one fails > because the fd is in the transitory state (the one which I do not like), > bind_all should simply set this bit to 1. And there does not seem to be > any need for additional parameters, additional space on stack, or > anything else. Right, the trick is likely to properly maintain the output fds of bind_all in that not only pending fds are set, but others are cleared - avoids the third bitmap. Still playing with such an approach. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Jan Kiszka wrote: Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> GIT version control wrote: >>> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c >>> index 11f47c0..a896031 100644 >>> --- a/ksrc/skins/posix/mq.c >>> +++ b/ksrc/skins/posix/mq.c >>> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct >>> xnselector *selector, >>> return 0; >>> >>>unlock_and_error: >>> + xnfree(binding): >>> xnlock_put_irqrestore(&nklock, s); >>> return err; >>> } >> Ok. Will pull. But I really need to fix that. >> > Ack - now that I see it myself. > I fixed this in my branch and added another patch to transform EIDRM into EBADF when selecting a (half-)deleted RTDM device. Please merge. >>> Wait! When the sync object behind some file descriptor is deleted >>> but >>> the descriptor itself is still existing, we rather have to return >>> that >>> fd signaled from select() instead of letting the call fail. I beed >>> to >>> look into this again. >> It looks to me like a transitory state, we can wait for the sync >> object >> to be deleted to have the fd destructor signaled. It should not be >> long. > That's not an issue of waiting for this. See e.g. TCP: peer closes > connection -> internal sync objects will be destroyed (to make > read/write fail). But the fd will remain valid until the local side > closes it as well. It looks to me like this is going to complicate things a lot, and will be a source of regressions. Why can we not have sync objects be destroyed when the fd is really destroyed and use a status bit of some kind to signal read/write that the fd was closed by peer? >>> It is way easier and more consistent to unblock reader and writers via >>> destroying the sync object than to signal it and add tests for specific >>> states to detect that. Keep in mind that this pattern is in use even >>> without select support. Diverging from it just to add select awareness >>> to some driver would be a step back. >> Ok. As you wish. But in that case, please provide a unit test case which >> we can run on all architectures to validate your modifications. We will >> not put this test case in xenomai repository but will create a >> repository for test cases later on. > As it requires quite some infrastructure to get there (or do I miss some > preexisting select unit test?), I can just point to RTnet + TCP for now: > run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and > terminate the client prematurely. This does not happen if you run the > very same test without RTnet loaded (ie. against Linux' TCP). > > Just pushed the corresponding fix. I really do not understand what you are trying to do. What is the problem exactly, and how do you fix it? You are reserving 384 more bytes on the stack. What for? >>> "We already return an fd as pending if the corresponding xnsynch object >>> is delete while waiting on it. Extend select to do the same if the >>> object is already marked deleted on binding. >>> >>> This fixes the return code select on half-closed RTnet TCP sockets from >>> -EIDRM to > 0 (for pending fds)." >>> >>> HTH. >>> >>> Regarding the additional stack requirements - yes, not nice. Maybe we >>> can avoid this by clearing the in_fds in select_bind_all while >>> processing it unless an fd is marked deleted. Would also avoid passing a >>> separate fds to it. >> To be more precise, it looks to me like each call to bind_one is >> supposed to set the bit of the corresponding fd, so, if bind_one fails >> because the fd is in the transitory state (the one which I do not like), >> bind_all should simply set this bit to 1. And there does not seem to be >> any need for additional parameters, additional space on stack, or >> anything else. > > Right, the trick is likely to properly maintain the output fds of > bind_all in that not only pending fds are set, but others are cleared - > avoids the third bitmap. Still playing with such an approach. xnselect_bind, called by bind_one should take care of setting/clearing individual bits, at least, it is the way it currently works. --
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Jan Kiszka wrote: Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> GIT version control wrote: >>> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c >>> index 11f47c0..a896031 100644 >>> --- a/ksrc/skins/posix/mq.c >>> +++ b/ksrc/skins/posix/mq.c >>> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct >>> xnselector *selector, >>> return 0; >>> >>>unlock_and_error: >>> + xnfree(binding): >>> xnlock_put_irqrestore(&nklock, s); >>> return err; >>> } >> Ok. Will pull. But I really need to fix that. >> > Ack - now that I see it myself. > I fixed this in my branch and added another patch to transform EIDRM into EBADF when selecting a (half-)deleted RTDM device. Please merge. >>> Wait! When the sync object behind some file descriptor is deleted >>> but >>> the descriptor itself is still existing, we rather have to return >>> that >>> fd signaled from select() instead of letting the call fail. I beed >>> to >>> look into this again. >> It looks to me like a transitory state, we can wait for the sync >> object >> to be deleted to have the fd destructor signaled. It should not be >> long. > That's not an issue of waiting for this. See e.g. TCP: peer closes > connection -> internal sync objects will be destroyed (to make > read/write fail). But the fd will remain valid until the local side > closes it as well. It looks to me like this is going to complicate things a lot, and will be a source of regressions. Why can we not have sync objects be destroyed when the fd is really destroyed and use a status bit of some kind to signal read/write that the fd was closed by peer? >>> It is way easier and more consistent to unblock reader and writers via >>> destroying the sync object than to signal it and add tests for specific >>> states to detect that. Keep in mind that this pattern is in use even >>> without select support. Diverging from it just to add select awareness >>> to some driver would be a step back. >> Ok. As you wish. But in that case, please provide a unit test case which >> we can run on all architectures to validate your modifications. We will >> not put this test case in xenomai repository but will create a >> repository for test cases later on. > As it requires quite some infrastructure to get there (or do I miss some > preexisting select unit test?), I can just point to RTnet + TCP for now: > run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and > terminate the client prematurely. This does not happen if you run the > very same test without RTnet loaded (ie. against Linux' TCP). > > Just pushed the corresponding fix. I really do not understand what you are trying to do. What is the problem exactly, and how do you fix it? You are reserving 384 more bytes on the stack. What for? >>> "We already return an fd as pending if the corresponding xnsynch object >>> is delete while waiting on it. Extend select to do the same if the >>> object is already marked deleted on binding. >>> >>> This fixes the return code select on half-closed RTnet TCP sockets from >>> -EIDRM to > 0 (for pending fds)." >>> >>> HTH. >>> >>> Regarding the additional stack requirements - yes, not nice. Maybe we >>> can avoid this by clearing the in_fds in select_bind_all while >>> processing it unless an fd is marked deleted. Would also avoid passing a >>> separate fds to it. >> To be more precise, it looks to me like each call to bind_one is >> supposed to set the bit of the corresponding fd, so, if bind_one fails >> because the fd is in the transitory state (the one which I do not like), >> bind_all should simply set this bit to 1. And there does not seem to be >> any need for additional parameters, additional space on stack, or >> anything else. > > Right, the trick is likely to properly maintain the output fds of > bind_all in that not only pending fds are set, but others are cleared - > avoids the third bitmap. Still playing with such an approach. Another precision, xnselect_bind accesses directly the "pending" set of the xnselector structure, so, you have the right to do that for the fds which fai
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Jan Kiszka wrote: > Right, the trick is likely to properly maintain the output fds of > bind_all in that not only pending fds are set, but others are cleared - > avoids the third bitmap. Still playing with such an approach. What we are trying to do, in a nutshell, is to create a notion of stateless binding between a file descriptor and an xnselector. If we really want to do this, we will have to extend the xnselect interface. But this looks really wrong to me. It seems possible to keep the bindings around as long as the file descriptor exists. The binding is here for this reason, to express the link between the file descriptor and the thread which runs select, and its lifecycle should be the same as the file descriptor. And if you want to signal that such a file descriptor is ready for reading/writing, even if it is closed, you should be using xnselect_signal. The interface is there, it is meant for that. So, the more I think about it, the more I think that we are going in the wrong direction. -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors
Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Gilles Chanteperdrix wrote: >>> To be more precise, it looks to me like each call to bind_one is >>> supposed to set the bit of the corresponding fd, so, if bind_one fails >>> because the fd is in the transitory state (the one which I do not like), >>> bind_all should simply set this bit to 1. And there does not seem to be >>> any need for additional parameters, additional space on stack, or >>> anything else. >> Right, the trick is likely to properly maintain the output fds of >> bind_all in that not only pending fds are set, but others are cleared - >> avoids the third bitmap. Still playing with such an approach. > > Another precision, xnselect_bind accesses directly the "pending" set of > the xnselector structure, so, you have the right to do that for the fds > which failed binding but which you want to mark as pending. Right, /me was blind, the issue can trivially be fixed inside RTDM by binding as pending even if the object is marked deleted. I'm now just cleaning up some other RTDM bits at this chance, pull request will follow. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Xenomai in Debian
Roland Stigge wrote: > Hi Gilles, > > first - I'm sorry if you sometimes feel offended by my work on Xenomai > in Debian. I understand that you are very much connected to your project > and want to have it working perfectly everywhere. > > Unfortunately, my time to work on this is limited and the last uploads > were work in progress - to provide latest Xenomai in Debian. Further > work on it was planned for this weekend. > > But please also understand that Debian developers will possibly > prioritize work on upstream packages where they feel their work is > appreciated. So please think about your tone before sending email and > driving people away from Xenomai. Ok. Please accept my sincere apologies. I understand that I am in no position to ask anything the way I did it. Trying to be more positive, would it help you, or anybody else if we created a xenomai-announce mailing list, where we would only publish releases and releasees informations ? -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] [pull request] RTDM fixes and cleanups
The following changes since commit 95412b8c10ba7ba199089a3cbb60c67c8d718186: Gilles Chanteperdrix (1): timeconv: revert 97b78d45bf4c5571fccd9675fae8bb008a341769 are available in the git repository at: git://git.xenomai.org/xenomai-jki.git for-upstream Jan Kiszka (5): RTDM+POSIX: Avoid leaking binding objects on errors RTDM: Bind deleted sem/event objects, but mark them pending RTDM: Consistently use xnsynch_test_flags to test for RTDM_SYNCH_DELETED RTDM: Avoid calling cleanup_instance with held spin lock RTDM: Add smp barrier to rtdm_context_unlock include/rtdm/rtdm_driver.h |1 + ksrc/skins/posix/mq.c |1 + ksrc/skins/rtdm/core.c | 45 ++- ksrc/skins/rtdm/drvlib.c | 40 -- 4 files changed, 46 insertions(+), 41 deletions(-) ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Xenomai in Debian
Hi Gilles, thanks for your mail! And don't worry - there's an updated Xenomai version in Debian sid now thanks to everyone from the Xenomai dev list. Sven put quite some effort into syncing the two projects. If there are any problems, feel free to "bug" me. ;-) Regarding the xenomai-announce list, this sounds like a good idea since many projects have it, including Debian. For me personally, my work much depends on free ressources and Debian release schedule urgency. Btw: The Debian watch system automatically generates an email when xenomai has got a new release. :-) So also no hurry here. See you, bye, Roland ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Yet another ((weak)) bug
Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Stefan Kisdaroczi wrote: >>> Am 12.02.2010 17:22, schrieb Jan Kiszka: libxnskin or so? Jan >>> libxenomai ? >> Ok. Let's go for libxenomai. I will try and do that in the next few days. >> > Already had a chance to look into this? No. I was off a few days. I just looked at it in the train, but could not test it yet. >>> If there is anything I can do to help accelerating it, please let me >>> know. We need to bake a new Xenomai package this week for our customer, >>> and at some point I need to decide between upstream and my temporary >>> fix. Upstream is generally preferred - if available. >> Yes, I have been a bit delayed, but should handle this this week, so as >> to be ready for a release next week-end. > > Anything in some private branch already? Ok. Everything pushed. It works on ARM and x86. I have not tested in the multilib and/or dlopen case, so please test it if you can. > > Jan > -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core