RE: Pollable Semaphores

2005-01-24 Thread Robert White
The other obvious problem is that if there are contenders on the semaphore, the
semaphore may well not remain acquirable between the return from select and the 
call
to actually acquire the semaphore.

What I'd think you need is a "device" that you open, then attach a semaphore to 
(with
an ioctl()?), then you would write increment/decrement actions to the device 
file
descriptor and it would reply with the results.

By making the write and read actions be "normal" and "atomic" you end up with 
normal
produce/consume file operations and so normal interaction with poll/select.

I would imagine [off the top of my head] that each device node would have to 
have
some kind of tasklet-like (but blockable) context that could wait for the 
semaphore
in a non-blocking way.  (Actually I'd imagine a reverse hook in the semaphore 
itself
that would contain a linked-list of inactive tasklet references which would 
each be
activated on a suitable semaphore operation (increment).

The short version is that the "available" state must be atomic with the actual
decrement operation or you will probably end up blocking in the decrement after 
the
select() anyway.  Once you factor that in, all sorts of more-classic models 
suggest
themselves.

Rob White,
Casabyte, Inc.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Pollable Semaphores

2005-01-24 Thread Robert White
The other obvious problem is that if there are contenders on the semaphore, the
semaphore may well not remain acquirable between the return from select and the 
call
to actually acquire the semaphore.

What I'd think you need is a device that you open, then attach a semaphore to 
(with
an ioctl()?), then you would write increment/decrement actions to the device 
file
descriptor and it would reply with the results.

By making the write and read actions be normal and atomic you end up with 
normal
produce/consume file operations and so normal interaction with poll/select.

I would imagine [off the top of my head] that each device node would have to 
have
some kind of tasklet-like (but blockable) context that could wait for the 
semaphore
in a non-blocking way.  (Actually I'd imagine a reverse hook in the semaphore 
itself
that would contain a linked-list of inactive tasklet references which would 
each be
activated on a suitable semaphore operation (increment).

The short version is that the available state must be atomic with the actual
decrement operation or you will probably end up blocking in the decrement after 
the
select() anyway.  Once you factor that in, all sorts of more-classic models 
suggest
themselves.

Rob White,
Casabyte, Inc.



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pollable Semaphores

2005-01-22 Thread Chris Wright
* Ulrich Drepper ([EMAIL PROTECTED]) wrote:
> On Fri, 21 Jan 2005 23:05:04 -0800, Chris Wright <[EMAIL PROTECTED]> wrote:
> > Yeah, here it is.  I refreshed it against a current kernel.  It passes my
> > same old test, where I select on /proc//status fd in exceptfds.
> 
> Looks certainly attractive to me.  Nice small patch.  How quickly
> after the death of the process is proc_pid_flush() called?

I don't think it's called until it's reaped.

> If this could go in and the futex stuff is handled, there is "only"
> async I/O to handle.  After that we could finally create a uniform
> event mechanism at userlevel which binds all these events (I/O,
> process/thread termination, sync primitives) together.  Maybe support
> for legacy sync primitives (SysV semaphores, msg queues) is needed as
> well, don't know yet.  Note that I assume that polling of POSIX
> mqueues works as it did the last time I tried it.

It should, AFAIK nothing there has changed.

thanks,
-chris
-- 
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pollable Semaphores

2005-01-22 Thread Chris Wright
* Ulrich Drepper ([EMAIL PROTECTED]) wrote:
 On Fri, 21 Jan 2005 23:05:04 -0800, Chris Wright [EMAIL PROTECTED] wrote:
  Yeah, here it is.  I refreshed it against a current kernel.  It passes my
  same old test, where I select on /proc/pid/status fd in exceptfds.
 
 Looks certainly attractive to me.  Nice small patch.  How quickly
 after the death of the process is proc_pid_flush() called?

I don't think it's called until it's reaped.

 If this could go in and the futex stuff is handled, there is only
 async I/O to handle.  After that we could finally create a uniform
 event mechanism at userlevel which binds all these events (I/O,
 process/thread termination, sync primitives) together.  Maybe support
 for legacy sync primitives (SysV semaphores, msg queues) is needed as
 well, don't know yet.  Note that I assume that polling of POSIX
 mqueues works as it did the last time I tried it.

It should, AFAIK nothing there has changed.

thanks,
-chris
-- 
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pollable Semaphores

2005-01-21 Thread Ulrich Drepper
On Fri, 21 Jan 2005 23:05:04 -0800, Chris Wright <[EMAIL PROTECTED]> wrote:
> Yeah, here it is.  I refreshed it against a current kernel.  It passes my
> same old test, where I select on /proc//status fd in exceptfds.

Looks certainly attractive to me.  Nice small patch.  How quickly
after the death of the process is proc_pid_flush() called?

If this could go in and the futex stuff is handled, there is "only"
async I/O to handle.  After that we could finally create a uniform
event mechanism at userlevel which binds all these events (I/O,
process/thread termination, sync primitives) together.  Maybe support
for legacy sync primitives (SysV semaphores, msg queues) is needed as
well, don't know yet.  Note that I assume that polling of POSIX
mqueues works as it did the last time I tried it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pollable Semaphores

2005-01-21 Thread Chris Wright
* Chris Wright ([EMAIL PROTECTED]) wrote:
> * Ulrich Drepper ([EMAIL PROTECTED]) wrote:
> > And is another thing to consider.  There is at least one other event
> > which should be pollable: process (maybe threads) deaths.  I was
> > hoping that we get support for this, perhaps in the form of polling
> > the /proc/PID directory.  For poll(), a POLLERR value could mean the
> > process/thread died.  For select(), once again a  bit in the except
> > array could be set.
> 
> I have a simple patch that does just that.  It worked after brief testing,
> then I never went back to look at it any more.  I'll see if I can't dig
> it up, maybe it's useful.

Yeah, here it is.  I refreshed it against a current kernel.  It passes my
same old test, where I select on /proc//status fd in exceptfds.

= fs/proc/base.c 1.86 vs edited =
--- 1.86/fs/proc/base.c 2005-01-10 17:29:31 -08:00
+++ edited/fs/proc/base.c   2005-01-21 22:51:00 -08:00
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "internal.h"
 
 /*
@@ -519,8 +520,21 @@ static ssize_t proc_info_read(struct fil
return length;
 }
 
+static unsigned int proc_info_poll(struct file *file, poll_table *wait)
+{
+   struct inode *inode = file->f_dentry->d_inode;
+   struct task_struct *task = proc_task(inode);
+   wait_queue_head_t *pid_wait = _I(task->proc_dentry->d_inode)->wait;
+
+   poll_wait(file, pid_wait, wait);
+   if (!pid_alive(task))
+   return POLLPRI;
+   return 0;
+}
+
 static struct file_operations proc_info_file_operations = {
.read   = proc_info_read,
+   .poll   = proc_info_poll,
 };
 
 static int mem_open(struct inode* inode, struct file* file)
@@ -1489,6 +1503,8 @@ void proc_pid_flush(struct dentry *proc_
 {
might_sleep();
if(proc_dentry != NULL) {
+   wait_queue_head_t *pid_wait=_I(proc_dentry->d_inode)->wait;
+   wake_up_interruptible(pid_wait);
shrink_dcache_parent(proc_dentry);
dput(proc_dentry);
}
= fs/proc/inode.c 1.31 vs edited =
--- 1.31/fs/proc/inode.c2005-01-04 18:48:14 -08:00
+++ edited/fs/proc/inode.c  2005-01-21 22:48:06 -08:00
@@ -97,6 +97,7 @@ static struct inode *proc_alloc_inode(st
ei->type = 0;
ei->op.proc_get_link = NULL;
ei->pde = NULL;
+   init_waitqueue_head(>wait);
inode = >vfs_inode;
inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME;
return inode;
= include/linux/proc_fs.h 1.41 vs edited =
--- 1.41/include/linux/proc_fs.h2005-01-07 21:44:33 -08:00
+++ edited/include/linux/proc_fs.h  2005-01-21 22:48:06 -08:00
@@ -243,6 +243,7 @@ struct proc_inode {
int (*proc_read)(struct task_struct *task, char *page);
} op;
struct proc_dir_entry *pde;
+   wait_queue_head_t wait;
struct inode vfs_inode;
 };
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pollable Semaphores

2005-01-21 Thread Chris Wright
* Ulrich Drepper ([EMAIL PROTECTED]) wrote:
> And is another thing to consider.  There is at least one other event
> which should be pollable: process (maybe threads) deaths.  I was
> hoping that we get support for this, perhaps in the form of polling
> the /proc/PID directory.  For poll(), a POLLERR value could mean the
> process/thread died.  For select(), once again a  bit in the except
> array could be set.

I have a simple patch that does just that.  It worked after brief testing,
then I never went back to look at it any more.  I'll see if I can't dig
it up, maybe it's useful.

thanks,
-chris
-- 
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pollable Semaphores

2005-01-21 Thread Ulrich Drepper
On Fri, 21 Jan 2005 17:17:51 -0600, Brent Casavant <[EMAIL PROTECTED]> wrote:

>   2. select/poll on the fd return EWOULDBLOCK if the current value of
>  the futex is not equal to the value of interest.  Otherwise it
>  behaves as FUTEX_FD currently does.

This is the problematic part.  The expected value, as you suggested,
can be handled with a write() and since the expected value is often
constant, this is a low-overhead method.

But the poll() interface is not so easy.  You cannot change the poll()
semantic to return such an error.  It makes really no sense.

What I thought could be done is to define instead a new POLL* constant
which signals the EWOULDBLOCK condition of the futex() syscall in the
revents member.  The poll/epoll syscall would do it's normal work and
just fill all the appropriate revents.  A futex value mismatch would
mean the call is not blocking at all, just as available data would be
for POLLIN.

For select, I would use the exception bitmap.  The bit is set for
futex fds in the EWOULDBLOCK case.

All this _could_ work.  But we've been bitten quite a few times in the
past.  There might be special cases which may need at least some
additional functionality.  This should be taken into account in the
original design.

So, if people are interested in this, code something up and try it. 
Stress it as much as you can.  I would oppose adding any new futex
interface created at a hunch if I'd be Andrew.

And is another thing to consider.  There is at least one other event
which should be pollable: process (maybe threads) deaths.  I was
hoping that we get support for this, perhaps in the form of polling
the /proc/PID directory.  For poll(), a POLLERR value could mean the
process/thread died.  For select(), once again a  bit in the except
array could be set.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pollable Semaphores

2005-01-21 Thread Davide Libenzi
On Fri, 21 Jan 2005, Brandon Corey wrote:

> I'm trying to find out if there is a pollable semaphore equivalent on Linux.
> 
> The main idea of a "pollable semaphore", is a semaphore with a related
> file descriptor.  The file descriptor can be used to select() when the
> semaphore is acquirable.  This provides a convenient way for users to
> implement code synchronization between threads, where multiple file
> descriptors are already being selected against.
> 
> We have a pollable semaphore implementation on IRIX that provides this
> functionality.  The API consists of a handful of calls for creation and
> destruction of pollable semaphores, as well as a means to attach them
> to a file descriptor.  Beyond that, from the users point of view, they're
> just treated as any other file descriptor.
> 
> These calls are routed through a library and then passed off to a kernel
> driver that handles the events.  If someone selects against a semaphore
> when it's unaquirable, the driver sleeps on a synchronization variable.
> When the semaphore is subsequently made aquirable, the driver will wake up
> any waiters.  Multiple pollable semaphores mixed with other file
> descriptors can be selected against, and a wakeup will occur when any of
> the semaphores become acquirable.
> 
> Is anyone aware of any equivalent functionality?

I used pipe-based semaphores when I need that functionality (call psem_down_fd()
to get the pollable fd):

http://www.xmailserver.org/pipe-sem.c
http://www.xmailserver.org/pipe-sem.h

They have the problem of the maximum pipe buffer size that affects the 
maximum count, but in my case it was fine. Or at least bugs did not come 
biting me at the time ;)



- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pollable Semaphores

2005-01-21 Thread Brent Casavant
On Fri, 21 Jan 2005, Roland Dreier wrote:

> Brandon> I'm trying to find out if there is a pollable semaphore
> Brandon> equivalent on Linux.  The main idea of a "pollable
> Brandon> semaphore", is a semaphore with a related file
> Brandon> descriptor.  The file descriptor can be used to select()
> Brandon> when the semaphore is acquirable.  This provides a
> Brandon> convenient way for users to implement code
> Brandon> synchronization between threads, where multiple file
> Brandon> descriptors are already being selected against.
> 
> Yes, I believe futexes and specifically FUTEX_FD can be used to
> implement this.  See http://people.redhat.com/~drepper/futex.pdf for
> full details.

Thanks for pointing out that paper, I wasn't aware of it (Brandon and
I talked about this problem before he posted).

The problem listed in section 9 of that paper is a showstopper to using
FUTEX_FD.  It's the very problem I ran into when trying to brainstorm a
wrapper library call around it to roughly implement the behavior Brandon
is looking for.

An additional major inconvenience is that the file descriptor can only
be used once, after which it must be closed and another reopened.  If
somehow the poll/select call could reuse the same file descriptor we
could avoid a whole bunch of library glue goo to make it work that
way (i.e. special library routines similar to poll(2) and select(2)
which do some fake file descriptor table hand waving to make it look
like there's reusable futex fds).

Perhaps a new FUTEX_POLLFD would be a reasonable solution for both
problems?  The semantics would be a bit different that FUTEX_FD.

It could solve the race condition (i.e. the problem in section 9 of
the paper) by the following:

  1. A write to the fd is used to set the value of interest analagous to
 the second parmeter to FUTEX_WAIT.  This value is stored on a
 per-fd (or maybe per-fd per-thread if it's not possible to have
 multiple fds per futex) basis.
  2. select/poll on the fd return EWOULDBLOCK if the current value of
 the futex is not equal to the value of interest.  Otherwise it
 behaves as FUTEX_FD currently does.

I think it's even possible that this behavior could be wedged into
the current FUTEX_FD, as a write(2) to this fd is meaningless on it
at present.  It's not a perfect solution, however, in that it would
be possible for one or more of the futexes that are the target of a
select(2) call to be updated so rapidly that we can never make progress
in the "write value then select" loop.

I'm not sure how to fix the reusability problem, as I haven't
determined the technical reason behind the current one-shot design.
Maybe it could be as simple as using another currently unused call
such as read(2) to reset the fd?  But I suspect it's harder than
that, otherwise it wouldn't be a one-shot in the first place.

Brent

-- 
Brent Casavant  If you had nothing to fear,
[EMAIL PROTECTED]how then could you be brave?
Silicon Graphics, Inc.-- Queen Dama, Source Wars
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pollable Semaphores

2005-01-21 Thread Roland Dreier
Brandon> I'm trying to find out if there is a pollable semaphore
Brandon> equivalent on Linux.  The main idea of a "pollable
Brandon> semaphore", is a semaphore with a related file
Brandon> descriptor.  The file descriptor can be used to select()
Brandon> when the semaphore is acquirable.  This provides a
Brandon> convenient way for users to implement code
Brandon> synchronization between threads, where multiple file
Brandon> descriptors are already being selected against.

Yes, I believe futexes and specifically FUTEX_FD can be used to
implement this.  See http://people.redhat.com/~drepper/futex.pdf for
full details.

 - Roland
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pollable Semaphores

2005-01-21 Thread Roland Dreier
Brandon I'm trying to find out if there is a pollable semaphore
Brandon equivalent on Linux.  The main idea of a pollable
Brandon semaphore, is a semaphore with a related file
Brandon descriptor.  The file descriptor can be used to select()
Brandon when the semaphore is acquirable.  This provides a
Brandon convenient way for users to implement code
Brandon synchronization between threads, where multiple file
Brandon descriptors are already being selected against.

Yes, I believe futexes and specifically FUTEX_FD can be used to
implement this.  See http://people.redhat.com/~drepper/futex.pdf for
full details.

 - Roland
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pollable Semaphores

2005-01-21 Thread Brent Casavant
On Fri, 21 Jan 2005, Roland Dreier wrote:

 Brandon I'm trying to find out if there is a pollable semaphore
 Brandon equivalent on Linux.  The main idea of a pollable
 Brandon semaphore, is a semaphore with a related file
 Brandon descriptor.  The file descriptor can be used to select()
 Brandon when the semaphore is acquirable.  This provides a
 Brandon convenient way for users to implement code
 Brandon synchronization between threads, where multiple file
 Brandon descriptors are already being selected against.
 
 Yes, I believe futexes and specifically FUTEX_FD can be used to
 implement this.  See http://people.redhat.com/~drepper/futex.pdf for
 full details.

Thanks for pointing out that paper, I wasn't aware of it (Brandon and
I talked about this problem before he posted).

The problem listed in section 9 of that paper is a showstopper to using
FUTEX_FD.  It's the very problem I ran into when trying to brainstorm a
wrapper library call around it to roughly implement the behavior Brandon
is looking for.

An additional major inconvenience is that the file descriptor can only
be used once, after which it must be closed and another reopened.  If
somehow the poll/select call could reuse the same file descriptor we
could avoid a whole bunch of library glue goo to make it work that
way (i.e. special library routines similar to poll(2) and select(2)
which do some fake file descriptor table hand waving to make it look
like there's reusable futex fds).

Perhaps a new FUTEX_POLLFD would be a reasonable solution for both
problems?  The semantics would be a bit different that FUTEX_FD.

It could solve the race condition (i.e. the problem in section 9 of
the paper) by the following:

  1. A write to the fd is used to set the value of interest analagous to
 the second parmeter to FUTEX_WAIT.  This value is stored on a
 per-fd (or maybe per-fd per-thread if it's not possible to have
 multiple fds per futex) basis.
  2. select/poll on the fd return EWOULDBLOCK if the current value of
 the futex is not equal to the value of interest.  Otherwise it
 behaves as FUTEX_FD currently does.

I think it's even possible that this behavior could be wedged into
the current FUTEX_FD, as a write(2) to this fd is meaningless on it
at present.  It's not a perfect solution, however, in that it would
be possible for one or more of the futexes that are the target of a
select(2) call to be updated so rapidly that we can never make progress
in the write value then select loop.

I'm not sure how to fix the reusability problem, as I haven't
determined the technical reason behind the current one-shot design.
Maybe it could be as simple as using another currently unused call
such as read(2) to reset the fd?  But I suspect it's harder than
that, otherwise it wouldn't be a one-shot in the first place.

Brent

-- 
Brent Casavant  If you had nothing to fear,
[EMAIL PROTECTED]how then could you be brave?
Silicon Graphics, Inc.-- Queen Dama, Source Wars
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pollable Semaphores

2005-01-21 Thread Davide Libenzi
On Fri, 21 Jan 2005, Brandon Corey wrote:

 I'm trying to find out if there is a pollable semaphore equivalent on Linux.
 
 The main idea of a pollable semaphore, is a semaphore with a related
 file descriptor.  The file descriptor can be used to select() when the
 semaphore is acquirable.  This provides a convenient way for users to
 implement code synchronization between threads, where multiple file
 descriptors are already being selected against.
 
 We have a pollable semaphore implementation on IRIX that provides this
 functionality.  The API consists of a handful of calls for creation and
 destruction of pollable semaphores, as well as a means to attach them
 to a file descriptor.  Beyond that, from the users point of view, they're
 just treated as any other file descriptor.
 
 These calls are routed through a library and then passed off to a kernel
 driver that handles the events.  If someone selects against a semaphore
 when it's unaquirable, the driver sleeps on a synchronization variable.
 When the semaphore is subsequently made aquirable, the driver will wake up
 any waiters.  Multiple pollable semaphores mixed with other file
 descriptors can be selected against, and a wakeup will occur when any of
 the semaphores become acquirable.
 
 Is anyone aware of any equivalent functionality?

I used pipe-based semaphores when I need that functionality (call psem_down_fd()
to get the pollable fd):

http://www.xmailserver.org/pipe-sem.c
http://www.xmailserver.org/pipe-sem.h

They have the problem of the maximum pipe buffer size that affects the 
maximum count, but in my case it was fine. Or at least bugs did not come 
biting me at the time ;)



- Davide

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pollable Semaphores

2005-01-21 Thread Ulrich Drepper
On Fri, 21 Jan 2005 17:17:51 -0600, Brent Casavant [EMAIL PROTECTED] wrote:

   2. select/poll on the fd return EWOULDBLOCK if the current value of
  the futex is not equal to the value of interest.  Otherwise it
  behaves as FUTEX_FD currently does.

This is the problematic part.  The expected value, as you suggested,
can be handled with a write() and since the expected value is often
constant, this is a low-overhead method.

But the poll() interface is not so easy.  You cannot change the poll()
semantic to return such an error.  It makes really no sense.

What I thought could be done is to define instead a new POLL* constant
which signals the EWOULDBLOCK condition of the futex() syscall in the
revents member.  The poll/epoll syscall would do it's normal work and
just fill all the appropriate revents.  A futex value mismatch would
mean the call is not blocking at all, just as available data would be
for POLLIN.

For select, I would use the exception bitmap.  The bit is set for
futex fds in the EWOULDBLOCK case.

All this _could_ work.  But we've been bitten quite a few times in the
past.  There might be special cases which may need at least some
additional functionality.  This should be taken into account in the
original design.

So, if people are interested in this, code something up and try it. 
Stress it as much as you can.  I would oppose adding any new futex
interface created at a hunch if I'd be Andrew.

And is another thing to consider.  There is at least one other event
which should be pollable: process (maybe threads) deaths.  I was
hoping that we get support for this, perhaps in the form of polling
the /proc/PID directory.  For poll(), a POLLERR value could mean the
process/thread died.  For select(), once again a  bit in the except
array could be set.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pollable Semaphores

2005-01-21 Thread Chris Wright
* Ulrich Drepper ([EMAIL PROTECTED]) wrote:
 And is another thing to consider.  There is at least one other event
 which should be pollable: process (maybe threads) deaths.  I was
 hoping that we get support for this, perhaps in the form of polling
 the /proc/PID directory.  For poll(), a POLLERR value could mean the
 process/thread died.  For select(), once again a  bit in the except
 array could be set.

I have a simple patch that does just that.  It worked after brief testing,
then I never went back to look at it any more.  I'll see if I can't dig
it up, maybe it's useful.

thanks,
-chris
-- 
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pollable Semaphores

2005-01-21 Thread Chris Wright
* Chris Wright ([EMAIL PROTECTED]) wrote:
 * Ulrich Drepper ([EMAIL PROTECTED]) wrote:
  And is another thing to consider.  There is at least one other event
  which should be pollable: process (maybe threads) deaths.  I was
  hoping that we get support for this, perhaps in the form of polling
  the /proc/PID directory.  For poll(), a POLLERR value could mean the
  process/thread died.  For select(), once again a  bit in the except
  array could be set.
 
 I have a simple patch that does just that.  It worked after brief testing,
 then I never went back to look at it any more.  I'll see if I can't dig
 it up, maybe it's useful.

Yeah, here it is.  I refreshed it against a current kernel.  It passes my
same old test, where I select on /proc/pid/status fd in exceptfds.

= fs/proc/base.c 1.86 vs edited =
--- 1.86/fs/proc/base.c 2005-01-10 17:29:31 -08:00
+++ edited/fs/proc/base.c   2005-01-21 22:51:00 -08:00
@@ -32,6 +32,7 @@
 #include linux/mount.h
 #include linux/security.h
 #include linux/ptrace.h
+#include linux/poll.h
 #include internal.h
 
 /*
@@ -519,8 +520,21 @@ static ssize_t proc_info_read(struct fil
return length;
 }
 
+static unsigned int proc_info_poll(struct file *file, poll_table *wait)
+{
+   struct inode *inode = file-f_dentry-d_inode;
+   struct task_struct *task = proc_task(inode);
+   wait_queue_head_t *pid_wait = PROC_I(task-proc_dentry-d_inode)-wait;
+
+   poll_wait(file, pid_wait, wait);
+   if (!pid_alive(task))
+   return POLLPRI;
+   return 0;
+}
+
 static struct file_operations proc_info_file_operations = {
.read   = proc_info_read,
+   .poll   = proc_info_poll,
 };
 
 static int mem_open(struct inode* inode, struct file* file)
@@ -1489,6 +1503,8 @@ void proc_pid_flush(struct dentry *proc_
 {
might_sleep();
if(proc_dentry != NULL) {
+   wait_queue_head_t *pid_wait=PROC_I(proc_dentry-d_inode)-wait;
+   wake_up_interruptible(pid_wait);
shrink_dcache_parent(proc_dentry);
dput(proc_dentry);
}
= fs/proc/inode.c 1.31 vs edited =
--- 1.31/fs/proc/inode.c2005-01-04 18:48:14 -08:00
+++ edited/fs/proc/inode.c  2005-01-21 22:48:06 -08:00
@@ -97,6 +97,7 @@ static struct inode *proc_alloc_inode(st
ei-type = 0;
ei-op.proc_get_link = NULL;
ei-pde = NULL;
+   init_waitqueue_head(ei-wait);
inode = ei-vfs_inode;
inode-i_mtime = inode-i_atime = inode-i_ctime = CURRENT_TIME;
return inode;
= include/linux/proc_fs.h 1.41 vs edited =
--- 1.41/include/linux/proc_fs.h2005-01-07 21:44:33 -08:00
+++ edited/include/linux/proc_fs.h  2005-01-21 22:48:06 -08:00
@@ -243,6 +243,7 @@ struct proc_inode {
int (*proc_read)(struct task_struct *task, char *page);
} op;
struct proc_dir_entry *pde;
+   wait_queue_head_t wait;
struct inode vfs_inode;
 };
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pollable Semaphores

2005-01-21 Thread Ulrich Drepper
On Fri, 21 Jan 2005 23:05:04 -0800, Chris Wright [EMAIL PROTECTED] wrote:
 Yeah, here it is.  I refreshed it against a current kernel.  It passes my
 same old test, where I select on /proc/pid/status fd in exceptfds.

Looks certainly attractive to me.  Nice small patch.  How quickly
after the death of the process is proc_pid_flush() called?

If this could go in and the futex stuff is handled, there is only
async I/O to handle.  After that we could finally create a uniform
event mechanism at userlevel which binds all these events (I/O,
process/thread termination, sync primitives) together.  Maybe support
for legacy sync primitives (SysV semaphores, msg queues) is needed as
well, don't know yet.  Note that I assume that polling of POSIX
mqueues works as it did the last time I tried it.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/