Re: In-kernel process exit hooks?

2016-01-09 Thread Paul Goyette

Let me rephrase - this is the only explanation I'm going to provide:

_I_ am not going to remove it.  If others feel so strongly that they 
would rather remove existing functionality (as ugly as it is), then 
_they_ can do the deed.




On Sat, 9 Jan 2016, Wolfgang Solfrank wrote:


Hi,


We all agree that filemon(4) is an ugly hack.  It probably should never
have gotten committed.  But it is there now, and there are a (very) few
use-cases.  So we don't want to remove it without having a replacement
implementation.


Well, can you explain?  Why would we not want to remove it and be done
with that nonsense?

Ciao,
Wolfgang
--
wolfg...@solfrank.net   Wolfgang Solfrank



+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++


Re: In-kernel process exit hooks?

2016-01-09 Thread Mateusz Guzik
On Fri, Jan 08, 2016 at 07:08:19AM +, David Holland wrote:
> On Thu, Jan 07, 2016 at 07:34:33AM +0800, Paul Goyette wrote:
>  > Based on internal implementation of filemon(4), there is an ordering
>  > requirement imposed on the sequence of events that occur when a process
>  > using /dev/filemon exits.  In particular, the file descriptor on which the
>  > device is open needs to be closed prior to closing the descriptor to which
>  > avent/activity records are being logged.  (Otherwise, because of an extra
>  > reference on the log file, fd_close() will hang indefinitely.)
> 
> Looking at the filemon code... it is completely wrong. It should not
> be using file handles like that, especially not in a device driver.
> Rather than adding new hacks to the system to work around it being
> silly, it should be redone properly.
> 
> For an example of the right way to do this kind of thing, look in
> kern_acct.c.
> 

I would like to add my $0,03 seeing that there is time invested into
what I consider a lost cause.

I have to agree that filemon is misdesigned and of low quality in
general.

While I don't know all the details, it is clear that the purpose would
be much better served by ktrace and I would argue efforts should be
spent there.

bmake takes filemon output and copies it into some file.

>From usability perspective what seems to be needed is few hacks to
ktrace to only log the stuff make is interested in and parsing code for
bmake to translate to what looks like current filemon format for
aformentioned copying.

I only had a brief look at ktrace implementation. It uses global locks,
so that would have to be worked on, but it should be trivial to at least
make tracees using different output files not compete with each other.

So, what's so fitting about ktrace design: tracing is tied to the target
process, can react to things like changing credentials and reparenting
(fork + parent exit), but most importantly it does not make assumptions
about things which can change without its control (see below).

Filemon monitors stuff at syscall level, stores the fd and remembers
pids.

Let's see how this leads to trouble. Some of these problems were also
present in FreeBSD implementation (and some still are). A bonus
interesting fact is that implementations have diverged, although are
still fundamentally broken.

http://nxr.netbsd.org/xref/src/sys/dev/filemon/filemon.c#filemon_ioctl
>case FILEMON_SET_PID:
>/* Set the monitored process ID - if allowed. */
>mutex_enter(proc_lock);
>tp = proc_find(*((pid_t *) data));

There are no steps taken to keep tp around nor in a certain state (see
below). proc_find only finds the pointer.

>mutex_exit(proc_lock);
>if (tp == NULL ||
>tp->p_emul != _netbsd) {

tp could have exited and be freed by now. Now that the comparison was
done it could have p_emul changed.

>error = ESRCH;
>break;
>}
>
>error = kauth_authorize_process(curproc->p_cred,
>KAUTH_PROCESS_CANSEE, tp,
>KAUTH_ARG(KAUTH_REQ_PROCESS_CANSEE_ENTRY), NULL, NULL);

Now that the check is pased the process could have executed something
setuid.

I'm somewhat surprised this code is here. Should not
kauth_authorize_process assert that tp has stable credentials in some
way, in effect just crashing filemon consumers?

>if (!error) {
>filemon->fm_pid = tp->p_pid;
>}

Pid is stored, but tp is not held. What if the 'traced' process exits?
grep only reveals the following bit in filemon_wrapper_sys_exit:

>if (filemon->fm_pid == curproc->p_pid)
>filemon_printf(filemon, "# Bye bye\n");

So filemon will continue tracing anything which happened to reuse the
pid.

"Tracing points" are at syscall layer. With only pid remembered this
adds up to serious breakage induced by the need to look the filemon structure
up each time, and based on unreliable data.

>static struct filemon *
>filemon_pid_check(struct proc * p)
>{   
>struct filemon *filemon;
>struct proc * lp;
>
>if (!TAILQ_EMPTY(_inuse)) {
>while (p) {
>/*
> * make sure p cannot exit
> * until we have moved on to p_pptr
> */
>rw_enter(>p_reflock, RW_READER);
>TAILQ_FOREACH(filemon, _inuse, fm_link) {
>if (p->p_pid == filemon->fm_pid) {
>rw_exit(>p_reflock);
>return (filemon);
>}
>}

But p_pid should be constant for 'struct proc' lifetime, so locking this
early is wasteful.

>lp = 

Re: In-kernel process exit hooks?

2016-01-09 Thread Mateusz Guzik
On Sat, Jan 09, 2016 at 08:25:05AM +0100, Mateusz Guzik wrote:
> On Sat, Jan 09, 2016 at 02:25:02PM +0800, Paul Goyette wrote:
> > On Sat, 9 Jan 2016, Mateusz Guzik wrote:
> > 
> > >While I don't know all the details, it is clear that the purpose would
> > >be much better served by ktrace and I would argue efforts should be
> > >spent there.
> > 
> > filemon's purpose is somewhat different than that of ktrace.  There
> > is a limited set of events that filemon captures, and it only
> > captures successful operations.  Its output is quite simple and easy
> > to parse by a human of shell script.
> > 
> 
> ktrace already have limited support for specifying what events are
> wanted. One can add a special filemon-like mode.
> 
> The output of kdump is definitely less friendly for scripts, but it is
> way easier to write a script formatting it than it is to fix filemon.
> 
> > >>From usability perspective what seems to be needed is few hacks to
> > >ktrace to only log the stuff make is interested in and parsing code for
> > >bmake to translate to what looks like current filemon format for
> > >aformentioned copying.
> > 
> > Perhaps.  But filemon is also a run-time loadable module (and it is
> > preferred that it NOT be built-in to any kernels).  If I remember
> > correctly, ktrace has hooks all over the system and does not readily
> > lend itself to modularization.
> > 
> 
> I doubt that compiling in ktrace on a which can handle filemon is an
> issue, and even then it should outweight hops around filemon.
> 
> For correct of a minimalistic safe filemon module the kernel would
> already have to grow hooks in few places (see below).
> 
> 
> > >Filemon monitors stuff at syscall level, stores the fd and remembers
> > >pids.
> > 
> > The fd is no longer being stored...  :)  But yes, it does still
> > remember pids, so it can determine if an event belongs to a monitored
> > process tree.  Again IIRC, ktrace is somewhat more intrusive, in that
> > it manipulates various process flags which can affect behavior of the
> > target (monitored) process.
> 
> It has a dedicated field in struct proc which is the only sensible way
> of implementing this I can think of.
> 
> > >>   error = kauth_authorize_process(curproc->p_cred,
> > >>   KAUTH_PROCESS_CANSEE, tp,
> > >>   KAUTH_ARG(KAUTH_REQ_PROCESS_CANSEE_ENTRY), NULL, NULL);
> > >
> > >Now that the check is pased the process could have executed something
> > >setuid.
> > 
> > Which process could have executed something setuid?  How would that
> > matter?  (I will confess I haven't checked, but I would be surprised
> > if "executing something setuid" would change the result of kauth()
> > check. But I could be wrong here.)
> > 
> 
> You start monitoring given process and proceed to execve a setuid file,
> which gives you information what it is doing, even though the
> kauth_authorize_process check would fail. This is basically what you
> yourself stated below about pid reuse.
> 
> So this either needs to prevent processes from changing credentials when
> traced or stop tracing when such a change occurs. The former seems like
> a rather bad idea and latter is already done by ktrace.
> 
> > >I'm somewhat surprised this code is here. Should not
> > >kauth_authorize_process assert that tp has stable credentials in some
> > >way, in effect just crashing filemon consumers?
> > >
> > >>   if (!error) {
> > >>   filemon->fm_pid = tp->p_pid;
> > >>   }
> > 
> > Why does the target process need stable credentials?  It's not doing
> > anything for filemon.
> > 
> 
> See below.
> 
> > >Pid is stored, but tp is not held. What if the 'traced' process exits?
> > >grep only reveals the following bit in filemon_wrapper_sys_exit:
> > >
> > >>   if (filemon->fm_pid == curproc->p_pid)
> > >>   filemon_printf(filemon, "# Bye bye\n");
> > >
> > >So filemon will continue tracing anything which happened to reuse the
> > >pid.
> > 
> > Yes, this is an outstanding issue.  More particularly, the new user
> > of the pid might not be accessible by the monitoring process (ie, it
> > would fail the earlier kauth() check).  But holding the pointer to
> > the proc isn't really the answer either - although less likely it is
> > entirely possible that the address could have been reused.
> > 
> 
> What you need is the ability to clear tracing bits on process exit.
> ktrace definitely has relevant bits in place already.
> 
> > (It's too bad we don't have "generation numbers" on the pid, which
> > would easily detect reuse.)
> 
> That would require struct proc to be type stable, which seems wasteful.
> 
> > >>   lp = p;
> > >>   p = p->p_pptr;
> > >>   rw_exit(>p_reflock);
> > >
> > >I guess the lock is needed to read the pointer to the parent.
> > 
> > Correct.
> > 
> > >Nothing was done to keep the parent around, which means it could 

Re: In-kernel process exit hooks?

2016-01-09 Thread Mateusz Guzik
On Sat, Jan 09, 2016 at 02:25:02PM +0800, Paul Goyette wrote:
> On Sat, 9 Jan 2016, Mateusz Guzik wrote:
> 
> >While I don't know all the details, it is clear that the purpose would
> >be much better served by ktrace and I would argue efforts should be
> >spent there.
> 
> filemon's purpose is somewhat different than that of ktrace.  There
> is a limited set of events that filemon captures, and it only
> captures successful operations.  Its output is quite simple and easy
> to parse by a human of shell script.
> 

ktrace already have limited support for specifying what events are
wanted. One can add a special filemon-like mode.

The output of kdump is definitely less friendly for scripts, but it is
way easier to write a script formatting it than it is to fix filemon.

> >>From usability perspective what seems to be needed is few hacks to
> >ktrace to only log the stuff make is interested in and parsing code for
> >bmake to translate to what looks like current filemon format for
> >aformentioned copying.
> 
> Perhaps.  But filemon is also a run-time loadable module (and it is
> preferred that it NOT be built-in to any kernels).  If I remember
> correctly, ktrace has hooks all over the system and does not readily
> lend itself to modularization.
> 

I doubt that compiling in ktrace on a which can handle filemon is an
issue, and even then it should outweight hops around filemon.

For correct of a minimalistic safe filemon module the kernel would
already have to grow hooks in few places (see below).


> >Filemon monitors stuff at syscall level, stores the fd and remembers
> >pids.
> 
> The fd is no longer being stored...  :)  But yes, it does still
> remember pids, so it can determine if an event belongs to a monitored
> process tree.  Again IIRC, ktrace is somewhat more intrusive, in that
> it manipulates various process flags which can affect behavior of the
> target (monitored) process.

It has a dedicated field in struct proc which is the only sensible way
of implementing this I can think of.

> >>   error = kauth_authorize_process(curproc->p_cred,
> >>   KAUTH_PROCESS_CANSEE, tp,
> >>   KAUTH_ARG(KAUTH_REQ_PROCESS_CANSEE_ENTRY), NULL, NULL);
> >
> >Now that the check is pased the process could have executed something
> >setuid.
> 
> Which process could have executed something setuid?  How would that
> matter?  (I will confess I haven't checked, but I would be surprised
> if "executing something setuid" would change the result of kauth()
> check. But I could be wrong here.)
> 

You start monitoring given process and proceed to execve a setuid file,
which gives you information what it is doing, even though the
kauth_authorize_process check would fail. This is basically what you
yourself stated below about pid reuse.

So this either needs to prevent processes from changing credentials when
traced or stop tracing when such a change occurs. The former seems like
a rather bad idea and latter is already done by ktrace.

> >I'm somewhat surprised this code is here. Should not
> >kauth_authorize_process assert that tp has stable credentials in some
> >way, in effect just crashing filemon consumers?
> >
> >>   if (!error) {
> >>   filemon->fm_pid = tp->p_pid;
> >>   }
> 
> Why does the target process need stable credentials?  It's not doing
> anything for filemon.
> 

See below.

> >Pid is stored, but tp is not held. What if the 'traced' process exits?
> >grep only reveals the following bit in filemon_wrapper_sys_exit:
> >
> >>   if (filemon->fm_pid == curproc->p_pid)
> >>   filemon_printf(filemon, "# Bye bye\n");
> >
> >So filemon will continue tracing anything which happened to reuse the
> >pid.
> 
> Yes, this is an outstanding issue.  More particularly, the new user
> of the pid might not be accessible by the monitoring process (ie, it
> would fail the earlier kauth() check).  But holding the pointer to
> the proc isn't really the answer either - although less likely it is
> entirely possible that the address could have been reused.
> 

What you need is the ability to clear tracing bits on process exit.
ktrace definitely has relevant bits in place already.

> (It's too bad we don't have "generation numbers" on the pid, which
> would easily detect reuse.)

That would require struct proc to be type stable, which seems wasteful.

> >>   lp = p;
> >>   p = p->p_pptr;
> >>   rw_exit(>p_reflock);
> >
> >I guess the lock is needed to read the pointer to the parent.
> 
> Correct.
> 
> >Nothing was done to keep the parent around, which means it could have
> >exited and be freed by now. Which in turn makes the second loop
> >iteration a use-after-free.
> 
> Hmmm, this code used to work.  Looks like the locking got messed up
> somewhere.  It should do (in pseudo code - I'm too lazy to write it
> all out!)
> 
>   lock(p)
>   while (p != 

Re: In-kernel process exit hooks?

2016-01-09 Thread Wolfgang Solfrank

Hi,


We all agree that filemon(4) is an ugly hack.  It probably should never
have gotten committed.  But it is there now, and there are a (very) few
use-cases.  So we don't want to remove it without having a replacement
implementation.


Well, can you explain?  Why would we not want to remove it and be done
with that nonsense?

Ciao,
Wolfgang
--
wolfg...@solfrank.net   Wolfgang Solfrank


Re: In-kernel process exit hooks?

2016-01-08 Thread Paul Goyette

On Fri, 8 Jan 2016, Paul Goyette wrote:


Hmmm, option #3 doesn't work so well, after all!

I tried it when both the target and monitor processes were children of the 
same shell process.  Thus all three of these share the same 'struct file' for 
their stdout, and the filemon_fd_close() callback cannot tell which one is 
being closed.  The all have the same fd (just in different descriptor 
tables?)


So next I'm going to try kre's suggestion about using a "proper" refcnt 
mechanism and see if that helps.


Robert Elz wrote:


Actually, thinking through this more, why not just "fix" filemon to
make a proper reference to the file, instead of the half baked thing
it is currently doing.

That is, instead of a fd_getfile() without an (almost immediate)
fd_putfile() so keeping ff_refcount incremented, in filemon, just do

   fp = fd_getfile(...);
   fp->f_count++;
   fd_putfile(...);

so filemon has a proper reference to the file*.   When it is done, it
just closes it, like any other file would be closed (which decrements
f_count and frees the struct file if it goes to 0).

Wouldn't this solve all the problems, keep the semantics the same, and
just generally be cleaner?


Option 4 works!  See attached diffs.  Thanks, kre!


I'll commit this in a day or two unless I receive some extremely 
negative feedback.  I know that filemon(4) is not the best code, and it 
really needs a total redesign and rewrite, but that's going to take a 
fair amount of time.  Until then, I think with these changes that we'll 
have a safe-and-stable implementation.  (And if it is decided to remove 
the current code completely, at least we can have a safe-and-stable copy 
in the attic!)



+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++Index: filemon.c
===
RCS file: /cvsroot/src/sys/dev/filemon/filemon.c,v
retrieving revision 1.24
diff -u -p -r1.24 filemon.c
--- filemon.c   5 Jan 2016 22:08:54 -   1.24
+++ filemon.c   8 Jan 2016 06:39:58 -
@@ -222,7 +222,6 @@ filemon_open(dev_t dev, int oflags __unu
 
filemon = kmem_alloc(sizeof(struct filemon), KM_SLEEP);
rw_init(>fm_mtx);
-   filemon->fm_fd = -1;
filemon->fm_fp = NULL;
filemon->fm_pid = curproc->p_pid;
 
@@ -265,7 +264,7 @@ filemon_close(struct file * fp)
 */
rw_enter(>fm_mtx, RW_WRITER);
if (filemon->fm_fp) {
-   fd_putfile(filemon->fm_fd); /* release our reference */
+   closef(filemon->fm_fp); /* release our reference */
filemon->fm_fp = NULL;
}
rw_exit(>fm_mtx);
@@ -279,6 +278,7 @@ static int
 filemon_ioctl(struct file * fp, u_long cmd, void *data)
 {
int error = 0;
+   int fd;
struct filemon *filemon;
struct proc *tp;
 
@@ -301,14 +301,16 @@ filemon_ioctl(struct file * fp, u_long c
 
/* First, release any current output file descriptor */
if (filemon->fm_fp)
-   fd_putfile(filemon->fm_fd);
+   closef(filemon->fm_fp);
 
/* Now set up the new one */
-   filemon->fm_fd = *((int *) data);
-   if ((filemon->fm_fp = fd_getfile(filemon->fm_fd)) == NULL) {
+   fd = *((int *) data);
+   if ((filemon->fm_fp = fd_getfile(fd)) == NULL) {
error = EBADF;
break;
}
+   filemon->fm_fp->f_count++;
+   fd_putfile(fd);
/* Write the file header. */
filemon_comment(filemon);
break;
Index: filemon.h
===
RCS file: /cvsroot/src/sys/dev/filemon/filemon.h,v
retrieving revision 1.8
diff -u -p -r1.8 filemon.h
--- filemon.h   25 Nov 2015 07:34:49 -  1.8
+++ filemon.h   8 Jan 2016 06:39:58 -
@@ -41,7 +41,6 @@ struct filemon {
char fm_fname1[MAXPATHLEN];/* Temporary filename buffer. */
char fm_fname2[MAXPATHLEN];/* Temporary filename buffer. */
char fm_msgbufr[32 + 2 * MAXPATHLEN];   /* Output message buffer. */
-   int fm_fd;  /* Output fd */
struct file *fm_fp; /* Output file pointer. */
krwlock_t fm_mtx;   /* Lock mutex for this filemon. */
TAILQ_ENTRY(filemon) fm_link;   /* Link into the in-use list. */


Re: In-kernel process exit hooks?

2016-01-08 Thread bch
On 1/7/16, David Holland  wrote:
> On Fri, Jan 08, 2016 at 07:08:19AM +, David Holland wrote:
>  > For an example of the right way to do this kind of thing, look in
>  > kern_acct.c.
>
> Better example: sys_fktrace, since that uses a file handle. And it
> does virtually the same thing that filemon's trying to do, except much
> more thoroughly.

Out of curiousity, I'm trying to use this interface, but getting:

/usr/include/sys/ktrace.h:83:20: error: field '_ts' has incomplete type
struct timespec _ts;

Does this look familiar to anybody ?

I also needed to manually include  for MAXCOMLEN (used in
 L66)


Cheers,

-bch



> --
> David A. Holland
> dholl...@netbsd.org
>


Re: In-kernel process exit hooks?

2016-01-08 Thread Paul Goyette

On Fri, 8 Jan 2016, bch wrote:


On 1/7/16, David Holland  wrote:

On Fri, Jan 08, 2016 at 07:08:19AM +, David Holland wrote:
> For an example of the right way to do this kind of thing, look in
> kern_acct.c.

Better example: sys_fktrace, since that uses a file handle. And it
does virtually the same thing that filemon's trying to do, except much
more thoroughly.


Out of curiousity, I'm trying to use this interface, but getting:

/usr/include/sys/ktrace.h:83:20: error: field '_ts' has incomplete type
   struct timespec _ts;

Does this look familiar to anybody ?


Perhaps manually include  ?



I also needed to manually include  for MAXCOMLEN (used in
 L66)


Cheers,

-bch




--
David A. Holland
dholl...@netbsd.org



!DSPAM:568f7d53105566048743086!




+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++


Re: In-kernel process exit hooks?

2016-01-08 Thread bch
On 1/8/16, Paul Goyette  wrote:
> On Fri, 8 Jan 2016, bch wrote:
>
>> On 1/7/16, David Holland  wrote:
>>> On Fri, Jan 08, 2016 at 07:08:19AM +, David Holland wrote:
>>> > For an example of the right way to do this kind of thing, look in
>>> > kern_acct.c.
>>>
>>> Better example: sys_fktrace, since that uses a file handle. And it
>>> does virtually the same thing that filemon's trying to do, except much
>>> more thoroughly.
>>
>> Out of curiousity, I'm trying to use this interface, but getting:
>>
>> /usr/include/sys/ktrace.h:83:20: error: field '_ts' has incomplete type
>>struct timespec _ts;
>>
>> Does this look familiar to anybody ?
>
> Perhaps manually include  ?

That's it.


Thanks.

>> I also needed to manually include  for MAXCOMLEN (used in
>>  L66)
>>
>>
>> Cheers,
>>
>> -bch
>>
>>
>>
>>> --
>>> David A. Holland
>>> dholl...@netbsd.org
>>>
>>
>> !DSPAM:568f7d53105566048743086!
>>
>>
>
> +--+--++
> | Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
> | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
> | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
> +--+--++
>


Re: In-kernel process exit hooks?

2016-01-08 Thread David Young
On Fri, Jan 08, 2016 at 12:52:16PM +0700, Robert Elz wrote:
> Date:Fri, 8 Jan 2016 11:22:28 +0800 (PHT)
> From:Paul Goyette 
> Message-ID:  
> 
>   | Is there a "supported" interface for detaching the file (or descriptor) 
>   | from the process without closing it?
> 
> Actually, thinking through this more, why not just "fix" filemon to make
> a proper reference to the file, instead of the half baked thing it is
> currently doing.

Yes, please! :-)

Furthermore, stick the file into LWP 0's descriptor table so that you
can see it with fstat.  It's a little more code to write---I wrote it
for gre(4)---but it's well worth the visibility.

Dave

-- 
David Young
dyo...@pobox.comUrbana, IL(217) 721-9981


Re: In-kernel process exit hooks?

2016-01-08 Thread bch
On 1/8/16, Taylor R Campbell  wrote:
>Date: Fri, 8 Jan 2016 17:13:54 +0800 (PHT)
>From: Paul Goyette 
>
>On Fri, 8 Jan 2016, bch wrote:
>
>> Out of curiousity, I'm trying to use this interface, but getting:
>>
>> /usr/include/sys/ktrace.h:83:20: error: field '_ts' has incomplete
> type
>>struct timespec _ts;
>>
>> Does this look familiar to anybody ?
>
>Perhaps manually include  ?
>
> Shouldn't be necessary.  If  uses struct timespec, it
> should include .
>

http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=50633

http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=50634


Re: In-kernel process exit hooks?

2016-01-08 Thread Paul Goyette

On Fri, 8 Jan 2016, Taylor R Campbell wrote:


  Date: Fri, 8 Jan 2016 16:52:28 +0800 (PHT)
  From: Paul Goyette 

  Robert Elz wrote:

  > That is, instead of a fd_getfile() without an (almost immediate)
  > fd_putfile() so keeping ff_refcount incremented, in filemon, just do
  >
  >fp = fd_getfile(...);
  >fp->f_count++;
  >fd_putfile(...);
  >
  > so filemon has a proper reference to the file*.   When it is done, it
  > just closes it, like any other file would be closed (which decrements
  > f_count and frees the struct file if it goes to 0).

  Option 4 works!  See attached diffs.  Thanks, kre!

This is essentially the same as fd_getfile2, except with a lot more
logic involved, including fiddly details that you have to take care
of.  And...

  I'll commit this in a day or two unless I receive some extremely
  negative feedback.

  @@ -301,14 +301,16 @@ filemon_ioctl(struct file * fp, u_long c
  [...]
  -   filemon->fm_fd = *((int *) data);
  -   if ((filemon->fm_fp = fd_getfile(filemon->fm_fd)) == NULL) {
  +   fd = *((int *) data);
  +   if ((filemon->fm_fp = fd_getfile(fd)) == NULL) {
  error = EBADF;
  break;
  }
  +   filemon->fm_fp->f_count++;
  +   fd_putfile(fd);

...you missed one: you can't touch fp->f_count without holding
fp->f_lock.

So I reiterate my suggestion to use fd_getfile2:

fd = *((int *)data);
if ((filemon->fm_fp = fd_getfile2(curproc, fd)) == NULL) {
error = EBADF;
break;
}


Ack - let me test and confirm that it works.



+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++


Re: In-kernel process exit hooks?

2016-01-08 Thread Paul Goyette

On Fri, 8 Jan 2016, Taylor R Campbell wrote:


  > Next, I'm going to have a look at fd_getfile2()/fclose() and see if that
  > is a viable approach.

  Hmmm.  The comments in kern/kern_descrip.c for fd_getfile2() require
  that the process is not allowed to fork or exit "across this call"
  (which I assume means "until the reference to the struct file is
  released").

I don't think that's what it means.  I think it means the process
whose descriptor we're asking for can't fork or exit *during the call*
to fd_getfile2.  In this case, it's the current process, so that's not
an issue.

The reason it is an issue stated in a comment is that the routine
appears to have been written for the purpose of procfs, in which case
the calling process is often not the same as the process passed to
fd_getfile2.


Ah, OK, that makes sense.




+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++


Re: In-kernel process exit hooks?

2016-01-08 Thread David Young
On Fri, Jan 08, 2016 at 10:47:08AM -0600, David Young wrote:
> On Fri, Jan 08, 2016 at 12:52:16PM +0700, Robert Elz wrote:
> > Date:Fri, 8 Jan 2016 11:22:28 +0800 (PHT)
> > From:Paul Goyette 
> > Message-ID:  
> > 
> >   | Is there a "supported" interface for detaching the file (or descriptor) 
> >   | from the process without closing it?
> > 
> > Actually, thinking through this more, why not just "fix" filemon to make
> > a proper reference to the file, instead of the half baked thing it is
> > currently doing.
> 
> Yes, please! :-)
> 
> Furthermore, stick the file into LWP 0's descriptor table so that you

Oops, meant PID 0's.

Dave

-- 
David Young
dyo...@pobox.comUrbana, IL(217) 721-9981


Re: In-kernel process exit hooks?

2016-01-08 Thread Paul Goyette

On Sat, 9 Jan 2016, Mateusz Guzik wrote:


While I don't know all the details, it is clear that the purpose would
be much better served by ktrace and I would argue efforts should be
spent there.


filemon's purpose is somewhat different than that of ktrace.  There is a 
limited set of events that filemon captures, and it only captures 
successful operations.  Its output is quite simple and easy to parse by 
a human of shell script.



bmake takes filemon output and copies it into some file.


Yes.  In "meta-mode" it will use the output from filemon to capture 
dependency information.



From usability perspective what seems to be needed is few hacks to

ktrace to only log the stuff make is interested in and parsing code for
bmake to translate to what looks like current filemon format for
aformentioned copying.


Perhaps.  But filemon is also a run-time loadable module (and it is 
preferred that it NOT be built-in to any kernels).  If I remember 
correctly, ktrace has hooks all over the system and does not readily 
lend itself to modularization.



I only had a brief look at ktrace implementation. It uses global locks,
so that would have to be worked on, but it should be trivial to at least
make tracees using different output files not compete with each other.

So, what's so fitting about ktrace design: tracing is tied to the target
process, can react to things like changing credentials and reparenting
(fork + parent exit), but most importantly it does not make assumptions
about things which can change without its control (see below).

Filemon monitors stuff at syscall level, stores the fd and remembers
pids.


The fd is no longer being stored...  :)  But yes, it does still remember 
pids, so it can determine if an event belongs to a monitored process 
tree.  Again IIRC, ktrace is somewhat more intrusive, in that it 
manipulates various process flags which can affect behavior of the 
target (monitored) process.



Let's see how this leads to trouble. Some of these problems were also
present in FreeBSD implementation (and some still are). A bonus
interesting fact is that implementations have diverged, although are
still fundamentally broken.

http://nxr.netbsd.org/xref/src/sys/dev/filemon/filemon.c#filemon_ioctl

   case FILEMON_SET_PID:
   /* Set the monitored process ID - if allowed. */
   mutex_enter(proc_lock);
   tp = proc_find(*((pid_t *) data));


There are no steps taken to keep tp around nor in a certain state (see
below). proc_find only finds the pointer.


We really don't need further access to the struct proc...


   mutex_exit(proc_lock);
   if (tp == NULL ||
   tp->p_emul != _netbsd) {


tp could have exited and be freed by now. Now that the comparison was
done it could have p_emul changed.


The mutex_exit() has been moved to after the last use of data which is 
protected by the mutex.



   error = ESRCH;
   break;
   }

   error = kauth_authorize_process(curproc->p_cred,
   KAUTH_PROCESS_CANSEE, tp,
   KAUTH_ARG(KAUTH_REQ_PROCESS_CANSEE_ENTRY), NULL, NULL);


Now that the check is pased the process could have executed something
setuid.


Which process could have executed something setuid?  How would that 
matter?  (I will confess I haven't checked, but I would be surprised if 
"executing something setuid" would change the result of kauth() check. 
But I could be wrong here.)



I'm somewhat surprised this code is here. Should not
kauth_authorize_process assert that tp has stable credentials in some
way, in effect just crashing filemon consumers?


   if (!error) {
   filemon->fm_pid = tp->p_pid;
   }


Why does the target process need stable credentials?  It's not doing 
anything for filemon.



Pid is stored, but tp is not held. What if the 'traced' process exits?
grep only reveals the following bit in filemon_wrapper_sys_exit:


   if (filemon->fm_pid == curproc->p_pid)
   filemon_printf(filemon, "# Bye bye\n");


So filemon will continue tracing anything which happened to reuse the
pid.


Yes, this is an outstanding issue.  More particularly, the new user of 
the pid might not be accessible by the monitoring process (ie, it would 
fail the earlier kauth() check).  But holding the pointer to the proc 
isn't really the answer either - although less likely it is entirely 
possible that the address could have been reused.


(It's too bad we don't have "generation numbers" on the pid, which would 
easily detect reuse.)



"Tracing points" are at syscall layer. With only pid remembered this
adds up to serious breakage induced by the need to look the filemon structure
up each time, and based on unreliable data.


static struct filemon *
filemon_pid_check(struct proc * p)
{
   struct filemon *filemon;
   struct proc * lp;

   if 

Re: In-kernel process exit hooks?

2016-01-08 Thread Paul Goyette

I'm not going to address individual points

We all agree that filemon(4) is an ugly hack.  It probably should never
have gotten committed.  But it is there now, and there are a (very) few
use-cases.  So we don't want to remove it without having a replacement
implementation.  (Although some of the issues, like not noticing a pid
being resued, could conceivably be considered security-related - ie,
information disclosure of a process to which you don't have access...)

I'll be working on a replacement, but it will take a while to get it
right.  At least several months, perhaps even longer.


make(1) (or bmake) doesn't have to move very far to avoid filemon(4). 
The only usage of filemon within make is for "meta-mode" which is only 
really understood by its author.  Indeed, filemon was created just for 
this one use-case;  any other use of filemon is truly accidental!


OK, I will address just one of your points individually:


The thing is there is filemon and bmake in FreeBSD as well. While I
don't have the time to modify ktrace for this purpose, I would
definitely help both projects if bmake started moving away from
filemon.

However, this is just my $0,03 as I don't contribute to this project.


Nothing prevents you from working on this and contributing any working 
code.  We're all volunteers here.


:)


+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++


Re: In-kernel process exit hooks?

2016-01-08 Thread David Holland
On Sat, Jan 09, 2016 at 08:25:05AM +0100, Mateusz Guzik wrote:
 > >if (!mutex_tryenter(parent->p_lock)) {
 > >mutex_exit(t->p_lock);
 > >mutex_enter(parent->p_lock);
 > 
 > As a side note this looks like a bug. t->p_lock is not relocked, so the
 > code ends up without the lock held. There is a similar sample in fork1.

I just fixed this. Thanks.

-- 
David A. Holland
dholl...@netbsd.org


Re: In-kernel process exit hooks?

2016-01-08 Thread Paul Goyette

So I reiterate my suggestion to use fd_getfile2:

fd = *((int *)data);
if ((filemon->fm_fp = fd_getfile2(curproc, fd)) == NULL) {
error = EBADF;
break;
}


Ack - let me test and confirm that it works.


Yes, it works just fine.  Revised diffs are attached.

So, I'll plan on committing this in the next couple of days.  Unless 
there is strong objection raised.




+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++Index: filemon.c
===
RCS file: /cvsroot/src/sys/dev/filemon/filemon.c,v
retrieving revision 1.26
diff -u -p -r1.26 filemon.c
--- filemon.c   8 Jan 2016 08:57:14 -   1.26
+++ filemon.c   9 Jan 2016 04:05:21 -
@@ -222,7 +222,6 @@ filemon_open(dev_t dev, int oflags __unu
 
filemon = kmem_alloc(sizeof(struct filemon), KM_SLEEP);
rw_init(>fm_mtx);
-   filemon->fm_fd = -1;
filemon->fm_fp = NULL;
filemon->fm_pid = curproc->p_pid;
 
@@ -265,7 +264,7 @@ filemon_close(struct file * fp)
 */
rw_enter(>fm_mtx, RW_WRITER);
if (filemon->fm_fp) {
-   fd_putfile(filemon->fm_fd); /* release our reference */
+   closef(filemon->fm_fp); /* release our reference */
filemon->fm_fp = NULL;
}
rw_exit(>fm_mtx);
@@ -279,6 +278,7 @@ static int
 filemon_ioctl(struct file * fp, u_long cmd, void *data)
 {
int error = 0;
+   int fd;
struct filemon *filemon;
struct proc *tp;
 
@@ -301,11 +301,11 @@ filemon_ioctl(struct file * fp, u_long c
 
/* First, release any current output file descriptor */
if (filemon->fm_fp)
-   fd_putfile(filemon->fm_fd);
+   closef(filemon->fm_fp);
 
/* Now set up the new one */
-   filemon->fm_fd = *((int *) data);
-   if ((filemon->fm_fp = fd_getfile(filemon->fm_fd)) == NULL) {
+   fd = *((int *) data);
+   if ((filemon->fm_fp = fd_getfile2(curproc, fd)) == NULL) {
error = EBADF;
break;
}
Index: filemon.h
===
RCS file: /cvsroot/src/sys/dev/filemon/filemon.h,v
retrieving revision 1.8
diff -u -p -r1.8 filemon.h
--- filemon.h   25 Nov 2015 07:34:49 -  1.8
+++ filemon.h   9 Jan 2016 04:05:21 -
@@ -41,7 +41,6 @@ struct filemon {
char fm_fname1[MAXPATHLEN];/* Temporary filename buffer. */
char fm_fname2[MAXPATHLEN];/* Temporary filename buffer. */
char fm_msgbufr[32 + 2 * MAXPATHLEN];   /* Output message buffer. */
-   int fm_fd;  /* Output fd */
struct file *fm_fp; /* Output file pointer. */
krwlock_t fm_mtx;   /* Lock mutex for this filemon. */
TAILQ_ENTRY(filemon) fm_link;   /* Link into the in-use list. */


Re: In-kernel process exit hooks?

2016-01-08 Thread Taylor R Campbell
   Date: Fri, 8 Jan 2016 16:52:28 +0800 (PHT)
   From: Paul Goyette 

   Robert Elz wrote:

   > That is, instead of a fd_getfile() without an (almost immediate)
   > fd_putfile() so keeping ff_refcount incremented, in filemon, just do
   >
   >fp = fd_getfile(...);
   >fp->f_count++;
   >fd_putfile(...);
   >
   > so filemon has a proper reference to the file*.   When it is done, it
   > just closes it, like any other file would be closed (which decrements
   > f_count and frees the struct file if it goes to 0).

   Option 4 works!  See attached diffs.  Thanks, kre!

This is essentially the same as fd_getfile2, except with a lot more
logic involved, including fiddly details that you have to take care
of.  And...

   I'll commit this in a day or two unless I receive some extremely 
   negative feedback.

   @@ -301,14 +301,16 @@ filemon_ioctl(struct file * fp, u_long c
   [...]
   -   filemon->fm_fd = *((int *) data);
   -   if ((filemon->fm_fp = fd_getfile(filemon->fm_fd)) == NULL) {
   +   fd = *((int *) data);
   +   if ((filemon->fm_fp = fd_getfile(fd)) == NULL) {
   error = EBADF;
   break;
   }
   +   filemon->fm_fp->f_count++;
   +   fd_putfile(fd);

...you missed one: you can't touch fp->f_count without holding
fp->f_lock.

So I reiterate my suggestion to use fd_getfile2:

fd = *((int *)data);
if ((filemon->fm_fp = fd_getfile2(curproc, fd)) == NULL) {
error = EBADF;
break;
}


Re: In-kernel process exit hooks?

2016-01-08 Thread David Young
On Fri, Jan 08, 2016 at 12:26:14PM +0700, Robert Elz wrote:
> Date:Fri, 8 Jan 2016 11:22:28 +0800 (PHT)
> From:Paul Goyette 
> Message-ID:  
> 
>   | Is there a "supported" interface for detaching the file (or descriptor) 
>   | from the process without closing it?
> 
> Inside the kernel you want to follow the exact same procedure as would
> be done by
> 
>   newfd = dup(oldfd);
>   close(oldfd);
> 
> except instead of dup (and assigning to a newfd in the process) we
> take the file reference and stick it in filemon.   There's nothing
> magic about this step.  What magic there is (though barely worthy of
> the title) would be in ensuring that filemon properly releases the file
> when it is closing.

Years ago I added to gre(4) an ioctl that a user thread can use to
delegate to the kernel a UDP socket that carries tunnel traffic.  I
think that that code should cover at least the dup(2) part of Robert's
suggestion.

Dave

-- 
David Young
dyo...@pobox.comUrbana, IL(217) 721-9981


Re: In-kernel process exit hooks?

2016-01-08 Thread Taylor R Campbell
   Date: Fri, 8 Jan 2016 17:13:54 +0800 (PHT)
   From: Paul Goyette 

   On Fri, 8 Jan 2016, bch wrote:

   > Out of curiousity, I'm trying to use this interface, but getting:
   >
   > /usr/include/sys/ktrace.h:83:20: error: field '_ts' has incomplete type
   >struct timespec _ts;
   >
   > Does this look familiar to anybody ?

   Perhaps manually include  ?

Shouldn't be necessary.  If  uses struct timespec, it
should include .


Re: In-kernel process exit hooks?

2016-01-08 Thread Taylor R Campbell
   Date: Fri, 8 Jan 2016 09:00:03 +0800 (PHT)
   From: Paul Goyette 

   On Fri, 8 Jan 2016, Paul Goyette wrote:

   > The saga continues!   :)
   >
   > 
   >
   > Next, I'm going to have a look at fd_getfile2()/fclose() and see if that
   > is a viable approach.

   Hmmm.  The comments in kern/kern_descrip.c for fd_getfile2() require 
   that the process is not allowed to fork or exit "across this call" 
   (which I assume means "until the reference to the struct file is 
   released").

I don't think that's what it means.  I think it means the process
whose descriptor we're asking for can't fork or exit *during the call*
to fd_getfile2.  In this case, it's the current process, so that's not
an issue.

The reason it is an issue stated in a comment is that the routine
appears to have been written for the purpose of procfs, in which case
the calling process is often not the same as the process passed to
fd_getfile2.


Re: In-kernel process exit hooks?

2016-01-07 Thread Paul Goyette

On Fri, 8 Jan 2016, Paul Goyette wrote:


The saga continues!   :)



Next, I'm going to have a look at fd_getfile2()/fclose() and see if that
is a viable approach.


Hmmm.  The comments in kern/kern_descrip.c for fd_getfile2() require 
that the process is not allowed to fork or exit "across this call" 
(which I assume means "until the reference to the struct file is 
released").


I'm not sure how to prevent this.  I could probably re-use the exithook 
mechanism from the previous approach to handle the exit case, but how to 
prevent fork() is another beast entirely (I think).  And to make things 
worse, the example code in the filemon(4) man page explicitly calls 
fork().


I'm quickly running out of ideas here...  I'm strongly leaning towards 
leaving the code alone, and more clearly documenting the conditions 
under which the hang occurs (ie, closure of the activity-log fd prior to 
disassociated the activity-log from the /dev/filemon fd, whether the 
close is done explicitly or as part of the sequential close that occurs 
at process exit).


Someone (Martin@, I think) earlier suggested modifying things to use 
cv_timedwait() in fd_close(), and modifying fd_free() to retry.  This 
might help in the process exit scenario, but doesn't address the case 
where the application process explicitly closes the file with the extra 
reference.


It might also be possible to modify fd_close() to have a 
filemon-specific close routine, similar to what is currently being done 
for knote_fdclose().  But this seems rather heavy-handed for something 
that has such a limited use-case as filemon(4).


The only other approach I can think of is to modify filemon(4) so it 
does not directly write any data to the activity log.  Rather than 
having the application provide a log fd (on which the extra reference 
needs to be taken), the application would read(2) from the filemon fd 
and handle the "logging" itself.  This would mean two additional trips 
across the kernel/userland boundary for every event, which feels like a 
rather costly approach.  It would also mean modifying make(1) (the only 
"production" use-case for filemon(4)).



Any other suggestions would be appreciated.


+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++


Re: In-kernel process exit hooks?

2016-01-07 Thread Paul Goyette

The saga continues!   :)

I previously posted one set of patches that changes filemon to use 
fd_{get,put}file() only when the activity-log file is actually being 
accessed.  This works, but doesn't prevent or detect when the 
application program closes-and-reopens that fd.


Attached to this Email is a separate patch, using exithooks.  The 
current exithook implementation is extended to have an additional 
"phase" argument, with values of EXITHOOK_BEFORE_CLOSE and 
EXITHOOK_AFTER_CLOSE.  (Existing uses of exithooks in sys_aio, sysv_sem, 
and rump are updated to use EXITHOOK_AFTER_CLOSE.)


Filemon(4) now establishes an EXITHOOK_BEFORE_CLOSE which searches the 
process's file descriptor table for entries to /dev/filemon.  If any are 
found, the associated activity-log file is disassociated from the 
filemon fd and the extra reference is released.


(Note that establishing the exithook is actually done through a config 
finalizer routine, since the built-in MODULE_CLASS_DRIVER modules are 
initialized before the exithook stuff in exec_init().)


This exithook approach also works.  There is still one problem, however. 
If the application program attempts to explicitly close the log-file fd 
while it is still associated with the filemon, we get the same hang.



Next, I'm going to have a look at fd_getfile2()/fclose() and see if that 
is a viable approach.



+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++Index: dev/filemon/filemon.c
===
RCS file: /cvsroot/src/sys/dev/filemon/filemon.c,v
retrieving revision 1.24
diff -u -p -r1.24 filemon.c
--- dev/filemon/filemon.c   5 Jan 2016 22:08:54 -   1.24
+++ dev/filemon/filemon.c   7 Jan 2016 23:54:37 -
@@ -43,6 +43,8 @@ __KERNEL_RCSID(0, "$NetBSD: filemon.c,v 
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "filemon.h"
 #include "ioconf.h"
@@ -89,6 +91,75 @@ static TAILQ_HEAD(, filemon) filemons_in
 static int logLevel = LOG_DEBUG;
 #endif
 
+/*
+ * Use an exithook to make sure that the filemon device is closed and
+ * our additional reference to the output file is released before the
+ * normal closure of open files.  The extra reference would otherwise
+ * cause process exit to hang indefinitely.
+ *
+ * We need to use a config_finalize() routine to register the hook
+ * due to system initialization sequence.
+ */
+void *hook = NULL; /* cookie from exithook_establish() */
+
+static int filemon_register_hook(device_t dev __unused);
+static voidfilemon_exithook(struct proc *p, void *arg);
+
+static int
+filemon_register_hook(device_t dev __unused)
+{
+
+   if (hook == NULL)
+   hook = exithook_establish(filemon_exithook, hook,
+   EXITHOOK_BEFORE_CLOSE);
+printf("%s: hook %p\n", __func__, hook);
+   if (hook != NULL)
+   return 0;
+   return ENXIO;
+}
+
+static void
+filemon_exithook(struct proc *p, void *arg)
+{
+   int fd;
+   struct filemon  *filemon;
+   fdfile_t*ff;
+   filedesc_t  *fdp;
+   fdtab_t *dt;
+
+   fdp = p->p_fd;
+   dt = fdp->fd_dt;
+   for (fd = 0; fd < dt->dt_nfiles; fd++) {
+   ff = dt->dt_ff[fd];
+   KASSERT(fd >= NDFDFILE ||
+   ff == (fdfile_t *)fdp->fd_dfdfile[fd]);
+   if (ff == NULL || ff->ff_file == NULL)
+   continue;
+   if (ff->ff_file->f_type != DTYPE_MISC ||
+   ff->ff_file->f_ops  != _fileops)
+   continue;
+printf("%s: found pid %d fd %d\n", __func__, p->p_pid, fd);
+   rw_enter(_mtx, RW_WRITER);
+   filemon = ff->ff_file->f_data;
+   if (filemon != NULL) {
+
+   /*
+* If we have an output file, release our
+* reference to it.
+*/
+   rw_enter(>fm_mtx, RW_WRITER);
+   if (filemon->fm_fp) {
+printf("%s: releasing %d\n", __func__, filemon->fm_fd);
+   KASSERT(p == curproc);
+   fd_putfile(filemon->fm_fd);
+   filemon->fm_fp = NULL;
+   }
+   rw_exit(>fm_mtx);
+   }
+   rw_exit(_mtx);
+   }
+}
+
 void
 filemon_output(struct filemon * filemon, char *msg, size_t len)
 {
@@ -216,6 +287,9 @@ filemon_open(dev_t dev, int oflags __unu
struct file *fp;
int error, fd;
 
+   if (hook == NULL)
+   return ENXIO;
+
/* falloc() will 

RE: In-kernel process exit hooks?

2016-01-07 Thread Terry Moore
Will the hang occur if make dies due to a signal (control-C, or kill -9, for
example)?

--Terry

> -Original Message-
> From: tech-kern-ow...@netbsd.org [mailto:tech-kern-ow...@netbsd.org] On
> Behalf Of Paul Goyette
> Sent: Thursday, January 7, 2016 17:00
> To: Taylor R Campbell <campbell+netbsd-tech-k...@mumble.net>
> Cc: tech-kern@netbsd.org
> Subject: Re: In-kernel process exit hooks?
> 
> On Fri, 8 Jan 2016, Paul Goyette wrote:
> 
> > The saga continues!   :)
> >
> > 
> >
> > Next, I'm going to have a look at fd_getfile2()/fclose() and see if
> > that is a viable approach.
> 
> Hmmm.  The comments in kern/kern_descrip.c for fd_getfile2() require that
> the process is not allowed to fork or exit "across this call"
> (which I assume means "until the reference to the struct file is released").
> 
> I'm not sure how to prevent this.  I could probably re-use the exithook
> mechanism from the previous approach to handle the exit case, but how to
> prevent fork() is another beast entirely (I think).  And to make things
> worse, the example code in the filemon(4) man page explicitly calls fork().
> 
> I'm quickly running out of ideas here...  I'm strongly leaning towards
> leaving the code alone, and more clearly documenting the conditions under
> which the hang occurs (ie, closure of the activity-log fd prior to
> disassociated the activity-log from the /dev/filemon fd, whether the close
> is done explicitly or as part of the sequential close that occurs at process
> exit).
> 
> Someone (Martin@, I think) earlier suggested modifying things to use
> cv_timedwait() in fd_close(), and modifying fd_free() to retry.  This might
> help in the process exit scenario, but doesn't address the case where the
> application process explicitly closes the file with the extra reference.
> 
> It might also be possible to modify fd_close() to have a filemon-specific
> close routine, similar to what is currently being done for knote_fdclose().
> But this seems rather heavy-handed for something that has such a limited
> use-case as filemon(4).
> 
> The only other approach I can think of is to modify filemon(4) so it does
> not directly write any data to the activity log.  Rather than having the
> application provide a log fd (on which the extra reference needs to be
> taken), the application would read(2) from the filemon fd and handle the
> "logging" itself.  This would mean two additional trips across the
> kernel/userland boundary for every event, which feels like a rather costly
> approach.  It would also mean modifying make(1) (the only "production" use-
> case for filemon(4)).
> 
> 
> Any other suggestions would be appreciated.
> 
> 
> +--+--++
> | Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
> | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
> | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
> +--+--++



Re: In-kernel process exit hooks?

2016-01-07 Thread Robert Elz
Date:Fri, 8 Jan 2016 09:00:03 +0800 (PHT)
From:Paul Goyette 
Message-ID:  

  | Any other suggestions would be appreciated.

Another possibility would be to detach the logged-to file from the
process when logging is enabled (making the ioctl that attaches it
also be notionally a close from the process point of view).   But
keeping the reference in kernel.  When logging is disabled (for whatever
reason) do the kernel fd_close() at that point).

Do this, and the process can no longer do an explicit close to cause
a hang, and exit() won't close it in the normal sequence either, only
closing the log (or detaching it from the fd) would close that file.

That's also a change of filemon semantics, but I can't imagine that the
application (make) really wants to be accessing the file while filemon is
writing to it, does it?   And if it did, it could always just explicitly
open it again (though I haven't really thought through the effects of a dup()
at this point - perhaps forbid ioctl'ing to a file that has a ref
count > 1, so we know the fd is the only reference to it, once
the ioctl "closes" it, a dup can't happen, obviously.

Accessing the file via an independent open (before or after the ioctl)
should not have any of these problems, that's a different struct file.

kre



Re: In-kernel process exit hooks?

2016-01-07 Thread Paul Goyette

On Fri, 8 Jan 2016, Robert Elz wrote:


Another possibility would be to detach the logged-to file from the
process when logging is enabled (making the ioctl that attaches it
also be notionally a close from the process point of view).   But
keeping the reference in kernel.  When logging is disabled (for whatever
reason) do the kernel fd_close() at that point).


Is there a "supported" interface for detaching the file (or descriptor) 
from the process without closing it?



Do this, and the process can no longer do an explicit close to cause
a hang, and exit() won't close it in the normal sequence either, only
closing the log (or detaching it from the fd) would close that file.

That's also a change of filemon semantics, but I can't imagine that the
application (make) really wants to be accessing the file while filemon is
writing to it, does it?   And if it did, it could always just explicitly
open it again (though I haven't really thought through the effects of a dup()
at this point - perhaps forbid ioctl'ing to a file that has a ref
count > 1, so we know the fd is the only reference to it, once
the ioctl "closes" it, a dup can't happen, obviously.


I expect that make(1) assumes the log-fd is still available, and just 
follows the example code in filemon(4) man page: when finished logging, 
reset the file-pointer to 0 and start reading the data.



Accessing the file via an independent open (before or after the ioctl)
should not have any of these problems, that's a different struct file.


Yeah, I was trying to avoid the change in semantics.  :)  The fewer 
things I touch, the fewer things will go wrong, and I definitely don't 
want to break make, which would result in difficulties making[sic] a new 
version.  :)



+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++


Re: In-kernel process exit hooks?

2016-01-07 Thread Paul Goyette

OK, I now have a third way of handling the problem.

To recap the three options (refer to the attachments)

1. fd_getfile

   This option calls fd_getfile() each time it needs to access the
   activity log, and calls fd_putfile() after each use.  This way,
   the additional reference on the file descriptor lasts only for a
   short period of time, and does not exist at any time that it can
   be passed to fd_close().

   The biggest drawback here is that the user-land application can
   close() and re-open() this fd, possibly referring to a different
   file; this will not be visible to filemon, which will write new
   event records to any file that happens to be opened on this fd.

2. exithook

   This option extends the existing exithook() mechanism to have
   multiple "phases", one of which can happen before the process
   exit code calls fd_free() (which in turn calls fd_close() for all
   open file descriptors).  The exithook registered by filemon finds
   any usages of filemon and resets the activity-log, which releases
   the extra reference to the log fd.

   This option works well for normal process exit (including signal),
   but does not resolve the problem if the application itself calls
   close() on the log fd.  In that situation, the process will still
   hang.

   Additionally, setting this up correctly is awkward, due to the
   order in which kernel components are initialized.  (Modules of
   CLASS_DRIVER get loaded and initialized before exec_init() can
   set up the hook mechanisms, so we need to use a config_finalizer
   to establish the exit hook.)

3. filemon-fd_close

   This solution introduces a new, filemon-specific callback in
   fd_close() (but only if the filemon module is loaded or built-in).
   Each time a file descriptor is passed to fd_close, the callback
   is invoked.  The callback checks each usage of /dev/filemon and
   if that usage is logging activity to the file being closed, the
   activity-log is reset, releasing the extra reference.  Thus,
   after we return to fd_close() the reference count is normal and
   the file gets properly closed.

   The only drawback I see here is the additional overhead of the
   callback, on every call to fd_close().  The code catches every
   occurrence of the "hang" that I can find, and handles it cleanly.

I still need to think about the fd_getfile2()/fclose() approach to see 
if it meets our needs, but the comments that prohibit calling fork() 
would seem to preclude this mechanism.


The "detach file from userland" approach suggested by kre would also 
likely work, but I'm reluctant to change the semantics of filemon.






+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++Index: filemon.c
===
RCS file: /cvsroot/src/sys/dev/filemon/filemon.c,v
retrieving revision 1.24
diff -u -p -r1.24 filemon.c
--- filemon.c   5 Jan 2016 22:08:54 -   1.24
+++ filemon.c   7 Jan 2016 06:28:01 -
@@ -94,9 +94,21 @@ filemon_output(struct filemon * filemon,
 {
struct uio auio;
struct iovec aiov;
+   struct file *fp; /* Output file pointer. */
 
-   if (filemon->fm_fp == NULL)
+
+   if (filemon->fm_fp == NULL) /* No output file specified */
+   return;
+   if ((fp = fd_getfile(filemon->fm_fd)) != filemon->fm_fp) {
+   /*
+* The file descriptor refers to a different file
+* than when it was passed to SET_FD.  So assume we
+* never had an output file.
+*/
+   filemon->fm_fp = NULL;
+   filemon->fm_fd = -1;
return;
+   }
 
aiov.iov_base = msg;
aiov.iov_len = len;
@@ -122,6 +134,7 @@ filemon_output(struct filemon * filemon,
(*filemon->fm_fp->f_ops->fo_write) (filemon->fm_fp,
&(filemon->fm_fp->f_offset),
, curlwp->l_cred, FOF_UPDATE_OFFSET);
+   fd_putfile(filemon->fm_fd); /* release our reference */
 }
 
 void
@@ -264,10 +277,7 @@ filemon_close(struct file * fp)
 * once we have exclusive access, it should never be used again
 */
rw_enter(>fm_mtx, RW_WRITER);
-   if (filemon->fm_fp) {
-   fd_putfile(filemon->fm_fd); /* release our reference */
-   filemon->fm_fp = NULL;
-   }
+   filemon->fm_fp = NULL;
rw_exit(>fm_mtx);
rw_destroy(>fm_mtx);
kmem_free(filemon, sizeof(struct filemon));
@@ -309,6 +319,7 @@ filemon_ioctl(struct file * fp, u_long c
error = EBADF;
break;
}
+   

Re: In-kernel process exit hooks?

2016-01-07 Thread bch
On 1/7/16, Paul Goyette  wrote:
> On Fri, 8 Jan 2016, Paul Goyette wrote:
>
>> The saga continues!   :)
>>
>> 
>>
>> Next, I'm going to have a look at fd_getfile2()/fclose() and see if that
>> is a viable approach.
>
> Hmmm.  The comments in kern/kern_descrip.c for fd_getfile2() require
> that the process is not allowed to fork or exit "across this call"
> (which I assume means "until the reference to the struct file is
> released").
>
> I'm not sure how to prevent this.  I could probably re-use the exithook
> mechanism from the previous approach to handle the exit case, but how to
> prevent fork() is another beast entirely (I think).  And to make things
> worse, the example code in the filemon(4) man page explicitly calls
> fork().
>
> I'm quickly running out of ideas here...  I'm strongly leaning towards
> leaving the code alone, and more clearly documenting the conditions
> under which the hang occurs (ie, closure of the activity-log fd prior to
> disassociated the activity-log from the /dev/filemon fd, whether the
> close is done explicitly or as part of the sequential close that occurs
> at process exit).
>
> Someone (Martin@, I think) earlier suggested modifying things to use
> cv_timedwait() in fd_close(), and modifying fd_free() to retry.  This
> might help in the process exit scenario, but doesn't address the case
> where the application process explicitly closes the file with the extra
> reference.
>
> It might also be possible to modify fd_close() to have a
> filemon-specific close routine, similar to what is currently being done
> for knote_fdclose().  But this seems rather heavy-handed for something
> that has such a limited use-case as filemon(4).
>
> The only other approach I can think of is to modify filemon(4) so it
> does not directly write any data to the activity log.  Rather than
> having the application provide a log fd (on which the extra reference
> needs to be taken), the application would read(2) from the filemon fd
> and handle the "logging" itself.  This would mean two additional trips
> across the kernel/userland boundary for every event, which feels like a
> rather costly approach.  It would also mean modifying make(1) (the only
> "production" use-case for filemon(4)).

It adds a degree of complexity, but one could also have both
interfaces (i.e.: filemon(4) does the writing, or the app recv's the
data from filemon and does the writing.

> Any other suggestions would be appreciated.
>
>
> +--+--++
> | Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
> | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
> | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
> +--+--++
>


RE: In-kernel process exit hooks?

2016-01-07 Thread Paul Goyette

On Thu, 7 Jan 2016, Terry Moore wrote:


Will the hang occur if make dies due to a signal (control-C, or kill -9, for
example)?


It depends on which fd is numerically lower.  If the /dev/filemon fd is 
lower, it will get closed first, which removes the reference to the 
log-file fd.  If the log-file fd is numerically lower, it will get 
closed first, and will hang waiting for its reference count to clear.


In practice, I expect that make(1) will open /dev/filemon first, get an 
fd, and then open the log file and get another fd.  The fd's are 
assigned on a first-free basis, so /dev/filemon gets the lower number, 
and no crash.  But no guarantee, either!



+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++


Re: In-kernel process exit hooks?

2016-01-07 Thread David Holland
On Thu, Jan 07, 2016 at 07:34:33AM +0800, Paul Goyette wrote:
 > Based on internal implementation of filemon(4), there is an ordering
 > requirement imposed on the sequence of events that occur when a process
 > using /dev/filemon exits.  In particular, the file descriptor on which the
 > device is open needs to be closed prior to closing the descriptor to which
 > avent/activity records are being logged.  (Otherwise, because of an extra
 > reference on the log file, fd_close() will hang indefinitely.)

Looking at the filemon code... it is completely wrong. It should not
be using file handles like that, especially not in a device driver.
Rather than adding new hacks to the system to work around it being
silly, it should be redone properly.

For an example of the right way to do this kind of thing, look in
kern_acct.c.

-- 
David A. Holland
dholl...@netbsd.org


Re: In-kernel process exit hooks?

2016-01-07 Thread David Holland
On Fri, Jan 08, 2016 at 11:22:28AM +0800, Paul Goyette wrote:
 > Yeah, I was trying to avoid the change in semantics.  :)  The fewer things
 > I touch, the fewer things will go wrong, and I definitely don't want to
 > break make, which would result in difficulties making[sic] a new version.
 > :)

The affected code in make is only sjg's meta-mode thing, which nobody
besides him really understands what it does. So while breaking it
isn't desirable, the risk of getting tangled in it yourself is
negligible.

-- 
David A. Holland
dholl...@netbsd.org


Re: In-kernel process exit hooks?

2016-01-07 Thread Robert Elz
Date:Fri, 8 Jan 2016 11:22:28 +0800 (PHT)
From:Paul Goyette 
Message-ID:  

  | Is there a "supported" interface for detaching the file (or descriptor) 
  | from the process without closing it?

Inside the kernel you want to follow the exact same procedure as would
be done by

newfd = dup(oldfd);
close(oldfd);

except instead of dup (and assigning to a newfd in the process) we
take the file reference and stick it in filemon.   There's nothing
magic about this step.  What magic there is (though barely worthy of
the title) would be in ensuring that filemon properly releases the file
when it is closing.

  | I expect that make(1) assumes the log-fd is still available, and just 
  | follows the example code in filemon(4) man page: when finished logging, 
  | reset the file-pointer to 0 and start reading the data.

make(1) can be changed...   After all, it was added originally, and that
took code changes to make happen.   Further, make must work OK without it,
none of my kernels have filemon included (and they don't load modules),
and make works just fine.

  | Yeah, I was trying to avoid the change in semantics.  :)

Generally a good thing, and if there's a clean way to make it happen,
worth doing.   But this one is a case where that almost certainly doesn't
really matter.

kre



Re: In-kernel process exit hooks?

2016-01-07 Thread Robert Elz
Date:Fri, 8 Jan 2016 11:22:28 +0800 (PHT)
From:Paul Goyette 
Message-ID:  

  | Is there a "supported" interface for detaching the file (or descriptor) 
  | from the process without closing it?

Actually, thinking through this more, why not just "fix" filemon to make
a proper reference to the file, instead of the half baked thing it is
currently doing.

That is, instead of a fd_getfile() without an (almost immediate) fd_putfile()
so keeping ff_refcount incremented, in filemon, just do

fp = fd_getfile(...);
fp->f_count++;
fd_putfile(...);

so filemon has a proper reference to the file*.   When it is done, it
just closes it, like any other file would be closed (which decrements
f_count and frees the struct file if it goes to 0).

Wouldn't this solve all the problems, keep the semantics the same, and
just generally be cleaner?

Is there some particular benefit filemon gets (aside from a little less
bookkeeping work) by sticking to its half baked reference grabbing scheme?

kre



Re: In-kernel process exit hooks?

2016-01-07 Thread Paul Goyette

Hmmm, option #3 doesn't work so well, after all!

I tried it when both the target and monitor processes were children of 
the same shell process.  Thus all three of these share the same 'struct 
file' for their stdout, and the filemon_fd_close() callback cannot tell 
which one is being closed.  The all have the same fd (just in different 
descriptor tables?)


So next I'm going to try kre's suggestion about using a "proper" refcnt 
mechanism and see if that helps.


This filemon thing is a real beast!  :)



On Fri, 8 Jan 2016, Paul Goyette wrote:


OK, I now have a third way of handling the problem.

To recap the three options (refer to the attachments)

1. fd_getfile

  This option calls fd_getfile() each time it needs to access the
  activity log, and calls fd_putfile() after each use.  This way,
  the additional reference on the file descriptor lasts only for a
  short period of time, and does not exist at any time that it can
  be passed to fd_close().

  The biggest drawback here is that the user-land application can
  close() and re-open() this fd, possibly referring to a different
  file; this will not be visible to filemon, which will write new
  event records to any file that happens to be opened on this fd.

2. exithook

  This option extends the existing exithook() mechanism to have
  multiple "phases", one of which can happen before the process
  exit code calls fd_free() (which in turn calls fd_close() for all
  open file descriptors).  The exithook registered by filemon finds
  any usages of filemon and resets the activity-log, which releases
  the extra reference to the log fd.

  This option works well for normal process exit (including signal),
  but does not resolve the problem if the application itself calls
  close() on the log fd.  In that situation, the process will still
  hang.

  Additionally, setting this up correctly is awkward, due to the
  order in which kernel components are initialized.  (Modules of
  CLASS_DRIVER get loaded and initialized before exec_init() can
  set up the hook mechanisms, so we need to use a config_finalizer
  to establish the exit hook.)

3. filemon-fd_close

  This solution introduces a new, filemon-specific callback in
  fd_close() (but only if the filemon module is loaded or built-in).
  Each time a file descriptor is passed to fd_close, the callback
  is invoked.  The callback checks each usage of /dev/filemon and
  if that usage is logging activity to the file being closed, the
  activity-log is reset, releasing the extra reference.  Thus,
  after we return to fd_close() the reference count is normal and
  the file gets properly closed.

  The only drawback I see here is the additional overhead of the
  callback, on every call to fd_close().  The code catches every
  occurrence of the "hang" that I can find, and handles it cleanly.

I still need to think about the fd_getfile2()/fclose() approach to see if it 
meets our needs, but the comments that prohibit calling fork() would seem to 
preclude this mechanism.


The "detach file from userland" approach suggested by kre would also likely 
work, but I'm reluctant to change the semantics of filemon.






+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++

!DSPAM:568f3f46191303231618539!



+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++


Re: In-kernel process exit hooks?

2016-01-07 Thread David Holland
On Fri, Jan 08, 2016 at 07:08:19AM +, David Holland wrote:
 > For an example of the right way to do this kind of thing, look in
 > kern_acct.c.

Better example: sys_fktrace, since that uses a file handle. And it
does virtually the same thing that filemon's trying to do, except much
more thoroughly.

-- 
David A. Holland
dholl...@netbsd.org


Re: In-kernel process exit hooks?

2016-01-07 Thread Paul Goyette

On Thu, 7 Jan 2016, Taylor R Campbell wrote:


  > Another possibility would be to change filemon(4) to do fd_getfile
  > each it needs to use the file descriptor.  This makes it a little more
  > brittle (fails if you close the descriptor), but would sidestep the
  > problem.

  Hmmm, perhaps.  Failure would not be a problem, since we would just
  revert to the initial "output file unspecified" conditions.

  I think I like this approach.  :)  I'll give it a try.

Is it supposed to monitor file activity past a fork and exec and
recursively through all subprocesses, or is it supposed to apply only
to the current process?

If it's supposed to work across fork/exec/, then this won't work --
but neither will the current approach!  I think fd_putfile will just
close whatever random descriptor happens to be in that slot, which may
be unrelated if userland (perhaps in some deeply nested child) did
dup2.


What I tried was to use

fd_getfile(fd)
write activity record
fd_putfile(fd)

It seemed to work, but thinking back on it, I have no idea which 
process's descriptor was being used for the lookup - probably curproc 
which would not be correct.  (It would need to be the process which has 
/dev/filemon open.)



  > Another possibility would be to use fd_getfile2/closef, which holds a
  > reference only on the file, instead of fd_getfile/fd_putfile, which
  > holds a reference on the file and on the descriptor.  Releasing the
  > reference to the file may not a problem, even though releasing the
  > reference to the descriptor is a problem as you found.

  Would this prevent the file descriptor from being closed behind our
  backs?

The descriptor could be closed but the file will persist.


Hmmm. I will have to consider this option.


+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++


Re: In-kernel process exit hooks?

2016-01-06 Thread Paul Goyette

On Wed, 6 Jan 2016, Taylor R Campbell wrote:


Another possibility would be to change filemon(4) to do fd_getfile
each it needs to use the file descriptor.  This makes it a little more
brittle (fails if you close the descriptor), but would sidestep the
problem.


Hmmm, perhaps.  Failure would not be a problem, since we would just 
revert to the initial "output file unspecified" conditions.


I think I like this approach.  :)  I'll give it a try.


Another possibility would be to use fd_getfile2/closef, which holds a
reference only on the file, instead of fd_getfile/fd_putfile, which
holds a reference on the file and on the descriptor.  Releasing the
reference to the file may not a problem, even though releasing the
reference to the descriptor is a problem as you found.


Would this prevent the file descriptor from being closed behind our 
backs?



+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++


Re: In-kernel process exit hooks?

2016-01-06 Thread Paul Goyette

> Another possibility would be to change filemon(4) to do fd_getfile
> each it needs to use the file descriptor.  This makes it a little
> more brittle (fails if you close the descriptor), but would sidestep
> the problem.

Hmmm, perhaps.  Failure would not be a problem, since we would just
revert to the initial "output file unspecified" conditions.

I think I like this approach.  :)  I'll give it a try.


This actually works quite well.  Please see the attached diffs for your 
review.


One possible problem is what happens if the monitoring program closes 
the file descriptor, and then re-uses that fd?  I've included a check to 
compare the original 'struct file *' pointer with the current one, which 
will catch "some" instances, but not guaranteed to catch them all.  It 
could be a bit of a surprise if filemon output shows up in unexpected 
places.  :)


Because of this potential for surprising the user, I think I'm still 
leaning to my earlier proposal of extending exithook processing.  But 
given the limited number of use-cases for filemon, I could live with 
making the fd_getfile()-only-when-you-need-it change instead.



+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++Index: filemon.c
===
RCS file: /cvsroot/src/sys/dev/filemon/filemon.c,v
retrieving revision 1.24
diff -u -p -r1.24 filemon.c
--- filemon.c   5 Jan 2016 22:08:54 -   1.24
+++ filemon.c   7 Jan 2016 00:45:46 -
@@ -94,9 +94,21 @@ filemon_output(struct filemon * filemon,
 {
struct uio auio;
struct iovec aiov;
+   struct file *fp; /* Output file pointer. */
 
-   if (filemon->fm_fp == NULL)
+
+   if (filemon->fm_fp == NULL) /* No output file specified */
+   return;
+   if ((fp = fd_getfile(filemon->fm_fd)) != filemon->fm_fp) {
+   /*
+* The file descriptor refers to a different file
+* than when it was passed to SET_FD.  So assume we
+* never had an output file.
+*/
+   filemon->fm_fp = NULL;
+   filemon->fm_fd = -1;
return;
+   }
 
aiov.iov_base = msg;
aiov.iov_len = len;
@@ -122,6 +134,7 @@ filemon_output(struct filemon * filemon,
(*filemon->fm_fp->f_ops->fo_write) (filemon->fm_fp,
&(filemon->fm_fp->f_offset),
, curlwp->l_cred, FOF_UPDATE_OFFSET);
+   fd_putfile(filemon->fm_fd); /* release our reference */
 }
 
 void
@@ -264,10 +277,7 @@ filemon_close(struct file * fp)
 * once we have exclusive access, it should never be used again
 */
rw_enter(>fm_mtx, RW_WRITER);
-   if (filemon->fm_fp) {
-   fd_putfile(filemon->fm_fd); /* release our reference */
-   filemon->fm_fp = NULL;
-   }
+   filemon->fm_fp = NULL;
rw_exit(>fm_mtx);
rw_destroy(>fm_mtx);
kmem_free(filemon, sizeof(struct filemon));
@@ -309,6 +319,7 @@ filemon_ioctl(struct file * fp, u_long c
error = EBADF;
break;
}
+   fd_putfile(filemon->fm_fd);
/* Write the file header. */
filemon_comment(filemon);
break;


RE: In-kernel process exit hooks?

2016-01-06 Thread Paul Goyette

On Wed, 6 Jan 2016, Terry Moore wrote:


You may well be right. From looking at the man page, fd_getfile() assumes the
the file is already open. Is there an additional spec_write() after the
fd_getfile()? I don't see it in you patch.


spec_write() is called via the dereferencing at the end of 
filemon_output() ...




In any case, I was using writabality as an example. It's really fragile to
depend on grabbing a file handle and assuming it's what you had before. The
security analysis of that is essentially open-ended -- and has to be revisited
every time the behavior of files as seen by fd_getfile() changes [therefore is
an eternal burden], whereas the analysis of adding an additional exit hook is
trivial, and as far as I can see, never has to be revisited.


Point taken.  In this case, write access is checked on each call, so 
that's not a problem.  But without holding the fd_getfile() reference, 
the application program is indeed frre to switch things out from under 
us.  I've attempted to minimize that by comparing the pointers to the 
'struct file' but it doesn't guarantee that things have not changed.


One more reason for us to retain the extra reference as originally 
written, and then modify exithooks as previously proposed to clean 
things up in the correct order.




I understand everyone wanting to be conservative about not adding new
facilities to the kernel, but sometimes a new facility actually saves a lot of
effort overall.

You're doing the work, so having made my point, I'll let you make your
decision.

--Terry


-Original Message-
From: tech-kern-ow...@netbsd.org [mailto:tech-kern-ow...@netbsd.org] On
Behalf Of Paul Goyette
Sent: Wednesday, January 6, 2016 21:31
To: Terry Moore <t...@mcci.com>
Cc: tech-kern@netbsd.org
Subject: RE: In-kernel process exit hooks?

I'm pretty sure that the mode check done at the beginning of
spec_write() will ensure that the file is opened with write access.

:)


On Wed, 6 Jan 2016, Terry Moore wrote:


Isn't there a security risk with the fd_getfile() approach? This
sounds (on the face of it) similar to the kinds of problems that led
tmpnam(3) to be deprecated? For example, what if the monitoring
program deliberately points the fd at a file that it opened as read-only;

will filemon then write to it?


--Terry


-Original Message-
From: tech-kern-ow...@netbsd.org [mailto:tech-kern-ow...@netbsd.org]
On Behalf Of Paul Goyette
Sent: Wednesday, January 6, 2016 16:55
To: Taylor R Campbell <campbell+netbsd-tech-k...@mumble.net>
Cc: tech-kern@netbsd.org
Subject: Re: In-kernel process exit hooks?


Another possibility would be to change filemon(4) to do fd_getfile
each it needs to use the file descriptor.  This makes it a little
more brittle (fails if you close the descriptor), but would
sidestep the problem.


Hmmm, perhaps.  Failure would not be a problem, since we would just
revert to the initial "output file unspecified" conditions.

I think I like this approach.  :)  I'll give it a try.


This actually works quite well.  Please see the attached diffs for
your review.

One possible problem is what happens if the monitoring program closes
the file descriptor, and then re-uses that fd?  I've included a check
to compare the original 'struct file *' pointer with the current one,
which will catch "some" instances, but not guaranteed to catch them
all.  It could be a bit of a surprise if filemon output shows up in
unexpected places.  :)

Because of this potential for surprising the user, I think I'm still
leaning to my earlier proposal of extending exithook processing.  But
given the limited number of use-cases for filemon, I could live with
making the fd_getfile()-only-when-you-need-it change instead.


+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at
| netbsd.org |
+--+--++





+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++





+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++


RE: In-kernel process exit hooks?

2016-01-06 Thread Paul Goyette
I'm pretty sure that the mode check done at the beginning of 
spec_write() will ensure that the file is opened with write access.


:)


On Wed, 6 Jan 2016, Terry Moore wrote:


Isn't there a security risk with the fd_getfile() approach? This sounds (on
the face of it) similar to the kinds of problems that led tmpnam(3) to be
deprecated? For example, what if the monitoring program deliberately points
the fd at a file that it opened as read-only; will filemon then write to it?

--Terry


-Original Message-
From: tech-kern-ow...@netbsd.org [mailto:tech-kern-ow...@netbsd.org] On
Behalf Of Paul Goyette
Sent: Wednesday, January 6, 2016 16:55
To: Taylor R Campbell <campbell+netbsd-tech-k...@mumble.net>
Cc: tech-kern@netbsd.org
Subject: Re: In-kernel process exit hooks?


Another possibility would be to change filemon(4) to do fd_getfile
each it needs to use the file descriptor.  This makes it a little
more brittle (fails if you close the descriptor), but would sidestep
the problem.


Hmmm, perhaps.  Failure would not be a problem, since we would just
revert to the initial "output file unspecified" conditions.

I think I like this approach.  :)  I'll give it a try.


This actually works quite well.  Please see the attached diffs for your
review.

One possible problem is what happens if the monitoring program closes
the file descriptor, and then re-uses that fd?  I've included a check to
compare the original 'struct file *' pointer with the current one, which
will catch "some" instances, but not guaranteed to catch them all.  It
could be a bit of a surprise if filemon output shows up in unexpected
places.  :)

Because of this potential for surprising the user, I think I'm still
leaning to my earlier proposal of extending exithook processing.  But
given the limited number of use-cases for filemon, I could live with
making the fd_getfile()-only-when-you-need-it change instead.


+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++





+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++


Re: In-kernel process exit hooks?

2016-01-06 Thread Taylor R Campbell
Another possibility would be to change filemon(4) to do fd_getfile
each it needs to use the file descriptor.  This makes it a little more
brittle (fails if you close the descriptor), but would sidestep the
problem.

Another possibility would be to use fd_getfile2/closef, which holds a
reference only on the file, instead of fd_getfile/fd_putfile, which
holds a reference on the file and on the descriptor.  Releasing the
reference to the file may not a problem, even though releasing the
reference to the descriptor is a problem as you found.