Re: Kill NFS-only kqueue poller thread

2020-05-31 Thread Visa Hankala
On Sun, May 31, 2020 at 09:40:47AM +0200, Martin Pieuchot wrote:
> On 30/05/20(Sat) 15:49, Visa Hankala wrote:
> > [...] 
> > Local filesystems can observe changes at the source, which makes polling
> > unnecessary. NFS clients do not have that benefit. The NFSv3 protocol
> > lacks a mechanism to notify clients of changes.
> > 
> > The NFS polling mechanism is in use for example when running tail -f
> > on a remote file. Directory monitoring can be utilized for example
> > by a directory browser application or a distributed batch processing
> > system.
> > 
> > > > It is true that the code is not perfect, but having even basic and
> > > > best-effort functionality with kevent(2) and NFS is better than nothing
> > > > in my opinion. The kqueue subsystem should not dumb down for emulating
> > > > poll(2) and select(2).
> > > 
> > > In that case we've to accept or work-around with the fact that a thread
> > > is allocated, an operation that can fail, in the kqfilter().  Do you have
> > > an opinion on that?  What should we do for other FSs is uniformity wanted?
> > 
> > Can't the emulation of poll(2) and select(2) operate without the poller?
> > These system calls are supposed to return true for regular files for
> > reading and writing. No polling of remote state is needed for that.
> > 
> > The read behaviour of poll(2) and select(2) differs from kevent(2)'s
> > EVFILT_READ. Consequently, there has to be an indicator that makes
> > f_event work in poll(2)/select(2) mode. I guess the same indicator can
> > control nfs_kqfilter().
> 
> So are you saying that poll(2) is broken on NFS and suggest that I extend
> the kqfilter to match this behavior?

I am not saying that. I am saying that poll(2) and kevent(2) have
different semantics with regular files. Except that the write filter
is missing at the moment, the current NFS poll(2) and kevent(2)
behaviour looks about similar to the other filesystems.

poll(2) returns true immediately, whereas kevent(2) EVFILT_READ returns
true once the file pointer is not at the end of file (unless NOTE_EOF
is set).

For the semantics of poll(2), the NFS poller is pointless. However,
the poller is necessary for approximating kevent(2) semantics on NFS.



Re: Kill NFS-only kqueue poller thread

2020-05-31 Thread Martin Pieuchot
On 30/05/20(Sat) 15:49, Visa Hankala wrote:
> [...] 
> Local filesystems can observe changes at the source, which makes polling
> unnecessary. NFS clients do not have that benefit. The NFSv3 protocol
> lacks a mechanism to notify clients of changes.
> 
> The NFS polling mechanism is in use for example when running tail -f
> on a remote file. Directory monitoring can be utilized for example
> by a directory browser application or a distributed batch processing
> system.
> 
> > > It is true that the code is not perfect, but having even basic and
> > > best-effort functionality with kevent(2) and NFS is better than nothing
> > > in my opinion. The kqueue subsystem should not dumb down for emulating
> > > poll(2) and select(2).
> > 
> > In that case we've to accept or work-around with the fact that a thread
> > is allocated, an operation that can fail, in the kqfilter().  Do you have
> > an opinion on that?  What should we do for other FSs is uniformity wanted?
> 
> Can't the emulation of poll(2) and select(2) operate without the poller?
> These system calls are supposed to return true for regular files for
> reading and writing. No polling of remote state is needed for that.
> 
> The read behaviour of poll(2) and select(2) differs from kevent(2)'s
> EVFILT_READ. Consequently, there has to be an indicator that makes
> f_event work in poll(2)/select(2) mode. I guess the same indicator can
> control nfs_kqfilter().

So are you saying that poll(2) is broken on NFS and suggest that I extend
the kqfilter to match this behavior?



Re: Kill NFS-only kqueue poller thread

2020-05-31 Thread Martin Pieuchot
On 30/05/20(Sat) 09:25, Theo de Raadt wrote:
> [...] 
> What does this have to do with threads?  That is an implimentation detail.

This implementation detail is specific to NFS, no other FS do anything
like that.  So I'm questioning whether calling kthread_create(9) inside a
kqueue(2) handler, which is called from kevent(2) and could be called
from poll(2) is the wanted behavior.

> This is about behaviour.  If you open a FFS directory, do you eventually
> get updates?

FFS and all the other FS have the same behavior when it comes to poll(2)
and kqueue(2).  They don't check for I/O and return true.  NFS tries to
be clever for kqueue(2)-only, I'm questioning the added value of that
cleverness.

> Should NFS not try to act the same?

That's exactly what my diff is doing: bring it in sync with other FS.



Re: Kill NFS-only kqueue poller thread

2020-05-30 Thread Visa Hankala
On Sat, May 30, 2020 at 03:34:06PM +0200, Martin Pieuchot wrote:
> On 30/05/20(Sat) 09:22, Visa Hankala wrote:
> > On Thu, May 28, 2020 at 12:11:20PM +0200, Martin Pieuchot wrote:
> > > When it comes to kqueue filters NFS is special.  A custom thread is
> > > created when the first event is registered.  Its purpose is to poll
> > > for changes every 2.5sec.  This logic has been inherited from NetBSD
> > > and is not present in FreeBSD.
> > > 
> > > Since filt_nfsread() only check `n_size' of a given nfsnode having a
> > > different context to emulate stat(2) behavior in kernel is questionable.
> > > So the diff below gets rid of this logic.  The rationals are KISS,
> > > coherency with nfs_poll() and match what other FS kqueue filters do.
> > > 
> > > While here add a missing write filter.
> > > 
> > > Matching the nfs_poll() logic and the write filter are necessary to keep
> > > the current behavior of poll(2) & select(2) once they start querying
> > > kqfilter handlers.
> > 
> > I think it is not good to remove nfs_kqpoll(). The function does more
> > than just supports the read filter. It generates various vnode events.
> > If the poller is removed, it becomes impossible for example to monitor
> > an NFS directory with kevent(2).
> 
> What do you mean with "impossible to monitor a NFS directory"?  Are you
> saying that NFS being is the only FS to have a specific thread to implement
> a poller is wanted feature?  If it is wanted, why is it NFS only?  Which
> use case involve monitoring a NFS directory with kevent(2)?  When are we
> aware of the existence of a "nfskqpoll" kernel thread?

Local filesystems can observe changes at the source, which makes polling
unnecessary. NFS clients do not have that benefit. The NFSv3 protocol
lacks a mechanism to notify clients of changes.

The NFS polling mechanism is in use for example when running tail -f
on a remote file. Directory monitoring can be utilized for example
by a directory browser application or a distributed batch processing
system.

> > It is true that the code is not perfect, but having even basic and
> > best-effort functionality with kevent(2) and NFS is better than nothing
> > in my opinion. The kqueue subsystem should not dumb down for emulating
> > poll(2) and select(2).
> 
> In that case we've to accept or work-around with the fact that a thread
> is allocated, an operation that can fail, in the kqfilter().  Do you have
> an opinion on that?  What should we do for other FSs is uniformity wanted?

Can't the emulation of poll(2) and select(2) operate without the poller?
These system calls are supposed to return true for regular files for
reading and writing. No polling of remote state is needed for that.

The read behaviour of poll(2) and select(2) differs from kevent(2)'s
EVFILT_READ. Consequently, there has to be an indicator that makes
f_event work in poll(2)/select(2) mode. I guess the same indicator can
control nfs_kqfilter().



Re: Kill NFS-only kqueue poller thread

2020-05-30 Thread Martin Pieuchot
On 30/05/20(Sat) 09:22, Visa Hankala wrote:
> On Thu, May 28, 2020 at 12:11:20PM +0200, Martin Pieuchot wrote:
> > When it comes to kqueue filters NFS is special.  A custom thread is
> > created when the first event is registered.  Its purpose is to poll
> > for changes every 2.5sec.  This logic has been inherited from NetBSD
> > and is not present in FreeBSD.
> > 
> > Since filt_nfsread() only check `n_size' of a given nfsnode having a
> > different context to emulate stat(2) behavior in kernel is questionable.
> > So the diff below gets rid of this logic.  The rationals are KISS,
> > coherency with nfs_poll() and match what other FS kqueue filters do.
> > 
> > While here add a missing write filter.
> > 
> > Matching the nfs_poll() logic and the write filter are necessary to keep
> > the current behavior of poll(2) & select(2) once they start querying
> > kqfilter handlers.
> 
> I think it is not good to remove nfs_kqpoll(). The function does more
> than just supports the read filter. It generates various vnode events.
> If the poller is removed, it becomes impossible for example to monitor
> an NFS directory with kevent(2).

What do you mean with "impossible to monitor a NFS directory"?  Are you
saying that NFS being is the only FS to have a specific thread to implement
a poller is wanted feature?  If it is wanted, why is it NFS only?  Which
use case involve monitoring a NFS directory with kevent(2)?  When are we
aware of the existence of a "nfskqpoll" kernel thread?

> It is true that the code is not perfect, but having even basic and
> best-effort functionality with kevent(2) and NFS is better than nothing
> in my opinion. The kqueue subsystem should not dumb down for emulating
> poll(2) and select(2).

In that case we've to accept or work-around with the fact that a thread
is allocated, an operation that can fail, in the kqfilter().  Do you have
an opinion on that?  What should we do for other FSs is uniformity wanted?



Re: Kill NFS-only kqueue poller thread

2020-05-30 Thread Visa Hankala
On Thu, May 28, 2020 at 12:11:20PM +0200, Martin Pieuchot wrote:
> When it comes to kqueue filters NFS is special.  A custom thread is
> created when the first event is registered.  Its purpose is to poll
> for changes every 2.5sec.  This logic has been inherited from NetBSD
> and is not present in FreeBSD.
> 
> Since filt_nfsread() only check `n_size' of a given nfsnode having a
> different context to emulate stat(2) behavior in kernel is questionable.
> So the diff below gets rid of this logic.  The rationals are KISS,
> coherency with nfs_poll() and match what other FS kqueue filters do.
> 
> While here add a missing write filter.
> 
> Matching the nfs_poll() logic and the write filter are necessary to keep
> the current behavior of poll(2) & select(2) once they start querying
> kqfilter handlers.

I think it is not good to remove nfs_kqpoll(). The function does more
than just supports the read filter. It generates various vnode events.
If the poller is removed, it becomes impossible for example to monitor
an NFS directory with kevent(2).

It is true that the code is not perfect, but having even basic and
best-effort functionality with kevent(2) and NFS is better than nothing
in my opinion. The kqueue subsystem should not dumb down for emulating
poll(2) and select(2).



Kill NFS-only kqueue poller thread

2020-05-28 Thread Martin Pieuchot
When it comes to kqueue filters NFS is special.  A custom thread is
created when the first event is registered.  Its purpose is to poll
for changes every 2.5sec.  This logic has been inherited from NetBSD
and is not present in FreeBSD.

Since filt_nfsread() only check `n_size' of a given nfsnode having a
different context to emulate stat(2) behavior in kernel is questionable.
So the diff below gets rid of this logic.  The rationals are KISS,
coherency with nfs_poll() and match what other FS kqueue filters do.

While here add a missing write filter.

Matching the nfs_poll() logic and the write filter are necessary to keep
the current behavior of poll(2) & select(2) once they start querying
kqfilter handlers.

ok?

Index: nfs/nfs_kq.c
===
RCS file: /cvs/src/sys/nfs/nfs_kq.c,v
retrieving revision 1.30
diff -u -p -r1.30 nfs_kq.c
--- nfs/nfs_kq.c7 Apr 2020 13:27:52 -   1.30
+++ nfs/nfs_kq.c28 May 2020 09:52:52 -
@@ -32,183 +32,26 @@
 
 #include 
 #include 
-#include 
-#include 
 #include 
-#include 
 #include 
-#include 
 #include 
-#include 
-#include 
-#include 
 
-#include 
 #include 
 #include 
 #include 
 #include 
 
-void   nfs_kqpoll(void *);
-
 void   filt_nfsdetach(struct knote *);
 intfilt_nfsread(struct knote *, long);
+intfilt_nfswrite(struct knote *, long);
 intfilt_nfsvnode(struct knote *, long);
 
-struct kevq {
-   SLIST_ENTRY(kevq)   kev_link;
-   struct vnode*vp;
-   u_int   usecount;
-   u_int   flags;
-#define KEVQ_BUSY  0x01/* currently being processed */
-#define KEVQ_WANT  0x02/* want to change this entry */
-   struct timespec omtime; /* old modification time */
-   struct timespec octime; /* old change time */
-   nlink_t onlink; /* old number of references to file */
-};
-SLIST_HEAD(kevqlist, kevq);
-
-struct rwlock nfskevq_lock = RWLOCK_INITIALIZER("nfskqlk");
-struct proc *pnfskq;
-struct kevqlist kevlist = SLIST_HEAD_INITIALIZER(kevlist);
-
-/*
- * This quite simplistic routine periodically checks for server changes
- * of any of the watched files every NFS_MINATTRTIMO/2 seconds.
- * Only changes in size, modification time, change time and nlinks
- * are being checked, everything else is ignored.
- * The routine only calls VOP_GETATTR() when it's likely it would get
- * some new data, i.e. when the vnode expires from attrcache. This
- * should give same result as periodically running stat(2) from userland,
- * while keeping CPU/network usage low, and still provide proper kevent
- * semantics.
- * The poller thread is created when first vnode is added to watch list,
- * and exits when the watch list is empty. The overhead of thread creation
- * isn't really important, neither speed of attach and detach of knote.
- */
-/* ARGSUSED */
-void
-nfs_kqpoll(void *arg)
-{
-   struct kevq *ke;
-   struct vattr attr;
-   struct proc *p = pnfskq;
-   u_quad_t osize;
-   int error;
-
-   for(;;) {
-   rw_enter_write(_lock);
-   SLIST_FOREACH(ke, , kev_link) {
-   struct nfsnode *np = VTONFS(ke->vp);
-
-#ifdef DEBUG
-   printf("nfs_kqpoll on: ");
-   VOP_PRINT(ke->vp);
-#endif
-   /* skip if still in attrcache */
-   if (nfs_getattrcache(ke->vp, ) != ENOENT)
-   continue;
-
-   /*
-* Mark entry busy, release lock and check
-* for changes.
-*/
-   ke->flags |= KEVQ_BUSY;
-   rw_exit_write(_lock);
-
-   /* save v_size, nfs_getattr() updates it */
-   osize = np->n_size;
-
-   error = VOP_GETATTR(ke->vp, , p->p_ucred, p);
-   if (error == ESTALE) {
-   NFS_INVALIDATE_ATTRCACHE(np);
-   VN_KNOTE(ke->vp, NOTE_DELETE);
-   goto next;
-   }
-
-   /* following is a bit fragile, but about best
-* we can get */
-   if (attr.va_size != osize) {
-   int flags = NOTE_WRITE;
-
-   if (attr.va_size > osize)
-   flags |= NOTE_EXTEND;
-   else
-   flags |= NOTE_TRUNCATE;
-
-   VN_KNOTE(ke->vp, flags);
-   ke->omtime = attr.va_mtime;
-   } else if (attr.va_mtime.tv_sec != ke->omtime.tv_sec
-   || attr.va_mtime.tv_nsec != ke->omtime.tv_nsec) {
-   VN_KNOTE(ke->vp,