Re: Making forced unmounts work

2012-12-08 Thread J. Hannken-Illjes
The more I think the more I just want to remove forced unmounts.

To take some examples:

- A hard,nointr NFS mount hanging because the server stops responding.

Even if it were possible to use fstrans_ here (and it would become ugly)
it would not help.  The root node of the mount will likely be locked by the
first thread trying to lookup so unmount won't be able to even lookup
the mount point.  If it were possible to run `mount -u' or `unmount' it
should be possible to update the mount as `soft,intr' and proceed as usual,
kill threads blocking an unmount and unmount.

- Ejecting a removable device currently mounted.

In this case the disk driver should error out and all threads should
come back with an I/O-error.  If this is not the case the driver is buggy.

- Refcount and other bugs

... should be fixed ...

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



Re: Making forced unmounts work

2012-12-08 Thread Christos Zoulas
In article 31490263-5a8e-411a-bb57-f7fc5cffc...@eis.cs.tu-bs.de,
J. Hannken-Illjes hann...@eis.cs.tu-bs.de wrote:
The more I think the more I just want to remove forced unmounts.

I think that any operation that cannot be undone (and requires reboot
to be undone) makes the OS less resilient to failure.

To take some examples:

- A hard,nointr NFS mount hanging because the server stops responding.

Even if it were possible to use fstrans_ here (and it would become ugly)
it would not help.  The root node of the mount will likely be locked by the
first thread trying to lookup so unmount won't be able to even lookup
the mount point.  If it were possible to run `mount -u' or `unmount' it
should be possible to update the mount as `soft,intr' and proceed as usual,
kill threads blocking an unmount and unmount.

Store the normalized mount path with the mountpoint, look it up in the mount
list, make all blocked threads give an I/O error on the current operation,
etc.

christos



Re: Making forced unmounts work

2012-12-07 Thread Michael van Elst
hann...@eis.cs.tu-bs.de (J. Hannken-Illjes) writes:

On Dec 6, 2012, at 8:32 AM, Michael van Elst mlel...@serpens.de wrote:

 hann...@eis.cs.tu-bs.de (J. Hannken-Illjes) writes:
 
 David wants forced unmounts to work even if a thread gets stuck
 permanently in a vnode operation.
 
 How can it get stuck (short of bugs) ?

Here we are talking about bugs only.


I am interested in the case where a mounted device is detached and the
umount blocks forever. Wether that is considered a bug or a design
decision, I don't care.


-- 
-- 
Michael van Elst
Internet: mlel...@serpens.de
A potential Snark may lurk in every tree.


Re: Making forced unmounts work

2012-12-06 Thread J. Hannken-Illjes
On Dec 6, 2012, at 8:32 AM, Michael van Elst mlel...@serpens.de wrote:

 hann...@eis.cs.tu-bs.de (J. Hannken-Illjes) writes:
 
 David wants forced unmounts to work even if a thread gets stuck
 permanently in a vnode operation.
 
 How can it get stuck (short of bugs) ?

Here we are talking about bugs only.

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



Re: Making forced unmounts work

2012-12-06 Thread Martin Husemann
I am interested, but I lack significant vnode clue. So, sorry if answers
are obvious - they are not to me.

About the only situation I ever (and it is almost reproducable at will), in
daily life, wanted to use forced unmounts instead of rebooting a machine (or
before the machine rebooted itself in direct consequence of the bug that
caused the inital issue) is a NFS mount combined with what probably are
network driver bugs (or maybe hardware).

I have multiple NFS servers where I better force all clients to use TCP
or smaller write windows, otherwise the clients will lock up on bigger
writes. AFAICT the nfs level creates the necessary packets (repeataly,
after appropriate timeouts etc.) and passes them to the driver, but the
driver (or the hardware, and most likely it is the receiving side on the
server) reproducably drops them, so no progress overall happens.

Would this situation be fixed by your patch or would the nfs layer still
hold the reader lock?

Martin


Re: Making forced unmounts work

2012-12-06 Thread J. Hannken-Illjes
On Dec 6, 2012, at 10:14 AM, Martin Husemann mar...@duskware.de wrote:

 I am interested, but I lack significant vnode clue. So, sorry if answers
 are obvious - they are not to me.
 
 About the only situation I ever (and it is almost reproducable at will), in
 daily life, wanted to use forced unmounts instead of rebooting a machine (or
 before the machine rebooted itself in direct consequence of the bug that
 caused the inital issue) is a NFS mount combined with what probably are
 network driver bugs (or maybe hardware).
 
 I have multiple NFS servers where I better force all clients to use TCP
 or smaller write windows, otherwise the clients will lock up on bigger
 writes. AFAICT the nfs level creates the necessary packets (repeataly,
 after appropriate timeouts etc.) and passes them to the driver, but the
 driver (or the hardware, and most likely it is the receiving side on the
 server) reproducably drops them, so no progress overall happens.
 
 Would this situation be fixed by your patch or would the nfs layer still
 hold the reader lock?

For now my patch only covers ffs and msdosfs file systems.

In general it should be possible to add fstrans to the nfs file system
making it possible to suspend it.  I did not dig deeper but as the nfs
client already has timeout mechanisms it should be possible to cover
this kind of problem.

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



Re: Making forced unmounts work

2012-12-06 Thread Julian Yon
On Thu, 6 Dec 2012 08:23:07 +0100
J. Hannken-Illjes hann...@eis.cs.tu-bs.de wrote:

 Looks like this thread is dead.  No one beside David Holland is
 interested and David objects.  I take back my proposal.

Have been reading the discussion. Don't assume that no contribution
means no interest!

 David wants forced unmounts to work even if a thread gets stuck
 permanently in a vnode operation.  I don't see a way to safely
 reclaim a vnode from a file system when this vnode is in use by
 another thread.

I think you could take some inspiration from Linux here: it has a very
handy umount -l which detaches the filesystem from the tree, but defers
the rest of the unmount/cleanup until the fs is no longer busy. This
can help in situations where you're not trying to physically remove
media but just need to reclaim the mount point, and therefore have no
immediate need for a umount -f. Because no process can open anything
new on the detached fs, if it eventually unwedges itself somehow it
won't get rewedged.


Julian

-- 
3072D/F3A66B3A Julian Yon (2012 General Use) pgp.2...@jry.me


signature.asc
Description: PGP signature


Re: Making forced unmounts work

2012-12-06 Thread David Holland
On Thu, Dec 06, 2012 at 10:32:01AM +, Julian Yon wrote:
  I think you could take some inspiration from Linux here: it has a very
  handy umount -l which detaches the filesystem from the tree, but defers
  the rest of the unmount/cleanup until the fs is no longer busy. This
  can help in situations where you're not trying to physically remove
  media but just need to reclaim the mount point, and therefore have no
  immediate need for a umount -f. Because no process can open anything
  new on the detached fs, if it eventually unwedges itself somehow it
  won't get rewedged.

Yes, I've talked about supporting that elsewhere (maybe not on
tech-kern, or at least not recently) -- allowing it has basically no
downside, but it requires infrastructure we don't currently have.

(Most notably, if you umount -l a hanging nfs volume and then later
want to umount -f it, you need to be able to address the leftovers
somehow; currently because it isn't in the fs namespace any more and
it isn't associated with a device, there's no way to refer to it. I'm
not sure how, if at all, Linux copes with this case.)

Also it requires splitting the current struct mount into two parts,
the filesystem and the mounting of that filesystem; but that's
desirable in the long run for other reasons as well, such as
supporting plan9-type rebind mounts.

-- 
David A. Holland
dholl...@netbsd.org


Re: Making forced unmounts work

2012-12-06 Thread David Holland
On Tue, Dec 04, 2012 at 05:31:59PM +0100, J. Hannken-Illjes wrote:
   A suspended fs has the guarantee that no other thread will be inside
   fstrans_suspend / fstrans_done of any vnode operation.
   
   Threads stuck permanently as in (c) are impossible to catch.
   
   ...doesn't that mean the (c) threads are going to be holding read
   locks, so the suspend will hang forever?
  
  Yes.
  
   We have to assume there will be at least one such thread, as people
   don't generally attempt umount -f unless something's wedged.
  
  If forced unmounts have to work with vnode operations that will not
  leave a fs the only solution will be to disable forced unmounts and
  leave reboot as the last option.  How could we safely determine if
  a vnode operation just takes some time or is stuck forever?  What should
  we do with threads that are stuck forever?

You can't. That's why umount -f is currently unsafe. We can probably
make it more safe (and as just pointed out elsewhere, linux-style lazy
unmounts are safe and can often be used as a substitute) but making it
fully safe will be difficult.

The basic use case for umount -f is when you have processes hanging
uninterruptibly waiting for a hard-mounted NFS server that's gone away
and won't be coming back anytime soon.

The other common case is when you accidentally ejected a removable
device without unmounting it, and didn't notice until reinserting it
isn't an option. Usually what causes you to notice is accessing the
device and failing, and that often results in a stuck process.

If there isn't a process hanging, generally you can clear away any
processes holding the volume busy and do an ordinary umount.

The third case I guess is when vnode refcount bugs have left you with
a busy fs that nothing's actually using; in this case there won't be
processes hanging; but by the time you've confirmed this, there also
won't be processes using the fs, so the current scheme is more or less
safe enough...

  Looks like this thread is dead.  No one beside David Holland is
  interested and David objects.  I take back my proposal.

I think fstrans may not be the mechanism we want, but I don't think we
should give up.

   - Both VOP_GETPAGES() and VOP_PUTPAGES() cannot block or sleep here
as they run part or all of the operation with vnode interlock held.
   
   Ugh. But this doesn't, for example, preclude atomically incrementing a
   per-cpu counter or setting a flag in the lwp structure.
  
  Sure, but we need a point in time where it is safe to mark a vnode dead
  and that means to reclaim its inode or other fs-private state.  This is
  not just accounting -- we must be sure no thread is using fs-private data
  and I don't see a way without suspending vnode operations.

The way to do this is to divide the dead operation into two parts:
first change the ops vector (or set a dead flag) so no new threads can
enter, then when the threads already in the fs are gone or otherwise
neutralized, delete the private data.

Or we could just leave the private data. One of the advantages of
setting a dead flag (and testing it in vnode_if.c) instead of changing
the ops vector is that the vnode doesn't lose its attachment to the fs
and can be cleaned up properly later if it ever gets released.

As for neutralized, one possible scheme is to mark the vnodes dead,
wait for a few seconds, and then do something evil to the blocked
threads still using the fs; that requires knowing which threads they
are, which is the hard part. Any simple scheme will end up being a
hotspot for cache contention.

See the kicking everybody out of the softc thread from a couple
years ago for one possible approach to that.

   - Accessing devices and fifos through a file system cannot be tracked
at the vnode_if.c level.  Take ufs_spec_read() for example:
   
 fstrans_start(...);
 VTOI(vp)-i_flag |= IN_ACCESS;
 fstrans_done(...);
 return VOCALL(spec_vnodeop_p, ...)
   
Here the VOCALL may sleep forever if the device has no data.
   
   This I don't understand. To the extent it's in the fs while it's
   doing this, vnode_if.c can know that, because it called ufs_spec_read
   and ufs_spec_read hasn't returned yet. To the extent that it's in
   specfs, specfs won't itself ever be umount -f'd, since it isn't
   mounted, so there's no danger.
  
  Ok, we had to use VOCALL rather than vnode_if.c and add the accounting
  here.

No, I don't see why. vnode_if.c calls ufs_spec_read(); until
ufs_spec_read() returns, as far as tracking done in vnode_if.c is
concerned, the thread doing it is still in the fs.

   If umount -f currently blats over the data that specfs uses, then it's
   currently not safe, but I don't see how it's different from any other
   similar case with ufs data.
   
   I expect I'm missing something.
  
  So another approach is to serialize vclean(), at least when cleaning
  active vnodes, and have VOCALL() take care of the vnode currently being
  cleaned.  We would need just 

Re: Making forced unmounts work

2012-12-06 Thread Taylor R Campbell
   Date: Thu, 6 Dec 2012 08:23:07 +0100
   From: J. Hannken-Illjes hann...@eis.cs.tu-bs.de

   Looks like this thread is dead.  No one beside David Holland is
   interested and David objects.  I take back my proposal.

I'm interested, but I haven't fit enough of the vnode life cycle or
the fstrans mechanism into my head to assess how well it is likely to
work, and I don't think it's a good strategy in general to put more
boilerplate into every vop implementation -- there is far too much
boilerplate already that lots of vop implementations get wrong.

As a step toward fixing the first problem, could you perhaps write
down the rules about how fstrans currently composes with vnode locks
and operations and life cycle?  I seem to recall there was some issue
a year or so ago about the ordering of vput and fstrans_done in
ufs_rmdir, and the nature of the issue wasn't clear to me from the man
pages or a cursory examination of the source, which suggested to me
that I don't understand fstrans well enough to reason about patches
like this.


Re: Making forced unmounts work

2012-12-06 Thread YAMAMOTO Takashi
hi,

 Forced unmounts will most likely panic the kernel.  The main problem is
 other threads running a vnode operation when we come to clean and
 reclaim an active vnode and therefore change its operation vector and
 destroy the file system private data without locking or synchronisation.
 
 One solution is to to suspend the file system while it gets unmounted.
 This way all other threads running a vnode operation will wait for
 fstrans_start() to succeed.  Changing fstrans_start() to detect a now
 dead vnode it is possible to restart the vnode operation with the new
 operations vector.
 
 In short the attached diff:
 
 - Adds a new kernel-internal errno ERESTARTVOP and changes VCALL() to
   restart a vnode operation once it returns ERESTARTVOP.
 
 - Changes fstrans_start() to take an optional `hint vnode' and return
   ERESTARTVOP if the vnode becomes dead.
 
 - Rearranges all file system operations to call fstrans_start() before
   accessing file system private data and check (and return) an error.
 
 - Changes vfs_suspend() to optionally suspend a file system without
   syncing it to disk.  This feature is used when revoking a vnode.
 
 - Changes dounmount() to suspend a file system during the unmount.
 
 - Keeps the `struct mp' until all (dead) vnodes disappear.
 
 - Adds some missing operations to deadfs.
 
 Comments or objections?

thanks for working on this.

if possible, it's better to avoid taking a rwlock for every VOPs.
IMO umount and revoke, instead every VOPs, ought to do the hard work.
have you considered to do it at upper levels like file, descriptors,
mmaps, etc?

eg.
1. iterate all processes to mark their files revoked
2. make long-sleepers (fifo, nfs rpcs, etc) abort
3. wait for reference counts settle

you may need to send IPIs as some operations are barrier-less.
(fd_getfile/putfile)

iirc dragonflybsd implements revoke at the file or descriptor level.

YAMAMOTO Takashi

 
 --
 J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)


Re: Making forced unmounts work

2012-12-05 Thread J. Hannken-Illjes
Looks like this thread is dead.  No one beside David Holland is
interested and David objects.  I take back my proposal.


David wants to track information about threads running a vnode
operation from vnode_if.c. I have no idea how this could be done
without knowing file system implementation.

David wants forced unmounts to work even if a thread gets stuck
permanently in a vnode operation.  I don't see a way to safely
reclaim a vnode from a file system when this vnode is in use by
another thread.

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



Re: Making forced unmounts work

2012-12-04 Thread David Holland
On Sun, Dec 02, 2012 at 05:29:01PM +0100, J. Hannken-Illjes wrote:
  I'm convinced -- having fstrans_start() return ERESTART is the way to go.

Ok then :-)

   Also I wonder if there's any way to accomplish this that doesn't
   require adding fstrans calls to every operation in every fs.
   
   Not in a clean way. We would need some kind of reference counting for
   vnode operations and that is quite impossible as vnode operations on
   devices or fifos sometimes wait forever and are called from other fs
   like ufsspec_read() for example.  How could we protect UFS updating
   access times here?
   
   I'm not entirely convinced of that. There are basically three
   problems: (a) new incoming threads, (b) threads that are already in
   the fs and running, and (c) threads that are already in the fs and
   that are stuck more or less permanently because something broke.
   
   Admittedly I don't really understand how fstrans suspending works.
   Does it keep track of all the threads that are in the fs, so the (b)
   ones can be interrupted somehow, or so we at least can wait until all
   of them either leave the fs or enter fstrans somewhere and stall?
  
  Fstrans is a recursive rw-lock. Vnode operations take the reader lock
  on entry.  To suspend a fs we take the writer lock and therefore
  it will catch both (a) and (b).  New threads will block on first entry,
  threads already inside the fs will continue until they leave the fs
  and block on next entry.

Ok, fair enough, but...

  A suspended fs has the guarantee that no other thread will be inside
  fstrans_suspend / fstrans_done of any vnode operation.
  
  Threads stuck permanently as in (c) are impossible to catch.

...doesn't that mean the (c) threads are going to be holding read
locks, so the suspend will hang forever?

We have to assume there will be at least one such thread, as people
don't generally attempt umount -f unless something's wedged.

   If we're going to track that information we should really do it from
   vnode_if.c, both to avoid having to modify every fs and to make sure
   all fses support it correctly. (We also need to be careful about how
   it's done to avoid causing massive lock contention; that's why such
   logic doesn't already exist.)
  
  Some cases are nearly impossible to track at the vnode_if.c level:
  
  - Both VOP_GETPAGES() and VOP_PUTPAGES() cannot block or sleep here
as they run part or all of the operation with vnode interlock held.

Ugh. But this doesn't, for example, preclude atomically incrementing a
per-cpu counter or setting a flag in the lwp structure.

  - Accessing devices and fifos through a file system cannot be tracked
at the vnode_if.c level.  Take ufs_spec_read() for example:
  
   fstrans_start(...);
   VTOI(vp)-i_flag |= IN_ACCESS;
   fstrans_done(...);
   return VOCALL(spec_vnodeop_p, ...)
  
Here the VOCALL may sleep forever if the device has no data.

This I don't understand. To the extent it's in the fs while it's
doing this, vnode_if.c can know that, because it called ufs_spec_read
and ufs_spec_read hasn't returned yet. To the extent that it's in
specfs, specfs won't itself ever be umount -f'd, since it isn't
mounted, so there's no danger.

If umount -f currently blats over the data that specfs uses, then it's
currently not safe, but I don't see how it's different from any other
similar case with ufs data.

I expect I'm missing something.

-- 
David A. Holland
dholl...@netbsd.org


Re: Making forced unmounts work

2012-12-02 Thread J . Hannken-Illjes
On Dec 2, 2012, at 5:00 AM, David Holland dholland-t...@netbsd.org wrote:

 On Thu, Nov 29, 2012 at 06:19:37PM +0100, J. Hannken-Illjes wrote:
 In short the attached diff:
 
 - Adds a new kernel-internal errno ERESTARTVOP and changes VCALL() to
 restart a vnode operation once it returns ERESTARTVOP.
 
 - Changes fstrans_start() to take an optional `hint vnode' and return
 ERESTARTVOP if the vnode becomes dead.
 
 Is there any major reason we can't just use ERESTART and rerun the
 whole syscall?
 
 Not all vnode operations come from a syscall and to me it looks cleaner
 to use one private errno for exactly this purpose.
 
 Could be. All those places are supposed to be capable of coping with
 ERESTART though (otherwise, they break if it happens) so it shouldn't
 make much difference. And if it does make a difference somewhere, that
 should be fixed... regardless of ERESTART for signals, we want FS
 operations to be able to bail and restart for transaction abort.

I'm convinced -- having fstrans_start() return ERESTART is the way to go.

snip

 Also I wonder if there's any way to accomplish this that doesn't
 require adding fstrans calls to every operation in every fs.
 
 Not in a clean way. We would need some kind of reference counting for
 vnode operations and that is quite impossible as vnode operations on
 devices or fifos sometimes wait forever and are called from other fs
 like ufsspec_read() for example.  How could we protect UFS updating
 access times here?
 
 I'm not entirely convinced of that. There are basically three
 problems: (a) new incoming threads, (b) threads that are already in
 the fs and running, and (c) threads that are already in the fs and
 that are stuck more or less permanently because something broke.
 
 Admittedly I don't really understand how fstrans suspending works.
 Does it keep track of all the threads that are in the fs, so the (b)
 ones can be interrupted somehow, or so we at least can wait until all
 of them either leave the fs or enter fstrans somewhere and stall?

Fstrans is a recursive rw-lock. Vnode operations take the reader lock
on entry.  To suspend a fs we take the writer lock and therefore
it will catch both (a) and (b).  New threads will block on first entry,
threads already inside the fs will continue until they leave the fs
and block on next entry.

A suspended fs has the guarantee that no other thread will be inside
fstrans_suspend / fstrans_done of any vnode operation.

Threads stuck permanently as in (c) are impossible to catch.
 
 If we're going to track that information we should really do it from
 vnode_if.c, both to avoid having to modify every fs and to make sure
 all fses support it correctly. (We also need to be careful about how
 it's done to avoid causing massive lock contention; that's why such
 logic doesn't already exist.)

Some cases are nearly impossible to track at the vnode_if.c level:

- Both VOP_GETPAGES() and VOP_PUTPAGES() cannot block or sleep here
  as they run part or all of the operation with vnode interlock held.

- Accessing devices and fifos through a file system cannot be tracked
  at the vnode_if.c level.  Take ufs_spec_read() for example:

fstrans_start(...);
VTOI(vp)-i_flag |= IN_ACCESS;
fstrans_done(...);
return VOCALL(spec_vnodeop_p, ...)

  Here the VOCALL may sleep forever if the device has no data.

 If, however, fstrans isn't tracking that information, I don't see how
 suspending the fs helps deal with the (b) threads, because if they're
 currently running they can continue to chew on fs-specific data for
 arbitrarily long before they run into anything that stalls them, and
 there's no way to know when they're done or how many of them there
 are.
 
 I don't really see what the issue with ufsspec_read() is, however, so
 we may be talking past each other.


--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



Re: Making forced unmounts work

2012-12-01 Thread David Holland
On Thu, Nov 29, 2012 at 06:19:37PM +0100, J. Hannken-Illjes wrote:
   In short the attached diff:
   
   - Adds a new kernel-internal errno ERESTARTVOP and changes VCALL() to
restart a vnode operation once it returns ERESTARTVOP.
   
   - Changes fstrans_start() to take an optional `hint vnode' and return
ERESTARTVOP if the vnode becomes dead.
   
   Is there any major reason we can't just use ERESTART and rerun the
   whole syscall?
  
  Not all vnode operations come from a syscall and to me it looks cleaner
  to use one private errno for exactly this purpose.

Could be. All those places are supposed to be capable of coping with
ERESTART though (otherwise, they break if it happens) so it shouldn't
make much difference. And if it does make a difference somewhere, that
should be fixed... regardless of ERESTART for signals, we want FS
operations to be able to bail and restart for transaction abort.

   I see there are two references to ERESTARTVOP in genfs_io.c, and I
   don't see what they're for without digging deeper, but given that they
   appear to make locking behavior depend on the error condition maybe it
   would be better not to do that too. :-/
  
  This is the wonderful world of VOP_GETPAGES() and VOP_PUTPAGES().  Both
  are called with vnode interlock held and when it is needed and possible
  to check the vnode the interlock has been released.  When these operations
  return ERESTARTVOP we have to lock the interlock because dead_getpages()
  and dead_putpages need it on entry (just to release it).
  
  It is possible to directly return the error from genfs_XXXpages() though.
  To me it looks clearer to always go the ERESTARTVOP route.

Ugh.

I don't like having the locking behavior be different for different
error cases; it's asking for trouble in the long run. I think this
ends up being a reason to use ERESTART instead.

   Also I wonder if there's any way to accomplish this that doesn't
   require adding fstrans calls to every operation in every fs.
  
  Not in a clean way. We would need some kind of reference counting for
  vnode operations and that is quite impossible as vnode operations on
  devices or fifos sometimes wait forever and are called from other fs
  like ufsspec_read() for example.  How could we protect UFS updating
  access times here?

I'm not entirely convinced of that. There are basically three
problems: (a) new incoming threads, (b) threads that are already in
the fs and running, and (c) threads that are already in the fs and
that are stuck more or less permanently because something broke.

Admittedly I don't really understand how fstrans suspending works.
Does it keep track of all the threads that are in the fs, so the (b)
ones can be interrupted somehow, or so we at least can wait until all
of them either leave the fs or enter fstrans somewhere and stall?

If we're going to track that information we should really do it from
vnode_if.c, both to avoid having to modify every fs and to make sure
all fses support it correctly. (We also need to be careful about how
it's done to avoid causing massive lock contention; that's why such
logic doesn't already exist.)

If, however, fstrans isn't tracking that information, I don't see how
suspending the fs helps deal with the (b) threads, because if they're
currently running they can continue to chew on fs-specific data for
arbitrarily long before they run into anything that stalls them, and
there's no way to know when they're done or how many of them there
are.

I don't really see what the issue with ufsspec_read() is, however, so
we may be talking past each other.

-- 
David A. Holland
dholl...@netbsd.org


Re: Making forced unmounts work

2012-11-29 Thread David Holland
On Mon, Nov 26, 2012 at 03:06:34PM +0100, J. Hannken-Illjes wrote:
  In short the attached diff:
  
  - Adds a new kernel-internal errno ERESTARTVOP and changes VCALL() to
restart a vnode operation once it returns ERESTARTVOP.
  
  - Changes fstrans_start() to take an optional `hint vnode' and return
ERESTARTVOP if the vnode becomes dead.

Is there any major reason we can't just use ERESTART and rerun the
whole syscall?

I see there are two references to ERESTARTVOP in genfs_io.c, and I
don't see what they're for without digging deeper, but given that they
appear to make locking behavior depend on the error condition maybe it
would be better not to do that too. :-/

Also I wonder if there's any way to accomplish this that doesn't
require adding fstrans calls to every operation in every fs.

More later on hopefully when I have more time to look...

-- 
David A. Holland
dholl...@netbsd.org


Re: Making forced unmounts work

2012-11-29 Thread J. Hannken-Illjes
On Nov 29, 2012, at 5:17 PM, David Holland dholland-t...@netbsd.org wrote:

 On Mon, Nov 26, 2012 at 03:06:34PM +0100, J. Hannken-Illjes wrote:
 In short the attached diff:
 
 - Adds a new kernel-internal errno ERESTARTVOP and changes VCALL() to
  restart a vnode operation once it returns ERESTARTVOP.
 
 - Changes fstrans_start() to take an optional `hint vnode' and return
  ERESTARTVOP if the vnode becomes dead.
 
 Is there any major reason we can't just use ERESTART and rerun the
 whole syscall?

Not all vnode operations come from a syscall and to me it looks cleaner
to use one private errno for exactly this purpose.

 I see there are two references to ERESTARTVOP in genfs_io.c, and I
 don't see what they're for without digging deeper, but given that they
 appear to make locking behavior depend on the error condition maybe it
 would be better not to do that too. :-/

This is the wonderful world of VOP_GETPAGES() and VOP_PUTPAGES().  Both
are called with vnode interlock held and when it is needed and possible
to check the vnode the interlock has been released.  When these operations
return ERESTARTVOP we have to lock the interlock because dead_getpages()
and dead_putpages need it on entry (just to release it).

It is possible to directly return the error from genfs_XXXpages() though.
To me it looks clearer to always go the ERESTARTVOP route.

 Also I wonder if there's any way to accomplish this that doesn't
 require adding fstrans calls to every operation in every fs.

Not in a clean way.  We would need some kind of reference counting for
vnode operations and that is quite impossible as vnode operations on
devices or fifos sometimes wait forever and are called from other fs
like ufsspec_read() for example.  How could we protect UFS updating
access times here?

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



Re: Making forced unmounts work

2012-11-27 Thread David Young
On Mon, Nov 26, 2012 at 03:06:34PM +0100, J. Hannken-Illjes wrote:
 Comments or objections?

I'm wondering if this will fix the bugs in 'mount -u -r /xyz' where a
FFS is mounted read-write at /xyz?  Sorry, I don't remember any longer
what the bugs were.

Dave

-- 
David Young
dyo...@pobox.comUrbana, IL(217) 721-9981


Re: Making forced unmounts work

2012-11-27 Thread J. Hannken-Illjes

On Nov 28, 2012, at 3:59 AM, David Young dyo...@pobox.com wrote:

 On Mon, Nov 26, 2012 at 03:06:34PM +0100, J. Hannken-Illjes wrote:
 Comments or objections?
 
 I'm wondering if this will fix the bugs in 'mount -u -r /xyz' where a
 FFS is mounted read-write at /xyz?  Sorry, I don't remember any longer
 what the bugs were.

This one will not fix the rw-ro mount problem where data or meta data
will be written to a file system mounted read only.  Stay tuned...

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)