Robert Thurlow wrote:
> Mike Gerdts wrote:
>
>> When would it be desirable for umountall to unmount file systems in
>> all zones?
>
> If we're bringing the system-as-a-whole down, we try to stop
> zones, but if one of them fails to shut down, it is better
> to try to unmount the filesystems mounted in them to free
> resources on the servers.  This is nice behaviour on a
> network; without it, other clients will be unable to get
> locks and state until lease periods expire.
>
>> During system shutdown, all zones should be down before the autofs and
>> nfs client services stop in the global zones.  In the event that some
>> zone is not shut down, this means that it is likely stuck in a
>> shutting down state and any calls to unmount "stuck" nfs mounts in
>> that zone will result in a hung system call and an SMF stop method
>> timeout.
>
> It's easy, especially during development, to get zones that
> won't shut down - all you need is one vnode refcount wrong,
> for example.  We're not ever going to be able to guarantee
> that such a refcount leak won't escape into the wild (and
> several have done so).  I have also seen global zone threads
> able to do the unmount logic successfully on behalf of a
> zone - because that work is not affected by refcounts.
>
> In summary, I like Pavel's idea and code changes here.  I
> would not object to the inverse flag to get back the
> "unmount everything" semantics.  I'm not really willing to
> lose those semantics.
>
>> It seems to me that there is a real chance[1] that the RPC calls would
>> not even be routable to an NFS server.  See, for example,
>> http://bugs.opensolaris.org/view_bug.do?bug_id=6476438.
>
> I believe there is a window during shutdown where we could
> usefully attempt an all-zones umountall.  I'd want to have
> that be after zone shutdown, but I think that's already the
> case.

There is no longer  the 'window during shutdown where we could
usefully attempt an all-zones umountall'. Fix for "6675447  NFSv4 client 
hangs on
shutdown if server is down beforehand"-  has added the '-l' flag
(limit actions to the local file systems) to svc.startd:
   system("/sbin/umountall -l"
with this fix, we no longer unmount NFS there.

Summary regarding where we try unmounting non-global zones nfs mounts
during system shutdown, in the time order:

1) stop method of system/zones called from  the global zone:

 ---> /lib/svc/method/svc-zones
     ---->  zlogin -S $zone /sbin/init 0
         -----> own instance of svc.startd calls zone's stop method of 
nfs/client

2) stop method of nfs/client called from  the global zone:

   currently does cross-zone unmounting via umountall -F nfs,
  but this will be removed

3) svc.startd in the global zone, after it kills all the processes, it 
calls:

   (void) system("/sbin/umountall -l");

4)   vfs_unmountall()


#1 ... can unmount non-global zone? YES.  shutdown of zones can fail, 
but  will print the list of the zones which failed to shutdown
#2 ... can unmount non-global zone? NO(YES) we propose to avoid it in 
this code review
#3 ... can unmount non-global zone? NO. removed in 6675447
#4 ... can unmount non-global zone? NO.  never goes OTW


===============


Let's take the decision.

REQUIREMENTS:

Rob:
a) the ability to do the 'all-zones unmount from global zone'
b) the 'all-zones unmount from global zone' should be part of the 
regular system shutdown

Mike:
c) wants stop method nfs/client to work on the current zone
d) wants  umountall(1M) to work on the current zone by default

The requirements have these priorities:

c)  - highest
a)
b)
d) - lowest

It looks that most difficult to implement is b) - 'all-zones unmount 
from global zone
should be part of the regular system shutdown'. We can add a code to 
/lib/svc/method/svc-zones
which would at the very end of the stop method unmount all the 
non-global zone mounts.
This would require a new semantics for the -Z option - see option #5


OPTIONS:

1) default behavior: don't limit action(s) to the current zone
available options: none

2) default behavior:: don't limit action(s) to the current zone
available options: -z ...limit action(s)to the current zone

3) default behavior:: limit action(s) to the current zone
available options: none

4) default behavior:: limit action(s) to the current zone
available options: -Z ...apply action(s) to all zones

5) default behavior:: limit action(s) to the current zone
available options: -Z ...apply action(s) to all *non-global* zones


So I propose to implement 5). Rob, Mike does it work for you?

Since we are introducing a new option, we must go through PSARC case,
and I expect more opinions/discussions to appear/happen there.

Thanks,
Pavel



Reply via email to