Thanks Pavel, looks sane to me now.

On Tue, 09 Jun 2009 17:32:33 +0200, pavel filipensky <Pavel.Filipensky at 
sun.com> wrote:

> Frank's comments are incorporated:
>
> 1) Only changes in the comments, bo change in the C code:
>
> http://cr.opensolaris.org/~pavelf/6500269
>
> 2) - mainly changes in the comments; only a small "int -> uint_t" change
> in the C code
>     - 1  change in the comments for 6507213
>
> http://cr.opensolaris.org/~pavelf/6707722
>
> 3) no changes
>
> http://cr.opensolaris.org/~pavelf/6721281
>
> --Pavel
>
> On 06/09/09 16:40, Frank Batschulat (Home) wrote:
>> On Tue, 09 Jun 2009 11:12:22 +0200, pavel filipensky <Pavel.Filipensky at 
>> sun.com> wrote:
>>
>>
>>> I have created separated webrevs for the bugs - to make the review more
>>> comfortable and allow the reviewers to review only a subset of the 3 CRs:
>>>
>>
>> Hi Pavel, first of all, if you expect outside reviewers you probably should
>> put the lot of analyse done in the bugs comments/evaluation section
>> to either the bugs description so it appears on bugs.opensolaris.org, or even
>> better, store a text file containing them at cr.opensolaris.org along
>> with the webrevs, for those 3 bugs below, it would have been really helpfull
>> for outside observers to better understand how you arrived here...
>>
>> 6500269 assertion failed: sp->mntinfo4_list == mi, file: 
>> ../../common/fs/nfs/nfs4_vfsops.c, line: 3258
>>
>>> http://cr.opensolaris.org/~pavelf/6500269     <<< I have added some
>>>
>>
>> overall looks good, comments:
>>
>> http://cr.opensolaris.org/~pavelf/6500269/usr/src/uts/common/nfs/nfs4_clnt.h.udiff.html
>>
>> 1)
>>   *      The mntinfo4_t::mi_recovlock protects the following fields:
>>   *              mi_srvsettime
>>   *
>> + * The mntinfo4_t::mi_recovlock protects also mi_srv.
>>
>> I'd prefer this to be instead:
>>
>>   *      The mntinfo4_t::mi_recovlock protects the following fields:
>> ! *              mi_srvsettime, mi_srv
>>
>> 2)
>> + * If changing mi_srv from one  server to different server, mi_recovlock
>> + * as RW_WRITER is needed - happens in recov_newserver() and 
>> recov_clientid().
>> + * If mi_srv is set for the first time - happens in nfs4_mount()
>> + * and nfs4_mountroot(), mi_recovlock is held as RW_READER.
>> + * This means that holding mi_recovlock as RW_READER is enough to protect
>> + * mi_srv from being changed from one server to another.
>>
>> this is a bit confusing...I'd prefer this to be changes like below, the 
>> functions may change
>> over time...and the latter part is redundant
>>
>> * Changing mi_srv from one nfs4_server_t to a different one requires
>> * holding the mi_recovlock as RW_WRITER.
>> * Exception: setting mi_srv the first time in mount/mounroot is done
>> * holding the mi_recovlock as RW_READER.
>>
>> http://cr.opensolaris.org/~pavelf/6500269/usr/src/uts/common/fs/nfs/nfs4_vfsops.c.udiff.html
>>
>> 3) find_nfs4_server_all()
>>
>> 3809 find_nfs4_server_all(mntinfo4_t *mi, int all)
>> 3810 {
>> 3811         nfs4_server_t *np = mi->mi_srv;
>> 3812
>> 3813         ASSERT(nfs_rw_lock_held(&mi->mi_recovlock, RW_READER) ||
>> 3814             nfs_rw_lock_held(&mi->mi_recovlock, RW_WRITER));
>> 3815
>> 3816         if (np && (np->s_thread_exit != NFS4_THREAD_EXIT || all)) {
>> 3817                 mutex_enter(&np->s_lock);
>> 3818                 np->s_refcnt++;
>> 3819                 return (np);
>> 3820         }
>> 3821         return (NULL);
>> 3822 }
>>
>> now, that looks suspicious, you use the nfs4_server_t 'np' prior having a 
>> lock on it, that is
>> different from the old find_nfs4_server_all() impl. as well. since we're also
>> no longer holding the nfs4_server_lst_lock I have doubts we're properly 
>> protected here.
>>
>> Btw, not your changes but still worth mentioning it:
>>
>> 3809 find_nfs4_server_all(mntinfo4_t *mi, int all)
>> 3810 {
>> 3811         nfs4_server_t *np = mi->mi_srv;
>> 3812
>> 3813         ASSERT(nfs_rw_lock_held(&mi->mi_recovlock, RW_READER) ||
>> 3814             nfs_rw_lock_held(&mi->mi_recovlock, RW_WRITER));
>>
>> btw, this assert is not sufficient in the context of find_nfs4_server_all()
>> it only assert that _someone_ is holding the lock:
>>
>>    4883 nfs_rw_lock_held(nfs_rwlock_t *l, krw_t rw)
>>    4884 {
>>    4885
>>    4886      if (rw == RW_READER)
>>    4887              return (l->count > 0);
>>    4888      ASSERT(rw == RW_WRITER);
>>    4889      return (l->count < 0);
>>    4890 }
>>
>> in the context of find_nfs4_server_all()/find_nfs4_server() we
>> really want to know that the _current caller_ is holding the lock as
>> called out in the comment:
>>
>>    3791  * The caller should be holding mi->mi_recovlock, and it should 
>> continue to
>>    3792  * hold the lock until done with the returned nfs4_server_t.  Once
>>    3793  * mi->mi_recovlock is released, there is no guarantee that the 
>> returned
>>    3794  * mi->nfs4_server_t will continue to correspond to mi.
>>    3795  */
>>    3796 nfs4_server_t *
>>    3797 find_nfs4_server(mntinfo4_t *mi)
>>
>> =================================
>>
>> 6707722 nfs4_start_fop() uses gethrestime_sec() with 1 second granularity to 
>> detect server change
>>
>>> http://cr.opensolaris.org/~pavelf/6707722
>>>
>>
>> looks good, but comments:
>>
>> http://cr.opensolaris.org/~pavelf/6707722/usr/src/uts/common/fs/nfs/nfs4_recovery.c.udiff.html
>>
>> +        int droplock_cnt;
>>
>> I'd prefer this to be a uint_t.
>>
>> http://cr.opensolaris.org/~pavelf/6707722/usr/src/uts/common/nfs/nfs4_clnt.h.udiff.html
>> +        int mi_srvset_cnt; /* increment whenever changing the nfs4_server_t 
>> */
>>
>> I'd prefer mi_srvset_cnt to be a uint_t.
>>
>> =========================================
>>
>> 6721281 panic: assertion failed: np != 0L, file: 
>> ../../common/fs/nfs/nfs4_callback.c, line: 1309
>>
>>> http://cr.opensolaris.org/~pavelf/6721281
>>>
>>
>> this looks good to me.
>>
>> =========================================
>>
>> Btw, since you touch nfs4_clnt.h anyways would ya mind fixing this along 
>> with it ?
>>
>> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6507213
>>
>> cheers
>> frankB
>>
>
> 



-- 
frankB

It is always possible to agglutinate multiple separate problems
into a single complex interdependent solution.
In most cases this is a bad idea.

Reply via email to