On Mar 12, 2009, at 12:28 PM, Dai Ngo wrote:
> Hi Jim,
>
> james wahlig wrote:
>>
>> I suppose I could run through the server code and see if there is a  
>> precedent already set, but thought maybe someone on the alias might  
>> know.  This is something that Spencer probably would have known off  
>> the top of his head, but I don't.
>>
>> Maybe there is a retry variable defined somewhere that we could  
>> reuse instead of creating a new one.
> I consulted with Jeff, and we did not find any existing configurable  
> variable to use so
> I created a new one.

Hi Jim,

I also thought that we'd have a "num retries" variable
defined somewhere.  I found nfs4_max_recov_error_retry which
is used by client recovery and set to 3.  The client has another
retry global for retrying a mount (set to 2).  My personal
favorite is recov_state.rs_num_retry_despite_err (client).  :-)

I didn't think it made sense to limit max retries for this bug
using one of the other retry-related vars I found.  So I thought
it would be okay for Dai to define a new one for this case.
I did mention that 10 felt "too big" to me (probably because I'm
a little polluted with client defaults of 2-3), but I didn't
insist on making it smaller because the fix is about not
retrying forever, and for that, 10 is as good as 3.

The more I think about it, the more I like no retries.  Maybe
I'm missing something, but why would the client drop/ignore
the first cbnull but process a subsequent cbnull?  I'm thinking
that if it doesn't reply to the first, then it will probably not
reply to subsequent cb_null calls.  It would be interesting to
know how many times our client fails to reply to cb_null.  I
suspect that we'd see server either issue no retries or the
max number of retries and nothing in between.

Jeff


Reply via email to