Hi Tom,

Tom Haynes wrote:
> Dai Ngo wrote:
>> Hi,
>>
>> I'd like to have a code review for the change to fix CR 6831781.
>>
>> The problem was caused by the current handling of QFULL condition in
>> rpcmod by delaying 10 secs before returning an error to caller, see 
>> related
>> CR 6762222. <http://monaco.sfbay/detail.jsf?cr=6762222>
>>
>> The fix is to retry dispatching the RPC call, when write queue is 
>> full, in 1
>> second interval until the RPC timeout expires and returns error to 
>> caller.
>> Also replacing the "nfs server not responding..." message with "send 
>> queue
>> full.." message to help user to identify the problem better.
>>
>
> Dai,
>
> How often will this message spam the console?
>
> I understand we had an existing message going to the console, but if 
> you are
> going from 1 delay of 10s to 10 delays of 1s, I have to wonder if that 
> means
> 10 more messages?

Prior to the fix, whenever a QFULL condition ocurs, rpcmod delays 10 secs
(to allow the queue to clear) then returns an error to caller. The 
caller (NFS)
writes an error message in system log then retry the call again.

This fix modified rpcmod to retry, in 1 sec interval, until the timeout 
specified
in the RPC call expires. For TCP, the default timeout is 60 secs. If the 
QFULL
condition is not cleared after this retry period (60 secs) then rpcmod 
returns an
error to the caller which then displays an error message in the system 
log. With
this fix, the error message is almost never displayed since the QFULL 
condition
is usually cleared in less than 5 secs (seen with vdbench, with few 
worst cases
peaked out at 20 secs).
>
> Also, I think you need to do a 'hg reci' - the comment section on the 
> webrev
> is showing up more than the bug and description.
could you be more specific on this?

rasta.dainx[516] pwd
/export/home/dain/NFS_BUGS/6831781/onnv-clone
rasta.dainx[517] hg reci
abort: workspace has uncommitted changes
rasta.dainx[518]

What does 'hg reci' do?

Thanks,
-Dai


Reply via email to