On Sat, Oct 30, 2010 at 03:25:56PM +0300, Mikolaj Golub wrote:
> 
> On Thu, 28 Oct 2010 22:08:54 +0300 Mikolaj Golub wrote to Pawel Jakub Dawidek:
> 
>  PJD>> I looked at the code and the keepalive packets arbe sent from another
>  PJD>> thread. Could you try turning them off in primary.c and see if that
>  PJD>> helps?
> 
>  MG> At first I set RETRY_SLEEP to 1 sec to have more keepalive packets. The 
> errors
>  MG> started to observe frequently:
> 
>  MG> Oct 28 21:35:53 bolek hastd[1709]: [storage] (secondary) Unable to 
> receive request header: RPC version wrong.
>  MG> Oct 28 21:35:54 bolek hastd[1632]: [storage] (secondary) Worker process 
> exited ungracefully (pid=1709, exitcode=75).
>  MG> Oct 28 21:36:12 bolek hastd[1722]: [storage] (secondary) Unable to 
> receive request header: RPC version wrong.
>  MG> Oct 28 21:36:12 bolek hastd[1632]: [storage] (secondary) Worker process 
> exited ungracefully (pid=1722, exitcode=75).
>  MG> ...
> 
>  MG> Now I have been running synchronization for more then a half an hour with
>  MG> keepalive_send disabled and have not seen any error.
> 
> So :-) What do you think about sending keepalive in remote_send_thread() to
> avoid this problem and sending them only when a connection is idle (it looks
> like there is no much use to send them all the time)? Something like in the
> patch below (it works for me).

I like your patch and I agree of course it is better to send keepalive
packets only when connection is idle. The only thing I'd change is to
modify QUEUE_TAKE1() macro to take additional argument 'timeout' - if we
don't want it to time out, we pass 0. Could you modify your patch?

-- 
Pawel Jakub Dawidek                       http://www.wheelsystems.com
p...@freebsd.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

Attachment: pgphudAeuOdiS.pgp
Description: PGP signature

Reply via email to