Hi Pierre-Yves,

I guess we are both wrong...
I used a few different timeout values including 1000 before
changing any code. I just checked relayd(the unpatched version) again and I
get the same results.

These web servers just serve the default apache index page.
I can connect to them instantly from the load balancer (using lynx) or any
other (client)machine
I have tested.

So either the timeout value is not read/set correctly or it is something
else.

-Brad

> -----Original Message-----
> From: [EMAIL PROTECTED]
> Sent: Wed, 27 Feb 2008 11:53:03 +0100
> To: [EMAIL PROTECTED]
> Subject: Re: relayd http check connection failures; hoststated operates
> correctly
>
> Brad Arrington <[EMAIL PROTECTED]> wrote:
>> Hi,
>>
>> I ran into the same problem you did, I thought it was something I
>> was doing wrong until I read your email...
>>
>> Here is the fix I came up with.
>>
>> --- check_tcp.c-current Mon Feb 25 15:11:40 2008
>> +++ check_tcp.c Mon Feb 25 23:48:45 2008
>> @@ -82,6 +82,7 @@
>>         if (fcntl(s, F_SETFL, O_NONBLOCK) == -1)
>>                 goto bad;
>>
>> +       gettimeofday(&cte->table->conf.timeout, NULL);
>>         bcopy(&cte->table->conf.timeout, &tv, sizeof(tv));
>>         if (connect(s, (struct sockaddr *)&cte->host->conf.ss, len)
>> == -1) { if (errno != EINPROGRESS)
>>
>> I should check for return codes on gettimeofday but here it is
>> anyway... I submited a bug report too.
>>
>> -Brad
>>
>
> Hi Brad,
>
> Your fix is wrong, you run in a timeout which happens because the
> default relayd configuration supposes you are in the same broadcast
> domain than your relayed host and has a 200ms timeout.
>
> The error reporting is a bit confusing and should just mention that a
> timeout occured, I will fix that. The gettimeofday you used indeed
> fixed your issue but is really wrong since it modifies the value you
> specify in the configuration file.
>
> A simple fix for you would be to specify:
>
> timeout 1000 # (or any appropriate timeout value for your application)
>
> in your configuration file.
>
>>> startup
>>> init_filter: filter init done
>>> tcp_write: connect timed out
>>> relay_privinit: adding relay www
>>> init_tables: created 0 tables
>>> hce_notify_done: aa.bb.cc.209 (tcp_write: connect failed)
>>> protocol 0: name http
>>> host aa.bb.cc.209, check http code (3ms), state unknown -> down,
>>> availability 0.00%
>>>         flags: 0x0004
>>> tcp_write: connect timed out
>
> The timeout is mentionned here.
>
>>>         type: hce_notify_done: aa.bb.cc.211 (tcp_write: connect
>>> failed) http
>
> And then a connect failed error happens which might have confused you.
>
> pyr.

Reply via email to