Re: relayd and PUT

2018-01-19 Thread Rivo Nurges
Hi!

On Fri, 2018-01-05 at 00:12 +0100, Alexander Bluhm wrote:
> I have commited more regression tests that check the timeout with
> unidirectional traffic flow.  I could not find an error.  In theory
> when we have an idle timeout in one direction, relayd checks wheter
> there is trafic flowing in the other direction.  The tests set the
> timeout to 2 seconds and send 5 bytes while sleeping one second
> between each byte.  The timeout does not trigger.
> So it seems that you encounter some corner case.  I need more
> information.

Yes, its a bit harder to trigger. First, currently relayd opens server
connection only after first client request finishes. If the first
request is long PUT relayd buffers the PUT and the problem doesn't
appear. But it triggers another problem, depending on the request size
you run out of memory. I have another patch for this. It opens the
relay>server socket earlyer and makes the timeout problem even easyer
to trigger. I will send it separately.

So to get the server connection opened and the timeout to happen you
need to do some small GET(or whatever) request and keep the connection
open. In my test case I use "GET /; PUT /largefile"

> - Do you use http or https?

both have the problem

> - Do you use persistent connections?

yes

> - Do you use chunked encoding?

no

> - Does it only occur with http or also with plain tcp?

only http

> - Does disabling socket splicing help?

the problem happens when libevent code is in use
either splicing is disabled or not available(https)

> - Does it happen when the connect to the server is slow?
> 
> While testing I saw that with socket splicing the timeout is handled
> twice.  We get an wakeup from the idle splicing and from libevent
> timeout.  I think it is sufficient to only use the idle splicing
> if it is available.

I noticed it too, but it doesn't seem to make things worse.

> Does this diff help?

This diff doesn't change things.

Rivo

Re: relayd and PUT

2018-01-19 Thread Rivo Nurges
Hi!

Please ingore this.

Rivo

On Fri, 2018-01-19 at 14:01 +, Rivo Nurges wrote:
> On Fri, 2018-01-05 at 00:12 +0100, Alexander Bluhm wrote:
> > On Wed, Dec 13, 2017 at 07:42:03AM +0100, Claudio Jeker wrote:
> > > On Wed, Dec 13, 2017 at 12:25:39AM +, Rivo Nurges wrote:
> > > > If you http PUT a "big" file through relayd, server<>relay read
> > > > side
> > > > will eventually get a EVBUFFER_TIMEOUT. Nothing comes back from
> > > > the
> > > > server until the PUT is done. I disabled server read timeouts
> > > > for
> > > > PUT
> > > > requests.
> > > 
> > > I have seen something similar and came to the conclusion that the
> > > timeout
> > > handling of relayd is not correct. As long as traffic is flowing
> > > the
> > > timeout should be reset (at least that is what every other
> > > implementation
> > > does). This is not really happening in relayd. I have seen this
> > > on
> > > GET
> > > requests that are huge (timeout hits in the middle of the
> > > transimit
> > > and
> > > kills the session).
> > 
> > I have commited more regression tests that check the timeout with
> > unidirectional traffic flow.  I could not find an error.  In theory
> > when we have an idle timeout in one direction, relayd checks wheter
> > there is trafic flowing in the other direction.  The tests set the
> > timeout to 2 seconds and send 5 bytes while sleeping one second
> > between each byte.  The timeout does not trigger.
> > 
> > So it seems that you encounter some corner case.  I need more
> > information.
> > 
> > - Do you use http or https?
> > - Do you use persistent connections?
> > - Do you use chunked encoding?
> > - Does it only occur with http or also with plain tcp?
> > - Does disabling socket splicing help?
> > - Does it happen when the connect to the server is slow?
> > 
> > While testing I saw that with socket splicing the timeout is
> > handled
> > twice.  We get an wakeup from the idle splicing and from libevent
> > timeout.  I think it is sufficient to only use the idle splicing
> > if it is available.
> > 
> > Does this diff help?
> > 
> > bluhm
> > 
> > Index: relay.c
> > ===
> > RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/relayd/relay.c,v
> > retrieving revision 1.237
> > diff -u -p -r1.237 relay.c
> > --- relay.c 27 Dec 2017 15:53:30 -  1.237
> > +++ relay.c 4 Jan 2018 22:44:20 -
> > @@ -733,16 +733,21 @@ relay_connected(int fd, short sig, void 
> > if ((rlay->rl_conf.flags & F_TLSCLIENT) && (out->tls !=
> > NULL))
> > relay_tls_connected(out);
> >  
> > -   bufferevent_settimeout(bev,
> > -   rlay->rl_conf.timeout.tv_sec, rlay-
> > > rl_conf.timeout.tv_sec);
> > 
> > bufferevent_setwatermark(bev, EV_WRITE,
> > RELAY_MIN_PREFETCHED * proto->tcpbufsiz, 0);
> > bufferevent_enable(bev, EV_READ|EV_WRITE);
> > if (con->se_in.bev)
> > bufferevent_enable(con->se_in.bev, EV_READ);
> >  
> > -   if (relay_splice(>se_out) == -1)
> > +   switch (relay_splice(>se_out)) {
> > +   case 0:
> > +   bufferevent_settimeout(bev,
> > +   rlay->rl_conf.timeout.tv_sec, rlay-
> > > rl_conf.timeout.tv_sec);
> > 
> > +   break;
> > +   case -1:
> > relay_close(con, strerror(errno));
> > +   break;
> > +   }
> >  }
> >  
> >  void
> > @@ -784,14 +789,19 @@ relay_input(struct rsession *con)
> > if ((rlay->rl_conf.flags & F_TLS) && con->se_in.tls !=
> > NULL)
> > relay_tls_connected(>se_in);
> >  
> > -   bufferevent_settimeout(con->se_in.bev,
> > -   rlay->rl_conf.timeout.tv_sec, rlay-
> > > rl_conf.timeout.tv_sec);
> > 
> > bufferevent_setwatermark(con->se_in.bev, EV_WRITE,
> > RELAY_MIN_PREFETCHED * proto->tcpbufsiz, 0);
> > bufferevent_enable(con->se_in.bev, EV_READ|EV_WRITE);
> >  
> > -   if (relay_splice(>se_in) == -1)
> > +   switch (relay_splice(>se_in)) {
> > +   case 0:
> > +   bufferevent_settimeout(con->se_in.bev,
> > +   rlay->rl_conf.timeout.tv_sec, rlay-
> > > rl_conf.timeout.tv_sec);
> > 
> > +   break;
> > +   case -1:
> > relay_close(con, strerror(errno));
> > +   break;
> > +   }
> >  }
> >  
> >  void

Re: relayd and PUT

2018-01-19 Thread Rivo Nurges
On Fri, 2018-01-05 at 00:12 +0100, Alexander Bluhm wrote:
> On Wed, Dec 13, 2017 at 07:42:03AM +0100, Claudio Jeker wrote:
> > On Wed, Dec 13, 2017 at 12:25:39AM +, Rivo Nurges wrote:
> > > If you http PUT a "big" file through relayd, server<>relay read
> > > side
> > > will eventually get a EVBUFFER_TIMEOUT. Nothing comes back from
> > > the
> > > server until the PUT is done. I disabled server read timeouts for
> > > PUT
> > > requests.
> > 
> > I have seen something similar and came to the conclusion that the
> > timeout
> > handling of relayd is not correct. As long as traffic is flowing
> > the
> > timeout should be reset (at least that is what every other
> > implementation
> > does). This is not really happening in relayd. I have seen this on
> > GET
> > requests that are huge (timeout hits in the middle of the transimit
> > and
> > kills the session).
> 
> I have commited more regression tests that check the timeout with
> unidirectional traffic flow.  I could not find an error.  In theory
> when we have an idle timeout in one direction, relayd checks wheter
> there is trafic flowing in the other direction.  The tests set the
> timeout to 2 seconds and send 5 bytes while sleeping one second
> between each byte.  The timeout does not trigger.
> 
> So it seems that you encounter some corner case.  I need more
> information.
> 
> - Do you use http or https?
> - Do you use persistent connections?
> - Do you use chunked encoding?
> - Does it only occur with http or also with plain tcp?
> - Does disabling socket splicing help?
> - Does it happen when the connect to the server is slow?
> 
> While testing I saw that with socket splicing the timeout is handled
> twice.  We get an wakeup from the idle splicing and from libevent
> timeout.  I think it is sufficient to only use the idle splicing
> if it is available.
> 
> Does this diff help?
> 
> bluhm
> 
> Index: relay.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/relayd/relay.c,v
> retrieving revision 1.237
> diff -u -p -r1.237 relay.c
> --- relay.c   27 Dec 2017 15:53:30 -  1.237
> +++ relay.c   4 Jan 2018 22:44:20 -
> @@ -733,16 +733,21 @@ relay_connected(int fd, short sig, void 
>   if ((rlay->rl_conf.flags & F_TLSCLIENT) && (out->tls !=
> NULL))
>   relay_tls_connected(out);
>  
> - bufferevent_settimeout(bev,
> - rlay->rl_conf.timeout.tv_sec, rlay-
> >rl_conf.timeout.tv_sec);
>   bufferevent_setwatermark(bev, EV_WRITE,
>   RELAY_MIN_PREFETCHED * proto->tcpbufsiz, 0);
>   bufferevent_enable(bev, EV_READ|EV_WRITE);
>   if (con->se_in.bev)
>   bufferevent_enable(con->se_in.bev, EV_READ);
>  
> - if (relay_splice(>se_out) == -1)
> + switch (relay_splice(>se_out)) {
> + case 0:
> + bufferevent_settimeout(bev,
> + rlay->rl_conf.timeout.tv_sec, rlay-
> >rl_conf.timeout.tv_sec);
> + break;
> + case -1:
>   relay_close(con, strerror(errno));
> + break;
> + }
>  }
>  
>  void
> @@ -784,14 +789,19 @@ relay_input(struct rsession *con)
>   if ((rlay->rl_conf.flags & F_TLS) && con->se_in.tls != NULL)
>   relay_tls_connected(>se_in);
>  
> - bufferevent_settimeout(con->se_in.bev,
> - rlay->rl_conf.timeout.tv_sec, rlay-
> >rl_conf.timeout.tv_sec);
>   bufferevent_setwatermark(con->se_in.bev, EV_WRITE,
>   RELAY_MIN_PREFETCHED * proto->tcpbufsiz, 0);
>   bufferevent_enable(con->se_in.bev, EV_READ|EV_WRITE);
>  
> - if (relay_splice(>se_in) == -1)
> + switch (relay_splice(>se_in)) {
> + case 0:
> + bufferevent_settimeout(con->se_in.bev,
> + rlay->rl_conf.timeout.tv_sec, rlay-
> >rl_conf.timeout.tv_sec);
> + break;
> + case -1:
>   relay_close(con, strerror(errno));
> + break;
> + }
>  }
>  
>  void

Re: relayd and PUT

2018-01-04 Thread Alexander Bluhm
On Wed, Dec 13, 2017 at 07:42:03AM +0100, Claudio Jeker wrote:
> On Wed, Dec 13, 2017 at 12:25:39AM +, Rivo Nurges wrote:
> > If you http PUT a "big" file through relayd, server<>relay read side
> > will eventually get a EVBUFFER_TIMEOUT. Nothing comes back from the
> > server until the PUT is done. I disabled server read timeouts for PUT
> > requests.
> 
> I have seen something similar and came to the conclusion that the timeout
> handling of relayd is not correct. As long as traffic is flowing the
> timeout should be reset (at least that is what every other implementation
> does). This is not really happening in relayd. I have seen this on GET
> requests that are huge (timeout hits in the middle of the transimit and
> kills the session).

I have commited more regression tests that check the timeout with
unidirectional traffic flow.  I could not find an error.  In theory
when we have an idle timeout in one direction, relayd checks wheter
there is trafic flowing in the other direction.  The tests set the
timeout to 2 seconds and send 5 bytes while sleeping one second
between each byte.  The timeout does not trigger.

So it seems that you encounter some corner case.  I need more
information.

- Do you use http or https?
- Do you use persistent connections?
- Do you use chunked encoding?
- Does it only occur with http or also with plain tcp?
- Does disabling socket splicing help?
- Does it happen when the connect to the server is slow?

While testing I saw that with socket splicing the timeout is handled
twice.  We get an wakeup from the idle splicing and from libevent
timeout.  I think it is sufficient to only use the idle splicing
if it is available.

Does this diff help?

bluhm

Index: relay.c
===
RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/relayd/relay.c,v
retrieving revision 1.237
diff -u -p -r1.237 relay.c
--- relay.c 27 Dec 2017 15:53:30 -  1.237
+++ relay.c 4 Jan 2018 22:44:20 -
@@ -733,16 +733,21 @@ relay_connected(int fd, short sig, void 
if ((rlay->rl_conf.flags & F_TLSCLIENT) && (out->tls != NULL))
relay_tls_connected(out);
 
-   bufferevent_settimeout(bev,
-   rlay->rl_conf.timeout.tv_sec, rlay->rl_conf.timeout.tv_sec);
bufferevent_setwatermark(bev, EV_WRITE,
RELAY_MIN_PREFETCHED * proto->tcpbufsiz, 0);
bufferevent_enable(bev, EV_READ|EV_WRITE);
if (con->se_in.bev)
bufferevent_enable(con->se_in.bev, EV_READ);
 
-   if (relay_splice(>se_out) == -1)
+   switch (relay_splice(>se_out)) {
+   case 0:
+   bufferevent_settimeout(bev,
+   rlay->rl_conf.timeout.tv_sec, rlay->rl_conf.timeout.tv_sec);
+   break;
+   case -1:
relay_close(con, strerror(errno));
+   break;
+   }
 }
 
 void
@@ -784,14 +789,19 @@ relay_input(struct rsession *con)
if ((rlay->rl_conf.flags & F_TLS) && con->se_in.tls != NULL)
relay_tls_connected(>se_in);
 
-   bufferevent_settimeout(con->se_in.bev,
-   rlay->rl_conf.timeout.tv_sec, rlay->rl_conf.timeout.tv_sec);
bufferevent_setwatermark(con->se_in.bev, EV_WRITE,
RELAY_MIN_PREFETCHED * proto->tcpbufsiz, 0);
bufferevent_enable(con->se_in.bev, EV_READ|EV_WRITE);
 
-   if (relay_splice(>se_in) == -1)
+   switch (relay_splice(>se_in)) {
+   case 0:
+   bufferevent_settimeout(con->se_in.bev,
+   rlay->rl_conf.timeout.tv_sec, rlay->rl_conf.timeout.tv_sec);
+   break;
+   case -1:
relay_close(con, strerror(errno));
+   break;
+   }
 }
 
 void



Re: relayd and PUT

2017-12-15 Thread Rivo Nurges
Hi!

On Wed, 2017-12-13 at 07:42 +0100, Claudio Jeker wrote:
> I have seen something similar and came to the conclusion that the
> timeout
> handling of relayd is not correct. As long as traffic is flowing the
> timeout should be reset (at least that is what every other
> implementation
> does). This is not really happening in relayd. I have seen this on
> GET
> requests that are huge (timeout hits in the middle of the transimit
> and
> kills the session).
> 
> Because of this I think the diff is a workaround and does not solve
> the
> real underlying problem.

My next try. It schedules a timer event to bump bufferevent timeouts.
One of the possibilities is to schedule it eg once every second, but I
choose to schedule the event at half of the remaining bufferevent
timeout.

Rivo

Index: usr.sbin/relayd/relay.c
===
RCS file: /cvs/src/usr.sbin/relayd/relay.c,v
retrieving revision 1.236
diff -u -p -r1.236 relay.c
--- usr.sbin/relayd/relay.c 28 Nov 2017 01:51:47 -  1.236
+++ usr.sbin/relayd/relay.c 16 Dec 2017 00:36:31 -
@@ -69,6 +69,7 @@ intrelay_socket_connect(struct sockad
 
 voidrelay_accept(int, short, void *);
 voidrelay_input(struct rsession *);
+voidrelay_timeout(int, short, void *);
 
 voidrelay_hash_addr(SIPHASH_CTX *, struct sockaddr_storage *, int);
 
@@ -662,6 +663,7 @@ relay_connected(int fd, short sig, void 
struct bufferevent  *bev;
struct ctl_relay_event  *out = >se_out;
socklen_tlen;
+   struct timeval   tv;
int  error;
 
if (sig == EV_TIMEOUT) {
@@ -724,6 +726,14 @@ relay_connected(int fd, short sig, void 
 
bufferevent_settimeout(bev,
rlay->rl_conf.timeout.tv_sec, rlay->rl_conf.timeout.tv_sec);
+
+   evtimer_set(>se_ev, relay_timeout, con);
+   timerclear();
+   tv.tv_sec = rlay->rl_conf.timeout.tv_sec / 2;
+   if (tv.tv_sec == 0)
+   tv.tv_usec = 50;
+   evtimer_add(>se_ev, );
+
bufferevent_setwatermark(bev, EV_WRITE,
RELAY_MIN_PREFETCHED * proto->tcpbufsiz, 0);
bufferevent_enable(bev, EV_READ|EV_WRITE);
@@ -955,6 +965,38 @@ relay_spliceadjust(struct ctl_relay_even
cre->splicelen = -1;
 
return (0);
+}
+
+void
+relay_timeout(int fd, short event, void *arg)
+{
+   struct rsession *con = arg;
+   struct ctl_relay_event  *cre = >se_out;
+   struct relay*rlay = con->se_relay;
+   struct timeval   tv, tv_now;
+   time_t   timeout;
+
+   timerclear();
+   getmonotime(_now);
+   timersub(_now, >se_tv_last, );
+   timeout = rlay->rl_conf.timeout.tv_sec - tv.tv_sec;
+
+   DPRINTF("%s: session %d: last %llds ago, timeout %llds", __func__,
+   con->se_id, tv.tv_sec, timeout);
+
+   if (timeout > 0){
+   bufferevent_settimeout(cre->bev, timeout, timeout);
+   bufferevent_settimeout(con->se_in.bev, timeout, timeout);
+   }
+
+   evtimer_set(>se_ev, relay_timeout, con);
+   tv.tv_sec = (rlay->rl_conf.timeout.tv_sec - tv.tv_sec) / 2;
+   if (tv.tv_sec == 0)
+   tv.tv_usec = 50;
+   evtimer_add(>se_ev, );
+
+   DPRINTF("%s: session %d: next after %llds", __func__, con->se_id,
+   tv.tv_sec);
 }
 
 void

Re: relayd and PUT

2017-12-13 Thread Rivo Nurges
Hi!

Thanks for enlightening me, I’ll try to fix the real problem.

Rivo

> On 13 Dec 2017, at 08:42, Claudio Jeker  wrote:
> 
>> On Wed, Dec 13, 2017 at 12:25:39AM +, Rivo Nurges wrote:
>> Hi!
>> 
>> If you http PUT a "big" file through relayd, server<>relay read side
>> will eventually get a EVBUFFER_TIMEOUT. Nothing comes back from the
>> server until the PUT is done. I disabled server read timeouts for PUT
>> requests.
>> 
>> While trying to fix the issue I managed to trigger another problem. For
>> HTTP relays we open relay<>server connection only after the first
>> request is completely read from the client. If http PUT is the the
>> first request and is big enough we will run out of memory and
>> eventually out of swap. To avoid the issue I will open relay<>server
>> connection earlyer and let relayd to start sending the stuff to the
>> server.
>> 
>> And another one I don't know how to fix. If relayd fills all memory and
>> swap with buffers kernel enters infinite loop. relayd is in flt_noram
>> state and pagedaemon constantly tries to free something without any
>> luck. userland scheduling halts. bgp looses its peers but carp still
>> happily sends its hellos...
>> 
> 
> I have seen something similar and came to the conclusion that the timeout
> handling of relayd is not correct. As long as traffic is flowing the
> timeout should be reset (at least that is what every other implementation
> does). This is not really happening in relayd. I have seen this on GET
> requests that are huge (timeout hits in the middle of the transimit and
> kills the session).
> 
> Because of this I think the diff is a workaround and does not solve the
> real underlying problem.
> 
>> Rivo
>> 
>> Index: usr.sbin/relayd/relay.c
>> ===
>> RCS file: /cvs/src/usr.sbin/relayd/relay.c,v
>> retrieving revision 1.236
>> diff -u -p -r1.236 relay.c
>> --- usr.sbin/relayd/relay.c28 Nov 2017 01:51:47 -1.
>> 236
>> +++ usr.sbin/relayd/relay.c13 Dec 2017 00:05:33 -
>> @@ -723,7 +723,8 @@ relay_connected(int fd, short sig, void 
>>relay_tls_connected(out);
>> 
>>bufferevent_settimeout(bev,
>> -rlay->rl_conf.timeout.tv_sec, rlay-
>>> rl_conf.timeout.tv_sec);
>> +con->se_out.writeonly ? 0 : rlay->rl_conf.timeout.tv_sec,
>> +rlay->rl_conf.timeout.tv_sec);
>>bufferevent_setwatermark(bev, EV_WRITE,
>>RELAY_MIN_PREFETCHED * proto->tcpbufsiz, 0);
>>bufferevent_enable(bev, EV_READ|EV_WRITE);
>> Index: usr.sbin/relayd/relay_http.c
>> ===
>> RCS file: /cvs/src/usr.sbin/relayd/relay_http.c,v
>> retrieving revision 1.70
>> diff -u -p -r1.70 relay_http.c
>> --- usr.sbin/relayd/relay_http.c27 Nov 2017 16:25:50 -
>> 1.70
>> +++ usr.sbin/relayd/relay_http.c13 Dec 2017 00:05:33 -
>> @@ -439,6 +439,10 @@ relay_read_http(struct bufferevent *bev,
>>case HTTP_METHOD_OPTIONS:
>>case HTTP_METHOD_POST:
>>case HTTP_METHOD_PUT:
>> +con->se_out.writeonly = 1;
>> +if(cre->dst->state == STATE_CONNECTED)
>> +bufferevent_settimeout(bev,
>> +0, rlay->rl_conf.timeout.tv_sec); 
>>case HTTP_METHOD_RESPONSE:
>>/* WebDAV methods */
>>case HTTP_METHOD_PROPFIND:
>> @@ -569,6 +573,9 @@ relay_read_httpcontent(struct buffereven
>>goto fail;
>>cre->toread -= size;
>>}
>> +if (cre->dst->writeonly && cre->dst->state !=
>> STATE_CONNECTED)
>> +if (relay_connect(con) == -1)
>> +goto fail;
>>DPRINTF("%s: done, size %lu, to read %lld", __func__,
>>size, cre->toread);
>>}
>> Index: usr.sbin/relayd/relayd.h
>> ===
>> RCS file: /cvs/src/usr.sbin/relayd/relayd.h,v
>> retrieving revision 1.248
>> diff -u -p -r1.248 relayd.h
>> --- usr.sbin/relayd/relayd.h28 Nov 2017 18:25:53 -1
>> .248
>> +++ usr.sbin/relayd/relayd.h13 Dec 2017 00:05:33 -
>> @@ -218,6 +218,7 @@ struct ctl_relay_event {
>>int line;
>>int done;
>>int timedout;
>> +int writeonly;
>>enum relay_state state;
>>enum direction dir;
>> 
> 
> -- 
> :wq Claudio
> 


Re: relayd and PUT

2017-12-12 Thread Claudio Jeker
On Wed, Dec 13, 2017 at 12:25:39AM +, Rivo Nurges wrote:
> Hi!
> 
> If you http PUT a "big" file through relayd, server<>relay read side
> will eventually get a EVBUFFER_TIMEOUT. Nothing comes back from the
> server until the PUT is done. I disabled server read timeouts for PUT
> requests.
> 
> While trying to fix the issue I managed to trigger another problem. For
> HTTP relays we open relay<>server connection only after the first
> request is completely read from the client. If http PUT is the the
> first request and is big enough we will run out of memory and
> eventually out of swap. To avoid the issue I will open relay<>server
> connection earlyer and let relayd to start sending the stuff to the
> server.
> 
> And another one I don't know how to fix. If relayd fills all memory and
> swap with buffers kernel enters infinite loop. relayd is in flt_noram
> state and pagedaemon constantly tries to free something without any
> luck. userland scheduling halts. bgp looses its peers but carp still
> happily sends its hellos...
> 

I have seen something similar and came to the conclusion that the timeout
handling of relayd is not correct. As long as traffic is flowing the
timeout should be reset (at least that is what every other implementation
does). This is not really happening in relayd. I have seen this on GET
requests that are huge (timeout hits in the middle of the transimit and
kills the session).

Because of this I think the diff is a workaround and does not solve the
real underlying problem.

> Rivo
> 
> Index: usr.sbin/relayd/relay.c
> ===
> RCS file: /cvs/src/usr.sbin/relayd/relay.c,v
> retrieving revision 1.236
> diff -u -p -r1.236 relay.c
> --- usr.sbin/relayd/relay.c   28 Nov 2017 01:51:47 -  1.
> 236
> +++ usr.sbin/relayd/relay.c   13 Dec 2017 00:05:33 -
> @@ -723,7 +723,8 @@ relay_connected(int fd, short sig, void 
>   relay_tls_connected(out);
>  
>   bufferevent_settimeout(bev,
> - rlay->rl_conf.timeout.tv_sec, rlay-
> >rl_conf.timeout.tv_sec);
> + con->se_out.writeonly ? 0 : rlay->rl_conf.timeout.tv_sec,
> + rlay->rl_conf.timeout.tv_sec);
>   bufferevent_setwatermark(bev, EV_WRITE,
>   RELAY_MIN_PREFETCHED * proto->tcpbufsiz, 0);
>   bufferevent_enable(bev, EV_READ|EV_WRITE);
> Index: usr.sbin/relayd/relay_http.c
> ===
> RCS file: /cvs/src/usr.sbin/relayd/relay_http.c,v
> retrieving revision 1.70
> diff -u -p -r1.70 relay_http.c
> --- usr.sbin/relayd/relay_http.c  27 Nov 2017 16:25:50 -  
> 1.70
> +++ usr.sbin/relayd/relay_http.c  13 Dec 2017 00:05:33 -
> @@ -439,6 +439,10 @@ relay_read_http(struct bufferevent *bev,
>   case HTTP_METHOD_OPTIONS:
>   case HTTP_METHOD_POST:
>   case HTTP_METHOD_PUT:
> + con->se_out.writeonly = 1;
> + if(cre->dst->state == STATE_CONNECTED)
> + bufferevent_settimeout(bev,
> + 0, rlay->rl_conf.timeout.tv_sec); 
>   case HTTP_METHOD_RESPONSE:
>   /* WebDAV methods */
>   case HTTP_METHOD_PROPFIND:
> @@ -569,6 +573,9 @@ relay_read_httpcontent(struct buffereven
>   goto fail;
>   cre->toread -= size;
>   }
> + if (cre->dst->writeonly && cre->dst->state !=
> STATE_CONNECTED)
> + if (relay_connect(con) == -1)
> + goto fail;
>   DPRINTF("%s: done, size %lu, to read %lld", __func__,
>   size, cre->toread);
>   }
> Index: usr.sbin/relayd/relayd.h
> ===
> RCS file: /cvs/src/usr.sbin/relayd/relayd.h,v
> retrieving revision 1.248
> diff -u -p -r1.248 relayd.h
> --- usr.sbin/relayd/relayd.h  28 Nov 2017 18:25:53 -  1
> .248
> +++ usr.sbin/relayd/relayd.h  13 Dec 2017 00:05:33 -
> @@ -218,6 +218,7 @@ struct ctl_relay_event {
>   int  line;
>   int  done;
>   int  timedout;
> + int  writeonly;
>   enum relay_state state;
>   enum direction   dir;
>  

-- 
:wq Claudio



Re: relayd and PUT

2017-12-12 Thread Rivo Nurges
Hi!

Without text mangling this time...

Rivo

Index: usr.sbin/relayd/relay.c
===
RCS file: /cvs/src/usr.sbin/relayd/relay.c,v
retrieving revision 1.236
diff -u -p -r1.236 relay.c
--- usr.sbin/relayd/relay.c 28 Nov 2017 01:51:47 -  1.236
+++ usr.sbin/relayd/relay.c 13 Dec 2017 00:05:33 -
@@ -723,7 +723,8 @@ relay_connected(int fd, short sig, void 
relay_tls_connected(out);
 
bufferevent_settimeout(bev,
-   rlay->rl_conf.timeout.tv_sec, rlay->rl_conf.timeout.tv_sec);
+   con->se_out.writeonly ? 0 : rlay->rl_conf.timeout.tv_sec,
+   rlay->rl_conf.timeout.tv_sec);
bufferevent_setwatermark(bev, EV_WRITE,
RELAY_MIN_PREFETCHED * proto->tcpbufsiz, 0);
bufferevent_enable(bev, EV_READ|EV_WRITE);
Index: usr.sbin/relayd/relay_http.c
===
RCS file: /cvs/src/usr.sbin/relayd/relay_http.c,v
retrieving revision 1.70
diff -u -p -r1.70 relay_http.c
--- usr.sbin/relayd/relay_http.c27 Nov 2017 16:25:50 -  1.70
+++ usr.sbin/relayd/relay_http.c13 Dec 2017 00:05:33 -
@@ -439,6 +439,10 @@ relay_read_http(struct bufferevent *bev,
case HTTP_METHOD_OPTIONS:
case HTTP_METHOD_POST:
case HTTP_METHOD_PUT:
+   con->se_out.writeonly = 1;
+   if(cre->dst->state == STATE_CONNECTED)
+   bufferevent_settimeout(bev,
+   0, rlay->rl_conf.timeout.tv_sec); 
case HTTP_METHOD_RESPONSE:
/* WebDAV methods */
case HTTP_METHOD_PROPFIND:
@@ -569,6 +573,9 @@ relay_read_httpcontent(struct buffereven
goto fail;
cre->toread -= size;
}
+   if (cre->dst->writeonly && cre->dst->state != STATE_CONNECTED)
+   if (relay_connect(con) == -1)
+   goto fail;
DPRINTF("%s: done, size %lu, to read %lld", __func__,
size, cre->toread);
}
Index: usr.sbin/relayd/relayd.h
===
RCS file: /cvs/src/usr.sbin/relayd/relayd.h,v
retrieving revision 1.248
diff -u -p -r1.248 relayd.h
--- usr.sbin/relayd/relayd.h28 Nov 2017 18:25:53 -  1.248
+++ usr.sbin/relayd/relayd.h13 Dec 2017 00:05:33 -
@@ -218,6 +218,7 @@ struct ctl_relay_event {
int  line;
int  done;
int  timedout;
+   int  writeonly;
enum relay_state state;
enum direction   dir;
 


begin-base64 644 relayd_put.diff
SW5kZXg6IHVzci5zYmluL3JlbGF5ZC9yZWxheS5jCj09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KUkNTIGZpbGU6IC9jdnMv
c3JjL3Vzci5zYmluL3JlbGF5ZC9yZWxheS5jLHYKcmV0cmlldmluZyByZXZpc2lvbiAxLjIzNgpk
aWZmIC11IC1wIC1yMS4yMzYgcmVsYXkuYwotLS0gdXNyLnNiaW4vcmVsYXlkL3JlbGF5LmMJMjgg
Tm92IDIwMTcgMDE6NTE6NDcgLTAwMDAJMS4yMzYKKysrIHVzci5zYmluL3JlbGF5ZC9yZWxheS5j
CTEzIERlYyAyMDE3IDAwOjA1OjMzIC0wMDAwCkBAIC03MjMsNyArNzIzLDggQEAgcmVsYXlfY29u
bmVjdGVkKGludCBmZCwgc2hvcnQgc2lnLCB2b2lkIAogCQlyZWxheV90bHNfY29ubmVjdGVkKG91
dCk7CiAKIAlidWZmZXJldmVudF9zZXR0aW1lb3V0KGJldiwKLQkgICAgcmxheS0+cmxfY29uZi50
aW1lb3V0LnR2X3NlYywgcmxheS0+cmxfY29uZi50aW1lb3V0LnR2X3NlYyk7CisJICAgIGNvbi0+
c2Vfb3V0LndyaXRlb25seSA/IDAgOiBybGF5LT5ybF9jb25mLnRpbWVvdXQudHZfc2VjLAorCSAg
ICBybGF5LT5ybF9jb25mLnRpbWVvdXQudHZfc2VjKTsKIAlidWZmZXJldmVudF9zZXR3YXRlcm1h
cmsoYmV2LCBFVl9XUklURSwKIAkJUkVMQVlfTUlOX1BSRUZFVENIRUQgKiBwcm90by0+dGNwYnVm
c2l6LCAwKTsKIAlidWZmZXJldmVudF9lbmFibGUoYmV2LCBFVl9SRUFEfEVWX1dSSVRFKTsKSW5k
ZXg6IHVzci5zYmluL3JlbGF5ZC9yZWxheV9odHRwLmMKPT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQpSQ1MgZmlsZTogL2N2
cy9zcmMvdXNyLnNiaW4vcmVsYXlkL3JlbGF5X2h0dHAuYyx2CnJldHJpZXZpbmcgcmV2aXNpb24g
MS43MApkaWZmIC11IC1wIC1yMS43MCByZWxheV9odHRwLmMKLS0tIHVzci5zYmluL3JlbGF5ZC9y
ZWxheV9odHRwLmMJMjcgTm92IDIwMTcgMTY6MjU6NTAgLTAwMDAJMS43MAorKysgdXNyLnNiaW4v
cmVsYXlkL3JlbGF5X2h0dHAuYwkxMyBEZWMgMjAxNyAwMDowNTozMyAtMDAwMApAQCAtNDM5LDYg
KzQzOSwxMCBAQCByZWxheV9yZWFkX2h0dHAoc3RydWN0IGJ1ZmZlcmV2ZW50ICpiZXYsCiAJCWNh
c2UgSFRUUF9NRVRIT0RfT1BUSU9OUzoKIAkJY2FzZSBIVFRQX01FVEhPRF9QT1NUOgogCQljYXNl
IEhUVFBfTUVUSE9EX1BVVDoKKwkJCWNvbi0+c2Vfb3V0LndyaXRlb25seSA9IDE7CisJCQlpZihj
cmUtPmRzdC0+c3RhdGUgPT0gU1RBVEVfQ09OTkVDVEVEKQorCQkJCWJ1ZmZlcmV2ZW50X3NldHRp
bWVvdXQoYmV2LAorCQkJCSAgICAwLCBybGF5LT5ybF9jb25mLnRpbWVvdXQudHZfc2VjKTsgCiAJ
CWNhc2UgSFRUUF9NRVRIT0RfUkVTUE9OU0U6CiAJCS8qIFdlYkRBViBtZXRob2RzICovCiAJCWNh
c2UgSFRUUF9NRVRIT0RfUFJPUEZJTkQ6CkBAIC01NjksNiArNTczLDkgQEAgcmVsYXlfcmVhZF9o
dHRwY29udGVudChzdHJ1Y3QgYnVmZmVyZXZlbgogCQkJCWdvdG8gZmFpbDsKIAkJCWNyZS0+dG9y

relayd and PUT

2017-12-12 Thread Rivo Nurges
Hi!

If you http PUT a "big" file through relayd, server<>relay read side
will eventually get a EVBUFFER_TIMEOUT. Nothing comes back from the
server until the PUT is done. I disabled server read timeouts for PUT
requests.

While trying to fix the issue I managed to trigger another problem. For
HTTP relays we open relay<>server connection only after the first
request is completely read from the client. If http PUT is the the
first request and is big enough we will run out of memory and
eventually out of swap. To avoid the issue I will open relay<>server
connection earlyer and let relayd to start sending the stuff to the
server.

And another one I don't know how to fix. If relayd fills all memory and
swap with buffers kernel enters infinite loop. relayd is in flt_noram
state and pagedaemon constantly tries to free something without any
luck. userland scheduling halts. bgp looses its peers but carp still
happily sends its hellos...

Rivo

Index: usr.sbin/relayd/relay.c
===
RCS file: /cvs/src/usr.sbin/relayd/relay.c,v
retrieving revision 1.236
diff -u -p -r1.236 relay.c
--- usr.sbin/relayd/relay.c 28 Nov 2017 01:51:47 -  1.
236
+++ usr.sbin/relayd/relay.c 13 Dec 2017 00:05:33 -
@@ -723,7 +723,8 @@ relay_connected(int fd, short sig, void 
relay_tls_connected(out);
 
bufferevent_settimeout(bev,
-   rlay->rl_conf.timeout.tv_sec, rlay-
>rl_conf.timeout.tv_sec);
+   con->se_out.writeonly ? 0 : rlay->rl_conf.timeout.tv_sec,
+   rlay->rl_conf.timeout.tv_sec);
bufferevent_setwatermark(bev, EV_WRITE,
RELAY_MIN_PREFETCHED * proto->tcpbufsiz, 0);
bufferevent_enable(bev, EV_READ|EV_WRITE);
Index: usr.sbin/relayd/relay_http.c
===
RCS file: /cvs/src/usr.sbin/relayd/relay_http.c,v
retrieving revision 1.70
diff -u -p -r1.70 relay_http.c
--- usr.sbin/relayd/relay_http.c27 Nov 2017 16:25:50 -  
1.70
+++ usr.sbin/relayd/relay_http.c13 Dec 2017 00:05:33 -
@@ -439,6 +439,10 @@ relay_read_http(struct bufferevent *bev,
case HTTP_METHOD_OPTIONS:
case HTTP_METHOD_POST:
case HTTP_METHOD_PUT:
+   con->se_out.writeonly = 1;
+   if(cre->dst->state == STATE_CONNECTED)
+   bufferevent_settimeout(bev,
+   0, rlay->rl_conf.timeout.tv_sec); 
case HTTP_METHOD_RESPONSE:
/* WebDAV methods */
case HTTP_METHOD_PROPFIND:
@@ -569,6 +573,9 @@ relay_read_httpcontent(struct buffereven
goto fail;
cre->toread -= size;
}
+   if (cre->dst->writeonly && cre->dst->state !=
STATE_CONNECTED)
+   if (relay_connect(con) == -1)
+   goto fail;
DPRINTF("%s: done, size %lu, to read %lld", __func__,
size, cre->toread);
}
Index: usr.sbin/relayd/relayd.h
===
RCS file: /cvs/src/usr.sbin/relayd/relayd.h,v
retrieving revision 1.248
diff -u -p -r1.248 relayd.h
--- usr.sbin/relayd/relayd.h28 Nov 2017 18:25:53 -  1
.248
+++ usr.sbin/relayd/relayd.h13 Dec 2017 00:05:33 -
@@ -218,6 +218,7 @@ struct ctl_relay_event {
int  line;
int  done;
int  timedout;
+   int  writeonly;
enum relay_state state;
enum direction   dir;