Hi!

On Fri, 2018-01-05 at 00:12 +0100, Alexander Bluhm wrote:
> I have commited more regression tests that check the timeout with
> unidirectional traffic flow.  I could not find an error.  In theory
> when we have an idle timeout in one direction, relayd checks wheter
> there is trafic flowing in the other direction.  The tests set the
> timeout to 2 seconds and send 5 bytes while sleeping one second
> between each byte.  The timeout does not trigger.
> So it seems that you encounter some corner case.  I need more
> information.

Yes, its a bit harder to trigger. First, currently relayd opens server
connection only after first client request finishes. If the first
request is long PUT relayd buffers the PUT and the problem doesn't
appear. But it triggers another problem, depending on the request size
you run out of memory. I have another patch for this. It opens the
relay>server socket earlyer and makes the timeout problem even easyer
to trigger. I will send it separately.

So to get the server connection opened and the timeout to happen you
need to do some small GET(or whatever) request and keep the connection
open. In my test case I use "GET /; PUT /largefile"

> - Do you use http or https?

both have the problem

> - Do you use persistent connections?

yes

> - Do you use chunked encoding?

no

> - Does it only occur with http or also with plain tcp?

only http

> - Does disabling socket splicing help?

the problem happens when libevent code is in use
either splicing is disabled or not available(https)

> - Does it happen when the connect to the server is slow?
> 
> While testing I saw that with socket splicing the timeout is handled
> twice.  We get an wakeup from the idle splicing and from libevent
> timeout.  I think it is sufficient to only use the idle splicing
> if it is available.

I noticed it too, but it doesn't seem to make things worse.

> Does this diff help?

This diff doesn't change things.

Rivo

Reply via email to