On 01. mai 2018 22:05, Dave Watson wrote:
It is reported that in some cases, write_space may be called in
do_tcp_sendpages, such that we recursively invoke do_tcp_sendpages again:

[  660.468802]  ? do_tcp_sendpages+0x8d/0x580
[  660.468826]  ? tls_push_sg+0x74/0x130 [tls]
[  660.468852]  ? tls_push_record+0x24a/0x390 [tls]
[  660.468880]  ? tls_write_space+0x6a/0x80 [tls]
...

tls_push_sg already does a loop over all sending sg's, so ignore
any tls_write_space notifications until we are done sending.
We then have to call the previous write_space to wake up
poll() waiters after we are done with the send loop.

Reported-by: Andre Tomt <an...@tomt.net>
Signed-off-by: Dave Watson <davejwat...@fb.com>

Unfortunately it seems like this patch has a bug, while it fixed the kernel crashing it is causing some connections in my testbed to stall.

Making sure ctx->in_tcp_sendpages is also cleared before the return ret within the while(1) loop seems to fix it for me.


diff -Naurp a/net/tls/tls_main.c b/net/tls/tls_main.c
--- a/net/tls/tls_main.c        2018-05-06 02:23:41.638597066 +0200
+++ b/net/tls/tls_main.c        2018-05-06 01:59:14.378568139 +0200
@@ -135,6 +135,7 @@ retry:
                        offset -= sg->offset;
                        ctx->partially_sent_offset = offset;
                        ctx->partially_sent_record = (void *)sg;
+                       ctx->in_tcp_sendpages = false;
                        return ret;
                }

Reply via email to