Todd Lipcon created KUDU-2758:
---------------------------------

             Summary: TLS socket writes in 16kb chunks with intervening 
epoll/setsockopt syscalls
                 Key: KUDU-2758
                 URL: https://issues.apache.org/jira/browse/KUDU-2758
             Project: Kudu
          Issue Type: Bug
          Components: perf, rpc, security
            Reporter: Todd Lipcon


I noticed that krpc has the following syscall pattern:

{code}
 rpc reactor-231 23122 [002] 35488410.994309: syscalls:sys_enter_epoll_wait: 
epfd: 0x00000007, events: 0x02137520, maxevents: 0x00000040, timeout: 0x00000050
 rpc reactor-231 23122 [002] 35488410.994310: syscalls:sys_exit_epoll_wait: 0x1
 rpc reactor-231 23122 [002] 35488410.994313: syscalls:sys_enter_setsockopt: 
fd: 0x00000011, level: 0x00000006, optname: 0x00000003, optval: 0x7fc80910175c, 
optlen: 0x00000004
 rpc reactor-231 23122 [002] 35488410.994314: syscalls:sys_exit_setsockopt: 0x0
 rpc reactor-231 23122 [002] 35488410.994351: syscalls:sys_enter_write: fd: 
0x00000011, buf: 0x7fc7e8059e93, count: 0x0000401d
 rpc reactor-231 23122 [002] 35488410.994370: syscalls:sys_exit_write: 0x401d
 rpc reactor-231 23122 [002] 35488410.994372: syscalls:sys_enter_setsockopt: 
fd: 0x00000011, level: 0x00000006, optname: 0x00000003, optval: 0x7fc80910175c, 
optlen: 0x00000004
 rpc reactor-231 23122 [002] 35488410.994378: syscalls:sys_exit_setsockopt: 0x0
{code}

This block of syscalls repeats in a pretty tight loop -- epoll_wait, CORK, 
write, UNCORK. The writes are always 0x401d bytes (just more than 16kb). I 
found the following in the ssl_write manpage:
{quote}
SSL_write() will only return with success, when the complete contents of buf of 
length num has been written. This default behaviour can be changed with the 
SSL_MODE_ENABLE_PARTIAL_WRITE option of ssl_ctx_set_mode(3). When this flag is 
set, SSL_write() will also return with success, when a partial write has been 
successfully completed. In this case the SSL_write() operation is considered 
completed. The bytes are sent and a new SSL_write() operation with a new buffer 
(with the already sent bytes removed) must be started. A partial write is 
performed with the size of a message block, which is 16kB for SSLv3/TLSv1.
{quote}

Seems likely we should be looping the writes before uncorking -- either until 
we run into a temporary socket error or run out of stuff to write.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to