kpumuk opened a new pull request, #3317:
URL: https://github.com/apache/thrift/pull/3317

   <!-- Explain the changes in the pull request below: -->
     
   This PR fixes a TLS hang/timeout in Ruby Thrift with NonblockingServer + 
SSLServerSocket + FramedTransport (when the client socket has a non-nil 
timeout).
   
   The timed `Thrift::Socket#read` path does `IO.select` and then 
`readpartial`. With `OpenSSL::SSL::SSLSocket`, `readpartial` can decrypt a full 
TLS record but return only the requested bytes, leaving
     the rest in the SSL internal buffer. The next `IO.select` waits on the raw 
fd, which cannot see those buffered SSL bytes, so the framed payload read can 
block until timeout. This shows up especially with framed reads (4-byte frame 
size first, then frame body).
   
   The solution is to switch timed `read`/`write` to ` read_nonblock` / 
`write_nonblock` and only call `IO.select` after `IO::WaitReadable` / 
`IO::WaitWritable`. This keeps timeouts intact and makes reads/writes 
SSL-buffer-aware. Note that we can receive `IO::WaitWritable` on reads, and 
`IO::WaitReadable` on writes on SSL sockets, so we need to handle both cases.
   
   Additionally, timeout tracking now uses a monotonic clock to avoid drift or 
wall-clock adjustments affecting deadline checks.
   
   ### With timeout specified
   
   ```
   $ THRIFT_TLS=true ruby benchmark/benchmark.rb
   Starting server...
   Spawning benchmark processes...
   Collecting output...
   #<Thread:0x0000ffff95aa1e10 
/thrift/lib/rb/lib/thrift/server/nonblocking_server.rb:122 run> terminated with 
exception (report_on_exception is true):
   /usr/local/lib/ruby/3.4.0/openssl/buffering.rb:80:in 
'OpenSSL::SSL::SSLSocket#sysread': SSL_read: unexpected eof while reading 
(OpenSSL::SSL::SSLError)
           from /usr/local/lib/ruby/3.4.0/openssl/buffering.rb:80:in 
'OpenSSL::Buffering#fill_rbuff'
           from /usr/local/lib/ruby/3.4.0/openssl/buffering.rb:339:in 
'OpenSSL::Buffering#eof?'
           from /thrift/lib/rb/lib/thrift/server/nonblocking_server.rb:156:in 
'block (2 levels) in Thrift::NonblockingServer::IOManager#run'
           from /thrift/lib/rb/lib/thrift/server/nonblocking_server.rb:154:in 
'Array#each'
           from /thrift/lib/rb/lib/thrift/server/nonblocking_server.rb:154:in 
'block in Thrift::NonblockingServer::IOManager#run'
           from <internal:kernel>:168:in 'Kernel#loop'
           from /thrift/lib/rb/lib/thrift/server/nonblocking_server.rb:149:in 
'Thrift::NonblockingServer::IOManager#run'
           from /thrift/lib/rb/lib/thrift/server/nonblocking_server.rb:124:in 
'block in Thrift::NonblockingServer::IOManager#spawn'
   Translating output...
   Analyzing output...
   
   Server class:        Thrift::NonblockingServer
   Server interpreter:  ruby
   Client interpreter:  ruby
   Protocol type:       binary
   Socket class:        Thrift::SSLSocket
   Number of processes: 40
   Clients per process: 5
   Calls per client:    50
   Using fastthread:    no
   
   Connection failures:                0
   Connection errors:                  200
   Average time per call:              NaN seconds
   Average time per client (50 calls): NaN seconds
   Total time for all calls:           0.0000 seconds
   Real time for benchmarking:         26.6815 seconds
   Shortest call time:                 0.0000 seconds
   Longest call time:                  0.0000 seconds
   Shortest client time (50 calls):    0.0000 seconds
   Longest client time (50 calls):     0.0000 seconds
   ```
   
   ### Before
   
   ```
   $ THRIFT_TLS=true ruby benchmark/benchmark.rb
   Starting server...
   Spawning benchmark processes...
   Collecting output...
   Translating output...
   Analyzing output...
   
   Server class:        Thrift::NonblockingServer
   Server interpreter:  ruby
   Client interpreter:  ruby
   Protocol type:       binary
   Socket class:        Thrift::SSLSocket
   Number of processes: 40
   Clients per process: 5
   Calls per client:    50
   Using fastthread:    no
   
   Connection failures:                0
   Connection errors:                  0
   Average time per call:              0.0037 seconds
   Average time per client (50 calls): 0.2267 seconds
   Total time for all calls:           36.5155 seconds
   Real time for benchmarking:         2.9521 seconds
   Shortest call time:                 0.0001 seconds
   Longest call time:                  0.0339 seconds
   Shortest client time (50 calls):    0.0653 seconds
   Longest client time (50 calls):     0.3258 seconds
   ```
   
   ### After
   
   ```
   THRIFT_TLS=true ruby benchmark/benchmark.rb
   Starting server...
   Spawning benchmark processes...
   Collecting output...
   Translating output...
   Analyzing output...
   
   Server class:        Thrift::NonblockingServer
   Server interpreter:  ruby
   Client interpreter:  ruby
   Protocol type:       binary
   Socket class:        Thrift::SSLSocket
   Number of processes: 40
   Clients per process: 5
   Calls per client:    50
   Using fastthread:    no
   
   Connection failures:                0
   Connection errors:                  0
   Average time per call:              0.0034 seconds
   Average time per client (50 calls): 0.2003 seconds
   Total time for all calls:           34.1452 seconds
   Real time for benchmarking:         2.8205 seconds
   Shortest call time:                 0.0001 seconds
   Longest call time:                  0.0280 seconds
   Shortest client time (50 calls):    0.0561 seconds
   Longest client time (50 calls):     0.3026 seconds
   ```
   
   <!-- We recommend you review the checklist/tips before submitting a pull 
request. -->
   
   - [ ] Did you create an [Apache 
Jira](https://issues.apache.org/jira/projects/THRIFT/issues/) ticket?  
([Request account here](https://selfserve.apache.org/jira-account.html), not 
required for trivial changes)
   - [ ] If a ticket exists: Does your pull request title follow the pattern 
"THRIFT-NNNN: describe my issue"?
   - [x] Did you squash your changes to a single commit?  (not required, but 
preferred)
   - [x] Did you do your best to avoid breaking changes?  If one was needed, 
did you label the Jira ticket with "Breaking-Change"?
   - [ ] If your change does not involve any code, include `[skip ci]` anywhere 
in the commit message to free up build resources.
   
   <!--
     The Contributing Guide at:
     https://github.com/apache/thrift/blob/master/CONTRIBUTING.md
     has more details and tips for committing properly.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to