kpumuk opened a new pull request, #3317:
URL: https://github.com/apache/thrift/pull/3317
<!-- Explain the changes in the pull request below: -->
This PR fixes a TLS hang/timeout in Ruby Thrift with NonblockingServer +
SSLServerSocket + FramedTransport (when the client socket has a non-nil
timeout).
The timed `Thrift::Socket#read` path does `IO.select` and then
`readpartial`. With `OpenSSL::SSL::SSLSocket`, `readpartial` can decrypt a full
TLS record but return only the requested bytes, leaving
the rest in the SSL internal buffer. The next `IO.select` waits on the raw
fd, which cannot see those buffered SSL bytes, so the framed payload read can
block until timeout. This shows up especially with framed reads (4-byte frame
size first, then frame body).
The solution is to switch timed `read`/`write` to ` read_nonblock` /
`write_nonblock` and only call `IO.select` after `IO::WaitReadable` /
`IO::WaitWritable`. This keeps timeouts intact and makes reads/writes
SSL-buffer-aware. Note that we can receive `IO::WaitWritable` on reads, and
`IO::WaitReadable` on writes on SSL sockets, so we need to handle both cases.
Additionally, timeout tracking now uses a monotonic clock to avoid drift or
wall-clock adjustments affecting deadline checks.
### With timeout specified
```
$ THRIFT_TLS=true ruby benchmark/benchmark.rb
Starting server...
Spawning benchmark processes...
Collecting output...
#<Thread:0x0000ffff95aa1e10
/thrift/lib/rb/lib/thrift/server/nonblocking_server.rb:122 run> terminated with
exception (report_on_exception is true):
/usr/local/lib/ruby/3.4.0/openssl/buffering.rb:80:in
'OpenSSL::SSL::SSLSocket#sysread': SSL_read: unexpected eof while reading
(OpenSSL::SSL::SSLError)
from /usr/local/lib/ruby/3.4.0/openssl/buffering.rb:80:in
'OpenSSL::Buffering#fill_rbuff'
from /usr/local/lib/ruby/3.4.0/openssl/buffering.rb:339:in
'OpenSSL::Buffering#eof?'
from /thrift/lib/rb/lib/thrift/server/nonblocking_server.rb:156:in
'block (2 levels) in Thrift::NonblockingServer::IOManager#run'
from /thrift/lib/rb/lib/thrift/server/nonblocking_server.rb:154:in
'Array#each'
from /thrift/lib/rb/lib/thrift/server/nonblocking_server.rb:154:in
'block in Thrift::NonblockingServer::IOManager#run'
from <internal:kernel>:168:in 'Kernel#loop'
from /thrift/lib/rb/lib/thrift/server/nonblocking_server.rb:149:in
'Thrift::NonblockingServer::IOManager#run'
from /thrift/lib/rb/lib/thrift/server/nonblocking_server.rb:124:in
'block in Thrift::NonblockingServer::IOManager#spawn'
Translating output...
Analyzing output...
Server class: Thrift::NonblockingServer
Server interpreter: ruby
Client interpreter: ruby
Protocol type: binary
Socket class: Thrift::SSLSocket
Number of processes: 40
Clients per process: 5
Calls per client: 50
Using fastthread: no
Connection failures: 0
Connection errors: 200
Average time per call: NaN seconds
Average time per client (50 calls): NaN seconds
Total time for all calls: 0.0000 seconds
Real time for benchmarking: 26.6815 seconds
Shortest call time: 0.0000 seconds
Longest call time: 0.0000 seconds
Shortest client time (50 calls): 0.0000 seconds
Longest client time (50 calls): 0.0000 seconds
```
### Before
```
$ THRIFT_TLS=true ruby benchmark/benchmark.rb
Starting server...
Spawning benchmark processes...
Collecting output...
Translating output...
Analyzing output...
Server class: Thrift::NonblockingServer
Server interpreter: ruby
Client interpreter: ruby
Protocol type: binary
Socket class: Thrift::SSLSocket
Number of processes: 40
Clients per process: 5
Calls per client: 50
Using fastthread: no
Connection failures: 0
Connection errors: 0
Average time per call: 0.0037 seconds
Average time per client (50 calls): 0.2267 seconds
Total time for all calls: 36.5155 seconds
Real time for benchmarking: 2.9521 seconds
Shortest call time: 0.0001 seconds
Longest call time: 0.0339 seconds
Shortest client time (50 calls): 0.0653 seconds
Longest client time (50 calls): 0.3258 seconds
```
### After
```
THRIFT_TLS=true ruby benchmark/benchmark.rb
Starting server...
Spawning benchmark processes...
Collecting output...
Translating output...
Analyzing output...
Server class: Thrift::NonblockingServer
Server interpreter: ruby
Client interpreter: ruby
Protocol type: binary
Socket class: Thrift::SSLSocket
Number of processes: 40
Clients per process: 5
Calls per client: 50
Using fastthread: no
Connection failures: 0
Connection errors: 0
Average time per call: 0.0034 seconds
Average time per client (50 calls): 0.2003 seconds
Total time for all calls: 34.1452 seconds
Real time for benchmarking: 2.8205 seconds
Shortest call time: 0.0001 seconds
Longest call time: 0.0280 seconds
Shortest client time (50 calls): 0.0561 seconds
Longest client time (50 calls): 0.3026 seconds
```
<!-- We recommend you review the checklist/tips before submitting a pull
request. -->
- [ ] Did you create an [Apache
Jira](https://issues.apache.org/jira/projects/THRIFT/issues/) ticket?
([Request account here](https://selfserve.apache.org/jira-account.html), not
required for trivial changes)
- [ ] If a ticket exists: Does your pull request title follow the pattern
"THRIFT-NNNN: describe my issue"?
- [x] Did you squash your changes to a single commit? (not required, but
preferred)
- [x] Did you do your best to avoid breaking changes? If one was needed,
did you label the Jira ticket with "Breaking-Change"?
- [ ] If your change does not involve any code, include `[skip ci]` anywhere
in the commit message to free up build resources.
<!--
The Contributing Guide at:
https://github.com/apache/thrift/blob/master/CONTRIBUTING.md
has more details and tips for committing properly.
-->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]