On 27.12.2018 20:36, Ben Pfaff wrote: > On Wed, Dec 26, 2018 at 06:23:56PM +0300, Ilya Maximets wrote: >> On some systems in case where remote is not responding, socket could >> remain in SYN_SENT state for a really long time without errors waiting >> for connection. This leads to situations where vconn connection hangs >> for a few minutes waiting for connection to the DOWN remote. >> >> For example, this situation emulated by "refuse-connection" vconn >> testcase. This leads to test failures because Alarm signal arrives much >> faster than ETIMEDOUT from the socket: >> >> ./vconn.at:21: ovstest test-vconn refuse-connection tcp >> Alarm clock >> stderr: >> |socket_util|INFO|0:127.0.0.1: listening on port 63812 >> |poll_loop|DBG|wakeup due to 0-ms timeout >> |poll_loop|DBG|wakeup due to 10155-ms timeout >> |fatal_signal|WARN|terminating with signal 14 (Alarm clock) >> ./vconn.at:21: exit code was 142, expected 0 >> vconn.at:21: 535. tcp vconn - refuse connection (vconn.at:21): FAILED >> >> This patch allowes to specify timeout value for vconn blocking >> connections. If the connection takes more time, socket will be closed >> with ETIMEDOUT error code. Negative value could be used to wait >> infinitely. >> >> Signed-off-by: Ilya Maximets <[email protected]> > > Same comments as patch 2. > > Are the timeouts only useful for the test cases? I wonder whether just > calling alarm(10); at the beginning of the test programs would be just > as helpful. On the other hand, it would make using a debugger on those > programs harder.
I guess, we have alarms in all the test programs. The issue here is that some test apps like 'test_refuse_connection' treats connection failure as a success. But on some systems, wrong connections hangs for a really long time and alarm kills the test application. In this case we can't say for sure if the test failed or not, i.e. if it was expected connection failure or other random issue that forced the application to hang. stream connection tests even worse, because they are trying to sequentially establish connection to one of 3 different remotes while only one of them is correct. And it will never try to connect to correct one if the blocking connection to wrong port will hang for a few minutes. It'll be simply killed by alarm. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
