On Fri, 11 Feb 2022 18:04:35 GMT, Xue-Lei Andrew Fan <xue...@openjdk.org> wrote:
>> Hi @XueleiFan, >> Not quite sure What could be wrong with setting SO_TIMEOUT during the >> socketClose() operation. We still close the socket and SO_TIMEOUT will be >> restored just after skip() operation is completed. These changes could >> affect the parallel read of the handshake records (we hold appInput.readLock >> during the skip) but we still close the socket and interrupt connection in >> this case. >> If you still think these changes are incorrect, What do you think about the >> most first version of the patch: >> https://github.com/openjdk/jdk/pull/7432/commits/d1c2a4fe23916ff0f1c38150bc0ea1c9c38fe39b >> This version adds a new lock in the SSLSocketInputRecord and protects the >> parallel execution of the read and skip operations. > > If I understand the issue correctly, the problem is about that the skip > method cannot get the right length if the input stream could be accessed in > another thread. > > 1. get the available input stream bytes length; > 2. a third thread read the input stream, and the input stream get changed; > 3. skip up to the bytes length in step 1. > 4. the skip() method hangs on waiting for more bytes. > > I think we could focus on on address the synchronization problem, because the > problem could result in other weird behaviors we don't know yet. > > For the SO_TIMEOUT, I think it is a good workaround. But I'm not sure if it > will impact other behaviors or not. Besides, I know you are trying to use a > small timeout, but it is still blocked and it not easy to find a number fit > all situation that does not impact the performance. > > As I commented in the 1st version, I'm not sure of the locks logic in the > patch, which changes the behavior of SSLSocket. > > I may suggest to an input stream level synchronization. I know the update > could take a while, and may not be able to integrate in time. If you want > to fix the issue as soon as possible, I'm OK to move ahead with your current > direction, but please set the SO_TIMEOUT to as minimal as possible, for > example 1 ms; and have a comment that this is a temporary/workaround solution. Thank you again for your detailed response and comments. Your assumption about this issue is right and I think SO_TIMEOUT should be an acceptable solution. I've changed DEFAULT_SKIP_TIMEOUT to 1 ms and added comments about a temporary workaround. If you don't mind, I'd like to commit it asap because this patch should be backported to the early versions. No regressions were found on the sun/security/ssl tests. ------------- PR: https://git.openjdk.java.net/jdk/pull/7432