fishy commented on pull request #2293:
URL: https://github.com/apache/thrift/pull/2293#issuecomment-745709232


   OK I cannot reproduce this issue with go1.15.6 locally. Looking through 
travis log we were using go 1.10.8 to run that test, and I do can reproduce 
this issue with go1.10.8 almost 100% locally.
   
   My current understanding is that in the test we are relying on canceled 
context to make the http request fail: 
https://github.com/apache/thrift/blob/e872b350ab0cfd2402340b8ca82b02dc299db69e/lib/go/thrift/http_client_test.go#L129
   
   This works fine in go1.15.6. In go1.10.8 although it did make the `Flush` 
call next line fail, it doesn't really cause the write loop of the http client 
to stop correctly. So after you reset the `requestBuffer` inside `Flush`, the 
write loop of the http client still tries to read from `requestBuffer` in order 
to write the request, and that caused both data race and the test failure.
   
   I think the big, underlying problem is that when making the http request 
inside `Flush`, we passed the `requestBuffer`  to the http request 
(https://github.com/apache/thrift/blob/e872b350ab0cfd2402340b8ca82b02dc299db69e/lib/go/thrift/http_client.go#L208).
 This doesn't make a copy of the buffer anywhere, so the ownership of the 
buffer becomes muddy: `THttpClient` thinks it owns the data and can call its 
`Reset`, while the http request also thinks it owns the data and it's the only 
one reading/writing from it.
   
   I think the proper fix would be that we need to make a copy of the buffer 
before sending it to http request. It will hurt performance, though. @dcelasun 
what do you think?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to