phxnsharp commented on pull request #2395:
URL: https://github.com/apache/thrift/pull/2395#issuecomment-848701698


   > What I am missing is a test case that clearly shows the issue you are 
reporting.
   
   Note that I am sensitive to this issue due to a [previous 
experience](https://github.com/jaegertracing/jaeger-client-csharp/issues/181) I 
had with a similar problem in an unrelated tool. The problem is that since the 
thread pool is a shared resource with complex logic that depends on the 
hardware being run on, the expression of this type of bug is often very 
intermittent, and usually far away from the actual bug. It took me forever to 
debug that problem. Since then I am always on the lookout for improper long 
lived tasks on the thread pool.
   
   I found this bug in some test cases that we have of our own code. In this 
case I am standing up mocks of the in and out of process sides of a Thrift 
communication channel all inside the test and testing the channel itself. It 
was intermittently failing and when I investigated I found these long lived 
threads sitting in the thread pool. 
   
   Note that in this case, my tests are not useful to you because they are 
testing our code a layer up. While I was debugging, I also found that 
NamedPipeClientStream.ConnectAsync is also implemented incorrectly (I have not 
submitted that issue to Microsoft yet). Putting a workaround in for that one (I 
use the synchronous Connect() call on a TaskCreationOption.LongLived task) 
actually mostly (but not totally) fixes this specific unit test problem for us. 
Note that there is no way to put a workaround for TThreadPoolAsyncServer in 
(short of creating a cut-n-paste clone of it) because it is a private internal 
method.
   
   I openly admit that I have no proof that these long lived tasks are causing 
active problems. I still submit that if they don't now, they will someday in a 
subtle and difficult to debug way.
   
   The way I would expect to be able to reproduce this would be to start a 
TThreadPoolAsyncServer and then start many connections to it without closing 
the previous connections. My expectations would be that the first few would be 
fast, but when you exhausted the thread pool, the next connection would take 1 
second to connect while the thread pool deadlocks.
   
   Note that the exact behavior will likely depend on whether you are using 
.NET Framework or .NET Core, the specific version you are using, and the 
hardware you are running on. It is entirely possible that Microsoft has or will 
change the nature of how the thread pool deals with this situation. All the 
more reason in my mind to watch out for this situation.
   
   Sorry for the diatribe! Thank you for the thoughtful and timely dialog, as 
well as for the great project! We use Thrift for a lot of things these days and 
it has been a valuable tool to us!
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to