chenBright opened a new pull request, #1814: URL: https://github.com/apache/incubator-brpc/pull/1814
fix https://github.com/apache/incubator-brpc/issues/658。 coredump case的场景:当上游Controller::IssueRPC选出节点后、Socket::Write KEEPWRITE_IN_BACKGROUND前,下游退出了,上游在OnNewMwssage中读到EOF,将Socket SetFailed,并启动了HealthCheckThread。此时,socket的引用计数为3(Controller::IssueRPC、HealthCheckThread、SocketMap),HealthCheckThread在WaitAndRest等引用计数变成2(预期是HealthCheckThread、SocketMap)。然后,NamingService将该下游节点摘掉,并将对应socket从SocketMap中删除,引用计数减一,变成2。HealthCheckThread在WaitAndRest中将_ssl_state置为SSL_UNKONW。最后,上游Controller::IssueRPC走到Socket::KeepWrite。虽然client未开启ssl,但是HealthCheckThread将_ssl_state置为SSL_UNKONW了,所以程序会运行到写ssl(空指针)的逻辑,最终导致coredump。 解决方案:设置flag标识是否停止health check thread。当SocketMap删除socket时,先将flag置为true,再释放引用计数。HealthCheckThread中Wait到2之后,判断改flag是否为true。是,则直接返回,退出health check thread。否则,继续进行health check。 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
