WellaceZ opened a new issue #1564: Provider停机过程中,consumer的新请求报The request is rejected和Failed to send request, local:not connected问题 URL: https://github.com/apache/servicecomb-java-chassis/issues/1564 【场景描述】: 服务A作为Provider,服务B作为Comsumer,对Provider其进行停机,则ServiceComb会等待现有线程(存在耗时任务)处理完毕。 停机过程: (1)Provider首先向注册中心发起了Unregister请求,而且Comsumer也收到的注册中心的Unregister通知。 (2)然后Provider等待现有线程(存在耗时任务)处理完毕,但是这个等待过程中,Comsumer的新请求却还是可以调用到Provider的接口,会报The request is rejected. Cannot process the request due to STATUS = STOPPING错误。 查看Provider的cse_run日志,打印了: > 2020-02-07 19:19:11.030 [vert.x-eventloop-thread-0] WARN - [SCBEngine.java:ensureStatusUp:332] - [] - The request is rejected. Cannot process the request due to STATUS = STOPPING 【问题1】Comsumer本地不是已经Unregister了正在停机的Provider实例了吗?为什么请求还是可以发送到那个Provider实例中去呢? (3)Provider现有线程处理完毕,进程停止。Comsumer的新请求在调用Provider接口时将会报 2020-02-07 19:19:29.246 [vert.x-eventloop-thread-1] ERROR - [RestClientInvocation.java:lambda$invoke$0:110] - [] - Failed to send request, local:not connected, remote:/10.52.183.58:8080. io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: /10.52.183.58:8080 ...... 【问题2】这个错误是什么原因导致的?为何Consumer报的错误不是java.lang.IllegalStateException: Probably invoke a service before it is registered, or no instance found for it, appId=**, name=**呢? 【Provider端的停机日志】: 2020-02-07 19:19:06.405 [Thread-60] INFO - [SCBEngine.java:destroy:274] - [] - ServiceComb is closing now... 2020-02-07 19:19:06.405 [Thread-60] INFO - [SCBEngine.java:safeTriggerEvent:165] - [] - BootListener com.huawei.paas.monitor.MonitorBootListener succeed to process BEFORE_CLOSE. 2020-02-07 19:19:06.406 [Thread-60] INFO - [SCBEngine.java:safeTriggerEvent:165] - [] - BootListener org.apache.servicecomb.core.provider.producer.ProducerProviderManager succeed to process BEFORE_CLOSE. 2020-02-07 19:19:06.410 [Thread-60] INFO - [SCBEngine.java:safeTriggerEvent:165] - [] - BootListener org.apache.servicecomb.metrics.core.MetricsBootListener succeed to process BEFORE_CLOSE. 2020-02-07 19:19:06.410 [Thread-60] INFO - [SCBEngine.java:safeTriggerEvent:165] - [] - BootListener org.apache.servicecomb.common.rest.RestEngineSchemaListener succeed to process BEFORE_CLOSE. 2020-02-07 19:19:06.410 [Thread-60] INFO - [SCBEngine.java:safeTriggerEvent:165] - [] - BootListener org.apache.servicecomb.AuthHandlerBoot succeed to process BEFORE_CLOSE. 2020-02-07 19:19:06.410 [Thread-60] INFO - [RemoteServiceRegistry.java:onShutdown:80] - [] - service center task is shutdown. >>>>Provider首先向注册中心发起了Unregister请求: 2020-02-07 19:19:06.443 [Thread-60] INFO - [AbstractServiceRegistry.java:unregisterInstance:220] - [] - Unregister microservice instance success. microserviceId=72af34d233d206a9a655206af41463d9280c2430 instanceId=45847166499b11eab6edfa163e6793f1 2020-02-07 19:19:06.443 [Thread-60] INFO - [VertxUtils.java:closeVertxByName:193] - [] - Closing vertx registry. 2020-02-07 19:19:06.458 [registry-vert.x-eventloop-thread-2] ERROR - [ServiceRegistryClientImpl.java:watchErrorHandler:761] - [] - watcher connect to service center server failed, microservice 72af34d233d206a9a655206af41463d9280c2430, connection is closed accidentally 2020-02-07 19:19:06.463 [registry-vert.x-eventloop-thread-2] INFO - [ServiceCenterTask.java:onExceptionEvent:97] - [] - read exception event, message is :connection is closed accidentally 2020-02-07 19:19:06.468 [registry-vert.x-eventloop-thread-24] INFO - [VertxUtils.java:lambda$closeVertxByName$2:204] - [] - Success to close vertx registry. >>>>Comsumer本地Unregister之后,还是会有新请求发送给Provider实例进行处理: 2020-02-07 19:19:11.030 [vert.x-eventloop-thread-0] WARN - [SCBEngine.java:ensureStatusUp:332] - [] - The request is rejected. Cannot process the request due to STATUS = STOPPING 2020-02-07 19:19:11.036 [vert.x-eventloop-thread-0] INFO - [SPIServiceUtils.java:loadSortedService:79] - [] - Found SPI service org.apache.servicecomb.swagger.invocation.exception.ExceptionToProducerResponseConverter, count=4. 2020-02-07 19:19:11.036 [vert.x-eventloop-thread-0] INFO - [SPIServiceUtils.java:loadSortedService:82] - [] - 0. org.apache.servicecomb.swagger.invocation.validator.ConstraintViolationExceptionToProducerResponseConverter. 2020-02-07 19:19:11.036 [vert.x-eventloop-thread-0] INFO - [SPIServiceUtils.java:loadSortedService:82] - [] - 1. com.huawei.nuwa.cse.web.springmvc.CommonExceptionConverter. 2020-02-07 19:19:11.036 [vert.x-eventloop-thread-0] INFO - [SPIServiceUtils.java:loadSortedService:82] - [] - 2. org.apache.servicecomb.swagger.invocation.exception.InvocationExceptionToProducerResponseConverter. 2020-02-07 19:19:11.036 [vert.x-eventloop-thread-0] INFO - [SPIServiceUtils.java:loadSortedService:82] - [] - 3. org.apache.servicecomb.swagger.invocation.exception.DefaultExceptionToProducerResponseConverter. 2020-02-07 19:19:20.474 [Thread-60] INFO - [VertxUtils.java:closeVertxByName:193] - [] - Closing vertx config-center. 2020-02-07 19:19:20.474 [Thread-60] INFO - [VertxUtils.java:closeVertxByName:197] - [] - Vertx config-center not exist. 2020-02-07 19:19:20.474 [Thread-60] INFO - [VertxUtils.java:closeVertxByName:193] - [] - Closing vertx transport. 2020-02-07 19:19:20.487 [vert.x-eventloop-thread-24] INFO - [VertxUtils.java:lambda$closeVertxByName$2:204] - [] - Success to close vertx transport. 2020-02-07 19:19:20.488 [Thread-60] INFO - [SCBEngine.java:safeTriggerEvent:165] - [] - BootListener com.huawei.paas.monitor.MonitorBootListener succeed to process AFTER_CLOSE. 2020-02-07 19:19:20.488 [Thread-60] INFO - [SCBEngine.java:safeTriggerEvent:165] - [] - BootListener org.apache.servicecomb.core.provider.producer.ProducerProviderManager succeed to process AFTER_CLOSE. 2020-02-07 19:19:20.488 [Thread-60] INFO - [SCBEngine.java:safeTriggerEvent:165] - [] - BootListener org.apache.servicecomb.metrics.core.MetricsBootListener succeed to process AFTER_CLOSE. 2020-02-07 19:19:20.488 [Thread-60] INFO - [SCBEngine.java:safeTriggerEvent:165] - [] - BootListener org.apache.servicecomb.common.rest.RestEngineSchemaListener succeed to process AFTER_CLOSE. 2020-02-07 19:19:20.488 [Thread-60] INFO - [SCBEngine.java:safeTriggerEvent:165] - [] - BootListener org.apache.servicecomb.AuthHandlerBoot succeed to process AFTER_CLOSE. 2020-02-07 19:19:20.488 [Thread-60] INFO - [SCBEngine.java:destroy:277] - [] - ServiceComb had closed 【Consumer端的运行日志】: 2020-02-07 19:17:38.154 [registry-vert.x-eventloop-thread-2] INFO - [MicroserviceWatchTask.java:onMicroserviceInstanceChanged:85] - [] - microservice Gray/HiCloudDataManagerService_zl/4.0.3 REGISTERED an instance 45847166499b11eab6edfa163e6793f1, [rest://10.52.183.58:8080]. 2020-02-07 19:17:38.186 [Service Center Task [java.util.concurrent.ThreadPoolExecutor$Worker@200aae61[State = -1, empty queue][63]]] INFO - [AbstractServiceRegistry.java:findServiceInstances:261] - [] - find instances[1] from service center success. service=Gray/HiCloudDataManagerService_zl/0.0.0+, old revision=97784746be9a2d00f08bc72fe1ce5ff8c7d07ffe, new revision=ad4dfcbb11c6bc0c6742c0640d7f2bdf9646e1b4 2020-02-07 19:17:38.186 [Service Center Task [java.util.concurrent.ThreadPoolExecutor$Worker@200aae61[State = -1, empty queue][63]]] INFO - [AbstractServiceRegistry.java:findServiceInstances:269] - [] - service id=72af34d233d206a9a655206af41463d9280c2430, instance id=45847166499b11eab6edfa163e6793f1, endpoints=[rest://10.52.183.58:8080] 2020-02-07 19:17:38.186 [Service Center Task [java.util.concurrent.ThreadPoolExecutor$Worker@200aae61[State = -1, empty queue][63]]] INFO - [MicroserviceVersionRule.java:printData:145] - [] - update instances, appId=Gray, microserviceName=HiCloudDataManagerService_zl, versionRule=0.0.0.0+, latestVersion=4.0.3.0, inputVersionCount=1, inputInstanceCount=1 0.instanceId=45847166499b11eab6edfa163e6793f1, status=UP, version=4.0.3.0, endpoints=[rest://10.52.183.58:8080], environment=development, framework.name=servicecomb-java-chassis, framework.version=ServiceComb:1.2.0.B007r7;CSE:2.3.70.4 2020-02-07 19:18:21.957 [registry-vert.x-eventloop-thread-0] WARN - [ServiceRegistryClientImpl.java:lambda$null$7:242] - [] - failed to findInstances: {"errorCode":"400012","errorMessage":"Micro-service does not exist","detail":"Consumer[404f50431e15cfd037c85fe113516928b23ec060][development/Gray/HiCloudPermissionServiceClient_Safe_zl/4.0.3] find provider[development/default/CseMonitoring/latest] failed, provider does not exist."} >>>>Consumer收到注册中心通知,本地也Unregister了Provider的停机实例: 2020-02-07 19:19:09.158 [registry-vert.x-eventloop-thread-2] INFO - [MicroserviceWatchTask.java:onMicroserviceInstanceChanged:93] - [] - microservice Gray/HiCloudDataManagerService_zl/4.0.3 UNREGISTERED an instance 45847166499b11eab6edfa163e6793f1, [rest://10.52.183.58:8080]. 2020-02-07 19:19:09.191 [Service Center Task [java.util.concurrent.ThreadPoolExecutor$Worker@6abeb7d6[State = -1, empty queue][61]]] INFO - [AbstractServiceRegistry.java:findServiceInstances:261] - [] - find instances[0] from service center success. service=Gray/HiCloudDataManagerService_zl/0.0.0+, old revision=ad4dfcbb11c6bc0c6742c0640d7f2bdf9646e1b4, new revision=97784746be9a2d00f08bc72fe1ce5ff8c7d07ffe 2020-02-07 19:19:09.192 [Service Center Task [java.util.concurrent.ThreadPoolExecutor$Worker@6abeb7d6[State = -1, empty queue][61]]] INFO - [MicroserviceVersionRule.java:printData:145] - [] - update instances, appId=Gray, microserviceName=HiCloudDataManagerService_zl, versionRule=0.0.0.0+, latestVersion=4.0.3.0, inputVersionCount=1, inputInstanceCount=1 0.instanceId=45847166499b11eab6edfa163e6793f1, status=UP, version=4.0.3.0, endpoints=[rest://10.52.183.58:8080], environment=development, framework.name=servicecomb-java-chassis, framework.version=ServiceComb:1.2.0.B007r7;CSE:2.3.70. >>>>此时——Provider正在等待现有线程(存在耗时任务)处理过程中,Comsumer的新请求却还是可以调用到Provider的接口,会报The request is rejected. Cannot process the request due to STATUS = STOPPING错误——【问题1】,Comsumer没有日志打印。 >>>>过了一会儿,Provider现有线程处理完毕,进程停止。Comsumer的新请求在调用Provider接口时将会报Failed to send request, local:not connected, remote:/10.52.183.58:8080错误——【问题2】,日志如下: 2020-02-07 19:19:29.246 [vert.x-eventloop-thread-1] ERROR - [RestClientInvocation.java:lambda$invoke$0:110] - [] - Failed to send request, local:not connected, remote:/10.52.183.58:8080. io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: /10.52.183.58:8080 Caused by: java.net.ConnectException: Connection refused: no further information at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_231] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_231] at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330) ~[netty-transport-4.1.43.Final.jar!/:4.1.43.Final] at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334) [netty-transport-4.1.43.Final.jar!/:4.1.43.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:688) [netty-transport-4.1.43.Final.jar!/:4.1.43.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635) [netty-transport-4.1.43.Final.jar!/:4.1.43.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552) [netty-transport-4.1.43.Final.jar!/:4.1.43.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514) [netty-transport-4.1.43.Final.jar!/:4.1.43.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050) [netty-common-4.1.43.Final.jar!/:4.1.43.Final] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.43.Final.jar!/:4.1.43.Final] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.43.Final.jar!/:4.1.43.Final] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_231] 2020-02-07 19:19:50.028 [LoadBalancerStatsTimer] INFO - [ServiceCombLoadBalancerStats.java:run:154] - [] - ping mark server 45847166499b11eab6edfa163e6793f1 failure. 2020-02-07 19:20:10.027 [LoadBalancerStatsTimer] INFO - [ServiceCombLoadBalancerStats.java:run:154] - [] - ping mark server 45847166499b11eab6edfa163e6793f1 failure. 2020-02-07 19:20:30.028 [LoadBalancerStatsTimer] INFO - [ServiceCombLoadBalancerStats.java:run:154] - [] - ping mark server 45847166499b11eab6edfa163e6793f1 failure. 2020-02-07 19:20:50.029 [LoadBalancerStatsTimer] INFO - [ServiceCombLoadBalancerStats.java:run:154] - [] - ping mark server 45847166499b11eab6edfa163e6793f1 failure.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
