Ax1an commented on pull request #6863:
URL: https://github.com/apache/skywalking/pull/6863#issuecomment-834100555


   **Why NPE1 happens?**
   
   
![image](https://user-images.githubusercontent.com/28091237/117406098-285e4800-af3f-11eb-91fa-c7589ca6e589.png)
   
   First, we need to make it clear that skywalking sets dynamic fields when 
enhancing the 
`org.elasticsearch.client.transport.TransportClientNodesService.execute` method.
   
   Secondly, we need to know that the exception occurred in the 
`org.apache.skywalking.apm.plugin.elasticsearch.v6.interceptor.AdapterActionFutureActionGetMethodsInterceptor`
 class of enhanced the 
`org.elasticsearch.action.support.AdapterActionFuture.actionGet` method.
   
   Seeing this exception, I first suspect that the execution result of the 
`isTrace (objInst)` method in the `beforeMethod` method is false, which results 
in the `createLocalSpan` method not executing. And the 
`org.elasticsearch.action.support.AdapterActionFuture` class's instance method 
`actionGet` occurred exception  in the normal execution process.
   
   
![image](https://user-images.githubusercontent.com/28091237/117406150-4035cc00-af3f-11eb-927a-12929994d5f1.png)
   
   By using arthas to observe 
`org.elasticsearch.action.support.AdapterActionFuture.actionGet()` method, I 
get the following exception.
   
   ```java
   watch org.elasticsearch.action.support.AdapterActionFuture actionGet 
"{throwExp}" -e -x 2 -n 2
   
   ts=2021-04-30 17:40:18; [cost=5.923075ms] result=@ArrayList[
       ConnectTransportException[[][10.111.233.68:9300] general node connection 
failure]; nested: IllegalStateException[Received message from unsupported 
version: [6.3.2] minimal compatible version is: [6.8.0]];
        at 
org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener$1.onFailure(TcpTransport.java:957)
        at 
org.elasticsearch.transport.TransportHandshaker$HandshakeResponseHandler.handleResponse(TransportHandshaker.java:138)
        at 
org.elasticsearch.transport.TransportHandshaker$HandshakeResponseHandler.handleResponse(TransportHandshaker.java:115)
        at 
org.elasticsearch.transport.InboundHandler$1.doRun(InboundHandler.java:224)
        at 
org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
        at 
org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:193)
        at 
org.elasticsearch.transport.InboundHandler.handleResponse(InboundHandler.java:216)
        at 
org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:141)
        at 
org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:105)
        at 
org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:660)
        at 
org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:62)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:359)
        ... 32 more
   ,
   ]
   ```
   
   After asking related colleagues, I found that they called version 6.3.2 ES 
cluster by using version 7.2.1 ES SDK.
   
   Why the execution result of the `isTrace (objInst)` method in the 
`beforeMethod` method is false?
   
   I find that the skywalking dynamic fields are set when the 
`org.elasticsearch.client.transport.TransportClientNodesService.execute()` 
method is executed.
   
   
![image](https://user-images.githubusercontent.com/28091237/117406197-53489c00-af3f-11eb-8038-7fbe23af0e46.png)
   
   However,  the 
`org.elasticsearch.client.transport.TransportClientNodesService.execute()` 
method is not executed every time before the 
`org.elasticsearch.action.support.AdapterActionFuture.actionGet()` method is 
executed.
   
   NPE occurs in the following execution path:
   
   
![image](https://user-images.githubusercontent.com/28091237/117406251-65c2d580-af3f-11eb-8221-40d479473fe2.png)
   
   `ScheduledNodeSampler` periodically sniffs cluster nodes and does not 
execute the 
`org.elasticsearch.client.transport.TransportClientNodesService.execute()` 
method.
   
   NPE will not occur in the following execution path:
   
   
![image](https://user-images.githubusercontent.com/28091237/117406295-75dab500-af3f-11eb-9460-62cfdd1dbceb.png)
   
   By looking at the `ActionRequestBuilder` method source code, you can know 
that the `org.elasticsearch.action.support.AdapterActionFuture.actionGet()` 
method is executed after the 
`org.elasticsearch.client.transport.TransportClientNodesService.execute()` 
method.
   
   So this exception has nothing to do with skywalking. It is caused by 
developers operating ES with an incompatible version of SDK.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to