Tian Jiang created IOTDB-5883:
---------------------------------

             Summary: Refactor redirection and dispatching target
                 Key: IOTDB-5883
                 URL: https://issues.apache.org/jira/browse/IOTDB-5883
             Project: Apache IoTDB
          Issue Type: Improvement
          Components: Core/Cluster
            Reporter: Tian Jiang
            Assignee: Tian Jiang
             Fix For: master branch
         Attachments: image-2023-05-16-12-22-29-563.png, 
image-2023-05-16-12-42-50-397.png

The current redirection mechanism has the following issues:

1. The lower level (e.g., from the consensus layer) redirection will be 
overwritten by QueryExecution. Even if the TSStatus from the lower level is 
already REDIRECTION_RECOMMEND, QueryExecution will still recalculate the 
redirection. Even worse, the redirection calculated may lead to a wrong node 
(see the second issue for an explanation), although the client could just be 
sending to the right node.
 !image-2023-05-16-12-22-29-563.png|thumbnail! 

2. The dispatching target and redirection target can be stale. For each 
FragmentInstance, its dispatching target and redirection target is based on the 
PartitionCache, and the very first node in the associated ReplicaSet is chosen 
as the dispatching target and redirection target. 
However, as the PartitionCache is not updated after a leadership change, the 
first node in a ReplicaSet may not be the leader/primary/master node.
As a result, the FragmentInstance may be dispatched/redirected to a non-leader 
node, which will incur further redirection.

Solutions:

1. QueryExection will calculate the redirection only when the TSStatus from the 
lower level is REDIRECTION_RECOMMEND and it does not include a redirection node.
Such a situation is somehow rare since most REDIRECTION_RECOMMEND returned by 
the lower level will include a redirection node.

2. In each ReplicaSet, an optional preferred location is recorded. When the 
preferred location is set, it will be chosen as the dispatching target and 
redirection target.
When REDIRECTION_RECOMMEND is returned from the lower level and a redirection 
node is included, the preferred location of the ReplicaSet will be updated to 
that node.
Furthermore, if the node that generates the MPP plan is in the ReplicaSet, the 
FragmentInstance will not be dispatched to another node. It is because the 
consensus layer has a better chance to know who the leader is than the 
PartitionCache. Consequently, a consensus layer redirection is more accurate 
than an MPP-level redirection.
 !image-2023-05-16-12-42-50-397.png|thumbnail! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to