[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koushik Das updated CLOUDSTACK-4944:
------------------------------------

    Fix Version/s:     (was: 4.3.0)
                   Future

> Command sequence logic in agent code may lead to errors in clustered MS setup
> -----------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-4944
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4944
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Management Server
>    Affects Versions: pre-4.0.0, 4.1.0, 4.2.0
>            Reporter: Koushik Das
>            Assignee: Koushik Das
>             Fix For: Future
>
>
> I was looking at the command sequencing logic in the agent code 
> (AgentAttache.java). Each agent maintains a sequence that gets initialised 
> based on following logic
>    private static final Random s_rand = new 
> Random(System.currentTimeMillis());
>    _nextSequence = s_rand.nextInt(Short.MAX_VALUE) << 48;
> For every command that gets processed by the agent the sequence is 
> incremented by 1. If commands are to be executed in sequence then they are 
> queued up based on this sequence
>    protected synchronized void addRequest(Request req) {
>        int index = findRequest(req);
>        assert (index < 0) : "How can we get index again? " + index + ":" + 
> req.toString();
>        _requests.add(-index - 1, req);
>    }
> The above works fine in case of a single MS scenario. In case of a clustered 
> MS setup things change slightly.
> The command can originate at any MS and based on the ownership of the agent, 
> it gets forwarded to the correct MS which then handles the command. Now 
> command sequences are local to individual agents in MS. In this case the 
> originating MS agent tags the request with a sequence. This gets forwarded to 
> the owning MS and based on if 'executeInSequence' flag is set, gets added to 
> the list based on the sequence number. Now here lies the problem, commands 
> are not inserted in the order in which they arrive but based on the sequence 
> number. In case of a forwarded command the sequence is different from the 
> local sequence. If the starting sequence of forwarded commands is much less 
> than that of the locally generated commands then there is a possibility of 
> local commands getting starved if there is a steady arrival of forwarded 
> commands. Similarly it can also happen the other way round. Also if the the 
> starting sequence for a agent in local and peer MS is not spread far apart 
> then there may be overlaps and a new request will override the old one.
> Not sure if anyone encountered any issues due to this. The correct way looks 
> like to implement the queue model (FIFO) rather than doing a add based on the 
> above code.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to