[
https://issues.apache.org/jira/browse/CLOUDSTACK-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Koushik Das updated CLOUDSTACK-4944:
------------------------------------
Fix Version/s: (was: 4.3.0)
Future
> Command sequence logic in agent code may lead to errors in clustered MS setup
> -----------------------------------------------------------------------------
>
> Key: CLOUDSTACK-4944
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4944
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: Management Server
> Affects Versions: pre-4.0.0, 4.1.0, 4.2.0
> Reporter: Koushik Das
> Assignee: Koushik Das
> Fix For: Future
>
>
> I was looking at the command sequencing logic in the agent code
> (AgentAttache.java). Each agent maintains a sequence that gets initialised
> based on following logic
> private static final Random s_rand = new
> Random(System.currentTimeMillis());
> _nextSequence = s_rand.nextInt(Short.MAX_VALUE) << 48;
> For every command that gets processed by the agent the sequence is
> incremented by 1. If commands are to be executed in sequence then they are
> queued up based on this sequence
> protected synchronized void addRequest(Request req) {
> int index = findRequest(req);
> assert (index < 0) : "How can we get index again? " + index + ":" +
> req.toString();
> _requests.add(-index - 1, req);
> }
> The above works fine in case of a single MS scenario. In case of a clustered
> MS setup things change slightly.
> The command can originate at any MS and based on the ownership of the agent,
> it gets forwarded to the correct MS which then handles the command. Now
> command sequences are local to individual agents in MS. In this case the
> originating MS agent tags the request with a sequence. This gets forwarded to
> the owning MS and based on if 'executeInSequence' flag is set, gets added to
> the list based on the sequence number. Now here lies the problem, commands
> are not inserted in the order in which they arrive but based on the sequence
> number. In case of a forwarded command the sequence is different from the
> local sequence. If the starting sequence of forwarded commands is much less
> than that of the locally generated commands then there is a possibility of
> local commands getting starved if there is a steady arrival of forwarded
> commands. Similarly it can also happen the other way round. Also if the the
> starting sequence for a agent in local and peer MS is not spread far apart
> then there may be overlaps and a new request will override the old one.
> Not sure if anyone encountered any issues due to this. The correct way looks
> like to implement the queue model (FIFO) rather than doing a add based on the
> above code.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)