Aled Sage created BROOKLYN-603:
----------------------------------
Summary: SoftwareProcess.restart(restartMachine: true) fails:
keypair does not exist
Key: BROOKLYN-603
URL: https://issues.apache.org/jira/browse/BROOKLYN-603
Project: Brooklyn
Issue Type: Bug
Affects Versions: 1.0.0-M1
Reporter: Aled Sage
For a {{VanillaSoftwareProcess}} entity, calling {{restart(restartMachine:
true)}} fails with the error:
{noformat}
2018-09-28T01:13:35.382Z :
{"timeMillis":1538097215363,"thread":"brooklyn-execmanager-bKgJW2xZ-51","level":"DEBUG","loggerName":"org.apache.brooklyn.util.core.task.BasicExecutionManager","message":"Exception
running task Task[Cross-
context execution: Invoking effector restart on Docker Entity with parameters
{restartMachine=true}]@O8Dz2VIj (rethrowing):
org.apache.brooklyn.core.mgmt.internal.EffectorUtils$EffectorCallPropagatedRuntimeException:
Error invoking r
estart at VanillaSoftwareProcessImpl{id=d0398d4n58}: Failed to get VM after 3
attempts. - First cause is org.jclouds.rest.ResourceNotFoundException: The key
pair 'jclouds#qa-docker-ent-d0398d4n58#4ef' does not exist (listed in primar
y trace); plus 2 more (e.g. the last is
org.jclouds.rest.ResourceNotFoundException: The key pair
'jclouds#qa-docker-ent-d0398d4n58#4ef' does not exist):
ResourceNotFoundException: The key pair 'jclouds#qa-docker-ent-d0398d4n58#4ef' d
oes not
exist","endOfBatch":false,"loggerFqcn":"org.ops4j.pax.logging.slf4j.Slf4jLogger","threadId":208,"threadPriority":5}
{noformat}
The series of events is:
# start the entity:
## call jclouds to provision the VM
## jclouds creates the keyPair and other incidental resources, and provisions
the VM
# the keyPair is deleted (by our 'cloud cleaner')
# restart the entity:
## call jclouds to stop the VM
### jclouds stops the VM, and also attempts to delete incidental resources
(e.g. keyPairs)
## call jclouds to create the VM
### jclouds creates the keyPair
### jclouds decides to use the old keyPair, because that name is still in the
{{credentialsMap}}.
### VM creation fails: keypair does not exist.
At step 3.1.1, {{org.jclouds.ec2.compute.EC2ComputeService.deleteKeyPair}}, it
does not find the keypair. It therefore does not call
{{credentialsMap.remove(new RegionAndName(region, keyPair.getKeyName()))}}
(where credentialsMap is {{ConcurrentMap<RegionAndName, KeyPair>
credentialsMap}}). It leaves the non-existent keypair name in the
credentialsMap.
At step 3.2.2, it leaves behind the second keyPair, and tries to use the key
name from the map. The code is at
{{org.jclouds.ec2.compute.strategy.CreateKeyPairAndSecurityGroupsAsNeededAndReturnRunOptions.createOrImportKeyPair}}:
{noformat}
// base EC2 driver currently does not support key import
protected String createOrImportKeyPair(String region, String group,
TemplateOptions options) {
RegionAndName regionAndGroup = new RegionAndName(region, group);
KeyPair keyPair = makeKeyPair.apply(new RegionAndName(region, group));
// make sure that we don't request multiple keys simultaneously
// if there is already a keypair for the group specified, use it
// otherwise create a new keypair and key it under the group and also the
regular keyname
KeyPair origValue = credentialsMap.putIfAbsent(regionAndGroup, keyPair);
if (origValue != null) {
return origValue.getKeyName();
}
{noformat}
There are a number of improvements we could make:
1. Fix jclouds, so it clears out the non-existant keypair name from the
credentialsMap as part of {{cleanUpIncidentalResources}}.
2. When the entity stops the VM and then creates a new VM, pass in a different
group name (e.g. with an incremented suffix - currently the group name is
generated from the app/entity's name + id).
3. Discourage use of {{restart(restartMachine: true)}} (e.g. the VM's IP
address may well change - is the entity really implemented to support this?
Would a user of this parameter just expect the VM to reboot, so is it
dangerous?).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)