Thanks Evan! We should look into finding a way to reliably reproduce this
behaviour so that we can fix it.

On Thu, Feb 23, 2012 at 1:56 PM, Evan Pollan <[email protected]> wrote:

> Andrei,
>
> I snipped out a portion of the whirr stdout/stderr and just updated
> https://issues.apache.org/jira/browse/WHIRR-488, rather than file a
> jclouds bug.  Seems more reasonable to sort out potentially pathological
> whirr behavior, then file a jclouds defect once that's sorted out.
>
> In any case, here's a good example of what happens:  some nodes are
> successfully started, then there are a whole series of jclouds errors
> compaining about not being able to resolve a compatible AMI (even though
> nodes using the specified AMI have already been created...):
>
> Nodes started: [[id=us-east-1/i-381f245d, providerId=i-381f245d,
>> group=pageviews-cluster, name=null, location=[id=us-east-1c, scope=ZONE,
>> description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA],
>> metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null,
>> family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true,
>> description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020
>> .manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-40-65-170,
>> privateAddresses=[10.40.65.170], publicAddresses=[50.19.189.37],
>> hardware=[id=m1.xlarge, providerId=m1.xlarge, name=null, processor
>> s=[[cores=4.0, speed=2.0]], ram=15360, volumes=[[id=null, type=LOCAL,
>> size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null,
>> type=LOCAL, size=420.0, device=/dev/sdb, durable=false, is
>> BootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc,
>> durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0,
>> device=/dev/sdd, durable=false, isBootDevice=false], [id=null,
>>  type=LOCAL, size=420.0, device=/dev/sde, durable=false,
>> isBootDevice=false]],
>> supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()),
>> tags=[]], log
>> inUser=ubuntu, userMetadata={}, tags=[]]]
>> Dying because - java.net.SocketTimeoutException: Read timed out
>> Dying because - java.net.SocketTimeoutException: Read timed out
>> Starting 6 node(s) with roles [hadoop-datanode, hadoop-tasktracker]
>> Unexpected error while starting 15 nodes, minimum 12 nodes for
>> [hadoop-datanode, hadoop-tasktracker] of cluster pageviews-cluster
>> java.util.concurrent.ExecutionException:
>> java.util.NoSuchElementException: no image matched predicate:
>> And(locationEqualsOrChildOf(us-east-1),And(osFamily(ubuntu),osDescription(ubuntu-images-us/ubuntu-l
>>
>> ucid-10.04-amd64-server-20101020.manifest.xml),osVersion(10.04),os64Bit(true),osArch(paravirtual)),imageVersion(20101020),imageDescription(ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manif
>> est.xml))
>>         at
>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
>>         at java.util.concurrent.FutureTask.get(FutureTask.java:111)
>>         at
>> org.apache.whirr.compute.StartupProcess.waitForOutcomes(StartupProcess.java:129)
>>         at
>> org.apache.whirr.compute.StartupProcess.call(StartupProcess.java:82)
>>         at
>> org.apache.whirr.compute.StartupProcess.call(StartupProcess.java:40)
>>         at
>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>         at java.lang.Thread.run(Thread.java:636)
>> Caused by: java.util.NoSuchElementException: no image matched predicate:
>> And(locationEqualsOrChildOf(us-east-1),And(osFamily(ubuntu),osDescription(ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-201010
>>
>> 20.manifest.xml),osVersion(10.04),os64Bit(true),osArch(paravirtual)),imageVersion(20101020),imageDescription(ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml))
>>         at
>> org.jclouds.compute.domain.internal.TemplateBuilderImpl.throwNoSuchElementExceptionAfterLoggingImageIds(TemplateBuilderImpl.java:620)
>>         at
>> org.jclouds.compute.domain.internal.TemplateBuilderImpl.build(TemplateBuilderImpl.java:608)
>>         at
>> org.jclouds.ec2.compute.strategy.EC2CreateNodesInGroupThenAddToSet.execute(EC2CreateNodesInGroupThenAddToSet.java:135)
>>         at
>> org.jclouds.compute.internal.BaseComputeService.createNodesInGroup(BaseComputeService.java:199)
>>         at
>> org.jclouds.aws.ec2.compute.AWSEC2ComputeService.createNodesInGroup(AWSEC2ComputeService.java:130)
>>         at org.apache.whirr.compute.NodeStarter.call(NodeStarter.java:55)
>
>
> After a bunch of these, whirr decides there were too many failures, and
> tries to destroy any nodes it's created.  Then, it just hangs.
>

Reply via email to