Hi Andrei

Andrei Savu wrote:
Paul -

I think you are hitting an upper bound on the size of the clusters that can be started with Whirr right now.

I was trying to start an Hadoop cluster of 20 datanodes|tasktrackers.

What is the current upper bound?

One possible workaround you can try is to enable lazy image fetching in jclouds:
http://www.jclouds.org/documentation/userguide/using-ec2

It's not clear to me how I can do that with Whirr, I am not even sure if
that is the root cause of the problem.

I have created a new JIRA issue so that we can add this automatically when the image-id is known:
https://issues.apache.org/jira/browse/WHIRR-416

I am looking forward to see if this will fix my problem and increase the
number of nodes of Hadoop clusters one can use via Whirr.

What if you start a smaller size cluster but with more powerful machines?

An option, but not a good one in the context of MapReduce, isn't it? :-)
m1.large are powerful (and expensive) enough for what I want to do.

Paolo


Cheers,

-- Andrei Savu

On Fri, Oct 28, 2011 at 6:32 PM, Paolo Castagna <[email protected] <mailto:[email protected]>> wrote:

    Hi,
    it's me again, I am trying to use Apache Whirr 0.6.0-incubating
    to start a 20 nodes Hadoop cluster on Amazon EC2.

    Here is my recipe:

    ----
    whirr.cluster-name=hadoop
    whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,20
    hadoop-datanode+hadoop-tasktracker
    whirr.instance-templates-max-percent-failures=100
    hadoop-namenode+hadoop-jobtracker,50
    hadoop-datanode+hadoop-tasktracker
    whirr.max-startup-retries=1
    whirr.provider=aws-ec2
    whirr.identity=${env:AWS_ACCESS_KEY_ID_LIVE}
    whirr.credential=${env:AWS_SECRET_ACCESS_KEY_LIVE}
    whirr.hardware-id=m1.large
    whirr.image-id=eu-west-1/ami-ee0e3c9a
    whirr.location-id=eu-west-1
    whirr.private-key-file=${sys:user.home}/.ssh/whirr
    whirr.public-key-file=${whirr.private-key-file}.pub
    whirr.hadoop.version=0.20.204.0
    
whirr.hadoop.tarball.url=http://archive.apache.org/dist/hadoop/core/hadoop-${whirr.hadoop.version}/hadoop-${whirr.hadoop.version}.tar.gz
    ----

    I see a lot of these errors:

    org.jclouds.aws.AWSResponseException: request POST
    https://ec2.eu-west-1.amazonaws.com/ HTTP/1.1 failed with code 503,
    error: AWSError{requestId='b361f3f6-73f1-4348-964a-31265ec70eeb',
    requestToken='null', code='RequestLimitExceeded', message='Request
    limit exceeded.', context='{Response=, Errors=}'}
           at
    
org.jclouds.aws.handlers.ParseAWSErrorFromXmlContent.handleError(ParseAWSErrorFromXmlContent.java:74)
           at
    
org.jclouds.http.handlers.DelegatingErrorHandler.handleError(DelegatingErrorHandler.java:71)
           at
    
org.jclouds.http.internal.BaseHttpCommandExecutorService$HttpResponseCallable.shouldContinue(BaseHttpCommandExecutorService.java:200)
           at
    
org.jclouds.http.internal.BaseHttpCommandExecutorService$HttpResponseCallable.call(BaseHttpCommandExecutorService.java:165)
           at
    
org.jclouds.http.internal.BaseHttpCommandExecutorService$HttpResponseCallable.call(BaseHttpCommandExecutorService.java:134)
           at
    java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
           at java.util.concurrent.FutureTask.run(FutureTask.java:138)
           at
    
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
           at
    
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
           at java.lang.Thread.run(Thread.java:662)


    I have more than 20 slots available on this Amazon account.

    Is it Whirr sending requests too fast to Amazon?

    How can I solve this problem?

    Regards,
    Paolo



Reply via email to