Hi David,

David, what type of job is it? Can the shell scripts call you back when they're done or is there some out-of-band event that would have to be subscribed to?

I'm also interested in the wrapping mechanism. You could write bash scripts to wrap whirr start/stap and watch for the trigger in between, but scraping the IP addresses and configuring credentials will get tedious. +1 to something more elegant ... would a JVM language script be interesting? Am thinking a simple management layer with embedded Whirr to start/stop but also able to monitor your processes in between, finding out about them programmatically via Whirr.

--A


On 22/09/2011 15:29, Andrei Savu wrote:
Sorry for the confusion :) When I see job I think about Hadoop.

For arbitrary scripts I think jclouds provides some ways of doing this
as you already know. To make this process of checking if a script is
running low latency I think you need some sort of server side daemon
but I can't recommend one.

-- Andrei

On Thu, Sep 22, 2011 at 3:24 PM, David Alves<[email protected]>  wrote:
As I said the thing is I'm NOT using hadoop :)
I'm just running generic scripts/ssh commands.

-david

On Sep 22, 2011, at 5:20 PM, Andrei Savu wrote:

I don't know that much about how to manage jobs in Hadoop using the
API. Maybe Tom can provide a good answer to this. I completely
understand the elegance part :)

-- Andrei Savu

On Thu, Sep 22, 2011 at 3:17 PM, David Alves<[email protected]>  wrote:
First there is the question of accuracy, as I said I am collecting metrics that 
I'd like to be as accurate as possible.
Second there is the matter of elegance. I always like to avoid polls whenever 
possible.

That being said, I don't wan't to embark in some odyssey just to avoid poll, so 
if it really is too much trouble I am ok with letting it go.
Anyhow even with poll is there something already implemented that enables it in 
generic cases?

thanks
-david

On Sep 22, 2011, at 5:09 PM, Andrei Savu wrote:

Why is so important to avoid having a poll? The cost is low and almost
any job is running at least for a few minutes.

-- Andrei

On Thu, Sep 22, 2011 at 3:07 PM, David Alves<[email protected]>  wrote:
Hi Andrei

        I know…
        The thing is that code used the Hadoop JobClient class's runJob() 
method that actually polls for progress.
        I am not using hadoop (in hindsight using the word "job" might have 
been a mistake) and I was wondering if there is already a way to do that for generic 
cases (e.g., scripts or java programs).
        In particular as I'm collecting accurate metrics I'd like a non poll 
based technique.
        Even if there is none I can always try and code it, so all ideas are 
welcome.

thanks
david


On Sep 22, 2011, at 4:52 PM, Andrei Savu wrote:

This is exactly what the example code is doing (and the hadoop
integration test). The job running code is blocking while the job is
executing.

-- Andrei Savu / andreisavu.ro

On Thu, Sep 22, 2011 at 2:03 PM, David Alves<[email protected]>  wrote:
Hi All

        I need to launch a cluster run a job and terminate the cluster as the 
job is finished (as soon as possible).
        Is there any "nice" way to do this, or do you have any suggestions?
        On the top of my head I can imagine some quick and dirty solutions 
(like creating a file whenever the task is completed and polling for its 
existence from the whirr handler) but I'd like to do it without polling if 
possible. Any ideas?

thanks
-david




.


Reply via email to