Hi David,
David, what type of job is it? Can the shell scripts call you back when
they're done or is there some out-of-band event that would have to be
subscribed to?
I'm also interested in the wrapping mechanism. You could write bash
scripts to wrap whirr start/stap and watch for the trigger in between,
but scraping the IP addresses and configuring credentials will get
tedious. +1 to something more elegant ... would a JVM language script
be interesting? Am thinking a simple management layer with embedded
Whirr to start/stop but also able to monitor your processes in between,
finding out about them programmatically via Whirr.
--A
On 22/09/2011 15:29, Andrei Savu wrote:
Sorry for the confusion :) When I see job I think about Hadoop.
For arbitrary scripts I think jclouds provides some ways of doing this
as you already know. To make this process of checking if a script is
running low latency I think you need some sort of server side daemon
but I can't recommend one.
-- Andrei
On Thu, Sep 22, 2011 at 3:24 PM, David Alves<[email protected]> wrote:
As I said the thing is I'm NOT using hadoop :)
I'm just running generic scripts/ssh commands.
-david
On Sep 22, 2011, at 5:20 PM, Andrei Savu wrote:
I don't know that much about how to manage jobs in Hadoop using the
API. Maybe Tom can provide a good answer to this. I completely
understand the elegance part :)
-- Andrei Savu
On Thu, Sep 22, 2011 at 3:17 PM, David Alves<[email protected]> wrote:
First there is the question of accuracy, as I said I am collecting metrics that
I'd like to be as accurate as possible.
Second there is the matter of elegance. I always like to avoid polls whenever
possible.
That being said, I don't wan't to embark in some odyssey just to avoid poll, so
if it really is too much trouble I am ok with letting it go.
Anyhow even with poll is there something already implemented that enables it in
generic cases?
thanks
-david
On Sep 22, 2011, at 5:09 PM, Andrei Savu wrote:
Why is so important to avoid having a poll? The cost is low and almost
any job is running at least for a few minutes.
-- Andrei
On Thu, Sep 22, 2011 at 3:07 PM, David Alves<[email protected]> wrote:
Hi Andrei
I know…
The thing is that code used the Hadoop JobClient class's runJob()
method that actually polls for progress.
I am not using hadoop (in hindsight using the word "job" might have
been a mistake) and I was wondering if there is already a way to do that for generic
cases (e.g., scripts or java programs).
In particular as I'm collecting accurate metrics I'd like a non poll
based technique.
Even if there is none I can always try and code it, so all ideas are
welcome.
thanks
david
On Sep 22, 2011, at 4:52 PM, Andrei Savu wrote:
This is exactly what the example code is doing (and the hadoop
integration test). The job running code is blocking while the job is
executing.
-- Andrei Savu / andreisavu.ro
On Thu, Sep 22, 2011 at 2:03 PM, David Alves<[email protected]> wrote:
Hi All
I need to launch a cluster run a job and terminate the cluster as the
job is finished (as soon as possible).
Is there any "nice" way to do this, or do you have any suggestions?
On the top of my head I can imagine some quick and dirty solutions
(like creating a file whenever the task is completed and polling for its
existence from the whirr handler) but I'd like to do it without polling if
possible. Any ideas?
thanks
-david
.