Hi Jeff,

On Fri, Sep 2, 2011 at 01:04, Jeff Forcier <[email protected]> wrote:

> Next up is testing and tweaking Morgan's multiprocessing/parallelism
> work to get 1.3 out the door, which should be a nice boost to Fabric's
> ability to operate in larger environments.
>

I'm very interested in this particular feature and would be interested in
helping out getting it merged.

In addition to the parallel capabilities I'd also love to see changes
related to easier library usage of fabric go in because (at least for me)
they are pretty tightly coupled with the parallel feature.

Wrt the parallel behaviour of fabric I was curious about a few things:

== Failure ==

The current failure handling of fabric (halt on failure) does not map very
well on parallelism. Even though a task might fail on a host and stop, all
other hosts are busy executing the same task. In addition to that, the
current code in Morgans/goosemo's branch does not (yet) allow for inspection
of return values of parallel executed tasks.
Because of the queuesize limit there is also a possible scenario where
execution of a task on a host fails in an early stage leaving the target
hosts in an rather hard to recover (or interpret) state.

 f.e. with 40 hosts and a queuesize of 10 suppose that execution of a task
fails on host 13. If the code stops executing there and then, the user is
left with a set of 20 hosts that did execute the tasks of which one or more
hosts failed to execute successfully and another set of 20 hosts that did
not execute the task at all.....

I would prefer a scenario where fabric would always execute a command marked
for parallel execution on all hosts and report status afterwards. The
execution of any subsequent commands could be halted or not depending on the
value of warn_only.

I was wondering what your take on this was.

== Interactivity ==

Interactive behaviour (asking for password prompts etc.) has similar
problems. It is very complex to ask users for input on parallel tasks. In
most cases were users want to execute tasks in parallel interactivity is
meaningless or unwanted.
Interactivity would require a parallel executing process to reliably grab
stdin and wait for input, which sorta defeats the concept of parallelism.
Even if it would be possible, typing a response to 100+ password prompts is
not my idea of a robust execution :)

I would prefer to surpress all interaction in parallel execution mode and
make the task fail instead.

Adding logging support would be useful for this new functionality, but
> I may put that out as a followup feature release, if parallel
> execution works "well enough" without logging that folks would benefit
> from having it released "early".
>

== Output ==

Output and consequently logging, which seems slated to replace the default
print to sys.stdout/sys.stderr in fabric would also require some thought
while executing in parallel. In the current situation when a task is
executed in parallel the output of the task on the various hosts is
interleaved, thus making it hard to interpret for users.

In addition any output of tasks is fairly hard to handle if you
want/need/plan on using fabric as a library.

Switching to logging will not solve this problem, interleaved output will
still happen in a situation where all output is pushed into a logging
stream.

Changing to parallel execution might be a good time to rethink the entire
output handling in fabric for tasks.
I noticed that return values from tasks are not captured in the current
codebase (is this correct ?) which also makes it fairly hard to use fabric
as a library.

== Background ==

To give you some idea of what I'm (planning to use) using fabric for.
I build maintenance and deploy scripting for a company with a large
serverfarm of 3000+ nodes. Tasks build with fabric will typically execute on
groups of hosts sized around 50 - 300 with a max of approx. 1500 hosts.

As you can imagine parallel execution if a required feature here and
interactive command execution is a no-go area.
Our main use case for fabric is to use it as a library in scripts and
orchestration services, deploying software would be a good example of
scripting (and also were the current focus is).

I'm hoping to use fabric because the task model makes it really easy for an
entire team of system engineers and developers to develop tasks in isolation
of the complexity that comes with scale. The task developer can focus on the
single machine scenario, which makes it relatively simple to develop a new
task, while fabric does the heavy-lifting with regard to ssh connections,
sudo, parallelism, output/return value capturing etc.

Fabric is pretty awesome already, adding parallelism will make it really
really awesome IMHO.

== Code ==

Although in a very embryonic state, I took the work that Morgan did on the
parallel queue and converted it to a queue that is capable of capturing the
return values of tasks executed in parallel.

I added two methods to tasks.py to allow a library user to call execute in
sequential and parallel fashion.

You can take a look at it here:
https://github.com/ramonvanalteren/fabric/tree/multiprocessing-lib


Pff this turned into a far longer mail than originally intended.
I'd love to discuss this further on this list or over IRC, you can find me
on #fabric as Ramonster.

Regards,

Ramon van Alteren

mail: ramon@vanalteren
IRC: Ramonster
_______________________________________________
Fab-user mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/fab-user

Reply via email to