Re: [Launchpad-dev] anyone else finding ec2land things disappearing w/out warning?

Robert Collins Tue, 26 Oct 2010 03:46:36 -0700

On Tue, Oct 26, 2010 at 4:24 AM, Maris Fogels
<[email protected]> wrote:
>
> I do not think an email alert will catch hung testrunners because an email
> implementation will probably not send granular enough messages about what the
> runner is doing.  Instead, I would consider installing a beacon in ec2 test 
> that
> sends HTTP POSTs to a central CGI script.  The beacon would report the start,
> stop, report, and shutdown events for each run.  Auditing the logs would catch
> hung, disappeared, or otherwise AWOL runners.  (BTW, web.py is awesome for
> building such small web apps, and it is already on devpad for this purpose. 
> Hint
> hint ;)
>
> It is really difficult to gather facts about a randomly occurring error in a
> randomly run process initiated by 30 developers on a globally distributed 
> team.
>  I really think that automated data gathering makes sense as the next step.


I agree that email at the end won't help with debugging silent fails.

Perhaps:
 - capture stdout and stderr to two files on disk
 - change the @ script to shutdown to:
   - combine the stdout and stderr files and send the combined file to
a central place

Email would be a fine transport medium this way. We don't need little
chunks, we just need something before the end.

Fixing the buffering of the progress (theres a bug) would make this
very granular.

-Rob

_______________________________________________
Mailing list: https://launchpad.net/~launchpad-dev
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~launchpad-dev
More help   : https://help.launchpad.net/ListHelp

Re: [Launchpad-dev] anyone else finding ec2land things disappearing w/out warning?

Reply via email to