Re: Who needs test report notifications and what's the best way to syndicate it?

David Golden Thu, 31 Dec 2009 13:25:16 -0800

On Thu, Dec 31, 2009 at 11:45 AM, Barbie <[email protected]> wrote:
> I currently poll the NNTP server every hour for updates. I think the RSS
> feed would be fine to use to know what to update, if it was updated
> frequently enough.


I think we can get it to be much, much more frequent.

> However, the bigger interest for me is whether I can grab the full
> reports in a bundle. A daily DB snapshot would be too infrequent for
> updates, but would be fine to ensure that everything was captured at the
> end of each day. Would requesting each report in turn be too much of a
> drain?

Metabase is supposed to support that kind of batch behavior, though
we've not tested it.  (It might not be implemented yet, even.)

> I'm currently working on supplying all the current reports via the
> existing Reports site, this will then relieve the NNTP archive on
> perl.org. Once I've got it running live in the next week or so, there
> will be a difinitive URL template that can be used to display the SMTP
> or HTTP submitted reports. This can be used in the RSS feed to relieve
> any direct burden on the Metabase itself.

I'm a bit concerned that we're pulling in separate directions.  Or at
least, if we're all touching separate parts of the elephant, I haven't
seen the whole elephant yet.  I think I need to find some time to jot
down or sketch out what I see as the new architecture and let people
react with what fits and what doesn't.

My goal is to get enough of the CT2.0/Metabase infrastructure up and
running that we can stop slinging around raw reports and start
referencing Metabase report fact objects.  Part of the goal of CT2.0
is to have reports as structured data, after all.

Here's a very quick overview of what I'm envisioning.

* Existing clients use Test::Reporter::Transport::Metabase to send
report objects to a master CT2.0 Metabase server on an Amazon EC2
virtual server (or load-balanced cluster of servers if necessary).

* Addition of the report to the master CT2.0 Metabase is syndicated in
a way that allows interested parties to update databases, post to IRC,
whatever.  The syndicated data could just be the "indexed" data about
the report (dist name, grade, platform, perl, etc.) and the GUID that
refererences the full report fact object in the CT2.0 Metabase.  (The
GUID replaces NNTP ID, but in a way that we can convert existing NNTP
ID's to GUIDs and vice versa.)  The point is that it  comes pre-parsed
rather than as a chunk of text.

* Some authorized site mirrors the actual report text locally to a
cheaper-to-operate server and serves them up via a standardized
link/template.  (Much like nntp.perl.org does now.)  Reports could be
put into a local metabase or stored in whatever format makes sense to
the administrator.  Worst case, they stay stored in S3 and the data is
fetched directly as needed.  (Expensive, but easy to implement.)

The framework for all the Metabase stuff already exists.  The metabase
backend on Amazon needs to be implemented.  (Not hard, just takes some
time.)  The syndication service needs to be written.  (Also not hard,
but easier to do after the backend design is done.)

I'm pretty confident in my ability to do the first part of that.  The
second I could do, but could use a volunteer to help write the
syndication service.  The third part -- substituting for nntp.perl.org
-- I'd also like a volunteer for.

And, of course, given the syndication service, do we (a) just wire
that up to the existing stats DB for downstream consumers or (b) make
it easy for downstream consumers to update their own DB's from the
syndication feed?

And, as i've said, in the *worst* case, if the syndication service
doesn't get done in time, it would be pretty trivial to regenerate the
stats database on an Amazon EC2 instance/cluster as often as necessary
(high bandwidth to S3 and zero cost) and then let
stats.cpantesters.org just rsync it every so often.  That's not the
best solution, but I'm trying to chart a path with a lot of options
for design redundancy.

I'll write this up further and try to describe components and broad tasks.

-- David

Re: Who needs test report notifications and what's the best way to syndicate it?

Reply via email to