Okay, the general vibe I got from the responses is:
* Report metadata isn't useful enough without the report text
* Data for old distributions matters for the people maintaining systems that
use them
I think the best path forward might be to only do these things:
1. Archive the full report text for reports older than 5 years
2. Keep only the full report text in the database for 5 years
3. Keep metadata and statistics in the database forever
Then we can build a site that can read those old reports from the archive
files. Since the most common use-case for a visitor (and correct me if I'm
wrong) is to go look up the reports for a specific distribution on a specific
Perl/platform, no functionality is lost. With development we can even make some
filtering / searching of the archived reports possible.
At the moment, keeping metadata forever should not be a huge issue: If I fix
the metadata to remove some duplicate data and normalize it a bit better, I can
even make it smaller.
The full data retention policy then becomes the following decision tree:
* Reports (full report data)
* Reports submitted >5 years ago
* Release on CPAN
- Report archived
* Release not on CPAN
- Report archived
* Reports submitted <5 years ago
* Release on CPAN
+ Report available
* Release not on CPAN
+ Report available
* Metadata (release, Perl version, Perl architecture, OS name/version, test
reporter, date/time, pass/fail status)
* Reports submitted >5 years ago
* Release on CPAN
+ Metadata available
* Release not on CPAN
+ Metadata available
* Reports submitted <5 years ago
* Release on CPAN
+ Metadata available
* Release not on CPAN
+ Metadata available
* Statistics (release, pass/fail count)
* Reports submitted >5 years ago
* Release on CPAN
+ Statistics available
* Release not on CPAN
+ Statistics available
* Reports submitted <5 years ago
* Release on CPAN
+ Statistics available
* Release not on CPAN
+ Statistics available
I'll start planning out the scripts needed to achieve this, and when I'm ready
to do something, I'll make an announcement and give some time for additional
comments.
Thanks,
Doug Bell
[email protected]