> On Jul 6, 2017, at 6:07 AM, Kent Fredric <kentfred...@gmail.com> wrote: > > On 6 July 2017 at 17:37, Doug Bell <d...@preaction.me> wrote: >> To me, that makes it a non-solution: >> Forcing everyone using the data to figure out which data is invalid is too >> high a burden. If we simply hide the data from any and all APIs that CPAN >> Testers has, that is, to me, worse than deleting it, since it requires >> changing potentially every query in the CPAN Testers site and every index in >> the database, which would be disruptive, and it ensures the data is actually >> worthless, since there's no actual way to read it. So if we're going to make >> it worthless, why not just delete it? > > I think it would be more useful to do it the other way around. > > Add a mechanism for authors to flag individual reports, as either > "invalid" (that is, was never relevant) or "archive" (was once > relevant, but now just confuses people who see it) > > And then do nothing with that data other than make it accessible from > some future API we add.
There presently is a way to flag reports as "invalid". This indeed may be the mechanism by which the CPAN Testers website hides Test-Simple's reports and the other websites do not. The website for flagging reports (https://admin.cpantesters.org <https://admin.cpantesters.org/>) may currently be broken, but it remains that, in this Test-Simple case, it is data that cannot be relevant. > Then, the default behaviour of retaining those reports forever is > preserved, but any tools that want to hide/filter data can do so (and > we'll encourage said tools to make it obvious they're hiding data when > they're hiding data, how much data they're hiding, and providing > mechanisms to disable data hiding ) > > This way the author isn't dictating what the user sees, only creating > a suggestion for how they might view the data, and its up to the > author/consumer as to if they trust those suggestions or not. There are also summary data tables that combine the data into simple counts (the "release" API that MetaCPAN uses). It's impossible to say which part of those counts are from flagged reports, so CPAN Testers just gives them the count. My guess is that it does not contain flagged reports, but I'd have to go read the queries that build these tables to be sure. -- Tangentially, it seems MetaCPAN solves this problem by ordering by release date instead of by release version: https://metacpan.org/pod/Test::Simple <https://metacpan.org/pod/Test::Simple> I could try to include the distribution's release date in all the APIs, and document that it is appropriate to order by {distribution date, distribution version}, but that's still a trap laid out for consumers of the data which can only be avoided by carefully reading the documentation and understanding that version numbers don't always go up. Doug Bell d...@preaction.me
signature.asc
Description: Message signed with OpenPGP