Disk space is not a concern, and as long as ServerCentral is donating the database server, likely won't be: 80,000,000 records are taking up about 500G, and we've got twice that available on SSD, and 8x that available on NAS, and that was just hardware we had lying around and could toss together quickly (not downplaying ServerCentral's contribution, the database had crashed again and we needed help).
The toolchain team supports versions of Perl that p5p do not, and both teams support those versions for their own completely valid reasons. I do not intend on deleting data for any versions of Perl unless I can get basically unanimous consensus that the data is 100% worthless, and those opinions will have to come some from very, very convincing sources (like, get me two pumpkings and a supermajority of PAUSE admins and I might start entertaining the notion). Downstream systems like cpXXXan use data for those very old versions of Perl, for example, and anyone still using those old versions is invite to run CPAN Testers to help others running those versions. I shouldn't have mentioned partitioning data, as I can't even replicate the performance issues I was having. Barbie's latest changes to some website updating processes likely improved overall database performance, since there's fewer queries being run. The main concern is deleting data that is simply not valid, like Test-Simple v2. The primary maintainer of the distribution asked if I could delete the data, gave me a couple valid usability concerns about having this data, and explained that the versions of that dist have been deleted and the testers data has no useful purpose at all to him. CPAN Testers data isn't just for the distribution authors, of course. It's also for the users of the distribution. But, in the case of Test-Simple v2, there can be no users: It's not on CPAN, and to my knowledge (from the current maintainer) those releases were simply broken. "Broken" is not necessarily a valid reason to delete data from CPAN Testers (seeing which versions are broken is a primary reason for CPAN Testers), but taken together with all the other reasons to delete these versions of Test-Simple makes me think it's the most expedient and least disruptive option. I mention disruptive because, comparing the data on the CPAN Testers website to the data on the Matrix, I noticed that the CPAN Testers website does _not_ show Test-Simple v2. Which means there is a way to hide data that isn't also being done by the Matrix. To me, that makes it a non-solution: Forcing everyone using the data to figure out which data is invalid is too high a burden. If we simply hide the data from any and all APIs that CPAN Testers has, that is, to me, worse than deleting it, since it requires changing potentially every query in the CPAN Testers site and every index in the database, which would be disruptive, and it ensures the data is actually worthless, since there's no actual way to read it. So if we're going to make it worthless, why not just delete it? So, given all that, is there anything that could make that data useful enough to offset the usability concerns? Doug Bell d...@preaction.me > On Jul 5, 2017, at 11:18 PM, Ron Savage <r...@savage.net.au> wrote: > > Hi Karen > > Obviously I suggested as one mechanism to reduce the size of the disk space > used. > > As for versions, perhaps zap 5.10.0 and older? > > And yes, I'm aware there may be older versions running in the wild, but the > point is to manage the CPAN (Tester) infrastructure within the capabilities > of the volunteers, not to support everything endlessly. > > On 06/07/17 14:10, Karen Etheridge wrote: >> I'm not sure why you'd suggest it then? What Perl versions would *you* >> suggest be removed from cpantesters? >> >> I would contend that all results are valuable. Helping preserve >> backwards compatibility is one of the great things that the cpantesters >> network brings us. >> >> On Wed, Jul 5, 2017 at 8:15 PM, Ron Savage <r...@savage.net.au >> <mailto:r...@savage.net.au> >> <mailto:r...@savage.net.au <mailto:r...@savage.net.au>>> wrote: >> >> Hi Karen >> >> I paused over that statement, but I'm sure there are persons with >> the Perl 5 porters group, for example, who may be able to specify >> same, in a way I'm not qualified to do :-). >> >> On 06/07/17 11:50, Karen Etheridge wrote: >> >> Define "versions of Perl which are no longer supported"? >> >> On Wed, Jul 5, 2017 at 6:40 PM, Ron Savage <r...@savage.net.au >> <mailto:r...@savage.net.au> >> <mailto:r...@savage.net.au <mailto:r...@savage.net.au>> >> <mailto:r...@savage.net.au <mailto:r...@savage.net.au> >> <mailto:r...@savage.net.au <mailto:r...@savage.net.au>>>> wrote: >> >> Hi Doug >> >> How much change would result, and how difficult would it be, >> if you >> deleted data pertaining to version of Perl which are no longer >> supported? >> >> >> On 06/07/17 10:03, Doug Bell wrote: >> >> I've gotten a few requests to remove data from CPAN >> Testers. I don't >> know if that has ever been done, and I'm not sure if I'd >> like to >> start >> doing it, but there are some situations that I do not >> think can >> be fixed >> any other way: >> >> Chad Granum (Exodist) wants to see the latest releases of >> Test-Simple >> first, but their version numbers are lower than some >> previously-released >> versions (and we cannot, to my knowledge, be the arbiter of >> those kind >> of problems). This results in issues >> like http://matrix.cpantesters.org/?dist=Test-Simple >> <http://matrix.cpantesters.org/?dist=Test-Simple> >> <http://matrix.cpantesters.org/?dist=Test-Simple >> <http://matrix.cpantesters.org/?dist=Test-Simple>> >> <http://matrix.cpantesters.org/?dist=Test-Simple >> <http://matrix.cpantesters.org/?dist=Test-Simple> >> <http://matrix.cpantesters.org/?dist=Test-Simple >> <http://matrix.cpantesters.org/?dist=Test-Simple>>> and >> http://beta.cpantesters.org/chart.html?dist=Test-Simple >> <http://beta.cpantesters.org/chart.html?dist=Test-Simple> >> <http://beta.cpantesters.org/chart.html?dist=Test-Simple >> <http://beta.cpantesters.org/chart.html?dist=Test-Simple>> >> <http://beta.cpantesters.org/chart.html?dist=Test-Simple >> <http://beta.cpantesters.org/chart.html?dist=Test-Simple>> where >> these invalid versions are shown first. >> >> There's also simply a _lot_ of data on CPAN Testers, some of >> which is >> for distributions that are no longer installable on CPAN >> (distributions >> available only on Backpan). This slows down a lot of >> otherwise >> simple >> procedures. If there was a way to move this data to another >> place, a lot >> of the normal operation of the site would be faster. >> However, >> partitioning the site like this would likely require a >> bunch of >> changes >> to downstream systems (which would have to decide which >> side of the >> partition they want). This doesn't sound like an >> insurmountable >> problem: >> The APIs default to "all" and you can opt-in to the faster >> "cpan" partition. >> >> We could just fix the Test-Simple issue (and other similar >> issues) by >> deleting the data that will never be relevant again: >> Test-Simple >> 2.0 was >> never released and only got development versions. Then >> we could >> discuss/develop longer-term solutions for the other issues. >> >> Does anyone have any other thoughts/opinions? >> >> Doug Bell >> d...@preaction.me <mailto:d...@preaction.me> >> <mailto:d...@preaction.me <mailto:d...@preaction.me>> >> <mailto:d...@preaction.me <mailto:d...@preaction.me> >> <mailto:d...@preaction.me <mailto:d...@preaction.me>>> >> <mailto:d...@preaction.me <mailto:d...@preaction.me> >> <mailto:d...@preaction.me <mailto:d...@preaction.me>> >> <mailto:d...@preaction.me <mailto:d...@preaction.me> >> <mailto:d...@preaction.me <mailto:d...@preaction.me>>>> >> >> >> >> >> -- >> Ron Savage - savage.net.au <http://savage.net.au/> >> <http://savage.net.au <http://savage.net.au/>> >> <http://savage.net.au <http://savage.net.au/>> >> >> >> >> -- >> Ron Savage - savage.net.au <http://savage.net.au/> <http://savage.net.au >> <http://savage.net.au/>> >> >> > > -- > Ron Savage - savage.net.au <http://savage.net.au/>
signature.asc
Description: Message signed with OpenPGP