Re: Popularity contest for Fedora
On Sun, 27 Dec 2020 at 17:52, Matthew Miller wrote: > On Sun, Dec 27, 2020 at 07:44:57PM +0100, clime wrote: > > I think we can simply parse server-side access logs to count package > > downloads, no? > > We can for our primary server, but most people get updates from mirrors > which we don't run directly. The central mirrorlist (from which I get the > dnf count data) just redirects people to those mirrors. Even if we could > get > package download counts from the mirrors, they're heavily skewed by: > > * public mirrors pulling the whole thing > * people pulling the whole thing for a private mirror > * ci and build systems (like, running mock) > * mysterious bots downloading stuff for whatever reason > * proxies and caching > > There are a couple of other items which make it hard to see and impossible for even our primary servers to be useful. When you look at the logs, there is nothing that indicates whether a package is being installed, updated, or pulled in as a dependency. This means that any stats will show which packages get updated the most during a release or have a lot of sub-packages which might get pulled in. The mirroring effect also has a noise problem where a client got some of his packages from one mirror and then got mostly dependencies from a secondary mirror. Finally CI and build systems swamp all other downloads from mirrors these days. Depending on how they are setup some seem to do a ```yum install *``` before operating. My guess is that at least 60% of all traffic is CI these days. (I expect that this also the case for a lot of other distributions also). Packages with lots of updates sounds like they might be getting more interest but you have a lot of upstreams who do 2 week sprint releases which mean there are lots of regular updates. All in all, what you get by looking at a mirrors data is a 'reverse popularity contest'. Packages like the kernel, glibc, firefox, and every dependency which gets an update sits on top. Packages at the bottom may be the ones being asked for but they are also dependencies which aren't pulled in a lot or don't see an update. In the end I think popcorn might be better BUT they are also hard to setup in these days of trolls and GDPR. [Heck smolt had almost more trolls in it than regular data by the end of it.. so many people set up PDP-11 and VAX as their hardware running Fedora.] and probably more. Popcon and smolt are better because it's actual > individual system data. On the other than, they're worse as mentioned > because opt-in doesn't give a realistic picture. > > > -- > Matthew Miller > > Fedora Project Leader > ___ > devel mailing list -- devel@lists.fedoraproject.org > To unsubscribe send an email to devel-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org > -- Stephen J Smoogen. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Popularity contest for Fedora
On Sun, Dec 27, 2020 at 07:44:57PM +0100, clime wrote: > I think we can simply parse server-side access logs to count package > downloads, no? We can for our primary server, but most people get updates from mirrors which we don't run directly. The central mirrorlist (from which I get the dnf count data) just redirects people to those mirrors. Even if we could get package download counts from the mirrors, they're heavily skewed by: * public mirrors pulling the whole thing * people pulling the whole thing for a private mirror * ci and build systems (like, running mock) * mysterious bots downloading stuff for whatever reason * proxies and caching and probably more. Popcon and smolt are better because it's actual individual system data. On the other than, they're worse as mentioned because opt-in doesn't give a realistic picture. -- Matthew Miller Fedora Project Leader ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Popularity contest for Fedora
I think we can simply parse server-side access logs to count package downloads, no? That ignores the effect of caching proxies, which are prevalent in academic and corporate environments. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Popularity contest for Fedora
On 27.12.2020 19:44, clime wrote: I think we can simply parse server-side access logs to count package downloads, no? On every third-party mirror? -- Sincerely, Vitaly Zaitsev (vit...@easycoding.org) ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Popularity contest for Fedora
On Sun, 27 Dec 2020 at 17:41, Gary Buhrmaster wrote: > > On Sun, Dec 27, 2020 at 3:12 PM Matthew Miller > wrote: > > > It's been talked about before but no one has done it. > > There was also smolt, which collected some > system information (and could be extended > to collect more) However, not only did the > upstream die, follow-on proposals never > took off, and also opened the entire > can-of-worms regarding an opt-in data > collection mechanism (and it was agreed > by most it had to be opt-in) not being able to > provide useful data to actually make good > decisions on. It is also true that many wish > we did have sufficiently good data in order > to make good decisions. Rock, meet hard > place. I think we can simply parse server-side access logs to count package downloads, no? It won't be probably very precise but could be enough to give us a basic idea... clime > ___ > devel mailing list -- devel@lists.fedoraproject.org > To unsubscribe send an email to devel-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Popularity contest for Fedora
On Sun, Dec 27, 2020 at 3:12 PM Matthew Miller wrote: > It's been talked about before but no one has done it. There was also smolt, which collected some system information (and could be extended to collect more) However, not only did the upstream die, follow-on proposals never took off, and also opened the entire can-of-worms regarding an opt-in data collection mechanism (and it was agreed by most it had to be opt-in) not being able to provide useful data to actually make good decisions on. It is also true that many wish we did have sufficiently good data in order to make good decisions. Rock, meet hard place. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Popularity contest for Fedora
On Sat, Dec 26, 2020 at 05:33:39PM -0600, Ron Olson wrote: > Has anything like this been considered for Fedora? It would actually > be kind of nice to see installation statistics of my packages, if > only to determine if I’m the only one using them. :) It's been talked about before but no one has done it. -- Matthew Miller Fedora Project Leader ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Popularity contest for Fedora
On 27.12.2020 00:33, Ron Olson wrote: Has anything like this been considered for Fedora? It would actually be kind of nice to see installation statistics of my packages, if only to determine if I’m the only one using them. :) Telemetry and user tracking are evil. -- Sincerely, Vitaly Zaitsev (vit...@easycoding.org) ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org