Re: [gentoo-dev] [gentoostats continued] Collected data and justification for it
On Thu, 07 May 2020 09:29:36 +0200 Michał Górny wrote: > For example, if OCaml bindings on some package are broken and require > a lot of work, I would find useful to know how likely it is that anyone > is using it. Or if a lot of people are enabling 'frobnicate' flag, > I could consider employing USE defaults. For normal reporting, I'd suggest "counts of users" have some default presentation that encourages people to think of the data as incomplete. For example, instead of "0", it might print "<10", or say, "10: +/- 10" Or rank results in terms of relative numbers, "low", "high", etc. Or maybe incorporate time bounds with the information: "0 this month" Because even the people participating may not be participating frequently for all the niche things to turn up in every sample. Just working out a good way to calculate what the "error bars" should be is the hard part. pgpUKxGN_q6U5.pgp Description: OpenPGP digital signature
Re: [gentoo-dev] [gentoostats continued] Collected data and justification for it
On Sat, 09 May 2020 15:22:52 +0200 Gerion Entrup wrote: > I'm not sure, if Portage is capable of this, but a distinction in USE > flags needed to fulfil some dependency of another package and USE flags > actively activated by the user could be useful. Presently impossible, as how portage implements the former, is by churning that information via either "plz sir, set this use flag in your config", or via auto-tweaking config to assert "I wanted this, you now want it". After that happens, the information as to /who/ specified that want is lost. At very best you can make some inferences based on the comments that get injected, but that's not anywhere near 100%, esp in turn-key approach, or, alternatively, assert that if a flag is specified in configure *and* something depends on that flag being set, then its the dependent, not the user but that really isn't true on a regular basis. For instance, uh, USE="X" (global) -> install Foo (w/ USE="X") Foo depends on Bar[X?] So is "Bar:X" required by the user, or by "Foo", or both? And does the answer to that question depend in any way on whether B (or Foo) declares IUSE="+X" or IUSE="X" ? pgpIoRLoR5lPw.pgp Description: OpenPGP digital signature
Re: [gentoo-dev] [gentoostats continued] Collected data and justification for it
On Fri, 8 May 2020 09:49:18 +0200 Jaco Kroon wrote: > So we do need the full list of packages installed, filtered to ::gentoo, > but there needs to be an indicated whether it's installed because it's > in @world, as a dep of something in @world (which is possibly not in > ::gentoo), or is some form of no-longer needed dep. A dedicated report of orphans that are installed, from ::gentoo, would probably help here. Because you can't directly assume orphans are "user wanted", but they *can* be. That's why its important during --depclean to read the output and re-add any to @world you need kept. If you never depclean, then you get to skip that step. ( And I have *many* times added -1 to installation of something I wanted, out of habit, because I do so much manual hacking via emerge -1 that adding it is an impulse! ) pgpbZvnrigLQ0.pgp Description: OpenPGP digital signature
Re: [gentoo-dev] [gentoostats continued] Collected data and justification for it
> > I think we shouldn't collect any data unless we have a good plan on how > we'd be able to use it. In this thread, I'd like to collect ideas > on what data to collect and how it could realistically be used. > 5) CFLAGS and possibly related variables 6) "active" version of slotted system packages (gcc, binutils, python, ???) I see this as interesting for the toolchain maintenance, but also interesting in general since we are a source-based distro. * How many users are running LTO? doing Profiling? building generic (- march=x86_64) packages? using -Os or -O3, -funroll-loop (just kidding) * How quick is gcc / binutils / ... adoption? * clang usage? -- Andreas K. Hüttel dilfri...@gentoo.org Gentoo Linux developer (council, qa, toolchain, base-system, perl, libreoffice) signature.asc Description: This is a digitally signed message part.
Re: [gentoo-dev] [gentoostats continued] Collected data and justification for it
Am Donnerstag, 7. Mai 2020, 09:29:36 CEST schrieb Michał Górny: > I'm going to start with the data and uses I can think of. Please reply > with other things you can think of. > > > 1) list of selected packages (@world) > > We would use this to determine the popularity of individual packages, > plus by scanning their dependencies we would be able to make combined > statistics for direct usage + dependencies of other selected packages. > This would allow us to judge which packages need more of our attention. > > For example, as we port Python packages to Python 3.8 the packages with > more declared users would be ported first. You may want to collect packages installed per sets, too. I mainly do what Hans mentioned in his mail to this thread but with sets. For example I have a KDE-PIM set that installs only my needed subset of the KDE PIM suite. (I also use this as common workaround for yet missing runtime dependencies / suggestions made by pkg_postinst.) Retrieval of this packages would be straight forward: Look at world_sets and collect all packages that are installed by the set. > 2) USE flags on installed packages (disabled/default/enabled) > > This would allow us to determine which flags users are most likely to > actually rely on. This could determine tested flag combinations, > defaults, and required level of support for individual flags. > > For example, if OCaml bindings on some package are broken and require > a lot of work, I would find useful to know how likely it is that anyone > is using it. Or if a lot of people are enabling 'frobnicate' flag, > I could consider employing USE defaults. I'm not sure, if Portage is capable of this, but a distinction in USE flags needed to fulfil some dependency of another package and USE flags actively activated by the user could be useful. Dependency use flags should be treated with a higher priority in my opinion, since they enable the installation of another package (tree), while use flags that enable a certain feature that is not used elsewhere are more "nice to have". Best Gerion signature.asc Description: This is a digitally signed message part.
Re: [gentoo-dev] [gentoostats continued] Collected data and justification for it
Hi, On 2020/05/08 08:17, Hans de Graaff wrote: > On Thu, 2020-05-07 at 09:29 +0200, Michał Górny wrote: >> >> 1) list of selected packages (@world) >> >> We would use this to determine the popularity of individual packages, >> plus by scanning their dependencies we would be able to make combined >> statistics for direct usage + dependencies of other selected >> packages. >> This would allow us to judge which packages need more of our >> attention. > At work we install a lot of dependencies through a few company-specific > virtual packages, e.g. company/developer for all stuff useful for our > developers. These packages would then be missed in the statistics. I'm > not sure how prevalent this is and to what extend it wills skew the > statistics. You raise a valid point. The company/developer package itself I don't think is relevant. The fact that some/package::gentoo is installed as a dependency of company/developer may carry some relevance. So we do need the full list of packages installed, filtered to ::gentoo, but there needs to be an indicated whether it's installed because it's in @world, as a dep of something in @world (which is possibly not in ::gentoo), or is some form of no-longer needed dep. Otherwise I agree with Michał on the four items to be taken. I do still think that the ability to define additional information sets would be useful for building more invasive functionality sets, not necessarily supported by Gentoo. For an organization if they can define a set that grabs a certain amount of hardware details for example that could help with inventory management. Kind Regards, Jaco
Re: [gentoo-dev] [gentoostats continued] Collected data and justification for it
On Thu, 2020-05-07 at 09:29 +0200, Michał Górny wrote: > > > 1) list of selected packages (@world) > > We would use this to determine the popularity of individual packages, > plus by scanning their dependencies we would be able to make combined > statistics for direct usage + dependencies of other selected > packages. > This would allow us to judge which packages need more of our > attention. At work we install a lot of dependencies through a few company-specific virtual packages, e.g. company/developer for all stuff useful for our developers. These packages would then be missed in the statistics. I'm not sure how prevalent this is and to what extend it wills skew the statistics. Hans signature.asc Description: This is a digitally signed message part
[gentoo-dev] [gentoostats continued] Collected data and justification for it
Hi, The previous thread covered a few topics, in this one I'd like to focus on the data collected. So far people have indicated a few different kinds of data they'd find useful. However, I don't think enough attention has been put on explaining why they need the data and how they'd use it. I think we shouldn't collect any data unless we have a good plan on how we'd be able to use it. In this thread, I'd like to collect ideas on what data to collect and how it could realistically be used. I'm going to start with the data and uses I can think of. Please reply with other things you can think of. 1) list of selected packages (@world) We would use this to determine the popularity of individual packages, plus by scanning their dependencies we would be able to make combined statistics for direct usage + dependencies of other selected packages. This would allow us to judge which packages need more of our attention. For example, as we port Python packages to Python 3.8 the packages with more declared users would be ported first. 2) USE flags on installed packages (disabled/default/enabled) This would allow us to determine which flags users are most likely to actually rely on. This could determine tested flag combinations, defaults, and required level of support for individual flags. For example, if OCaml bindings on some package are broken and require a lot of work, I would find useful to know how likely it is that anyone is using it. Or if a lot of people are enabling 'frobnicate' flag, I could consider employing USE defaults. 3) System profile This would primarily allow us to establish how transition to new profiles proceeds and could influence the decision on prolonging the support for old ones. As a side effect, we'd have stats on how popular different architectures are. For example, it would help us see whether people are moving away from amd64 17.0 to 17.1. 4) Arch - installed package correlation This one could be considered a bit invasive but it would help us determine how important is keeping particular arch keywords on a package. For example, package A breaks on SPARC. Fixing it would require significant effort. If we know it has users on SPARC we're more likely to put that effort; otherwise, we may just drop SPARC keywords and move on. That's all really useful stuff I can think of right now. What's your angle? -- Best regards, Michał Górny signature.asc Description: This is a digitally signed message part
Re: [gentoo-dev] Gentoostats
Andreas K. Hüttel schrieb: And it would be extremely useful how many of our maintainer-needed packages are actually still compiled once per year. (Or if any one single person even uses KDE on ppc64.) Gentoostats is a typical stillbirth of the Gentoo Google Summer of Soon- Obsolete Code. Would I be happy if someone were to revive and actually deploy it (the last point is important!)? YES! Actually there is something in use already which would allow you to find out which packages are compiled when. It is a community website called GenTwoo: http://gentwoo.elisp.net/ There is not all information visible, and there could be some improvements of course, but it exists. Best regards, Chí-Thanh Christopher Nguyễn signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Gentoostats
On Sun, 24 Jan 2016 16:59:57 +0100 "Andreas K. Hüttel" wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA512 > > On Sunday 24 January 2016 16:50:46 Göktürk Yüksek wrote: > > > > I don't want to go off-topic here too much but this is more than a > > missing tools issue. There are privacy concerns regarding the > > collection of such information. I recall this proposed idea from > > Google Summer of Code: > > > > https://wiki.gentoo.org/wiki/Google_Summer_of_Code/2012/Ideas#Package_stati > > stics_reporting_tool > > This has been debated to death. As long as noone is forced to use it, > privacy concerns shouldnt be a problem. And it would be extremely > useful how many of our maintainer-needed packages are actually still > compiled once per year. (Or if any one single person even uses KDE on > ppc64.) you'd probably get much more reliable stats on package usage by gathering distfiles d/l stats from mirrors and mapping that to packages
Re: [gentoo-dev] Gentoostats
On Sun, Jan 24, 2016 at 4:59 PM, Andreas K. Hüttel wrote: > Gentoostats is a typical stillbirth of the Gentoo Google Summer of Soon- > Obsolete Code. Would I be happy if someone were to revive and actually deploy > it (the last point is important!)? YES! When I last looked into it, I couldn't actually access Gentoo infra to deploy it on. If that would be possible, I wouldn't mind taking a look at what can be done here. Cheers, Dirkjan
[gentoo-dev] Gentoostats
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On Sunday 24 January 2016 16:50:46 Göktürk Yüksek wrote: > > I don't want to go off-topic here too much but this is more than a > missing tools issue. There are privacy concerns regarding the > collection of such information. I recall this proposed idea from > Google Summer of Code: > > https://wiki.gentoo.org/wiki/Google_Summer_of_Code/2012/Ideas#Package_stati > stics_reporting_tool This has been debated to death. As long as noone is forced to use it, privacy concerns shouldnt be a problem. And it would be extremely useful how many of our maintainer-needed packages are actually still compiled once per year. (Or if any one single person even uses KDE on ppc64.) Gentoostats is a typical stillbirth of the Gentoo Google Summer of Soon- Obsolete Code. Would I be happy if someone were to revive and actually deploy it (the last point is important!)? YES! - -- Andreas K. Hüttel Gentoo Linux developer (council, perl, libreoffice) dilfri...@gentoo.org http://www.akhuettel.de/ -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBCgAGBQJWpPT9AAoJEHRrah2soMK+WjoP/2zsgRV565keOQdPaya/j5ak 0ga6F4xjf+XdAg4soPG+c0guN/Qz3tZtuIdDnl7NDaWBUBWGvA6DuqcKxPj3g0EQ X9EZTCigAsO+0d1F4cLMqW7JsL5YqTL4wHftzjuCqqSTD7OtX6NtOBA1namIDCoz MpmSArjjBy31oiJgDRRBDwCRAMoSErKEnkeyXVyuFyD4yV9E8PMOFcrNkeO2MFHy /Ehy0v14F5pTiGNeDnt7EDXNf5rcOFGUYTUitNyrhotUuX7sobcS9RfX2B9VtWUF pgg3zRKGJdpeKwRx3MFZZA/O8f5bPT3ne1dMLZ/LOjxgvt/CglG5G4K+iL3lFC9v WEeHj4zejXQuKlX1olWOgZdAYlt9bUmg7YO2K+OOPfQrTmqbShlnPFiAXuMTIS0h elnKY8I5e1flHbFicQg6lnT+qBriy7afYhj7WkGypzC8DAhI1N4/eROavrALCkMW nqNbEM0x4RiNdpgmdoN4L8dFBygXW73O4G8Iu5xjE1hKA6xUCmYitP3AsI5rVx1A Jt6A2edk3Zk/g584nZ07GIt4W5AceFlFhBaYxKNgAo8MZUUE5gzvcblbPTF7Si4t gTkjyXy0qabpvDBlimWxFENxGSIUqM/8N0YB1xba/FXNLn4KmTmD8Tezvze2c5Al Htxql3SYp7YaY0HFrYdx =lclI -END PGP SIGNATURE-
Re: [gentoo-dev] Gentoostats, SoC 2011
On Mon, Aug 29, 2011 at 9:53 PM, Jorge Manuel B. S. Vicetto wrote: > The point I was addressing is the suggestion that the above should be > possible and the idea that any single developer is "entitled" to do so. It's a moot point, because no one (that I see) claimed or is claiming to be entitled to that. In fact, Alec said > We did post to -dev, hence this thread. The point is that we don't > need any 'official opinion' to do anything; and I don't want to set > that precedent. If you have specific concerns about actions we plan > to take (which by the way, we are not planning an opt-out solution. > If we plan to do an opt-out solution, we will again have a thread on > -dev) then let us know He's not saying that no official opinion would be needed if they were doing an opt-out. He's saying that they don't need an official opinion *since* they aren't doing some sort of opt-out system. Not your fault, but this whole thread regarding the merits/legality/privacy of opt-out is completely irrelevant to the original topic. Matt
Re: [gentoo-dev] Gentoostats, SoC 2011
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 29-08-2011 21:23, Donnie Berkholz wrote: > On 03:18 Fri 26 Aug , Jorge Manuel B. S. Vicetto wrote: >> I've picked this message as I want to address one point in this >> thread that was focused on this sub-thread. I disagree with the >> idea that adding an application to the Gentoo tree that collects >> data from users and sends it to a central (or distributed) system >> is the same as adding any other application to the tree. Having >> the ability to add ebuilds to the tree is part of what you gain by >> getting gentoo-x86 access. Issues with significant users privacy >> concerns and substantial changes like adding packages to the tree >> that collect data from users and compile it, > > Like, oh, any package with a built-in bug reporting system? How many of those are part of the system set or get installed automatically on one's system without any intervention? Furthermore, how many of them are or will be programmed to send data automatically, without prior action of the user and possibly without trace? The point I was addressing is the suggestion that the above should be possible and the idea that any single developer is "entitled" to do so. - -- Regards, Jorge Vicetto (jmbsvicetto) - jmbsvicetto at gentoo dot org Gentoo- forums / Userrel / Devrel / KDE / Elections / RelEng -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJOXEKXAAoJEC8ZTXQF1qEP1ggP+gLBY9IiNjOaIxdQoJ1B/i2f KEmvyTddr4Grxjo8ZME7mefIHi/8ethrWKBuCgf//XshpCQ2r+xKtEgluQf4fX+w MAk9OePybbJJvIeATuoxb/nVYaihMZ7uuOtH5dqbDzhWMMsV0xkmTqgztrQM2v4X jE4yT2hPYV4Ir9OUljzJ5LTBkcdgwDKIjxSn/lUjvCWhNGKr081h6437fOuIQDYE kf+/nDU/UDngk7yKTH4Bgbd7pBNUe8Fu8HJ+7y8iwG0Y4mPW8VCFRHsBFTVNf2/p haX68uC/jPAsWEPO3/YO5rs8JDHNXqL+8zXRPjZn/E0cUkT13+Fa79vKXI6wTPK4 fwF+WZdmAmP/zW5Gs7w82wbML0S0KhQzfVmLu+ne3NBxGhrtnpEzFq6BQgzCtlNu p8vQjtCEVSpeHkTMt0St9/3qPMXhVc1DCRllD2OrEbFil1keHLutDHzIFLVxUZuE 9Fv+esWuTI7yzJjErbvT2OGzbpZMvPuho90QthIbSap/fIf6vK/DOgN+2FcJy0/7 PDtIq8fRL2NF/CQOxjwfGwkpyUK3ZWk7QCBh65MA4PiZHG1eZf5enlvg+WuqYHcC e14tvNVl0FeiW3lwCNy3/IOugSPpIatrbtHCImu0eaJ6oZqLP+OX6HZjpixJg2TP JEnebRBgj6z6VdT774gg =vmrl -END PGP SIGNATURE-
Re: [gentoo-dev] Gentoostats, SoC 2011
On 03:18 Fri 26 Aug , Jorge Manuel B. S. Vicetto wrote: > I've picked this message as I want to address one point in this thread > that was focused on this sub-thread. I disagree with the idea that > adding an application to the Gentoo tree that collects data from users > and sends it to a central (or distributed) system is the same as > adding any other application to the tree. Having the ability to add > ebuilds to the tree is part of what you gain by getting gentoo-x86 > access. Issues with significant users privacy concerns and substantial > changes like adding packages to the tree that collect data from users > and compile it, Like, oh, any package with a built-in bug reporting system? -- Thanks, Donnie Donnie Berkholz Council Member / Sr. Developer Gentoo Linux Blog: http://dberkholz.com pgplaqzIBXO04.pgp Description: PGP signature
Re: [gentoo-dev] Gentoostats, SoC 2011
Jorge Manuel B. S. Vicetto wrote: I've picked this message as I want to address one point in this thread that was focused on this sub-thread. I disagree with the idea that adding an application to the Gentoo tree that collects data from users and sends it to a central (or distributed) system is the same as adding any other application to the tree. Having the ability to add ebuilds to the tree is part of what you gain by getting gentoo-x86 access. Issues with significant users privacy concerns and substantial changes like adding packages to the tree that collect data from users and compile it, should not be at the discretion of individual developers but be subject of global policies that should take into account the legal ramifications (trustees) and reflect the developers desire and goals (council). - -- Regards, Jorge Vicetto (jmbsvicetto) - jmbsvicetto at gentoo dot org Gentoo- forums / Userrel / Devrel / KDE / Elections / RelEng Just picking a message to reply to at random here. Sorry Jorge, I thought common sense would kick in way before now. As a user, if ANY distro starts collecting data about me without my consent, I would be looking for something else to use. For people to even think that users want someone snooping on them is rather presumptuous. I have to also agree with the legal problems as well. Doing this without the users consent is going to lead to a huge legal mess. It would also taint Gentoo and Linux in general if this were to happen. Anyone who thinks it won't needs to talk to a lawyer and some common folks really soon. As a user, if this was done without my consent, saying I would be pissed would be to mild a term but one I am willing to use on a public forum. As a example, I have DirecTv. It has no connection other than the satellite cable. No telephone or anything. I don't want them snooping on what I watch on TV either. I also don't care to have Gentoo collecting data on what I use or other data either. If I wanted that, I could just use M$ stuff. I would expect such things from them and the huge EULA they have. Back to my hole. Dale :-) :-)
Re: [gentoo-dev] Gentoostats, SoC 2011
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 25-08-2011 14:35, Alec Warner wrote: > On Thu, Aug 25, 2011 at 5:20 AM, Rich Freeman > wrote: >> The big issue with opt-out is privacy law - especially in Europe >> (that's leaving aside just being up-front with users). We'd end >> up having to have EULAs or such and perhaps a number of other >> legal controls, and I don't think that is a direction that we want >> to go in. I'm just not seeing the upside - better to just figure >> out good ways to use data that is easy and safe to obtain first. >> >> Earlier somebody suggested that this decision wasn't really in the >> domain of the Council/Trustees. I'm not sure I agree here - any >> kind of opt-out data collection is something that has potential >> legal ramifications as well as huge reputation concerns for the >> distro (the software is distributed from Foundation-owned hardware >> utilizing a Foundation-owned domain name and the data goes back to >> Foundation-owned hardware - I'm sure any lawyer could make a case >> for this). Just because there isn't a policy written down >> somewhere doesn't mean that we can't use common sense. Devs >> certainly don't need to run everything past the Council, but if you >> want to do something high-profile post it on -dev, and if there is >> an uproar look for an official second opinion before doing it. > > We did post to -dev, hence this thread. The point is that we don't > need any 'official opinion' to do anything; and I don't want to set > that precedent. If you have specific concerns about actions we plan > to take (which by the way, we are not planning an opt-out solution. > If we plan to do an opt-out solution, we will again have a thread on > -dev) then let us know. If you have specific legal concerns about > the application, data retention, encryption, logs, backups, onerous > european privacy laws, and other such questions you should raise > those concerns now. I've picked this message as I want to address one point in this thread that was focused on this sub-thread. I disagree with the idea that adding an application to the Gentoo tree that collects data from users and sends it to a central (or distributed) system is the same as adding any other application to the tree. Having the ability to add ebuilds to the tree is part of what you gain by getting gentoo-x86 access. Issues with significant users privacy concerns and substantial changes like adding packages to the tree that collect data from users and compile it, should not be at the discretion of individual developers but be subject of global policies that should take into account the legal ramifications (trustees) and reflect the developers desire and goals (council). - -- Regards, Jorge Vicetto (jmbsvicetto) - jmbsvicetto at gentoo dot org Gentoo- forums / Userrel / Devrel / KDE / Elections / RelEng -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJOVxCXAAoJEC8ZTXQF1qEP7KAQAJBwDHp4aS+5l8gahHUrsWYI 0gUpO+qtsFODsKToQa4ZZ9jTZhFvN0iscyApXvgO8FBOnPzFCMiq+LblI/j/cnFK OwVYJ4/tvcc1C1fE1lQecd1kNVlnVLCEvR8NbeKA184ty4kS7cJy2FqAiWbzGGno /zNsQI+iDUg6ZCamCz29EZ5FJgfUzXzG+Ipbh61T0c/Ukugq5xHA8c5zTzoRre2u /fSRMM9qPakmgaHJoV8t+8B0ejJccW/+MquKIyFdDnUDvQH5U/RnXl3D5oe7+0vb Eak3VB5iUrkZifqhpOQMEeAtuNColigPy4oPr6BsQz7t0uiC2M0MHei4cigbN8kn yp4U+RZE4PhJ/+b/U/jnaiidGu8IF+Kdl3DPgCR130N4vbpO8u7KjyphdoL7QZx5 hnc3A5ZxQxraQolKtFnl8Be8P5NvuKdiP192wYmACuCw3W95XVNDtUhc63n++fqo 0K9WTEudO+JZN7JYZFSU6OJo5hvujHcQvvIO2sG30Q56x7EfvCRFCzMUsRC8mU0L uSKW+YFHVp1+yCJ9BbnTWp9afPUVQ56/1YtCxLDsqEi0lI7otm0TpuJFIC/fDJ1F Hf9Kqaap9kZzc1WBKuMY0Rvvf8CKf/9bd9QTxT5Fz/tpiNGkU9MTMFPHghDFUP8h 773YR/NFapQVLHyqemla =G4Y6 -END PGP SIGNATURE-
Re: [gentoo-dev] Gentoostats, SoC 2011
On Thu, Aug 25, 2011 at 10:35 AM, Alec Warner wrote: > We did post to -dev, hence this thread. My post was intended to be general in applicability, and not critical of the particular instance of this issue being discussed. I would generally suggest that implementing this as a package and not as a function built-into portage would tend to make more sense to me (do we really want portage to do EVERYTHING?). However, I don't think that anybody needs anybody's blessing in particular to take one course or the other there. And, in the Gentoo tradition of everybody-does-whatever-they-want-to, there is nothing wrong with one set of devs doing it one way and another set doing it another way so that we end up with two data repositories with somewhat redundant data so that we can start another discussion on -dev about what the differences in the datasets mean. That is, until eventually devs get bored and after enough bugs pile up one or both of the collection mechanisms gets treecleaned. Then in five years somebody can build a new one. :) If I had strong concerns with anything that seemed likely to get adopted I'd voice them. Rich
Re: [gentoo-dev] Gentoostats, SoC 2011
On Thu, Aug 25, 2011 at 5:20 AM, Rich Freeman wrote: > On Thu, Aug 25, 2011 at 6:48 AM, Roy Bamford wrote: >> It has to be opt-in as opt out would be a dangerous precendent to set. >> >> I don't see any harm is a gentle reminder message from emerge, provided >> that the reminder can be turned off too, if the user really does not >> want to opt in. Thats no worse than being nagged about unread news. > > I tend to agree, the more I think about it. > > The simplest solution (which doesn't require any portage mods/etc), is > to simply make this a package that installs the appropriate logic in > cron.daily, and we send out a news item encouraging users to install > it voluntarily. If the user does nothing, they don't get the package. > > If somebody can come up with really good reason that we should be more > aggressive in promoting it, then we can promote it more aggressively. > That /might/ go as far as a forced opt-in/out decision. However, the > more I think about it the more I'm concerned with pure opt-out by > default. Why is the thread bikeshedding an out-opt that we aren't even considering doing right now? > > The big issue with opt-out is privacy law - especially in Europe > (that's leaving aside just being up-front with users). We'd end up > having to have EULAs or such and perhaps a number of other legal > controls, and I don't think that is a direction that we want to go in. > I'm just not seeing the upside - better to just figure out good ways > to use data that is easy and safe to obtain first. > > Earlier somebody suggested that this decision wasn't really in the > domain of the Council/Trustees. I'm not sure I agree here - any kind > of opt-out data collection is something that has potential legal > ramifications as well as huge reputation concerns for the distro (the > software is distributed from Foundation-owned hardware utilizing a > Foundation-owned domain name and the data goes back to > Foundation-owned hardware - I'm sure any lawyer could make a case for > this). Just because there isn't a policy written down somewhere > doesn't mean that we can't use common sense. Devs certainly don't > need to run everything past the Council, but if you want to do > something high-profile post it on -dev, and if there is an uproar look > for an official second opinion before doing it. We did post to -dev, hence this thread. The point is that we don't need any 'official opinion' to do anything; and I don't want to set that precedent. If you have specific concerns about actions we plan to take (which by the way, we are not planning an opt-out solution. If we plan to do an opt-out solution, we will again have a thread on -dev) then let us know. If you have specific legal concerns about the application, data retention, encryption, logs, backups, onerous european privacy laws, and other such questions you should raise those concerns now. > > Rich > >
Re: [gentoo-dev] Gentoostats, SoC 2011
On Thu, Aug 25, 2011 at 6:48 AM, Roy Bamford wrote: > It has to be opt-in as opt out would be a dangerous precendent to set. > > I don't see any harm is a gentle reminder message from emerge, provided > that the reminder can be turned off too, if the user really does not > want to opt in. Thats no worse than being nagged about unread news. I tend to agree, the more I think about it. The simplest solution (which doesn't require any portage mods/etc), is to simply make this a package that installs the appropriate logic in cron.daily, and we send out a news item encouraging users to install it voluntarily. If the user does nothing, they don't get the package. If somebody can come up with really good reason that we should be more aggressive in promoting it, then we can promote it more aggressively. That /might/ go as far as a forced opt-in/out decision. However, the more I think about it the more I'm concerned with pure opt-out by default. The big issue with opt-out is privacy law - especially in Europe (that's leaving aside just being up-front with users). We'd end up having to have EULAs or such and perhaps a number of other legal controls, and I don't think that is a direction that we want to go in. I'm just not seeing the upside - better to just figure out good ways to use data that is easy and safe to obtain first. Earlier somebody suggested that this decision wasn't really in the domain of the Council/Trustees. I'm not sure I agree here - any kind of opt-out data collection is something that has potential legal ramifications as well as huge reputation concerns for the distro (the software is distributed from Foundation-owned hardware utilizing a Foundation-owned domain name and the data goes back to Foundation-owned hardware - I'm sure any lawyer could make a case for this). Just because there isn't a policy written down somewhere doesn't mean that we can't use common sense. Devs certainly don't need to run everything past the Council, but if you want to do something high-profile post it on -dev, and if there is an uproar look for an official second opinion before doing it. Rich
Re: [gentoo-dev] Gentoostats, SoC 2011
On 2011.08.24 11:48, Patrick Lauer wrote: [snip] > > If you sneakily add something to cron.daily by default you can get > pretty nice coverage. But I guess anyone trying that in Gentooland > will > meet some rather unpleasant resistance :) > > > > This app and if its opt in or opt out will set a precedence for any future apps that want automatic user feedback in Gentoo It has to be opt-in as opt out would be a dangerous precendent to set. I don't see any harm is a gentle reminder message from emerge, provided that the reminder can be turned off too, if the user really does not want to opt in. Thats no worse than being nagged about unread news. -- Regards, Roy Bamford (Neddyseagoon) a member of elections gentoo-ops forum-mods trustees pgpz8BkPEPndt.pgp Description: PGP signature
Re: [gentoo-dev] Gentoostats, SoC 2011
On Wed, 24 Aug 2011 13:03:44 +0200 "Andreas K. Huettel" wrote: > Am Mittwoch 24 August 2011, 12:48:35 schrieb Patrick Lauer: > > > > If you sneakily add something to cron.daily by default you can get > > pretty nice coverage. But I guess anyone trying that in Gentooland > > will meet some rather unpleasant resistance :) > > > > Of course, we could place it in some blatantly obvious way into a > default configuration, together with a big fat message what it does > and how to quickly disable it. > > We'd get better coverage in an opt-out system than in an opt-in > system. And a larger number of angry users which missed the warning and now have to pay for additional GPRS transfer or so. And when people use GPRS rarely, they usually don't think about random apps that use the connection in background. > (First idea- package is pulled in by a default-on useflag and > installs itself into cron.daily. BEFORE it runs the first time it > outputs said message and asks for permission to proceed (which cannot > be done in the cron job obviously but we'd find a way).) And what if it can't ask for that? Assuming you're talking about 'opt-out', I guess the fallback would be to 'yes'. We don't want to end up like Windows, where you get AFK for five minutes and then discover the system has rebooted. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] Gentoostats, SoC 2011
On Wed, Aug 24, 2011 at 5:05 AM, Rich Freeman wrote: > On Wed, Aug 24, 2011 at 7:45 AM, Thomas Kahle wrote: >> Sorry, but NO. If you want you can make a big noise message that asks >> users to install the cron-job but opt-out is not an option here. > > Well, that's up to the Council/Trustees ultimately, but opinions (and > better still reasoning) are welcome since both would no-doubt want to > reflect the will of the community (and whatever is legal in the > jurisdictions that matter). It doesn't take a council vote nor a trustees vote to add a package to everyone's machine. In the end I'd recommend just looking at the opt-in numbers. Is the data useful from opt-in users? If the answer is no, then we can always think up other ways to get more users. Will auto-installs be on the list of ideas? You bet ;) But I think we are putting the cart before the horse. > > One option that many distros employ is a forced opt-in/out decision. > During the install process they simply ask the user, and they have to > hit either yes or no to continue. The reason most people don't opt-in > is that they don't think about it, and this forces the issue. > > The Gentoo analogue would be to put something in make.conf or whatever > that must be set one way or another. Maybe have an opt-in use flag > and an opt-out use flag and if you don't set either emerge just dies > with a notice or something. No doubt somebody could come up with a > more elegant solution. The stage3 tarball doesn't even come with a dhcp client; so I don't really see how installing a stats client makes sense from the standpoint of 'only what is necessary.' For many people, that is an important part of Gentoo (cf. python3...) Making emerge die unless you make a decision will probably break a bunch of shit (plenty of people have automatic installs in some fashion.) We would have to use an existing methodology to avoid breaking them (PROPERTIES=interactive?) > > Maybe another line of discussion that could inform the debate is what > the value of this information is? For a company, knowing what > packages are popular helps them to allocate resources. Gentoo is a > volunteer effort and devs allocate their effort based on personal > preference, though perhaps some would care about package popularity to > an extent. So, we might not benefit to the same degree from this kind > of information, since we can't crack the whip and force people to fix > some broken package that is popular. I think at present we don't know the informations value; that is part of why considering opt-out is premature ;) > > Rich > >
Re: [gentoo-dev] Gentoostats, SoC 2011
On Wed, Aug 24, 2011 at 7:45 AM, Thomas Kahle wrote: > Sorry, but NO. If you want you can make a big noise message that asks > users to install the cron-job but opt-out is not an option here. Well, that's up to the Council/Trustees ultimately, but opinions (and better still reasoning) are welcome since both would no-doubt want to reflect the will of the community (and whatever is legal in the jurisdictions that matter). One option that many distros employ is a forced opt-in/out decision. During the install process they simply ask the user, and they have to hit either yes or no to continue. The reason most people don't opt-in is that they don't think about it, and this forces the issue. The Gentoo analogue would be to put something in make.conf or whatever that must be set one way or another. Maybe have an opt-in use flag and an opt-out use flag and if you don't set either emerge just dies with a notice or something. No doubt somebody could come up with a more elegant solution. Maybe another line of discussion that could inform the debate is what the value of this information is? For a company, knowing what packages are popular helps them to allocate resources. Gentoo is a volunteer effort and devs allocate their effort based on personal preference, though perhaps some would care about package popularity to an extent. So, we might not benefit to the same degree from this kind of information, since we can't crack the whip and force people to fix some broken package that is popular. Rich
Re: [gentoo-dev] Gentoostats, SoC 2011
i am a user and i am ok with opt-out if the std data that is transferd is compleatly anonymized so no sensitive data. and if the user wants to register his/her machine pkg's more data is trasnfered thx Mario 2011/8/24 Thomas Kahle : > On 13:03 Wed 24 Aug 2011, Andreas K. Huettel wrote: >> Am Mittwoch 24 August 2011, 12:48:35 schrieb Patrick Lauer: >> > >> > If you sneakily add something to cron.daily by default you can get >> > pretty nice coverage. But I guess anyone trying that in Gentooland will >> > meet some rather unpleasant resistance :) >> > >> >> Of course, we could place it in some blatantly obvious way into a default >> configuration, together with a big fat message what it does and how to >> quickly disable it. >> >> We'd get better coverage in an opt-out system than in an opt-in system. >> >> (First idea- package is pulled in by a default-on useflag and installs >> itself into cron.daily. BEFORE it runs the first time it outputs said >> message and asks for permission to proceed (which cannot be done in the cron >> job obviously but we'd find a way).) > > Sorry, but NO. If you want you can make a big noise message that asks > users to install the cron-job but opt-out is not an option here. > > > > -- > Thomas Kahle > http://dev.gentoo.org/~tomka/ >
Re: [gentoo-dev] Gentoostats, SoC 2011
On 13:03 Wed 24 Aug 2011, Andreas K. Huettel wrote: > Am Mittwoch 24 August 2011, 12:48:35 schrieb Patrick Lauer: > > > > If you sneakily add something to cron.daily by default you can get > > pretty nice coverage. But I guess anyone trying that in Gentooland will > > meet some rather unpleasant resistance :) > > > > Of course, we could place it in some blatantly obvious way into a default > configuration, together with a big fat message what it does and how to > quickly disable it. > > We'd get better coverage in an opt-out system than in an opt-in system. > > (First idea- package is pulled in by a default-on useflag and installs itself > into cron.daily. BEFORE it runs the first time it outputs said message and > asks for permission to proceed (which cannot be done in the cron job > obviously but we'd find a way).) Sorry, but NO. If you want you can make a big noise message that asks users to install the cron-job but opt-out is not an option here. -- Thomas Kahle http://dev.gentoo.org/~tomka/ signature.asc Description: Digital signature
Re: [gentoo-dev] Gentoostats, SoC 2011
On 12:48 Wed 24 Aug 2011, Patrick Lauer wrote: > On 08/24/11 12:31, Thomas Kahle wrote: > > Hi, > > > > On 18:16 Tue 23 Aug 2011, Andreas K. Huettel wrote: > >> there is one important aspect of your program that really needs to be > >> documented (and comments in the code are not enough): > >> > >> What data exactly is the client sending to the server?! > >> > >> What you need is basically an easy-to-find file / web page / ... where > >> this is explained concise and in simple words. As long as that does > >> not exist, your program will not find much acceptance. > > > > > > You may look at the files README and FAQ for Ubuntu's popularity > > contest: http://popcon.ubuntu.com/ > > > > If we could get their turnout rates, that'd be great. > > If you sneakily add something to cron.daily by default you can get > pretty nice coverage. But I guess anyone trying that in Gentooland will > meet some rather unpleasant resistance :) Oh yeah... when I used Ubuntu last 11/06 it would still ask you on install. @Vikraman: I guess you see how *important* it is to be completely open and explain everything the program does. On Gentoo it should of course be opt-in, instead of opt-out. -- Thomas Kahle http://dev.gentoo.org/~tomka/ signature.asc Description: Digital signature
Re: [gentoo-dev] Gentoostats, SoC 2011
On Wed, Aug 24, 2011 at 6:48 AM, Patrick Lauer wrote: > If you sneakily add something to cron.daily by default you can get > pretty nice coverage. But I guess anyone trying that in Gentooland will > meet some rather unpleasant resistance :) Well, we could always broadcast the news widely (lists, forums, eselect news, and so on). I'd also make it controllable via use flag. Put the client and the cron.daily file in a package, and then make that a use-dependency of something everybody has (the profile if profiles support this (don't think they do), and if not pick something that correlates well with people who would benefit from this feature. Users can opt-out via use flag. You can also start out with it being opt-in (use flag off by default in profiles), and then turn it on later (with notice/etc). The key is to not be sneaky about it. Rich
Re: [gentoo-dev] Gentoostats, SoC 2011
Am Mittwoch 24 August 2011, 12:48:35 schrieb Patrick Lauer: > > If you sneakily add something to cron.daily by default you can get > pretty nice coverage. But I guess anyone trying that in Gentooland will > meet some rather unpleasant resistance :) > Of course, we could place it in some blatantly obvious way into a default configuration, together with a big fat message what it does and how to quickly disable it. We'd get better coverage in an opt-out system than in an opt-in system. (First idea- package is pulled in by a default-on useflag and installs itself into cron.daily. BEFORE it runs the first time it outputs said message and asks for permission to proceed (which cannot be done in the cron job obviously but we'd find a way).) -- Andreas K. Huettel Gentoo Linux developer - kde, sci, arm, tex dilfri...@gentoo.org http://www.akhuettel.de/
Re: [gentoo-dev] Gentoostats, SoC 2011
On 08/24/11 12:31, Thomas Kahle wrote: > Hi, > > On 18:16 Tue 23 Aug 2011, Andreas K. Huettel wrote: >> there is one important aspect of your program that really needs to be >> documented (and comments in the code are not enough): >> >> What data exactly is the client sending to the server?! >> >> What you need is basically an easy-to-find file / web page / ... where >> this is explained concise and in simple words. As long as that does >> not exist, your program will not find much acceptance. > > > You may look at the files README and FAQ for Ubuntu's popularity > contest: http://popcon.ubuntu.com/ > > If we could get their turnout rates, that'd be great. If you sneakily add something to cron.daily by default you can get pretty nice coverage. But I guess anyone trying that in Gentooland will meet some rather unpleasant resistance :)
Re: [gentoo-dev] Gentoostats, SoC 2011
Hi, On 18:16 Tue 23 Aug 2011, Andreas K. Huettel wrote: > there is one important aspect of your program that really needs to be > documented (and comments in the code are not enough): > > What data exactly is the client sending to the server?! > > What you need is basically an easy-to-find file / web page / ... where > this is explained concise and in simple words. As long as that does > not exist, your program will not find much acceptance. You may look at the files README and FAQ for Ubuntu's popularity contest: http://popcon.ubuntu.com/ If we could get their turnout rates, that'd be great. > Apart from that, I like the entire project, and am curious about its > results. +1 It has come up several times that getting usage statistics would motivate developers. Cheers, Thomas -- Thomas Kahle http://dev.gentoo.org/~tomka/ signature.asc Description: Digital signature
Re: [gentoo-dev] Gentoostats, SoC 2011
Hi Vikram, there is one important aspect of your program that really needs to be documented (and comments in the code are not enough): What data exactly is the client sending to the server?! What you need is basically an easy-to-find file / web page / ... where this is explained concise and in simple words. As long as that does not exist, your program will not find much acceptance. Apart from that, I like the entire project, and am curious about its results. Best, Andreas Am Montag 22 August 2011, 23:20:30 schrieb Vikraman: > Hi all, > > Gentoostats[0] is a GSoC 2011 project to collect package statistics from > gentoo > machines. Please check it out. Bug reports and feature suggestions are > welcome. > > To submit your stats, use the app-portage/gentoostats ebuild from betagarden > overlay[1]. > > [0] https://soc.dev.gentoo.org/gentoostats/ > [1] https://soc.dev.gentoo.org/gentoostats/about > > -- Andreas K. Huettel Gentoo Linux developer - kde, sci, arm, tex dilfri...@gentoo.org http://www.akhuettel.de/
[gentoo-dev] Gentoostats, SoC 2011
Hi all, Gentoostats[0] is a GSoC 2011 project to collect package statistics from gentoo machines. Please check it out. Bug reports and feature suggestions are welcome. To submit your stats, use the app-portage/gentoostats ebuild from betagarden overlay[1]. [0] https://soc.dev.gentoo.org/gentoostats/ [1] https://soc.dev.gentoo.org/gentoostats/about -- Vikraman signature.asc Description: PGP signature