Re: [gentoo-dev] [gentoostats continued] Collected data and justification for it

2020-05-09 Thread Kent Fredric
On Thu, 07 May 2020 09:29:36 +0200
Michał Górny  wrote:

> For example, if OCaml bindings on some package are broken and require
> a lot of work, I would find useful to know how likely it is that anyone
> is using it.  Or if a lot of people are enabling 'frobnicate' flag,
> I could consider employing USE defaults.

For normal reporting, I'd suggest "counts of users" have some default
presentation that encourages people to think of the data as incomplete.

For example, instead of "0", it might print "<10", or say, "10: +/- 10"

Or rank results in terms of relative numbers, "low", "high", etc.

Or maybe incorporate time bounds with the information:

   "0 this month"

Because even the people participating may not be participating
frequently for all the niche things to turn up in every sample.

Just working out a good way to calculate what the "error bars" should
be is the hard part.


pgpUKxGN_q6U5.pgp
Description: OpenPGP digital signature


Re: [gentoo-dev] [gentoostats continued] Collected data and justification for it

2020-05-09 Thread Kent Fredric
On Sat, 09 May 2020 15:22:52 +0200
Gerion Entrup  wrote:

> I'm not sure, if Portage is capable of this, but a distinction in USE
> flags needed to fulfil some dependency of another package and USE flags
> actively activated by the user could be useful.

Presently impossible, as how portage implements the former, is by
churning that information via either "plz sir, set this use flag in
your config", or via auto-tweaking config to assert "I wanted this, you
now want it".

After that happens, the information as to /who/ specified that want is
lost. 

At very best you can make some inferences based on the comments
that get injected, but that's not anywhere near 100%, esp in turn-key
approach, or, alternatively, assert that if a flag is specified in
configure *and* something depends on that flag being set, then its the
dependent, not the user  but that really isn't true on a regular
basis. 

For instance, uh, USE="X" (global) -> install Foo (w/ USE="X")

Foo depends on Bar[X?]

So is "Bar:X" required by the user, or by "Foo", or both?

And does the answer to that question depend in any way on whether B (or
Foo) declares IUSE="+X" or IUSE="X" ?




pgpIoRLoR5lPw.pgp
Description: OpenPGP digital signature


Re: [gentoo-dev] [gentoostats continued] Collected data and justification for it

2020-05-09 Thread Kent Fredric
On Fri, 8 May 2020 09:49:18 +0200
Jaco Kroon  wrote:

> So we do need the full list of packages installed, filtered to ::gentoo,
> but there needs to be an indicated whether it's installed because it's
> in @world, as a dep of something in @world (which is possibly not in
> ::gentoo), or is some form of no-longer needed dep.

A dedicated report of orphans that are installed, from ::gentoo, would
probably help here.

Because you can't directly assume orphans are "user wanted", but they
*can* be.

That's why its important during --depclean to read the output and
re-add any to @world you need kept.

If you never depclean, then you get to skip that step.

( And I have *many* times added -1 to installation of something I
wanted, out of habit, because I do so much manual hacking via emerge -1
that adding it is an impulse! )


pgpbZvnrigLQ0.pgp
Description: OpenPGP digital signature


Re: [gentoo-dev] [gentoostats continued] Collected data and justification for it

2020-05-09 Thread Andreas K . Hüttel
> 
> I think we shouldn't collect any data unless we have a good plan on how
> we'd be able to use it.  In this thread, I'd like to collect ideas
> on what data to collect and how it could realistically be used.
> 

5) CFLAGS and possibly related variables
6) "active" version of slotted system packages (gcc, binutils, python, ???)

I see this as interesting for the toolchain maintenance, but also interesting 
in general since we are a source-based distro.

* How many users are running LTO? doing Profiling? building generic (-
march=x86_64) packages? using -Os or -O3, -funroll-loop (just kidding)
* How quick is gcc / binutils / ... adoption?
* clang usage?


-- 
Andreas K. Hüttel
dilfri...@gentoo.org
Gentoo Linux developer 
(council, qa, toolchain, base-system, perl, libreoffice)

signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] [gentoostats continued] Collected data and justification for it

2020-05-09 Thread Gerion Entrup
Am Donnerstag, 7. Mai 2020, 09:29:36 CEST schrieb Michał Górny:
> I'm going to start with the data and uses I can think of.  Please reply
> with other things you can think of.
> 
> 
> 1) list of selected packages (@world)
> 
> We would use this to determine the popularity of individual packages,
> plus by scanning their dependencies we would be able to make combined
> statistics for direct usage + dependencies of other selected packages. 
> This would allow us to judge which packages need more of our attention.
> 
> For example, as we port Python packages to Python 3.8 the packages with
> more declared users would be ported first.

You may want to collect packages installed per sets, too.
I mainly do what Hans mentioned in his mail to this thread but with sets.
For example I have a KDE-PIM set that installs only my needed subset of
the KDE PIM suite. (I also use this as common workaround for yet missing
runtime dependencies / suggestions made by pkg_postinst.)

Retrieval of this packages would be straight forward: Look at world_sets
and collect all packages that are installed by the set.

 
> 2) USE flags on installed packages (disabled/default/enabled)
> 
> This would allow us to determine which flags users are most likely to
> actually rely on.  This could determine tested flag combinations,
> defaults, and required level of support for individual flags.
> 
> For example, if OCaml bindings on some package are broken and require
> a lot of work, I would find useful to know how likely it is that anyone
> is using it.  Or if a lot of people are enabling 'frobnicate' flag,
> I could consider employing USE defaults.

I'm not sure, if Portage is capable of this, but a distinction in USE
flags needed to fulfil some dependency of another package and USE flags
actively activated by the user could be useful.

Dependency use flags should be treated with a higher priority in my
opinion, since they enable the installation of another package (tree),
while use flags that enable a certain feature that is not used elsewhere
are more "nice to have".


Best
Gerion


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] [gentoostats continued] Collected data and justification for it

2020-05-08 Thread Jaco Kroon
Hi,

On 2020/05/08 08:17, Hans de Graaff wrote:
> On Thu, 2020-05-07 at 09:29 +0200, Michał Górny wrote:
>>
>> 1) list of selected packages (@world)
>>
>> We would use this to determine the popularity of individual packages,
>> plus by scanning their dependencies we would be able to make combined
>> statistics for direct usage + dependencies of other selected
>> packages. 
>> This would allow us to judge which packages need more of our
>> attention.
> At work we install a lot of dependencies through a few company-specific 
> virtual packages, e.g. company/developer for all stuff useful for our
> developers. These packages would then be missed in the statistics. I'm
> not sure how prevalent this is and to what extend it wills skew the
> statistics.

You raise a valid point.

The company/developer package itself I don't think is relevant.

The fact that some/package::gentoo is installed as a dependency of
company/developer may carry some relevance.

So we do need the full list of packages installed, filtered to ::gentoo,
but there needs to be an indicated whether it's installed because it's
in @world, as a dep of something in @world (which is possibly not in
::gentoo), or is some form of no-longer needed dep.

Otherwise I agree with Michał on the four items to be taken.

I do still think that the ability to define additional information sets
would be useful for building more invasive functionality sets, not
necessarily supported by Gentoo.  For an organization if they can define
a set that grabs a certain amount of hardware details for example that
could help with inventory management.

Kind Regards,
Jaco




Re: [gentoo-dev] [gentoostats continued] Collected data and justification for it

2020-05-08 Thread Hans de Graaff
On Thu, 2020-05-07 at 09:29 +0200, Michał Górny wrote:
> 
> 
> 1) list of selected packages (@world)
> 
> We would use this to determine the popularity of individual packages,
> plus by scanning their dependencies we would be able to make combined
> statistics for direct usage + dependencies of other selected
> packages. 
> This would allow us to judge which packages need more of our
> attention.

At work we install a lot of dependencies through a few company-specific 
virtual packages, e.g. company/developer for all stuff useful for our
developers. These packages would then be missed in the statistics. I'm
not sure how prevalent this is and to what extend it wills skew the
statistics.

Hans


signature.asc
Description: This is a digitally signed message part


Re: [gentoo-dev] Gentoostats

2016-01-24 Thread Dirkjan Ochtman
On Sun, Jan 24, 2016 at 4:59 PM, Andreas K. Hüttel  wrote:
> Gentoostats is a typical stillbirth of the Gentoo Google Summer of Soon-
> Obsolete Code. Would I be happy if someone were to revive and actually deploy
> it (the last point is important!)? YES!

When I last looked into it, I couldn't actually access Gentoo infra to
deploy it on. If that would be possible, I wouldn't mind taking a look
at what can be done here.

Cheers,

Dirkjan



Re: [gentoo-dev] Gentoostats

2016-01-24 Thread Alexis Ballier
On Sun, 24 Jan 2016 16:59:57 +0100
"Andreas K. Hüttel"  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
> 
> On Sunday 24 January 2016 16:50:46 Göktürk Yüksek wrote:
> > 
> > I don't want to go off-topic here too much but this is more than a
> > missing tools issue. There are privacy concerns regarding the
> > collection of such information. I recall this proposed idea from
> > Google Summer of Code:
> > 
> > https://wiki.gentoo.org/wiki/Google_Summer_of_Code/2012/Ideas#Package_stati
> > stics_reporting_tool  
> 
> This has been debated to death. As long as noone is forced to use it,
> privacy concerns shouldnt be a problem. And it would be extremely
> useful how many of our maintainer-needed packages are actually still
> compiled once per year. (Or if any one single person even uses KDE on
> ppc64.)

you'd probably get much more reliable stats on package usage by
gathering distfiles d/l stats from mirrors and mapping that to packages



Re: [gentoo-dev] Gentoostats

2016-01-24 Thread Chí-Thanh Christopher Nguyễn

Andreas K. Hüttel schrieb:

And it would be extremely useful how many of our maintainer-needed
packages are actually still compiled once per year. (Or if any one
single person even uses KDE on ppc64.)

Gentoostats is a typical stillbirth of the Gentoo Google Summer of
Soon- Obsolete Code. Would I be happy if someone were to revive and
actually deploy it (the last point is important!)? YES!


Actually there is something in use already which would allow you to find
out which packages are compiled when. It is a community website called GenTwoo:

http://gentwoo.elisp.net/

There is not all information visible, and there could be some improvements
of course, but it exists.

Best regards,
Chí-Thanh Christopher Nguyễn




signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-29 Thread Donnie Berkholz
On 03:18 Fri 26 Aug , Jorge Manuel B. S. Vicetto wrote:
 I've picked this message as I want to address one point in this thread 
 that was focused on this sub-thread. I disagree with the idea that 
 adding an application to the Gentoo tree that collects data from users 
 and sends it to a central (or distributed) system is the same as 
 adding any other application to the tree. Having the ability to add 
 ebuilds to the tree is part of what you gain by getting gentoo-x86 
 access. Issues with significant users privacy concerns and substantial 
 changes like adding packages to the tree that collect data from users 
 and compile it,

Like, oh, any package with a built-in bug reporting system?

-- 
Thanks,
Donnie

Donnie Berkholz
Council Member / Sr. Developer
Gentoo Linux
Blog: http://dberkholz.com


pgplaqzIBXO04.pgp
Description: PGP signature


Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-29 Thread Jorge Manuel B. S. Vicetto
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 29-08-2011 21:23, Donnie Berkholz wrote:
 On 03:18 Fri 26 Aug , Jorge Manuel B. S. Vicetto wrote:
 I've picked this message as I want to address one point in this 
 thread that was focused on this sub-thread. I disagree with the 
 idea that adding an application to the Gentoo tree that collects 
 data from users and sends it to a central (or distributed) system 
 is the same as adding any other application to the tree. Having
 the ability to add ebuilds to the tree is part of what you gain by 
 getting gentoo-x86 access. Issues with significant users privacy 
 concerns and substantial changes like adding packages to the tree 
 that collect data from users and compile it,
 
 Like, oh, any package with a built-in bug reporting system?

How many of those are part of the system set or get installed
automatically on one's system without any intervention? Furthermore, how
many of them are or will be programmed to send data automatically,
without prior action of the user and possibly without trace?
The point I was addressing is the suggestion that the above should be
possible and the idea that any single developer is entitled to do so.

- -- 
Regards,

Jorge Vicetto (jmbsvicetto) - jmbsvicetto at gentoo dot org
Gentoo- forums / Userrel / Devrel / KDE / Elections / RelEng
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJOXEKXAAoJEC8ZTXQF1qEP1ggP+gLBY9IiNjOaIxdQoJ1B/i2f
KEmvyTddr4Grxjo8ZME7mefIHi/8ethrWKBuCgf//XshpCQ2r+xKtEgluQf4fX+w
MAk9OePybbJJvIeATuoxb/nVYaihMZ7uuOtH5dqbDzhWMMsV0xkmTqgztrQM2v4X
jE4yT2hPYV4Ir9OUljzJ5LTBkcdgwDKIjxSn/lUjvCWhNGKr081h6437fOuIQDYE
kf+/nDU/UDngk7yKTH4Bgbd7pBNUe8Fu8HJ+7y8iwG0Y4mPW8VCFRHsBFTVNf2/p
haX68uC/jPAsWEPO3/YO5rs8JDHNXqL+8zXRPjZn/E0cUkT13+Fa79vKXI6wTPK4
fwF+WZdmAmP/zW5Gs7w82wbML0S0KhQzfVmLu+ne3NBxGhrtnpEzFq6BQgzCtlNu
p8vQjtCEVSpeHkTMt0St9/3qPMXhVc1DCRllD2OrEbFil1keHLutDHzIFLVxUZuE
9Fv+esWuTI7yzJjErbvT2OGzbpZMvPuho90QthIbSap/fIf6vK/DOgN+2FcJy0/7
PDtIq8fRL2NF/CQOxjwfGwkpyUK3ZWk7QCBh65MA4PiZHG1eZf5enlvg+WuqYHcC
e14tvNVl0FeiW3lwCNy3/IOugSPpIatrbtHCImu0eaJ6oZqLP+OX6HZjpixJg2TP
JEnebRBgj6z6VdT774gg
=vmrl
-END PGP SIGNATURE-



Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-29 Thread Matt Turner
On Mon, Aug 29, 2011 at 9:53 PM, Jorge Manuel B. S. Vicetto
jmbsvice...@gentoo.org wrote:
 The point I was addressing is the suggestion that the above should be
 possible and the idea that any single developer is entitled to do so.

It's a moot point, because no one (that I see) claimed or is claiming
to be entitled to that. In fact, Alec said

 We did post to -dev, hence this thread. The point is that we don't
 need any 'official opinion' to do anything; and I don't want to set
 that precedent. If you have specific concerns about actions we plan
 to take (which by the way, we are not planning an opt-out solution.
 If we plan to do an opt-out solution, we will again have a thread on
 -dev) then let us know

He's not saying that no official opinion would be needed if they were
doing an opt-out. He's saying that they don't need an official opinion
*since* they aren't doing some sort of opt-out system.

Not your fault, but this whole thread regarding the
merits/legality/privacy of opt-out is completely irrelevant to the
original topic.

Matt



Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-26 Thread Dale

Jorge Manuel B. S. Vicetto wrote:


I've picked this message as I want to address one point in this thread
that was focused on this sub-thread.
I disagree with the idea that adding an application to the Gentoo tree
that collects data from users and sends it to a central (or distributed)
system is the same as adding any other application to the tree.
Having the ability to add ebuilds to the tree is part of what you gain
by getting gentoo-x86 access. Issues with significant users privacy
concerns and substantial changes like adding packages to the tree that
collect data from users and compile it, should not be at the discretion
of individual developers but be subject of global policies that should
take into account the legal ramifications (trustees) and reflect the
developers desire and goals (council).

- -- 
Regards,


Jorge Vicetto (jmbsvicetto) - jmbsvicetto at gentoo dot org
Gentoo- forums / Userrel / Devrel / KDE / Elections / RelEng
   


Just picking a message to reply to at random here.  Sorry Jorge, I 
thought common sense would kick in way before now.


As a user, if ANY distro starts collecting data about me without my 
consent, I would be looking for something else to use.  For people to 
even think that users want someone snooping on them is rather presumptuous.


I have to also agree with the legal problems as well.  Doing this 
without the users consent is going to lead to a huge legal mess. It 
would also taint Gentoo and Linux in general if this were to happen.  
Anyone who thinks it won't needs to talk to a lawyer and some common 
folks really soon.


As a user, if this was done without my consent, saying I would be pissed 
would be to mild a term but one I am willing to use on a public forum.  
As a example, I have DirecTv.  It has no connection other than the 
satellite cable.  No telephone or anything.  I don't want them snooping 
on what I watch on TV either.  I also don't care to have Gentoo 
collecting data on what I use or other data either.  If I wanted that, I 
could just use M$ stuff.  I would expect such things from them and the 
huge EULA they have.


Back to my hole.

Dale

:-)  :-)



Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-25 Thread Michał Górny
On Wed, 24 Aug 2011 13:03:44 +0200
Andreas K. Huettel dilfri...@gentoo.org wrote:

 Am Mittwoch 24 August 2011, 12:48:35 schrieb Patrick Lauer:
  
  If you sneakily add something to cron.daily by default you can get
  pretty nice coverage. But I guess anyone trying that in Gentooland
  will meet some rather unpleasant resistance :)
  
 
 Of course, we could place it in some blatantly obvious way into a
 default configuration, together with a big fat message what it does
 and how to quickly disable it. 
 
 We'd get better coverage in an opt-out system than in an opt-in
 system.

And a larger number of angry users which missed the warning and now
have to pay for additional GPRS transfer or so. And when people use
GPRS rarely, they usually don't think about random apps that use
the connection in background.

 (First idea- package is pulled in by a default-on useflag and
 installs itself into cron.daily. BEFORE it runs the first time it
 outputs said message and asks for permission to proceed (which cannot
 be done in the cron job obviously but we'd find a way).)

And what if it can't ask for that? Assuming you're talking about
'opt-out', I guess the fallback would be to 'yes'. We don't want to end
up like Windows, where you get AFK for five minutes and then discover
the system has rebooted.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-25 Thread Roy Bamford
On 2011.08.24 11:48, Patrick Lauer wrote:
[snip]
 
 If you sneakily add something to cron.daily by default you can get
 pretty nice coverage. But I guess anyone trying that in Gentooland
 will
 meet some rather unpleasant resistance :)
 
 
 
 

This app and if its opt in or opt out will set a precedence for any 
future apps that want automatic user feedback in Gentoo

It has to be opt-in as opt out would be a dangerous precendent to set.

I don't see any harm is a gentle reminder message from emerge, provided 
that the reminder can be turned off too, if the user really does not 
want to opt in. Thats no worse than being nagged about unread news.

-- 
Regards,

Roy Bamford
(Neddyseagoon) a member of
elections
gentoo-ops
forum-mods
trustees


pgpz8BkPEPndt.pgp
Description: PGP signature


Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-25 Thread Rich Freeman
On Thu, Aug 25, 2011 at 6:48 AM, Roy Bamford neddyseag...@gentoo.org wrote:
 It has to be opt-in as opt out would be a dangerous precendent to set.

 I don't see any harm is a gentle reminder message from emerge, provided
 that the reminder can be turned off too, if the user really does not
 want to opt in. Thats no worse than being nagged about unread news.

I tend to agree, the more I think about it.

The simplest solution (which doesn't require any portage mods/etc), is
to simply make this a package that installs the appropriate logic in
cron.daily, and we send out a news item encouraging users to install
it voluntarily.  If the user does nothing, they don't get the package.

If somebody can come up with really good reason that we should be more
aggressive in promoting it, then we can promote it more aggressively.
That /might/ go as far as a forced opt-in/out decision.  However, the
more I think about it the more I'm concerned with pure opt-out by
default.

The big issue with opt-out is privacy law - especially in Europe
(that's leaving aside just being up-front with users).  We'd end up
having to have EULAs or such and perhaps a number of other legal
controls, and I don't think that is a direction that we want to go in.
 I'm just not seeing the upside - better to just figure out good ways
to use data that is easy and safe to obtain first.

Earlier somebody suggested that this decision wasn't really in the
domain of the Council/Trustees.  I'm not sure I agree here - any kind
of opt-out data collection is something that has potential legal
ramifications as well as huge reputation concerns for the distro (the
software is distributed from Foundation-owned hardware utilizing a
Foundation-owned domain name and the data goes back to
Foundation-owned hardware - I'm sure any lawyer could make a case for
this).  Just because there isn't a policy written down somewhere
doesn't mean that we can't use common sense.  Devs certainly don't
need to run everything past the Council, but if you want to do
something high-profile post it on -dev, and if there is an uproar look
for an official second opinion before doing it.

Rich



Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-25 Thread Alec Warner
On Thu, Aug 25, 2011 at 5:20 AM, Rich Freeman ri...@gentoo.org wrote:
 On Thu, Aug 25, 2011 at 6:48 AM, Roy Bamford neddyseag...@gentoo.org wrote:
 It has to be opt-in as opt out would be a dangerous precendent to set.

 I don't see any harm is a gentle reminder message from emerge, provided
 that the reminder can be turned off too, if the user really does not
 want to opt in. Thats no worse than being nagged about unread news.

 I tend to agree, the more I think about it.

 The simplest solution (which doesn't require any portage mods/etc), is
 to simply make this a package that installs the appropriate logic in
 cron.daily, and we send out a news item encouraging users to install
 it voluntarily.  If the user does nothing, they don't get the package.

 If somebody can come up with really good reason that we should be more
 aggressive in promoting it, then we can promote it more aggressively.
 That /might/ go as far as a forced opt-in/out decision.  However, the
 more I think about it the more I'm concerned with pure opt-out by
 default.

Why is the thread bikeshedding an out-opt that we aren't even
considering doing right now?


 The big issue with opt-out is privacy law - especially in Europe
 (that's leaving aside just being up-front with users).  We'd end up
 having to have EULAs or such and perhaps a number of other legal
 controls, and I don't think that is a direction that we want to go in.
  I'm just not seeing the upside - better to just figure out good ways
 to use data that is easy and safe to obtain first.

 Earlier somebody suggested that this decision wasn't really in the
 domain of the Council/Trustees.  I'm not sure I agree here - any kind
 of opt-out data collection is something that has potential legal
 ramifications as well as huge reputation concerns for the distro (the
 software is distributed from Foundation-owned hardware utilizing a
 Foundation-owned domain name and the data goes back to
 Foundation-owned hardware - I'm sure any lawyer could make a case for
 this).  Just because there isn't a policy written down somewhere
 doesn't mean that we can't use common sense.  Devs certainly don't
 need to run everything past the Council, but if you want to do
 something high-profile post it on -dev, and if there is an uproar look
 for an official second opinion before doing it.

We did post to -dev, hence this thread. The point is that we don't
need any 'official opinion' to do anything; and I don't want to set
that precedent. If you have specific concerns about actions we plan to
take (which by the way, we are not planning an opt-out solution. If we
plan to do an opt-out solution, we will again have a thread on -dev)
then let us know. If you have specific legal concerns about the
application, data retention, encryption, logs, backups, onerous
european privacy laws, and other such questions you should raise those
concerns now.


 Rich





Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-25 Thread Rich Freeman
On Thu, Aug 25, 2011 at 10:35 AM, Alec Warner anta...@gentoo.org wrote:
 We did post to -dev, hence this thread.

My post was intended to be general in applicability, and not critical
of the particular instance of this issue being discussed.

I would generally suggest that implementing this as a package and not
as a function built-into portage would tend to make more sense to me
(do we really want portage to do EVERYTHING?).  However, I don't think
that anybody needs anybody's blessing in particular to take one course
or the other there.  And, in the Gentoo tradition of
everybody-does-whatever-they-want-to, there is nothing wrong with one
set of devs doing it one way and another set doing it another way so
that we end up with two data repositories with somewhat redundant data
so that we can start another discussion on -dev about what the
differences in the datasets mean.  That is, until eventually devs get
bored and after enough bugs pile up one or both of the collection
mechanisms gets treecleaned.  Then in five years somebody can build a
new one. :)

If I had strong concerns with anything that seemed likely to get
adopted I'd voice them.

Rich



Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-25 Thread Jorge Manuel B. S. Vicetto
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 25-08-2011 14:35, Alec Warner wrote:
 On Thu, Aug 25, 2011 at 5:20 AM, Rich Freeman ri...@gentoo.org
 wrote:
snip
 The big issue with opt-out is privacy law - especially in Europe 
 (that's leaving aside just being up-front with users).  We'd end
 up having to have EULAs or such and perhaps a number of other
 legal controls, and I don't think that is a direction that we want
 to go in. I'm just not seeing the upside - better to just figure
 out good ways to use data that is easy and safe to obtain first.
 
 Earlier somebody suggested that this decision wasn't really in the 
 domain of the Council/Trustees.  I'm not sure I agree here - any
 kind of opt-out data collection is something that has potential
 legal ramifications as well as huge reputation concerns for the
 distro (the software is distributed from Foundation-owned hardware
 utilizing a Foundation-owned domain name and the data goes back to 
 Foundation-owned hardware - I'm sure any lawyer could make a case
 for this).  Just because there isn't a policy written down
 somewhere doesn't mean that we can't use common sense.  Devs
 certainly don't need to run everything past the Council, but if you
 want to do something high-profile post it on -dev, and if there is
 an uproar look for an official second opinion before doing it.
 
 We did post to -dev, hence this thread. The point is that we don't 
 need any 'official opinion' to do anything; and I don't want to set 
 that precedent. If you have specific concerns about actions we plan
 to take (which by the way, we are not planning an opt-out solution.
 If we plan to do an opt-out solution, we will again have a thread on
 -dev) then let us know. If you have specific legal concerns about
 the application, data retention, encryption, logs, backups, onerous 
 european privacy laws, and other such questions you should raise
 those concerns now.

I've picked this message as I want to address one point in this thread
that was focused on this sub-thread.
I disagree with the idea that adding an application to the Gentoo tree
that collects data from users and sends it to a central (or distributed)
system is the same as adding any other application to the tree.
Having the ability to add ebuilds to the tree is part of what you gain
by getting gentoo-x86 access. Issues with significant users privacy
concerns and substantial changes like adding packages to the tree that
collect data from users and compile it, should not be at the discretion
of individual developers but be subject of global policies that should
take into account the legal ramifications (trustees) and reflect the
developers desire and goals (council).

- -- 
Regards,

Jorge Vicetto (jmbsvicetto) - jmbsvicetto at gentoo dot org
Gentoo- forums / Userrel / Devrel / KDE / Elections / RelEng
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJOVxCXAAoJEC8ZTXQF1qEP7KAQAJBwDHp4aS+5l8gahHUrsWYI
0gUpO+qtsFODsKToQa4ZZ9jTZhFvN0iscyApXvgO8FBOnPzFCMiq+LblI/j/cnFK
OwVYJ4/tvcc1C1fE1lQecd1kNVlnVLCEvR8NbeKA184ty4kS7cJy2FqAiWbzGGno
/zNsQI+iDUg6ZCamCz29EZ5FJgfUzXzG+Ipbh61T0c/Ukugq5xHA8c5zTzoRre2u
/fSRMM9qPakmgaHJoV8t+8B0ejJccW/+MquKIyFdDnUDvQH5U/RnXl3D5oe7+0vb
Eak3VB5iUrkZifqhpOQMEeAtuNColigPy4oPr6BsQz7t0uiC2M0MHei4cigbN8kn
yp4U+RZE4PhJ/+b/U/jnaiidGu8IF+Kdl3DPgCR130N4vbpO8u7KjyphdoL7QZx5
hnc3A5ZxQxraQolKtFnl8Be8P5NvuKdiP192wYmACuCw3W95XVNDtUhc63n++fqo
0K9WTEudO+JZN7JYZFSU6OJo5hvujHcQvvIO2sG30Q56x7EfvCRFCzMUsRC8mU0L
uSKW+YFHVp1+yCJ9BbnTWp9afPUVQ56/1YtCxLDsqEi0lI7otm0TpuJFIC/fDJ1F
Hf9Kqaap9kZzc1WBKuMY0Rvvf8CKf/9bd9QTxT5Fz/tpiNGkU9MTMFPHghDFUP8h
773YR/NFapQVLHyqemla
=G4Y6
-END PGP SIGNATURE-



Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-24 Thread Thomas Kahle
Hi,

On 18:16 Tue 23 Aug 2011, Andreas K. Huettel wrote:
 there is one important aspect of your program that really needs to be
 documented (and comments in the code are not enough):
 
 What data exactly is the client sending to the server?!

 What you need is basically an easy-to-find file / web page / ... where
 this is explained concise and in simple words. As long as that does
 not exist, your program will not find much acceptance.


You may look at the files README and FAQ for Ubuntu's popularity
contest: http://popcon.ubuntu.com/

If we could get their turnout rates, that'd be great.

 Apart from that, I like the entire project, and am curious about its
 results.

+1 

It has come up several times that getting usage statistics would
motivate developers.

Cheers,
Thomas



-- 
Thomas Kahle
http://dev.gentoo.org/~tomka/


signature.asc
Description: Digital signature


Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-24 Thread Patrick Lauer
On 08/24/11 12:31, Thomas Kahle wrote:
 Hi,
 
 On 18:16 Tue 23 Aug 2011, Andreas K. Huettel wrote:
 there is one important aspect of your program that really needs to be
 documented (and comments in the code are not enough):

 What data exactly is the client sending to the server?!

 What you need is basically an easy-to-find file / web page / ... where
 this is explained concise and in simple words. As long as that does
 not exist, your program will not find much acceptance.
 
 
 You may look at the files README and FAQ for Ubuntu's popularity
 contest: http://popcon.ubuntu.com/
 
 If we could get their turnout rates, that'd be great.

If you sneakily add something to cron.daily by default you can get
pretty nice coverage. But I guess anyone trying that in Gentooland will
meet some rather unpleasant resistance :)




Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-24 Thread Andreas K. Huettel
Am Mittwoch 24 August 2011, 12:48:35 schrieb Patrick Lauer:
 
 If you sneakily add something to cron.daily by default you can get
 pretty nice coverage. But I guess anyone trying that in Gentooland will
 meet some rather unpleasant resistance :)
 

Of course, we could place it in some blatantly obvious way into a default 
configuration, together with a big fat message what it does and how to quickly 
disable it. 

We'd get better coverage in an opt-out system than in an opt-in system.

(First idea- package is pulled in by a default-on useflag and installs itself 
into cron.daily. BEFORE it runs the first time it outputs said message and asks 
for permission to proceed (which cannot be done in the cron job obviously but 
we'd find a way).)

-- 
Andreas K. Huettel
Gentoo Linux developer - kde, sci, arm, tex
dilfri...@gentoo.org
http://www.akhuettel.de/



Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-24 Thread Rich Freeman
On Wed, Aug 24, 2011 at 6:48 AM, Patrick Lauer patr...@gentoo.org wrote:
 If you sneakily add something to cron.daily by default you can get
 pretty nice coverage. But I guess anyone trying that in Gentooland will
 meet some rather unpleasant resistance :)

Well, we could always broadcast the news widely (lists, forums,
eselect news, and so on).

I'd also make it controllable via use flag.  Put the client and the
cron.daily file in a package, and then make that a use-dependency of
something everybody has (the profile if profiles support this (don't
think they do), and if not pick something that correlates well with
people who would benefit from this feature.

Users can opt-out via use flag.

You can also start out with it being opt-in (use flag off by default
in profiles), and then turn it on later (with notice/etc).

The key is to not be sneaky about it.

Rich



Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-24 Thread Thomas Kahle
On 12:48 Wed 24 Aug 2011, Patrick Lauer wrote:
 On 08/24/11 12:31, Thomas Kahle wrote:
  Hi,
  
  On 18:16 Tue 23 Aug 2011, Andreas K. Huettel wrote:
  there is one important aspect of your program that really needs to be
  documented (and comments in the code are not enough):
 
  What data exactly is the client sending to the server?!
 
  What you need is basically an easy-to-find file / web page / ... where
  this is explained concise and in simple words. As long as that does
  not exist, your program will not find much acceptance.
  
  
  You may look at the files README and FAQ for Ubuntu's popularity
  contest: http://popcon.ubuntu.com/
  
  If we could get their turnout rates, that'd be great.
 
 If you sneakily add something to cron.daily by default you can get
 pretty nice coverage. But I guess anyone trying that in Gentooland will
 meet some rather unpleasant resistance :)

Oh yeah... when I used Ubuntu last 11/06 it would still ask you on
install.

@Vikraman: I guess you see how *important* it is to be completely open
and explain everything the program does.  On Gentoo it should of course
be opt-in, instead of opt-out.


-- 
Thomas Kahle
http://dev.gentoo.org/~tomka/


signature.asc
Description: Digital signature


Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-24 Thread Thomas Kahle
On 13:03 Wed 24 Aug 2011, Andreas K. Huettel wrote:
 Am Mittwoch 24 August 2011, 12:48:35 schrieb Patrick Lauer:
  
  If you sneakily add something to cron.daily by default you can get
  pretty nice coverage. But I guess anyone trying that in Gentooland will
  meet some rather unpleasant resistance :)
  
 
 Of course, we could place it in some blatantly obvious way into a default 
 configuration, together with a big fat message what it does and how to 
 quickly disable it. 
 
 We'd get better coverage in an opt-out system than in an opt-in system.
 
 (First idea- package is pulled in by a default-on useflag and installs itself 
 into cron.daily. BEFORE it runs the first time it outputs said message and 
 asks for permission to proceed (which cannot be done in the cron job 
 obviously but we'd find a way).)

Sorry, but NO.  If you want you can make a big noise message that asks
users to install the cron-job but opt-out is not an option here.



-- 
Thomas Kahle
http://dev.gentoo.org/~tomka/


signature.asc
Description: Digital signature


Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-24 Thread Mario Fetka
i am a user and i am ok with opt-out if the std data that is transferd
is compleatly anonymized
so no sensitive data.

and if the user wants to register his/her machine pkg's more data is trasnfered

thx
Mario

2011/8/24 Thomas Kahle to...@gentoo.org:
 On 13:03 Wed 24 Aug 2011, Andreas K. Huettel wrote:
 Am Mittwoch 24 August 2011, 12:48:35 schrieb Patrick Lauer:
 
  If you sneakily add something to cron.daily by default you can get
  pretty nice coverage. But I guess anyone trying that in Gentooland will
  meet some rather unpleasant resistance :)
 

 Of course, we could place it in some blatantly obvious way into a default 
 configuration, together with a big fat message what it does and how to 
 quickly disable it.

 We'd get better coverage in an opt-out system than in an opt-in system.

 (First idea- package is pulled in by a default-on useflag and installs 
 itself into cron.daily. BEFORE it runs the first time it outputs said 
 message and asks for permission to proceed (which cannot be done in the cron 
 job obviously but we'd find a way).)

 Sorry, but NO.  If you want you can make a big noise message that asks
 users to install the cron-job but opt-out is not an option here.



 --
 Thomas Kahle
 http://dev.gentoo.org/~tomka/




Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-24 Thread Rich Freeman
On Wed, Aug 24, 2011 at 7:45 AM, Thomas Kahle to...@gentoo.org wrote:
 Sorry, but NO.  If you want you can make a big noise message that asks
 users to install the cron-job but opt-out is not an option here.

Well, that's up to the Council/Trustees ultimately, but opinions (and
better still reasoning) are welcome since both would no-doubt want to
reflect the will of the community (and whatever is legal in the
jurisdictions that matter).

One option that many distros employ is a forced opt-in/out decision.
During the install process they simply ask the user, and they have to
hit either yes or no to continue.  The reason most people don't opt-in
is that they don't think about it, and this forces the issue.

The Gentoo analogue would be to put something in make.conf or whatever
that must be set one way or another.  Maybe have an opt-in use flag
and an opt-out use flag and if you don't set either emerge just dies
with a notice or something.  No doubt somebody could come up with a
more elegant solution.

Maybe another line of discussion that could inform the debate is what
the value of this information is?  For a company, knowing what
packages are popular helps them to allocate resources.  Gentoo is a
volunteer effort and devs allocate their effort based on personal
preference, though perhaps some would care about package popularity to
an extent.  So, we might not benefit to the same degree from this kind
of information, since we can't crack the whip and force people to fix
some broken package that is popular.

Rich



Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-24 Thread Alec Warner
On Wed, Aug 24, 2011 at 5:05 AM, Rich Freeman ri...@gentoo.org wrote:
 On Wed, Aug 24, 2011 at 7:45 AM, Thomas Kahle to...@gentoo.org wrote:
 Sorry, but NO.  If you want you can make a big noise message that asks
 users to install the cron-job but opt-out is not an option here.

 Well, that's up to the Council/Trustees ultimately, but opinions (and
 better still reasoning) are welcome since both would no-doubt want to
 reflect the will of the community (and whatever is legal in the
 jurisdictions that matter).

It doesn't take a council vote nor a trustees vote to add a package to
everyone's machine.

In the end I'd recommend just looking at the opt-in numbers. Is the
data useful from opt-in users?
If the answer is no, then we can always think up other ways to get
more users. Will auto-installs be on the list of ideas? You bet ;) But
I think we are putting the cart before the horse.


 One option that many distros employ is a forced opt-in/out decision.
 During the install process they simply ask the user, and they have to
 hit either yes or no to continue.  The reason most people don't opt-in
 is that they don't think about it, and this forces the issue.

 The Gentoo analogue would be to put something in make.conf or whatever
 that must be set one way or another.  Maybe have an opt-in use flag
 and an opt-out use flag and if you don't set either emerge just dies
 with a notice or something.  No doubt somebody could come up with a
 more elegant solution.

The stage3 tarball doesn't even come with a dhcp client; so I don't
really see how installing a stats client makes sense from the
standpoint of 'only what is necessary.' For many people, that is an
important part of Gentoo (cf. python3...)

Making emerge die unless you make a decision will probably break a
bunch of shit (plenty of people have automatic installs in some
fashion.) We would have to use an existing methodology to avoid
breaking them (PROPERTIES=interactive?)


 Maybe another line of discussion that could inform the debate is what
 the value of this information is?  For a company, knowing what
 packages are popular helps them to allocate resources.  Gentoo is a
 volunteer effort and devs allocate their effort based on personal
 preference, though perhaps some would care about package popularity to
 an extent.  So, we might not benefit to the same degree from this kind
 of information, since we can't crack the whip and force people to fix
 some broken package that is popular.

I think at present we don't know the informations value; that is part
of why considering opt-out is premature ;)


 Rich





Re: [gentoo-dev] Gentoostats, SoC 2011

2011-08-23 Thread Andreas K. Huettel

Hi Vikram, 

there is one important aspect of your program that really needs to be 
documented (and comments in the code are not enough):

What data exactly is the client sending to the server?!

What you need is basically an easy-to-find file / web page / ... where this is 
explained concise and in simple words. As long as that does not exist, your 
program will not find much acceptance. 

Apart from that, I like the entire project, and am curious about its results.

Best, 
Andreas


Am Montag 22 August 2011, 23:20:30 schrieb Vikraman:
 Hi all,
 
 Gentoostats[0] is a GSoC 2011 project to collect package statistics from 
 gentoo
 machines. Please check it out. Bug reports and feature suggestions are 
 welcome.
 
 To submit your stats, use the app-portage/gentoostats ebuild from betagarden
 overlay[1].
 
 [0] https://soc.dev.gentoo.org/gentoostats/
 [1] https://soc.dev.gentoo.org/gentoostats/about
 
 


-- 
Andreas K. Huettel
Gentoo Linux developer - kde, sci, arm, tex
dilfri...@gentoo.org
http://www.akhuettel.de/