Re: percentage of popcon submitters

2009-01-18 Thread Aurelien Jarno
On Thu, Jan 15, 2009 at 10:00:04PM +0100, markus schnalke wrote:
 Hoi,
 
 I know it is not possible to _know_ the real percentage of uses which
 submit popcon stats of all users. But I want to ask for guesses,
 because more oppinions do likely improve the result.
 
 My current guess is between 1/3 and 2/3.
 
 What do you think?
 

My guess is 42.

-- 
  .''`.  Aurelien Jarno | GPG: 1024D/F1BCDB73
 : :' :  Debian developer   | Electrical Engineer
 `. `'   aure...@debian.org | aurel...@aurel32.net
   `-people.debian.org/~aurel32 | www.aurel32.net


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-18 Thread Raphael Geissert
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Russ Allbery wrote:
[...]
 what packages on your servers are missing security patches, basically

popularity-contest doesn't submit package versions, so it is not *that* easy to
know whether security updates have been installed or not.

As for what security matters popularity-contest could:
* randomly change the recent value of a random number of packages
* submit via https (or ftp+ssl), and/or even encrypt the data with gpg
* have some sort of apt-pinning so that it is possible to indicate that the data
corresponding to a given package(s) or repository (ies) should NOT be sent.
thereby preventing the I know when you went on VAC because your
xfoo-bar-custom package is marked as old information leak.

With those security meassures I believe there's a slight chance that a few more
people (or institutions) will install popularity-contest.

Cheers,
Raphael Geissert
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAklzySsACgkQYy49rUbZzlo5rQCffJsZ3Ws3iCrj2XlG47syH+R5
bacAn2tDyPob40e7VdoasMOPL/BBQTt/
=tK0A
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-18 Thread Raphael Geissert
Franklin PIAT wrote:
[...]
 * Some system are unproperly configured and can't submit
   their popcon (missing http proxy ; smtp server is wrong or blocked
   by their ISP). Especially when people are travelling.

Or there's no internet connection when popcon runs and tries to submit via http
so it falls back to mail, but mail is local-delivery only.

Cheers,
Raphael Geissert


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-17 Thread Junichi Uekawa
At Thu, 15 Jan 2009 22:00:04 +0100,
markus schnalke wrote:
 
 [1  text/plain; us-ascii (7bit)]
 Hoi,
 
 I know it is not possible to _know_ the real percentage of uses which
 submit popcon stats of all users. But I want to ask for guesses,
 because more oppinions do likely improve the result.
 
 My current guess is between 1/3 and 2/3.
 
 What do you think?
 

I used to try to track down that number using apt-listbugs.  From
popcon, I know the amount of users who install apt-listbugs  use
popcon.  apt-listbugs users can be approximated by unique IPs that are
accessing the Debian BTS.

I think I posted that figure a while ago somewhere.


regards,
junichi
-- 
dan...@{netfort.gr.jp,debian.org}


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-17 Thread Andrei Popescu
On Sat,17.Jan.09, 01:05:47, Kjeldgaard Morten wrote:

 On 16/01/2009, at 18.27, Johannes Wiedersich wrote:

 Did you think about thousands of computers having 'private ips' with
 some nat translation and/or local proxie? (I'm thinking of computer
 labs, companies, etc. not just the odd home user). Essentially all of
 the computers at our department share the same public IP.

 Hundreds of machines accessing proxies, and thousands having private IPs. 
 Are these numbers something you know or are you just throwing them around? 
 Otherwise they can of course be accounted for in the total estimate ;-)

I know of at least two ISPs in Romania (one of them being the biggest) 
who will connect you via NAT and I hear this practice is not unheard of 
in other countries. You won't see more than a few IPs coming from 
there...

Regards,
Andrei
-- 
If you can't explain it simply, you don't understand it well enough.
(Albert Einstein)


signature.asc
Description: Digital signature


Re: percentage of popcon submitters

2009-01-17 Thread Simon Josefsson
Bernd Eckenfels e...@lina.inka.de writes:

 In article 87d4enbfqd@mocca.josefsson.org you wrote:
 It would establish an upper bound of well-administrated debian machines,
 I think.

 It is a lower bound, since I guess there are more cases where more than one
 machine is updated. The case that you download without need or as a
 duplicate (With multiple IPs) is very low.

I've realized it is not a lower bound either, because some download
s.d.o packages to temporary chroots and pbuilds etc which should
probably not be counted as a debian machine.  Still, trying to get
numbers on the various statistics we can easily get at may improve
estimates and allow us to follow the trends.

/Simon


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread Petter Reinholdtsen

[Markus Schnalke]
 I know it is not possible to _know_ the real percentage of uses
 which submit popcon stats of all users. But I want to ask for
 guesses, because more oppinions do likely improve the result.

A while back, someone with access to the download logs for
security.debian.org tried to estimate the number of machines
downloading security fixes for Debian, based on the assumption that
no-one is using a mirror for security fixes.  I am unable to find
those results using Google right now, but would recommend trying to
get hold of those numbers to get new lower bound on the number of
Debian installations.

As for the answer to your question, I have no idea. :) The amount of
contributors to popcon went up a lot when the popcon question started
to appear in the default installer in Etch, but I have no idea how
many actually replied yes to that question. :)

Happy hacking,
-- 
Petter Reinholdtsen


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread Neil Williams
On Fri, 16 Jan 2009 08:45:12 +0100
Kjeldgaard Morten m...@bioxray.au.dk wrote:

 
  Thanks. Unless you setup some experimental method, any argument  
  should reduce
  to handwaving or extension of various particular examples..
 
 Surely, it must be possible to get an estimate of the number of  
 downloads of important packages and security updates? I know these  
 downloads also are requested from mirror sites, but at least for the  
 official mirror sites their relative activity must be known?

How do you map the number of downloads to the number of users or
machines? I have dozens of chroots that I use for multiple reasons.
Now, maybe I should use an apt proxy but most of these are
cross-building chroots so that doesn't help as the proxy will have
amd64 packages and I need arm or armel etc.

Then you have the problem of people who maintain local mirrors (often
quite short lived ones).

It's just more handwaving - unless you want to count every chroot and
every local mirror (per architecture) as a separate user.

There is no way of accurate counting unless access to the files is
restricted to a known number of download methods that all require user
intervention to proceed, at which point Debian would not be free.

The LinuxCounter method is completely arbitrary - the figures on the
site are guesswork and cannot be used in any other calculations.
LinuxCounter tries to extrapolate from 180,000 to 29,000,000 without
any real basis for such a leap of faith other than we guessed 18
million some time ago and we have x% increase in our counter figures
since then, so increase 18 million by x%. I'm not knocking their
figures, just reiterating what is on the linux counter website -
reliable figures just do not exist and trying to create them usually
results in restricting the freedoms that attract users in the first
place.

http://counter.li.org/estimates.php

Linux Counter is no more or less reliable than popcon - both are wild
guesses from different perspectives. popcon is an wild underestimate,
counter could as easily be over as under. Nobody knows and in a very
real sense, nobody could ever know with any accuracy.

popcon is what we have, it is an indicator with known deficiencies that
always need to be taken into account when using popcon data as a factor
in any packaging decision but popcon, overall, is just more handwaving.
I find it amusing that we post popcon % figures to two decimal places
when the real error margins are completely unknown but it reflects the
size of Debian - if popcon didn't use decimal places, a vast number of
packages in widespread usage would have a popcon % of zero.

I could say that LinuxCounter is out by 30% or 70% or 150% and there
would be no reason to consider my guesses as more or less reliable than
the ones from LinuxCounter. The whole thing is a complete unknown.

-- 


Neil Williams
=
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/



pgpWDE8eJPepc.pgp
Description: PGP signature


Re: percentage of popcon submitters

2009-01-16 Thread markus schnalke
[2009-01-16 10:09] Neil Williams codeh...@debian.org
 
 The whole thing is a complete unknown.

Of course you're right. But it's the best we have.

Instead of leaving it with ``we simply don't know'', I prefer to
estimate on the (unsure) data sources that are available.



For my case, I received valuable comments and approaches that improved
my estimations, so I thank everyone who contributed!


FYI:
I now assume one third of all Debian users submit their stats, with
the remark that one third is probably a high guess as it means there
would be only about 230 thousand Debian installations in total. But
according to counter.li.org between 490 thousand and 12 million Debian
users can be estimated. I later reduce the resulting number by one
third to respect users with multiple installations.

I think my final result is not too large, which is to avoid. If it
would be only 1/10 of the true number, I have no problem with it.



thanks again

meillo


signature.asc
Description: Digital signature


Re: percentage of popcon submitters

2009-01-16 Thread Romain Beauxis
Le Friday 16 January 2009 11:51:50 markus schnalke, vous avez écrit :
 [2009-01-16 10:09] Neil Williams codeh...@debian.org

  The whole thing is a complete unknown.

 Of course you're right. But it's the best we have.

 Instead of leaving it with ``we simply don't know'', I prefer to
 estimate on the (unsure) data sources that are available.

Not that I like to be polemic, but this sentence doesn't mean anything.

If the answer is we don't know, then we don't know. Problem is that you 
don't give any ground to your claims, hence it is far worse to give any 
estimation.

The only serious analysis was the one made by Bernd Eckenfels, which ended 
with 1%. I don't really believe this can be used as it is before another 
contradictory analysis can be done.

Of course, you surely do not need such serious considerations for your precise 
issue, but claiming that any ungrounded estimation is better than nothing 
made me tilt :)


Romain


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread markus schnalke
[2009-01-16 12:06] Romain Beauxis to...@rastageeks.org
 Le Friday 16 January 2009 11:51:50 markus schnalke, vous avez écrit :
  [2009-01-16 10:09] Neil Williams codeh...@debian.org
 
   The whole thing is a complete unknown.
 
  Of course you're right. But it's the best we have.
 
  Instead of leaving it with ``we simply don't know'', I prefer to
  estimate on the (unsure) data sources that are available.
 
 Not that I like to be polemic, but this sentence doesn't mean anything.
 
 If the answer is we don't know, then we don't know. Problem is that you 
 don't give any ground to your claims, hence it is far worse to give any 
 estimation.

IMO that's not right if you want to act. Acting without any plan is
worse than acting with an estimation (that bases on the best
information available).

Imaging navigation within permanent fog. I think it is better to go
towards the most probable direction, instead of doing nothing because
you dont know. Of course this again is dependent on external factors
like dangerous ground ... thus no answer fits every case.

In the end it's a philosophical question anyway. ;-)


 Of course, you surely do not need such serious considerations for your 
 precise 
 issue

correct ;-)


meillo


signature.asc
Description: Digital signature


Re: percentage of popcon submitters

2009-01-16 Thread Simon Josefsson
Neil Williams codeh...@debian.org writes:

 On Fri, 16 Jan 2009 08:45:12 +0100
 Kjeldgaard Morten m...@bioxray.au.dk wrote:

 
  Thanks. Unless you setup some experimental method, any argument  
  should reduce
  to handwaving or extension of various particular examples..
 
 Surely, it must be possible to get an estimate of the number of  
 downloads of important packages and security updates? I know these  
 downloads also are requested from mirror sites, but at least for the  
 official mirror sites their relative activity must be known?

 How do you map the number of downloads to the number of users or
 machines?

It would establish an upper bound of well-administrated debian machines,
I think.

 I have dozens of chroots that I use for multiple reasons.

Good point.  I wonder how much these contribute to the overall
statistics though.  Alternatively, one could argue relatively convincing
that a chroot with a complete debian system should be counted as another
debian installation.  Compare with virtual machines, which is rather
similar to a chroot installation on a normal PC.

/Simon


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread Franklin PIAT
Hi

Noah Slater wrote:
 On Thu, Jan 15, 2009 at 10:00:04PM +0100, markus schnalke wrote:
 I know it is not possible to _know_ the real percentage of uses which
 submit popcon stats of all users. But I want to ask for guesses,
 because more oppinions do likely improve the result.

 [..] is like this famous old problem: Nobody was
 permitted to see the Emperor of China, and the question was, What is the
 length of the Emperor of China's nose? [..]

I am writting a paper on popcon (ranking package usage), which isn't
finished yet.
However I have already identified the following bias:

popcon bias
---
**popularity-contest isn't part of any task (but prompted by D-I,
  since Etch). Therefore it is unlikely to be active on system
  installed using debootstrap (xen-create-image, etc..)
**Votes are purges after 23days of inactivity.
* Popcon statistics combines Old-Stable, Stable, Testing and
  Unstable... and maybe some Debian-derivatives.

user/sysadmin bias
--
* Installing popularity-contest is optional.
* Some system are unproperly configured and can't submit
  their popcon (missing http proxy ; smtp server is wrong or blocked
  by their ISP). Especially when people are travelling.
* People deploying Debian in large companies usually disable
  popcon.
* Security concerns (Do you think that people who install security
  audit tools, like nessus, will submit popcon?).
* ISPs and hosting companies disables popcon by default.

Franklin



-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread Simon Josefsson
James Vega james...@debian.org writes:

 On Thu, Jan 15, 2009 at 4:55 PM, markus schnalke mei...@marmaro.de wrote:
 [2009-01-15 22:37] Michael Goetze mgoe...@mgoetze.net

 before wild speculations ensues, you might want to specify what you
 really want to know: the percentage of people installing debian systems
 who use popcon (always/sometimes), or the percentage of installed
 machines that submit popcon data?

 Seems my wording was unclear.

 I want to know the percentage of installed machines that submit popcon
 data.

 That requires knowing the number of computers that have Debian installed
 which, as has been discussed various times in the past on this list, is
 difficult to determine.

How about numbers for security.debian.org downloads?  That will measure
the number of well-administrated debian machines (except those
well-administrated machines that use other mirrors).

/Simon


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread Neil Williams
On Fri, 16 Jan 2009 13:24:58 +0100
Simon Josefsson si...@josefsson.org wrote:

 Neil Williams codeh...@debian.org writes:
 
  Surely, it must be possible to get an estimate of the number of  
  downloads of important packages and security updates? I know these  
  downloads also are requested from mirror sites, but at least for the  
  official mirror sites their relative activity must be known?
 
  How do you map the number of downloads to the number of users or
  machines?
 
 It would establish an upper bound of well-administrated debian machines,
 I think.

No, merely the number of installations which is not the same, clearly.

Chroots can be entirely temporary. I regularly hammer the mirrors to
create test chroots last a matter of minutes. (Usually in a different
architecture each time, hence a proxy isn't much help.)

It's not just chroots either - don't forget issues of local mirrors.
Download measurements cannot take account of whether the downloaded
file is actually installed or merely copied into another repository.

  I have dozens of chroots that I use for multiple reasons.
 
 Good point.  I wonder how much these contribute to the overall
 statistics though. 

Currently, these are all ignored for popcon but would register in any
download measurements - repeatedly.

 Alternatively, one could argue relatively convincing
 that a chroot with a complete debian system should be counted as another
 debian installation.  Compare with virtual machines, which is rather
 similar to a chroot installation on a normal PC.

In that case, I'm probably responsible to thousands of 'installations'
and those DD's involved in D-I must have an inconceivable number of
installs. Frans? any idea how many installs you've clocked up by
that measure?

Out of all those thousands of chroots and dozens of local mirrors that
I've created (and subsequently removed) just since Etch, I've only
actually got 6 real machines that use Debian. Six machines that
actually boot, six that have real users and need maintenance - only
four that are regularly powered on.

IMHO, a chroot is not an installation - even if the chroot contains the
whole of GPE or the whole of GNOME, it's a chroot. It doesn't have real
users, it doesn't boot, it is a test environment only. If you want to
count well maintained Debian machines you have to exclude all chroots
and all local mirrors.

How is it worth recording data from a debug install that lasts only a
few seconds after completing the install and which is instantly
replaced by yet another test?

How is it worth recording any data from downloads that merely result in
yet another copy of the original mirror.

-- 


Neil Williams
=
http://www.data-freedom.org/
http://www.linux.codehelp.co.uk/
http://e-mail.is-not-s.ms/



pgpsi5k7Iz5PB.pgp
Description: PGP signature


Re: percentage of popcon submitters

2009-01-16 Thread Neil Williams
On Fri, 16 Jan 2009 13:21:29 +
Neil Williams codeh...@debian.org wrote:

 In that case, I'm probably responsible to thousands of 'installations'

OK, that's an exaggeration but it's certainly hundreds since Etch.

-- 


Neil Williams
=
http://www.data-freedom.org/
http://www.linux.codehelp.co.uk/
http://e-mail.is-not-s.ms/



pgpSyNMvRhBL0.pgp
Description: PGP signature


Re: percentage of popcon submitters

2009-01-16 Thread Luciano Bello
El Vie 16 Ene 2009, Simon Josefsson escribió:
 How about numbers for security.debian.org downloads?  That will measure
 the number of well-administrated debian machines (except those
 well-administrated machines that use other mirrors).

well-administrated *etch* machines.

luciano


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread Adeodato Simó
* Luciano Bello [Fri, 16 Jan 2009 11:37:39 -0200]:

 El Vie 16 Ene 2009, Simon Josefsson escribió:
  How about numbers for security.debian.org downloads?  That will measure
  the number of well-administrated debian machines (except those
  well-administrated machines that use other mirrors).

 well-administrated *etch* machines.

Hey, Lenny too.

-- 
Adeodato Simó dato at net.com.org.es
Debian Developer  adeodato at debian.org
 
There is no man so good who, were he to submit all his thoughts to the
laws, would not deserve hanging ten times in his life.
-- Michel de Montaigne


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread Simon Josefsson
Neil Williams codeh...@debian.org writes:

 On Fri, 16 Jan 2009 13:24:58 +0100
 Simon Josefsson si...@josefsson.org wrote:

 Neil Williams codeh...@debian.org writes:
 
  Surely, it must be possible to get an estimate of the number of  
  downloads of important packages and security updates? I know these  
  downloads also are requested from mirror sites, but at least for the  
  official mirror sites their relative activity must be known?
 
  How do you map the number of downloads to the number of users or
  machines?
 
 It would establish an upper bound of well-administrated debian machines,
 I think.

 No, merely the number of installations which is not the same, clearly.

 Chroots can be entirely temporary. I regularly hammer the mirrors to
 create test chroots last a matter of minutes. (Usually in a different
 architecture each time, hence a proxy isn't much help.)

 It's not just chroots either - don't forget issues of local mirrors.
 Download measurements cannot take account of whether the downloaded
 file is actually installed or merely copied into another repository.

It would still provides an upper bound, but the local mirror exception
is a good point.  So the number derived from security.debian.org
statistics would be 'an upper bound of the number of well-administrated
debian installation that do not use local security mirrors'.  I assume
this number is correlated to the number of real debian installations
(although I'm not sure we have a good definition of real here?).

Merely the number of distinct IP addresses downloading a particular
popular update from security.debian.org at least once would be
interesting.

/Simon


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread George Danchev
On Friday 16 January 2009 15:42:38 Neil Williams wrote:
 On Fri, 16 Jan 2009 13:21:29 +

 Neil Williams codeh...@debian.org wrote:
  In that case, I'm probably responsible to thousands of 'installations'

 OK, that's an exaggeration but it's certainly hundreds since Etch.

This is true, but I imagine that this could be alleviated by taking into 
account some popular file modification time, like /etc/debian_version? If 
found to be young enough, then we can count that as a throw-awayer we don't 
care about, otherwise it is a valid entry, no matter how and for what reasons 
the installation has been created.

Still, I don't see a good way to measure non-internetworked users (imagine 
here evil authorities, secret labs, mysterious scientists performing evil 
calculations, etc;-) and those who just answered no to popcon's question. 
We can easily ignore the former (since they are in the 'No Such Entity' 
subset anyway;-), but the latter matters, and are still adding uncertainty to 
the equation.

It is most likely that the more we beat that, the more unknown quantities will 
be brought to the scene, hence why bother ;-)

-- 
pub 4096R/0E4BD0AB 2003-03-18 people.fccf.net/danchev/key pgp.mit.edu


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread Johannes Wiedersich
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Simon Josefsson wrote:
 Merely the number of distinct IP addresses downloading a particular
 popular update from security.debian.org at least once would be
 interesting.

Did you think about thousands of computers having 'private ips' with
some nat translation and/or local proxie? (I'm thinking of computer
labs, companies, etc. not just the odd home user). Essentially all of
the computers at our department share the same public IP.

Johannes

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAklww5QACgkQC1NzPRl9qEXRMACeIZ6rTadGokr+Zk5GtLfeglRa
GW8AnRij/EmKaY+9o31Hs/TWd6RkL1Lu
=EujH
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread Russ Allbery
Petter Reinholdtsen p...@hungry.com writes:

 A while back, someone with access to the download logs for
 security.debian.org tried to estimate the number of machines downloading
 security fixes for Debian, based on the assumption that no-one is using
 a mirror for security fixes.  I am unable to find those results using
 Google right now, but would recommend trying to get hold of those
 numbers to get new lower bound on the number of Debian installations.

It's worth bearing in mind that that's a bad assumption, too.  We use a
local security mirror in full knowledge that it's not recommended, but we
watch it closely and will manually sync if need be.  We do this because we
have systems on IP addresses that are not routable to the Internet and
need an on-campus source for all package updates.  Having a local mirror
is easier for our purposes than using a proxy.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread Kjeldgaard Morten

On 16/01/2009, at 11.09, Neil Williams wrote:


How do you map the number of downloads to the number of users or
machines? I have dozens of chroots that I use for multiple reasons.
Now, maybe I should use an apt proxy but most of these are
cross-building chroots so that doesn't help as the proxy will have
amd64 packages and I need arm or armel etc.



Chroots have that effect, but the number of users using chroots is  
negligible compared to the total number of users. They would  
artificially boost the number of users, whereas other effects -- for  
example sites having a local repo -- would tend to lower it. Besides,  
many users of chroots use apt-cacher like me, which would not add to  
the count at all.


I agree that it's not possible to get an accurate number, but the OP  
specifically wants an estimate. An estimate of the number of downloads  
would make it possible to estimate of the fraction of sites that  
install popcon so a corrective factor could be applied to the popcon  
statistics.


Cheers,
Morten




--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread The Fungi
On Fri, Jan 16, 2009 at 11:18:14AM -0800, Russ Allbery wrote:
 It's worth bearing in mind that that's a bad assumption, too. We
 use a local security mirror in full knowledge that it's not
 recommended, but we watch it closely and will manually sync if
 need be. We do this because we have systems on IP addresses that
 are not routable to the Internet and need an on-campus source for
 all package updates. Having a local mirror is easier for our
 purposes than using a proxy.

Same here, though with a caching Debian package proxy instead of an
actual mirror. Nonetheless, s.d.o only sees one download of a given
security update even though it's actually being retrieved by
hundreds of machines.

For that matter, we also have a separate network for facility
security systems which is not connected to the Internet, and once a
month we rsync a detachable drive on an Internet-connected machine,
then sneaker-net it back to the system acting as a security update
repository for the other hosts.
-- 
{ IRL(Jeremy_Stanley); PGP(9E8DFF2E4F5995F8FEADDC5829ABF7441FB84657);
SMTP(fu...@yuggoth.org); IRC(fu...@irc.yuggoth.org#ccl); ICQ(114362511);
AIM(dreadazathoth); YAHOO(crawlingchaoslabs); FINGER(fu...@yuggoth.org);
MUD(fu...@katarsis.mudpy.org:6669); WWW(http://fungi.yuggoth.org/); }


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread Simon Josefsson
Johannes Wiedersich johan...@physik.blm.tu-muenchen.de writes:

 Simon Josefsson wrote:
 Merely the number of distinct IP addresses downloading a particular
 popular update from security.debian.org at least once would be
 interesting.

 Did you think about thousands of computers having 'private ips' with
 some nat translation and/or local proxie? (I'm thinking of computer
 labs, companies, etc. not just the odd home user). Essentially all of
 the computers at our department share the same public IP.

Right, there are many reasons why such a number wouldn't be perfect, but
I still believe it would be interesting to know.  Especially if you plot
the trend in a graph to watch yearly changes.  If you get 10 such
indicator variables that likely are somehow correlated to the number of
machines (virtual or not) running debian plotted in a graph, and watch
the trends, that is likely to be the best measure we are likely to ever
get.  Or are there any better ideas on how to get a good estimate?

/Simon


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread Kjeldgaard Morten


On 16/01/2009, at 23.25, The Fungi wrote:


Same here, though with a caching Debian package proxy instead of an
actual mirror. Nonetheless, s.d.o only sees one download of a given
security update even though it's actually being retrieved by
hundreds of machines.



On 16/01/2009, at 18.27, Johannes Wiedersich wrote:


Did you think about thousands of computers having 'private ips' with
some nat translation and/or local proxie? (I'm thinking of computer
labs, companies, etc. not just the odd home user). Essentially all of
the computers at our department share the same public IP.


Hundreds of machines accessing proxies, and thousands having private  
IPs. Are these numbers something you know or are you just throwing  
them around? Otherwise they can of course be accounted for in the  
total estimate ;-)


Cheers,
Morten


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread Bernd Eckenfels
In article 200901161206.13302.to...@rastageeks.org you wrote:
 If the answer is we don't know, then we don't know. Problem is that you 
 don't give any ground to your claims, hence it is far worse to give any 
 estimation.

But if you say we see security donloads from x unique IPs for every new
update then you have a lower bound.

 The only serious analysis was the one made by Bernd Eckenfels, which ended 
 with 1%. I don't really believe this can be used as it is before another 
 contradictory analysis can be done.

Well, its not too serious, since Linuxcounter also only estimates the
29 Million Users.

Gruss
Bernd


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread Bernd Eckenfels
In article 87d4enbfqd@mocca.josefsson.org you wrote:
 It would establish an upper bound of well-administrated debian machines,
 I think.

It is a lower bound, since I guess there are more cases where more than one
machine is updated. The case that you download without need or as a
duplicate (With multiple IPs) is very low.

Gruss
Bernd


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-16 Thread The Fungi
On Sat, Jan 17, 2009 at 01:05:47AM +0100, Kjeldgaard Morten wrote:
 Hundreds of machines accessing proxies, and thousands having
 private IPs. Are these numbers something you know or are you just
 throwing them around? Otherwise they can of course be accounted
 for in the total estimate ;-)

I could count mine with wc -l (I have to maintain a list to kick off
aptitude runs on them all). As for Russ, I'm sure he could at least
analyze his mirror's access logs for a more precise value if he
wanted to, just as someone with access to the s.d.o logs might. The
bigger issue is that there are almost certainly enough sites like
ours to make them a drop in the bucket, skewing the measure downward
fairly significantly.

Economies of scale, security, bandwidth costs, IPv4 runout, et
cetera assure that the more Debian machines running at a given
location, the less likely it is that you'll be able to count them
externally without popcon or somethig similar installed (and as has
been discussed, these are the same ones who are less likely to
install it site-wide anyway). The most I expect you can hope for
here is a relatively reasonable lower bound on actively-updated
systems. If there are probably at least X number of Debian systems
on the Internet, but maybe many more isn't a sufficiently useful
statement for you, you might be out of luck.
-- 
{ IRL(Jeremy_Stanley); PGP(9E8DFF2E4F5995F8FEADDC5829ABF7441FB84657);
SMTP(fu...@yuggoth.org); IRC(fu...@irc.yuggoth.org#ccl); ICQ(114362511);
AIM(dreadazathoth); YAHOO(crawlingchaoslabs); FINGER(fu...@yuggoth.org);
MUD(fu...@katarsis.mudpy.org:6669); WWW(http://fungi.yuggoth.org/); }


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-15 Thread Michael Goetze

Hi,


I know it is not possible to _know_ the real percentage of uses which
submit popcon stats of all users. But I want to ask for guesses,
because more oppinions do likely improve the result.

My current guess is between 1/3 and 2/3.

What do you think?


before wild speculations ensues, you might want to specify what you 
really want to know: the percentage of people installing debian systems 
who use popcon (always/sometimes), or the percentage of installed 
machines that submit popcon data?


For example, I'm pretty sure any hosting company offering Debian on 
dedicated servers will disable popcon by default...


Regards,
Michael


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-15 Thread markus schnalke
[2009-01-15 22:37] Michael Goetze mgoe...@mgoetze.net
 
 before wild speculations ensues, you might want to specify what you 
 really want to know: the percentage of people installing debian systems 
 who use popcon (always/sometimes), or the percentage of installed 
 machines that submit popcon data?

Seems my wording was unclear.

I want to know the percentage of installed machines that submit popcon
data.


Maybe some words about the background of this question:
I want to estimate the number of users of some software. Thus I have a
look at Popcon which tells me the number of installations of the
package. Now I need a good multiplicator (the searched one) to receive
an estimated number of users within Debian. I'll do the same for
Ubuntu's popcon and add guessed numbers for usage on other GNU and
Unix systems. That should lead to a quite good estimation.


 For example, I'm pretty sure any hosting company offering Debian on 
 dedicated servers will disable popcon by default...

And there is a number of systems without online connection, they will
also not submit.


meillo


signature.asc
Description: Digital signature


Re: percentage of popcon submitters

2009-01-15 Thread James Vega
On Thu, Jan 15, 2009 at 4:55 PM, markus schnalke mei...@marmaro.de wrote:
 [2009-01-15 22:37] Michael Goetze mgoe...@mgoetze.net

 before wild speculations ensues, you might want to specify what you
 really want to know: the percentage of people installing debian systems
 who use popcon (always/sometimes), or the percentage of installed
 machines that submit popcon data?

 Seems my wording was unclear.

 I want to know the percentage of installed machines that submit popcon
 data.

That requires knowing the number of computers that have Debian installed
which, as has been discussed various times in the past on this list, is
difficult to determine.

-- 
James
GPG Key: 1024D/61326D40 2003-09-02 James Vega james...@debian.org


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-15 Thread Michael Goetze

James Vega wrote:

On Thu, Jan 15, 2009 at 4:55 PM, markus schnalke mei...@marmaro.de wrote:

I want to know the percentage of installed machines that submit popcon
data.


That requires knowing the number of computers that have Debian installed
which, as has been discussed various times in the past on this list, is
difficult to determine.


And even then it might not help answer your question, for instance if 
you have a desktop application the percentage of popcon submitters might 
be higher than average among your users, whereas if you have some 
software mainly useful on classified military machines, the percentage 
of popcon submitters might be lower than average among your users.



--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-15 Thread Patrick Matthäi
Michael Goetze schrieb:
 Hi,
 
 I know it is not possible to _know_ the real percentage of uses which
 submit popcon stats of all users. But I want to ask for guesses,
 because more oppinions do likely improve the result.

 My current guess is between 1/3 and 2/3.

 What do you think?
 
 before wild speculations ensues, you might want to specify what you
 really want to know: the percentage of people installing debian systems
 who use popcon (always/sometimes), or the percentage of installed
 machines that submit popcon data?
 
 For example, I'm pretty sure any hosting company offering Debian on
 dedicated servers will disable popcon by default...
 

Hello,

for myself I deactivated popcon on every machine.
Then I first installed and activated it on every server and some times
later also on my desktop.

For my case I do not think that it will be realy deactivated on
dedicated servers in most cases.
But yeah, it may differ from person to person :)


-- 
/*
Mit freundlichem Gruß / With kind regards,
Patrick Matthäi

E-Mail: patrick.matth...@web.de

Comment:
Always if we think we are right,
we were maybe wrong.
*/


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-15 Thread Luciano Bello
El Jue 15 Ene 2009, markus schnalke escribió:
 My current guess is between 1/3 and 2/3.

that means that there is between 78055/(1/3)=234165 and 78055/(2/3)=117,082 of 
Debian installations. It doesn't look like a big number... I think that we are 
more.

Maybe your estimation is too high.

luciano


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-15 Thread Bernd Eckenfels
In article 20090115210004.gv21...@serveme.schnalke.local you wrote:
 My current guess is between 1/3 and 2/3.

Machines or Users?

According to Linuxcounter there are estimated 29,000,000 users and debian has
18.36% which equals in 5m  debian users. Popcon lists 78k submissions,
which is less than 2%

Gruss
Bernd


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-15 Thread Russ Allbery
markus schnalke mei...@marmaro.de writes:

 I know it is not possible to _know_ the real percentage of uses which
 submit popcon stats of all users. But I want to ask for guesses, because
 more oppinions do likely improve the result.

 My current guess is between 1/3 and 2/3.

 What do you think?

We run Debian or Ubuntu on around 500 machines and only submit popcon
results from about four of them.

It's hard to talk institutional computer security departments into the
idea that the minor risk of the information released by popcon (what
packages on your servers are missing security patches, basically) is worth
it for just the warm fuzzy feelings of contributing.  I suspect that
popcon is not running on most large institutional Debian installations.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-15 Thread Romain Beauxis
Le Thursday 15 January 2009 23:25:02 Bernd Eckenfels, vous avez écrit :
 In article 20090115210004.gv21...@serveme.schnalke.local you wrote:
  My current guess is between 1/3 and 2/3.

 Machines or Users?

 According to Linuxcounter there are estimated 29,000,000 users and debian
 has 18.36% which equals in 5m  debian users. Popcon lists 78k submissions,
 which is less than 2%

Thanks. Unless you setup some experimental method, any argument should reduce 
to handwaving or extension of various particular examples..

The big question (and the big troll) that's hidden behind this question is the 
total amount of installed debian systems.

Since this value is always and always discussed, I don't think there is any 
broadly accepted counting method, hence I don't think the original question 
can be answered...

Or, if you really want to troll, let's switch to the total amount of installed 
debian systems, since this is equivalent, but the de^C^Cbattle should be more 
fun... :-)


Romain


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-15 Thread Noah Slater
On Thu, Jan 15, 2009 at 10:00:04PM +0100, markus schnalke wrote:
 I know it is not possible to _know_ the real percentage of uses which
 submit popcon stats of all users. But I want to ask for guesses,
 because more oppinions do likely improve the result.

   This question of trying to figure out whether a book is good or bad by
  looking at it carefully or by taking the reports of a lot of people who
  looked at it carelessly is like this famous old problem: Nobody was
  permitted to see the Emperor of China, and the question was, What is the
  length of the Emperor of China's nose? To find out, you go all over the
  country asking people what they think the length of the Emperor of China's
  nose is, and you average it. And that would be very accurate because you
  averaged so many people. But it's no way to find anything out; when you have
  a very wide range of people who contribute without looking carefully at it,
  you don't improve your knowledge of the situation by averaging.

   -- Richard P. Feynman, Surely You're Joking, Mr. Feynman!

-- 
Noah Slater, http://tumbolia.org/nslater


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: percentage of popcon submitters

2009-01-15 Thread markus schnalke
[2009-01-16 05:59] Noah Slater nsla...@tumbolia.org
 On Thu, Jan 15, 2009 at 10:00:04PM +0100, markus schnalke wrote:
  I know it is not possible to _know_ the real percentage of uses which
  submit popcon stats of all users. But I want to ask for guesses,
  because more oppinions do likely improve the result.
 
This question of trying to figure out whether a book is good or bad by
   looking at it carefully or by taking the reports of a lot of people who
   looked at it carelessly is like this famous old problem: Nobody was
   permitted to see the Emperor of China, and the question was, What is the
   length of the Emperor of China's nose? To find out, you go all over the
   country asking people what they think the length of the Emperor of China's
   nose is, and you average it. And that would be very accurate because you
   averaged so many people. But it's no way to find anything out; when you have
   a very wide range of people who contribute without looking carefully at it,
   you don't improve your knowledge of the situation by averaging.
 
-- Richard P. Feynman, Surely You're Joking, Mr. 
 Feynman!

Good point, but one may refer to the Delphi method:
http://en.wikipedia.org/wiki/Delphi_method

However, the answers I received actually helped my. Not because they
were estimations, but because they were comments for what to keep in
mind.

In any way, I believe more oppinions do improve results, but not by
telling numbers one can average, but by showing how others see the
situation. This widens the own view.


meillo


signature.asc
Description: Digital signature


Re: percentage of popcon submitters

2009-01-15 Thread markus schnalke
[2009-01-15 23:25] Bernd Eckenfels e...@lina.inka.de
 In article 20090115210004.gv21...@serveme.schnalke.local you wrote:
  My current guess is between 1/3 and 2/3.
 
 Machines or Users?

Popcon focuses on machines. In the end I want users. But any number
would be good.


 According to Linuxcounter there are estimated 29,000,000 users and debian has
 18.36% which equals in 5m  debian users. Popcon lists 78k submissions,
 which is less than 2%

That is really a good approach. Thanks for that!

I seems to be quite sure that the popcon submitters are less than 1/3
of all Debian users.

Luciano Bello's calculation pointed to a similar way.


meillo


signature.asc
Description: Digital signature


Re: percentage of popcon submitters

2009-01-15 Thread Kjeldgaard Morten


Thanks. Unless you setup some experimental method, any argument  
should reduce

to handwaving or extension of various particular examples..


Surely, it must be possible to get an estimate of the number of  
downloads of important packages and security updates? I know these  
downloads also are requested from mirror sites, but at least for the  
official mirror sites their relative activity must be known?


Cheers,
Morten

--
Morten Kjeldgaard m...@ubuntu.com
Ubuntu MOTU Developer
GPG Key ID: 404825E7


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org