Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposal: DNF UUID)

2019-01-14 Thread John Harris
On Monday, January 14, 2019 6:48:47 PM EST Matthew Miller wrote:
> Merging Core and Extras into one thing was absolutely the
> right thing to do for the project, but not having a unique name for the
> resulting OS was a mistake and leads to this. Ah well.

In your opinion, is the purpose of the Fedora Project something other than the 
creation and maintenance of the distribution known as Fedora?


-- 
John M. Harris, Jr. 
Splentity
https://splentity.com/

signature.asc
Description: This is a digitally signed message part.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposal: DNF UUID)

2019-01-14 Thread Matthew Miller
On Mon, Jan 14, 2019 at 09:12:51PM +0100, Kevin Kofler wrote:
> > It's not an artificial distinction. Editions are particular solutions
> > targeting particular key use cases identified by the Fedora Board (and now
> > Council). This is different from a desktop Spin, which is focused on
> > delivering that particular technology, or from Labs, which are focused on
> > more niche use cases.
> This is a political/marketing distinction and not a technical one.

Yes.


> > Fedora is a Project. That Project makes an operating system platform and
> > various operating system and platform solutions.
> 
> Oh no, not the KDE rebranding fiasco here too!
> 
> Almost everyone still calls "KDE Plasma" just "KDE", despite all the 
> insistence that "KDE" is not a particular piece of software (anymore), but a 
> community. Trying to do the same to the "Fedora" brand is going to flop 
> exactly the same way.

This is not new. In Mo's blog post about the history of the Fedora logo,
there are separate logos for "Fedora Project" and for "Fedora Core" —  the
OS deliverable. Merging Core and Extras into one thing was absolutely the
right thing to do for the project, but not having a unique name for the
resulting OS was a mistake and leads to this. Ah well.

I'm not going to go out of my way to crusade about this by tracking down
people who Say It Wrong On The Internet, but I think as a project we can at
least attempt to be internally consistent, and I think there are huge
benefits in making sure Fedora (the project) isn't tied to one particular
output.


-- 
Matthew Miller

Fedora Project Leader
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposal: DNF UUID)

2019-01-14 Thread Dan Book
On Mon, Jan 14, 2019 at 6:20 PM  wrote:

> On Mon, Jan 14, 2019 at 12:05 PM, John Harris 
> wrote:
> > The easiest way to make any of the Spins more accessible, for them to
> > have any
> > chance comparable to the prominent advertising of Workstation and
> > similar
> > options, would be to make them more prominent on the "getfedora"
> > index. This
> > also have a huge effect on SEO.
>
> So the reason spins are not very visible -- and ought to stay not very
> visible -- is that they don't get the same level of attention as the
> main products, and we don't really want anybody to download those
> unless they know in advance what they are doing. In particular, we
> really don't want Fedora to be judged by the quality of its spins and
> labs. There are a lot of them, and it's just not plausible to keep up
> with quality control for every one.
>
>
I agree that the spins in general could use more QA attention to fit this
role. But it is volunteer work - the QA team can only manage so many
editions, and there are two people that even touch the Cinnamon spin
including myself, and one person managing almost the entire stack of
software it uses, and personally I am quite disincentivized to spend time
on a spin which is given such little fanfare by the project, the only
reason I continue is because I believe it is objectively a better desktop
environment than the ones Fedora pushes. Because it is so invisible, less
people realize it exists, and because of that, nearly nobody shows up to
assist. It is a chicken and egg problem.

-Dan
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposal: DNF UUID)

2019-01-14 Thread mcatanzaro
On Mon, Jan 14, 2019 at 12:05 PM, John Harris  
wrote:
The easiest way to make any of the Spins more accessible, for them to 
have any
chance comparable to the prominent advertising of Workstation and 
similar
options, would be to make them more prominent on the "getfedora" 
index. This

also have a huge effect on SEO.


So the reason spins are not very visible -- and ought to stay not very 
visible -- is that they don't get the same level of attention as the 
main products, and we don't really want anybody to download those 
unless they know in advance what they are doing. In particular, we 
really don't want Fedora to be judged by the quality of its spins and 
labs. There are a lot of them, and it's just not plausible to keep up 
with quality control for every one.


The Plasma spin is perhaps an exception here. I could totally see that 
one being elevated to the level of Fedora product: "Fedora Plasma" or 
something like that. I wouldn't really mind having two desktop 
products, myself. We just can't create Fedora products for every single 
desktop out there, or the download page is going to become way too hard 
to navigate, and users will become less-likely to wind up with the 
versions of Fedora that we want to promote. So if we promote KDE to a 
product, I'd say we'd have to draw the line there, and I'd argue that 
would make sense due to KDE's outsized importance to the Fedora 
community relative to other spins, and the QA it already receives 
(especially its blocker bug eligibility). I assume we fear branding 
difficulties if we have multiple UIs for Fedora? Perhaps it'd be a huge 
mistake. But the potential benefits of attracting more KDE users and 
developers to Fedora might well outweigh the cost! It's at least worth 
seriously considering.


Michael
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposal: DNF UUID)

2019-01-14 Thread Kevin Kofler
Matthew Miller wrote:
> It's not an artificial distinction. Editions are particular solutions
> targeting particular key use cases identified by the Fedora Board (and now
> Council). This is different from a desktop Spin, which is focused on
> delivering that particular technology, or from Labs, which are focused on
> more niche use cases.

This is a political/marketing distinction and not a technical one.

For Editions vs. Spins, the Editions are in practice all focusing on a 
particular technology: Workstation on GNOME, Server on server software, and 
Atomic/Silverblue on atomic updates. Workstation in particular attempts to 
simultaneously cater to very different "key use cases": web developers, 
gamers using proprietary graphics drivers, etc., so it is pretty much a 
general-purpose deliverable and not optimized for any particular use case; 
the only set point (not up for discussion) that I can see is that it is 
based on GNOME.

For Editions vs. Labs, the distinction between a "key" use case and a 
"niche" use case is purely subjective. (The only objective distinction that 
I see is that the Labs are actually much more tuned to their use cases than 
the Editions, which use them mostly as an alibi.) An ordering by decreasing 
download count would suffice to make the distinction between "key" and 
"niche" purely objectively (and without having to draw a clear line where 
"key" ends and "niche" starts).

I can see the point of the distinction between Spins and Labs (at least as a 
terminology – the processes are essentially the same for both anyway), but 
Editions claim to be use-case-centric like Labs while really being like 
technology-centric like Spins. So the marketing is pretty deceptive.

> Since this is an offshoot of a thread about metrics, I want to emphasize
> that by all the metrics we have, this has been *very* successful. Fedora
> numbers were flat-to-decreasing when we started this, and now they're
> steeply up and growing.

But the setup I propose has never been tried. The pre-"Fedora.Next" 
interations of the Fedora download page were also heavily biased towards 
GNOME (or "Desktop" as the GNOME-based deliverable used to be called). So 
you do not have any usable metrics for comparison.

>> So if that is your concern, the solution would be to define some minimum
>> formal requirements for a Spin to be listed on the get.fp.o front page.
>> But then those requirements should also apply to the 3 "Editions": if
>> they don't fit the criteria, they should be kicked out as well. (I could
>> see that possibly happenening for Server or Atomic/Silverblue at some
>> point. The Fedora user base is clearly desktop-centric. But I am NOT
>> saying that they should necessarily be delisted, just that they should be
>> held to the same maintenance standards as the Spins.)
> 
> There *are* "some minimum formal requirements". An Edition is a Fedora
> solution made by a formal Fedora Working Group in response to a strategic
> use case identified by the community through the Fedora Council.

That is not a formal requirement, it's a subjective committee decision. (See 
also what happened when the KDE SIG tried to create a science-centered 
Edition based on KDE Plasma, capitalizing on the many scientific KDE 
(kdeedu) and Qt applications and on the work done by the KDE Scientific and 
KDE Astronomy Labs. The Board/Council was just not interested for purely 
political reasons.)

> The WG needs formal membership, needs to meet regularly, and needs to have
> a regularly-refreshed requirements document.

These are reasonable criteria for being listed (though I'd also add some 
technical usability criteria, to make sure that the WG is actually producing 
a usable deliverable), but they should be the same for all 
Spins/Labs/Editions independently of whether the Council subjectively 
believes that that particular work deserves being an "Edition" or not.

> I really, really, strongly encourage the team behind each spin to
> advertise more prominently. The Council is even willing to allocate funds
> as necessary to help do that.

No amount of advertising we can do is going to be as prominent as the 
getfedora download page. All users are driven to that page.

The only option would be to completely rebrand the Spin to an independent 
Remix with its own name and domain (so searches for the new name would go 
directly to the new domain and not to getfedora), but even then, it would be 
very tough to even come close to the brand recognition Fedora has.

> Fedora is a Project. That Project makes an operating system platform and
> various operating system and platform solutions.

Oh no, not the KDE rebranding fiasco here too!

Almost everyone still calls "KDE Plasma" just "KDE", despite all the 
insistence that "KDE" is not a particular piece of software (anymore), but a 
community. Trying to do the same to the "Fedora" brand is going to flop 
exactly the same way.

> Your "choose your Fedora adventure" page is 

Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposal: DNF UUID)

2019-01-14 Thread John Harris
On Monday, January 14, 2019 12:56:30 PM EST Matthew Miller wrote:
> I think it's better to not focus so much on the central page or on the
> "getfedora" brochure site, and to instead make the page for each particular
> solution more useful and more discoverable.

The easiest way to make any of the Spins more accessible, for them to have any 
chance comparable to the prominent advertising of Workstation and similar 
options, would be to make them more prominent on the "getfedora" index. This 
also have a huge effect on SEO.

Right now, in DuckDuckGo:

"download fedora" returns: https://getfedora.org/en/workstation/download/ 
first, and https://getfedora.org/ next.

"get fedora" returns https://getfedora.org/ first, and https://getfedora.org/
en/workstation/download/ next.

-- 
John M. Harris, Jr. 
Splentity
https://splentity.com/

Sent using on-screen keyboard on an X200 tablet, please excuse my brevity.

signature.asc
Description: This is a digitally signed message part.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-14 Thread Matthew Miller
On Sun, Jan 13, 2019 at 02:15:19PM -0500, Stephen John Smoogen wrote:
> > Then can we change the title of the thread?
> Nico, you know this better than me. This is email not a forum. People can
> rename threads but depending on the email software it will just look like a
> completely different thread. I think a rename has been done, but people
> keep responding on this thread.

I have a draft update to the change 
https://fedoraproject.org/wiki/Changes/DNF_Better_Counting
which I'm waiting to hear back from the DNF team on. Once I do, Ben will
post that as a new thread with a new title.


-- 
Matthew Miller

Fedora Project Leader
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposal: DNF UUID)

2019-01-14 Thread Matthew Miller
On Sat, Jan 12, 2019 at 07:51:34PM +0100, Kevin Kofler wrote:
> I think John's statement was pretty clear: The artificial distinction 
> between "Editions" and "Spins" needs to go away.

It's not an artificial distinction. Editions are particular solutions
targeting particular key use cases identified by the Fedora Board (and now
Council). This is different from a desktop Spin, which is focused on
delivering that particular technology, or from Labs, which are focused on
more niche use cases.

Since this is an offshoot of a thread about metrics, I want to emphasize
that by all the metrics we have, this has been *very* successful. Fedora
numbers were flat-to-decreasing when we started this, and now they're
steeply up and growing.


> So if that is your concern, the solution would be to define some minimum 
> formal requirements for a Spin to be listed on the get.fp.o front page. But 
> then those requirements should also apply to the 3 "Editions": if they don't 
> fit the criteria, they should be kicked out as well. (I could see that 
> possibly happenening for Server or Atomic/Silverblue at some point. The 
> Fedora user base is clearly desktop-centric. But I am NOT saying that they 
> should necessarily be delisted, just that they should be held to the same 
> maintenance standards as the Spins.)

There *are* "some minimum formal requirements". An Edition is a Fedora
solution made by a formal Fedora Working Group in response to a strategic
use case identified by the community through the Fedora Council. The WG
needs formal membership, needs to meet regularly, and needs to have a
regularly-refreshed requirements document.

> That said, I am pretty sure that if the Spins were more prominently 
> advertised, they would be more likely to attract helping hands. As it stands 
> now, users not yet familiar with Fedora might not even realize that the 
> Spins even exist.

I really, really, strongly encourage the team behind each spin to advertise
more prominently. The Council is even willing to allocate funds as necessary
to help do that.


> Fedora is an operating system that you can use, share, distribute, and
> modify as you like, all completely for free. More information.

Fedora is a Project. That Project makes an operating system platform and
various operating system and platform solutions.


[...]

Your "choose your Fedora adventure" page is interesting, but not new. We
talked about this with the design team and they're really not in favor of
that as the primary user experience for people who don't know what they
want. It can be overwhelming and potentially full of traps. 

I think it's better to not focus so much on the central page or on the
"getfedora" brochure site, and to instead make the page for each particular
solution more useful and more discoverable.


-- 
Matthew Miller

Fedora Project Leader
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-13 Thread Stephen John Smoogen
On Sun, 13 Jan 2019 at 12:47, Nico Kadel-Garcia  wrote:

> On Sun, Jan 13, 2019 at 11:54 AM Stephen John Smoogen 
> wrote:
> >
> >
> >
> > On Sat, 12 Jan 2019 at 22:25, Nico Kadel-Garcia 
> wrote:
> >>
> >> On Fri, Jan 11, 2019 at 4:37 PM Roberto Ragusa 
> wrote:
> >> >
> >> > On 1/8/19 4:22 PM, Lennart Poettering wrote:
> >> >
> >> > > If all you want to do is count, then it should be entirely
> sufficient
> >> > > to do it like this:
> >> > >
> >> > >GET
> /metalink?repo=fedora-28=x86_64==1 HTTP/1.1
> >> > >
> >> > > the first time within each one-week window and a simple
> >> > >
> >> > >GET /metalink?repo=fedora-28=x86_64= HTTP/1.1
> >> > >
> >> > > all other times.
> >> >
> >> > As an additional improvement, is it really needed to count every
> machine?
> >> > We can subsample a lot, and only let some specific machines to show
> >> > up for counting.
> >>
> >> The difficulty is not the counting. Requiring safe counting and
> >> aggregation by the server is a requirement that no server or
> >> intermediate server or proxy needs to follow, and would require
> >> configuration or filtering control of a server that is outside of
> >> client hands. It's not legally or technologically mandated. The great
> >> use fo r the data is tracking hosts, metadata that is saleable and
> >> likely to help provide a new form of tracking information.
> >>
> >> Writing this into the dnf behavior is typical, but i't's not
> >> beneficial to the clients. It's beneficial to the mirrors, who are
> >> likely to sell the data. While it may be that infamous problem, a
> >> "Simple Matter Of Programming(tm)" to sanitize the data, there are
> >> strong motivations to collect it and sell it, and I'd expect various
> >> mirrors to start doing so within moments of the activation of the
> >> feature.
> >
> >
> > 1. The mirrors do not see this.
>
> If it's not available to the mirrors, then anyone who hardcodes a
> mirror's URL into the local "baseurl" settings is not going to be
> counted this way, and we're back at the "we don't know how many
> clients there are" problem. If only the "mirrorlist" hosts see the
> UUID, "countme" or any other identical client ID.
>
>
Since you seem to have avoided reading the emails where this was detailed,
here is the simplest version of the countme proposal. [Please see Lennarts
email and replies for the non shortened version.]

Once a time period (day, week, month), an update would just add a countme=1
to it.

There is no more client id. There is no data other than that. We would just
count all the countme=1 and get an idea of what was going on. It isn't an
exact number but it puts some amount of solid-ness in the fuzzy cloud. The
more complicated version which mattdm is wanting is that countme gets
incremented by the week after install. Nothing else. No data from the
/etc/machine-id, no data from /var/yum/uuid etc.




> > 2. We aren't talking about UUIDs anymore and just a countme variable
> being sent periodically. If a countme is going to be too much data to send,
> then clients are probably already sending way too much data already.
>
> Then can we change the title of the thread?
>
>
Nico, you know this better than me. This is email not a forum. People can
rename threads but depending on the email software it will just look like a
completely different thread. I think a rename has been done, but people
keep responding on this thread.


>

-- 
Stephen J Smoogen.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposal: DNF UUID)

2019-01-13 Thread Zbigniew Jędrzejewski-Szmek
On Sat, Jan 12, 2019 at 07:51:34PM +0100, Kevin Kofler wrote:
> Stephen John Smoogen wrote:
> > Side note, I was at a loss of what you were getting at. There were several
> > ways it could be interpreted and has been used by people in the past to
> > mean different things.
> 
> I think John's statement was pretty clear: The artificial distinction 
> between "Editions" and "Spins" needs to go away.
> 
> > The problem is that there is an inherent conflict of resources here. When
> > we put everything on the download pages, everyone including the spin
> > owners say it was too confusing.

The issue of which spins/editions are promoted is orthogonal to the
issue of counting. After all, counting just reflects the actual
frequency of installations, not the reasons for it.

But counting may provide a fresh look at this issue. We'll have much
better data which spins/editions are used. If it turns out that KDE is
more popular than previous statistics showed, or that KDE has a higher
retention rate (the number of short-lived installations is low
suggesting that users "like it if they see it"), this would be a
strong argument to make the KDE spin more visible.

Zbyszek
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-13 Thread Nico Kadel-Garcia
On Sun, Jan 13, 2019 at 11:54 AM Stephen John Smoogen  wrote:
>
>
>
> On Sat, 12 Jan 2019 at 22:25, Nico Kadel-Garcia  wrote:
>>
>> On Fri, Jan 11, 2019 at 4:37 PM Roberto Ragusa  wrote:
>> >
>> > On 1/8/19 4:22 PM, Lennart Poettering wrote:
>> >
>> > > If all you want to do is count, then it should be entirely sufficient
>> > > to do it like this:
>> > >
>> > >GET /metalink?repo=fedora-28=x86_64==1 
>> > > HTTP/1.1
>> > >
>> > > the first time within each one-week window and a simple
>> > >
>> > >GET /metalink?repo=fedora-28=x86_64= HTTP/1.1
>> > >
>> > > all other times.
>> >
>> > As an additional improvement, is it really needed to count every machine?
>> > We can subsample a lot, and only let some specific machines to show
>> > up for counting.
>>
>> The difficulty is not the counting. Requiring safe counting and
>> aggregation by the server is a requirement that no server or
>> intermediate server or proxy needs to follow, and would require
>> configuration or filtering control of a server that is outside of
>> client hands. It's not legally or technologically mandated. The great
>> use fo r the data is tracking hosts, metadata that is saleable and
>> likely to help provide a new form of tracking information.
>>
>> Writing this into the dnf behavior is typical, but i't's not
>> beneficial to the clients. It's beneficial to the mirrors, who are
>> likely to sell the data. While it may be that infamous problem, a
>> "Simple Matter Of Programming(tm)" to sanitize the data, there are
>> strong motivations to collect it and sell it, and I'd expect various
>> mirrors to start doing so within moments of the activation of the
>> feature.
>
>
> 1. The mirrors do not see this.

If it's not available to the mirrors, then anyone who hardcodes a
mirror's URL into the local "baseurl" settings is not going to be
counted this way, and we're back at the "we don't know how many
clients there are" problem. If only the "mirrorlist" hosts see the
UUID, "countme" or any other identical client ID.

> 2. We aren't talking about UUIDs anymore and just a countme variable being 
> sent periodically. If a countme is going to be too much data to send, then 
> clients are probably already sending way too much data already.

Then can we change the title of the thread?

If the "countme" variable is unique and sent only to the host
providing the mirrorlist, it's tracking data. That host becomes
responsible for anonymization, and it is *too late* unless the data
encrypted at the client, say with the GPG key of the relevant
repository, and that starts requiring GPG private keys on the host
providing the mirrorlist. If it's bonig across the wire, even with
SSL, man-in-the-middle is an old, old problem.

Whether the mirrorlist back end software is promised to be sanitized,
it's tracking data. Sadly, I've been through this in other venues. The
data was considerd "safe" because it was "anonymized". Except that the
original web traffic was tappable, along with IP addresses and unique
client information. A subpoena, a Patriot Act request, or even a
foreign worker with an H1-B visa reporting back to foreign
intelligence or a technology competitor could obtain a great deal of
trackable data.

Am I paranoid? Yes. Am i paranoid *enough*? I'm not so sure, we've
seen assembly of pseudonymous data and metadata throughout the history
of intelligence work. Demanding it, and handling it safely, is often
an exercise in people claiming "no one would do that!", "no one would
bother to investigate that", and people misusing it as a matter of
course. I'd suggest it's not even worth the effort to demand or to
collect with such concerns.

Nico Kadel-Garcia 
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-13 Thread Stephen John Smoogen
On Sat, 12 Jan 2019 at 22:25, Nico Kadel-Garcia  wrote:

> On Fri, Jan 11, 2019 at 4:37 PM Roberto Ragusa 
> wrote:
> >
> > On 1/8/19 4:22 PM, Lennart Poettering wrote:
> >
> > > If all you want to do is count, then it should be entirely sufficient
> > > to do it like this:
> > >
> > >GET /metalink?repo=fedora-28=x86_64==1
> HTTP/1.1
> > >
> > > the first time within each one-week window and a simple
> > >
> > >GET /metalink?repo=fedora-28=x86_64= HTTP/1.1
> > >
> > > all other times.
> >
> > As an additional improvement, is it really needed to count every machine?
> > We can subsample a lot, and only let some specific machines to show
> > up for counting.
>
> The difficulty is not the counting. Requiring safe counting and
> aggregation by the server is a requirement that no server or
> intermediate server or proxy needs to follow, and would require
> configuration or filtering control of a server that is outside of
> client hands. It's not legally or technologically mandated. The great
> use fo r the data is tracking hosts, metadata that is saleable and
> likely to help provide a new form of tracking information.
>
> Writing this into the dnf behavior is typical, but i't's not
> beneficial to the clients. It's beneficial to the mirrors, who are
> likely to sell the data. While it may be that infamous problem, a
> "Simple Matter Of Programming(tm)" to sanitize the data, there are
> strong motivations to collect it and sell it, and I'd expect various
> mirrors to start doing so within moments of the activation of the
> feature.
>

1. The mirrors do not see this.
2. We aren't talking about UUIDs anymore and just a countme variable being
sent periodically. If a countme is going to be too much data to send, then
clients are probably already sending way too much data already.



> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
>


-- 
Stephen J Smoogen.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-12 Thread Samuel Sieb

On 1/12/19 7:24 PM, Nico Kadel-Garcia wrote:

Writing this into the dnf behavior is typical, but i't's not
beneficial to the clients. It's beneficial to the mirrors, who are
likely to sell the data. While it may be that infamous problem, a
"Simple Matter Of Programming(tm)" to sanitize the data, there are
strong motivations to collect it and sell it, and I'd expect various
mirrors to start doing so within moments of the activation of the
feature.


Except that you've missed the point that's been made several times that 
the mirrors do not see this information ever.  It's only the mirror 
managers that would see it and those are not managed by the public.

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-12 Thread Nico Kadel-Garcia
On Fri, Jan 11, 2019 at 4:37 PM Roberto Ragusa  wrote:
>
> On 1/8/19 4:22 PM, Lennart Poettering wrote:
>
> > If all you want to do is count, then it should be entirely sufficient
> > to do it like this:
> >
> >GET /metalink?repo=fedora-28=x86_64==1 
> > HTTP/1.1
> >
> > the first time within each one-week window and a simple
> >
> >GET /metalink?repo=fedora-28=x86_64= HTTP/1.1
> >
> > all other times.
>
> As an additional improvement, is it really needed to count every machine?
> We can subsample a lot, and only let some specific machines to show
> up for counting.

The difficulty is not the counting. Requiring safe counting and
aggregation by the server is a requirement that no server or
intermediate server or proxy needs to follow, and would require
configuration or filtering control of a server that is outside of
client hands. It's not legally or technologically mandated. The great
use fo r the data is tracking hosts, metadata that is saleable and
likely to help provide a new form of tracking information.

Writing this into the dnf behavior is typical, but i't's not
beneficial to the clients. It's beneficial to the mirrors, who are
likely to sell the data. While it may be that infamous problem, a
"Simple Matter Of Programming(tm)" to sanitize the data, there are
strong motivations to collect it and sell it, and I'd expect various
mirrors to start doing so within moments of the activation of the
feature.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposal: DNF UUID)

2019-01-12 Thread Kevin Kofler
Stephen John Smoogen wrote:
> Side note, I was at a loss of what you were getting at. There were several
> ways it could be interpreted and has been used by people in the past to
> mean different things.

I think John's statement was pretty clear: The artificial distinction 
between "Editions" and "Spins" needs to go away.

> The problem is that there is an inherent conflict of resources here. When
> we put everything on the download pages, everyone including the spin
> owners say it was too confusing.

What Spin owners have you asked? While I cannot officially speak for the KDE 
SIG, I am almost certain (as a former member and a current passive follower 
of the KDE SIG) that the KDE SIG has never said such a thing. To the best of 
my knowledge, the KDE SIG's proposals to improve the download page have 
always suggested listing ALL Spins, not just KDE/Plasma and 
Workstation/Desktop/GNOME. (My suggestion has always been to order them by 
decreasing download counts.)

> But choosing which things get put on a special page or not ends up getting
> the opposite "You're oppressing me" or "Oh its ok if you drop everyone but
> MY spin".  The opposite catch-22 is that the spin may only have 1-2 to
> handle issues but they aren't getting more or less people because they
> don't have more than 1-2 people on it. This leads to multiple spins only
> getting looked at in the beta where someone sees "oh it won't get in the
> next release.. ok I will see if I can get time to fix things"

So if that is your concern, the solution would be to define some minimum 
formal requirements for a Spin to be listed on the get.fp.o front page. But 
then those requirements should also apply to the 3 "Editions": if they don't 
fit the criteria, they should be kicked out as well. (I could see that 
possibly happenening for Server or Atomic/Silverblue at some point. The 
Fedora user base is clearly desktop-centric. But I am NOT saying that they 
should necessarily be delisted, just that they should be held to the same 
maintenance standards as the Spins.)

That said, I am pretty sure that if the Spins were more prominently 
advertised, they would be more likely to attract helping hands. As it stands 
now, users not yet familiar with Fedora might not even realize that the 
Spins even exist.

> It is a complicated problem and doing the basic hand-waving of "it is
> because Fedora markets specifically GNOME they suck" just makes people
> pissed off and entrenched versus coming up with a workable solution.

I would propose this mockup (mix of HTML and ASCII art, sorry – each '#' 
sign stands for a nice colored icon, e.g., a notebook icon, an upstream 
desktop project logo, etc.):

# fedora Welcome to Fedora, a GNU/Linux distribution entirely composed 
of free and open source software, downloadable at no cost.

Fedora is an operating system that you can use, share, distribute, and 
modify as you like, all completely for free. More information.

What hardware (physical or virtual) do you want to install Fedora on?
  ||  
# Desktop|  # Server|  # Container
# Notebook/Laptop|  # VPS   |  # Docker
# Workstation|  # Server VM |  # Kubernetes
 |  |  

Desktop, Notebook/Laptop, Workstation

Fedora for the workstation: One operating system, many faces.

You can select between several different desktop workspace environments with 
different looks, feels, and user experiences, while always being able to use 
the full set of applications included with or shipped by third parties for 
Fedora. More information.

# GNOME – The default desktop environment in Fedora, recommended for new
  users. Download now (x86_64 ISO image)
# KDE Plasma – [description] Download now (x86_64 ISO image)
# Xfce – [description] Download now (x86_64 ISO image)
[all other Spins – the whole list should be ordered by decreasing download 
counts]

Fedora also offers convenient Labs for some niche use cases, to save you the 
trouble of manually installing your niche applications on one of the above 
Spins:
# Astronomy (based on: # KDE Plasma) – [description] Download now
   (x86_64 ISO image)
# Design Suite (based on: # GNOME) – [description] Download now
 (x86_64 ISO image)
[all other Labs – the whole list should be ordered by decreasing download 
counts]

Server, VPS, Server VM

Fedora for the server: […]

# Server – [description] Download now (x86_64 ISO image)

Container, Docker, Kubernetes

Fedora for containers: […]

# Silverblue – [description] Download now (x86_64 […] image)

[end mockup]

This mockup can easily be extended with more columns in the hardware table 
(and corresponding linked to page sections), e.g., a fourth column for ARM 
mobile devices.

Kevin Kofler
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe 

Re: F30: System-Wide Change proposal: DNF UUID

2019-01-12 Thread Stephen John Smoogen
On Sat, 12 Jan 2019 at 04:37, John Harris  wrote:

> On Saturday, January 12, 2019 2:27:33 AM EST Adam Williamson wrote:
> > Just as a note, Workstation isn't a spin, it's a Fedora Edition:
> >
> > https://fedoraproject.org/wiki/Editions
> >
> > framing it as if it's "just another spin" is a bit off. Its prominence
> > is quite intentional and the whole Fedora.next / editions thing was
> > precisely about picking some specific 'flavors' of Fedora and giving
> > them prominence over the others.
>
> Really, the issue there is specifically that it isn't "just another spin",
> but
> I'm sure you knew that's what I was getting at.


Side note, I was at a loss of what you were getting at. There were several
ways it could be interpreted and has been used by people in the past to
mean different things.


> Fedora's aggressive marketing
> of specifically GNOME, while hiding other Spins, would be an interesting
> factor in review of metrics of spins.
>

The problem is that there is an inherent conflict of resources here. When
we put everything on the download pages, everyone including the spin owners
say it was too confusing. But choosing which things get put on a special
page or not ends up getting the opposite "You're oppressing me" or "Oh its
ok if you drop everyone but MY spin".  The opposite catch-22 is that the
spin may only have 1-2 to handle issues but they aren't getting more or
less people because they don't have more than 1-2 people on it. This leads
to multiple spins only getting looked at in the beta where someone sees "oh
it won't get in the next release.. ok I will see if I can get time to fix
things"

It is a complicated problem and doing the basic hand-waving of "it is
because Fedora markets specifically GNOME they suck" just makes people
pissed off and entrenched versus coming up with a workable solution.

-- 
Stephen J Smoogen.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Editions vs. Spins (was: Re: F30: System-Wide Change proposal: DNF UUID)

2019-01-12 Thread Kevin Kofler
John Harris wrote:
> Really, the issue there is specifically that it isn't "just another spin",

+1

This pointless artificial distinction between "Editions" and "Spins" needs 
to stop (because there is no technical difference whatsoever between the 2 
concepts), as does the unfair advertising ("Editions" as shiny logos above 
the scrolling horizon and with one-click links directly to the ISO vs. 
"Spins" hidden beyond the scrolling horizon, grayed out, with no names and 
no description, and requiring at least 2 clicks to get them). But the people 
in power still refuse to do anything about it.

The grayed out logos are particularly outrageous because they are doing to 
the upstream logos exactly the kind of things explicitly forbidden in the 
Fedora logo guidelines (changing the colors and even reducing them to 2). 
(It so happens that the new Fedora logo will likely allow this kind of 
usage, but have you ever asked the upstreams whether THEY are OK with those 
unilateral changes to their logos?) I really don't see why, whereas all 
other icons on get.fp.o are colored, the ones for the Spins (and ONLY those) 
have to be grayed out. Yet https://pagure.io/design/issue/411 was closed as 
"fixed" without any actual fix having been deployed, ever.

I also find it funny that the argument for the one-click direct ISO download 
for GNOME "Workstation" (or formerly "Desktop") has always been that choices 
confuse users. But now there is a "Workstation"/"Server"/"Atomic" choice. 
While "Workstation" vs. "Server" is something that makes sense to most 
users, "Atomic" is definitely not (and the description full of technical 
jargon such as "Docker" and "Kubernetes" won't help either). Yet, Fedora 
still refuses to show the full list of choices there and shows only those 3 
arbitrarily picked ones.

Kevin Kofler
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-12 Thread John Harris
On Saturday, January 12, 2019 2:27:33 AM EST Adam Williamson wrote:
> Just as a note, Workstation isn't a spin, it's a Fedora Edition:
> 
> https://fedoraproject.org/wiki/Editions
> 
> framing it as if it's "just another spin" is a bit off. Its prominence
> is quite intentional and the whole Fedora.next / editions thing was
> precisely about picking some specific 'flavors' of Fedora and giving
> them prominence over the others.

Really, the issue there is specifically that it isn't "just another spin", but 
I'm sure you knew that's what I was getting at. Fedora's aggressive marketing 
of specifically GNOME, while hiding other Spins, would be an interesting 
factor in review of metrics of spins.

-- 
John M. Harris, Jr. 
Splentity
https://splentity.com/

signature.asc
Description: This is a digitally signed message part.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-11 Thread Adam Williamson
On Fri, 2019-01-11 at 18:48 -0500, John Harris wrote:
> On Friday, January 11, 2019 4:36:54 PM EST Roberto Ragusa wrote:
> > That is, apply the logic above only if(hash(machine_id)%1000==0)
> > (this becomes a poll instead of a referendum, results must then be
> > multiplied by 1000)
> 
> If this is done, the likelyhood of invalid data for the given Spin is pretty 
> high. For example, Workstation could show as being more popular than all of 
> the other spins combined, just because it's more popular than any given spin 
> (likely because it's advertised prominently, while other spins are hidden 
> behind a link at the middle of the download page).

Just as a note, Workstation isn't a spin, it's a Fedora Edition:

https://fedoraproject.org/wiki/Editions

framing it as if it's "just another spin" is a bit off. Its prominence
is quite intentional and the whole Fedora.next / editions thing was
precisely about picking some specific 'flavors' of Fedora and giving
them prominence over the others.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-11 Thread John Harris
On Friday, January 11, 2019 4:36:54 PM EST Roberto Ragusa wrote:
> That is, apply the logic above only if(hash(machine_id)%1000==0)
> (this becomes a poll instead of a referendum, results must then be
> multiplied by 1000)

If this is done, the likelyhood of invalid data for the given Spin is pretty 
high. For example, Workstation could show as being more popular than all of 
the other spins combined, just because it's more popular than any given spin 
(likely because it's advertised prominently, while other spins are hidden 
behind a link at the middle of the download page).

-- 
John M. Harris, Jr. 
Splentity
https://splentity.com/

signature.asc
Description: This is a digitally signed message part.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-11 Thread Roberto Ragusa
On 1/8/19 4:22 PM, Lennart Poettering wrote:

> If all you want to do is count, then it should be entirely sufficient
> to do it like this:
> 
>GET /metalink?repo=fedora-28=x86_64==1 HTTP/1.1
> 
> the first time within each one-week window and a simple
> 
>GET /metalink?repo=fedora-28=x86_64= HTTP/1.1
> 
> all other times.

As an additional improvement, is it really needed to count every machine?
We can subsample a lot, and only let some specific machines to show
up for counting.

That is, apply the logic above only if(hash(machine_id)%1000==0)
(this becomes a poll instead of a referendum, results must then be multiplied 
by 1000)

Or, to avoid having somebody constantly be counted and other constantly ignored,
the rule could be if(hash(machine_id)%1000==hash(weekofthecentury)%1000)

With this setup I know that 99.9% of the weeks I'm not reporting anything at 
all.

Of course 1000 is a constant that may be tuned, but looks a good choice
to me if the expected total number is on the order of 1 million.

Regards.

-- 
   Roberto Ragusamail at robertoragusa.it
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-10 Thread Richard W.M. Jones
On Mon, Jan 07, 2019 at 01:29:42PM -0500, Matthew Miller wrote:
> On Mon, Jan 07, 2019 at 12:30:53PM -0500, John Harris wrote:
> > > The Fedora community cares about privacy and is adverse to tracking
> > > measures. We don't want to track; just count.
> > If this is ever implemented, we should probably notify end users and 
> > provide 
> > an easy way to disable this. If you pass an identifier, that enables client 
> > tracking.
> 
> I agree -- it'll go in the release notes, docs, and probably also dnf
> configuration files as a comment.

It might need to be opt-in to comply with GDPR, although of
course IANAL etc.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-10 Thread Miroslav Lichvar
On Tue, Jan 08, 2019 at 09:43:01AM +0100, Lennart Poettering wrote:
> Moreover, afair we install and enable NTP clients by default on all
> our installations, no? just like pretty much any other OS these days
> does... counting by NTP mostly just means switching from NTP pool
> servers to fedora's own servers.

I think it would be difficult/expensive to provide the same quality of
service as the pool with thousands of servers distributed around the
globe.

Switching completely would probably be a bad idea. A better approach
would be to configure the clients to use a mix of the pool servers and
our servers. I think that's what Ubuntu does.

> > 3. Logging NTP does not cover the problem the UUID is trying to help
> > solve.. there are two places where we undercount and overcount
> > systems.
> >  a. systems behind nat firewalls all show up as 1 ip address. ntp or
> > yum or gnome-hotspot ask multiple times during a day.. but not a set
> > number. Just looking at my 3 home systems I see around 1 to 80
> > connections depending on what i have done that day.
> 
> The amount of traffic within a time window is linear to the number of
> hosts behind that IP address. It's relatively easy to estimate that
> there are 5 clients behind an IP adress if you get 5 NTP request
> datagrams within one protocol iteration instead of just one...

That would work if the "tracking" NTP server was configured with a
fixed polling interval and disabled bursts, and the systems were always
running. In our default configuration we use a variable polling
interval and bursts. Tracking individual clients behind one IP address
is possible if their number is not very large, but it's a bit more
complicated (it depends also on the client's implementation), and it
can count only systems that are running at the same time.

> > 4. NTP is a high security problem when you concentrate it to a set of
> > servers. These become servers that everyone wants to hack even more
> > than build systems. These problems range from DDOS to active hacks.
> 
> Uh, well, the major NTP servers tend to be pretty well tested and
> fuzzed these days, and they can be sandboxed efficiently, since they
> involve no big stack but only trivial SOCK_DGRAM traffic. I see no
> reason whatsoever for them to be less secure than a hand-written HTTP
> service that only Fedora runs and doesn't get all the validation love
> the NTP servers get...

The problem are DoS attacks. If the number of servers was small, it'd
be easy (cheap) to take them all out. The pool has thousands of
servers. The weak point is rather in their monitoring.

-- 
Miroslav Lichvar
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-09 Thread Matthew Miller
On Wed, Jan 09, 2019 at 01:38:07PM +0100, Tomasz Torcz wrote:
>   Nb. “UUID” sounds terribly technical. Can we use some term which
> is already known and understood by users, e.g. Advertising ID?

Well, it very much is not an "advertising ID", so not that.

But I think we're going to explore the non-uuid "countme" flag option
instead, which makes that irrelevant.



-- 
Matthew Miller

Fedora Project Leader
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-09 Thread Robert Marcano

On 1/9/19 4:45 AM, Nicolas Mailhot wrote:

Le 2019-01-08 18:13, Robert Marcano a écrit :

On 1/7/19 2:28 PM, Matthew Miller wrote:

On Mon, Jan 07, 2019 at 06:24:14PM +0100, Lennart Poettering wrote:

* The Fedora community cares about privacy and is adverse to tracking
measures. We don't want to track; just count.

Uh, so what's the story there? i mean, if you pass over the uuid you
make clients trackable, regardless if you want to make use of that or
not...


Not if we don't keep them for long. One idea is to rotate them fairly
frequently. But this is mostly a statement of intent and might be 
more about

how we build the backend than about what we force in the client.


If the client generate a new UUID every month (for example), or use
the current month in the UUID generation algorithm, There is no need
for the users to trust that the server is removing the logs is true.


Of course there is. It's rather trivial to correlate the previous UUID 
to the new one when you also have access to the corresponding IP addresses.


Then implement some kind of ping service that send that frequently 
changed UUID over an anonymizing network, maybe Tor.


It doesn't have to run all time, it could be a monthly timer that start 
a small instance of a Tor client and send the ping with the monthly 
UUID. I am elucrubrating here, but this could be refined.


Now, how to avoid fake pings? the same can occur with fake updates 
requests used for approximating current installation counts.




You need to be serious about data collection and approach it with a 
security mindset “how could I hijack the system and betray users trust” 
not “of course my data users are good they will never try anything evil 
I can collect everything I get my hands on and think later” (the kind of 
credulous US thinking that gave us Cambridge Analytica).


That’s what the GDPR is about. It’s *your* responsibility as data 
collector to think about how data could be used, it’s *your* problem to 
protect it, it’s *your* problem if it’s misused, you can not make it 
available on a platter for others to do evil things with and claim it’s 
those people’s problem.



___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-09 Thread Florian Weimer
* Peter Robinson:

>> > Not if we don't keep them for long. One idea is to rotate them fairly
>> > frequently. But this is mostly a statement of intent and might be more 
>> > about
>> > how we build the backend than about what we force in the client.
>>
>> My understanding is that the Fedora project does not control how much
>> network logging Red Hat does on its behalf, so rotating the UUID might
>> well not bring back the old anonymity.
>
> The mirror manager bits run all over the place, not just in Red Hat
> hosted locations and all the mirror manager bits run over https where
> Fedora infra controls the server end points so from that perspective
> it's mostly irrelevant

Doesn't this make it worse?

Thanks,
Florian
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-09 Thread John Harris
On Wednesday, January 9, 2019 3:45:16 AM EST Nicolas Mailhot wrote:
> Le 2019-01-08 18:13, Robert Marcano a écrit :
> 
> > On 1/7/19 2:28 PM, Matthew Miller wrote:
> > 
> >> On Mon, Jan 07, 2019 at 06:24:14PM +0100, Lennart Poettering wrote:
> >> 
>  * The Fedora community cares about privacy and is adverse to 
>  tracking
>  measures. We don't want to track; just count.
> >>> 
> >>> Uh, so what's the story there? i mean, if you pass over the uuid you
> >>> make clients trackable, regardless if you want to make use of that or
> >>> not...
> >> 
> >> 
> >> Not if we don't keep them for long. One idea is to rotate them fairly
> >> frequently. But this is mostly a statement of intent and might be more 
> >> about
> >> how we build the backend than about what we force in the client.
> > 
> > 
> > If the client generate a new UUID every month (for example), or use
> > the current month in the UUID generation algorithm, There is no need
> > for the users to trust that the server is removing the logs is true.
> 
> 
> Of course there is. It's rather trivial to correlate the previous UUID 
> to the new one when you also have access to the corresponding IP 
> addresses.
> 
> You need to be serious about data collection and approach it with a 
> security mindset “how could I hijack the system and betray users trust” 
> not “of course my data users are good they will never try anything evil 
> I can collect everything I get my hands on and think later” (the kind of 
> credulous US thinking that gave us Cambridge Analytica).
> 
> That’s what the GDPR is about. It’s *your* responsibility as data 
> collector to think about how data could be used, it’s *your* problem to 
> protect it, it’s *your* problem if it’s misused, you can not make it 
> available on a platter for others to do evil things with and claim it’s 
> those people’s problem.
> 
> -- 
> Nicolas Mailhot
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org

One other major issue I personally have with this is that using a UUID would 
make it very easy for anyone with the data to find what IP addresses a 
particular user frequents. This would allow for some pretty horrific physical 
location tracking.

-- 
John M. Harris, Jr. 
Splentity
https://splentity.com/

signature.asc
Description: This is a digitally signed message part.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-09 Thread Tomasz Torcz
On Tue, Jan 08, 2019 at 08:38:01PM +0100, Benjamin Berg wrote:
> > We can certainly implement a setup that does not collect or store the
> > UUID together with the IP address or timestamp. Send the UUID as a
> > HTTP header, don't log it, send the UUID off to a counting service
> > (*). If we make sure the UUID is protected in transit, sent only to
> > our own servers (or servers configured by the user), and not collected
> > or stored in a personally identifiable way, I suspect that we're
> > meeting our obligations under the GDPR, though we'd need to
> > double-check any selected solution carefully.
> 
> You are right that it is possible to immediately discard or obfuscate
> the information.
> 
> But, as Nicolas pointed out, the argument here is that the UUID itself
> likely needs to be considered "personal data" in the GDPR sense. And
> even doing something as minimal as that seems to imply "processing"[1]
> the data in the GDPR sense.

  Nb. “UUID” sounds terribly technical. Can we use some term which
is already known and understood by users, e.g. Advertising ID?
-- 
Tomasz Torcz   72->|   80->|
xmpp: zdzich...@chrome.pl  72->|   80->|
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-09 Thread Nicolas Mailhot

Le 2019-01-08 18:13, Robert Marcano a écrit :

On 1/7/19 2:28 PM, Matthew Miller wrote:

On Mon, Jan 07, 2019 at 06:24:14PM +0100, Lennart Poettering wrote:
* The Fedora community cares about privacy and is adverse to 
tracking

measures. We don't want to track; just count.

Uh, so what's the story there? i mean, if you pass over the uuid you
make clients trackable, regardless if you want to make use of that or
not...


Not if we don't keep them for long. One idea is to rotate them fairly
frequently. But this is mostly a statement of intent and might be more 
about

how we build the backend than about what we force in the client.


If the client generate a new UUID every month (for example), or use
the current month in the UUID generation algorithm, There is no need
for the users to trust that the server is removing the logs is true.


Of course there is. It's rather trivial to correlate the previous UUID 
to the new one when you also have access to the corresponding IP 
addresses.


You need to be serious about data collection and approach it with a 
security mindset “how could I hijack the system and betray users trust” 
not “of course my data users are good they will never try anything evil 
I can collect everything I get my hands on and think later” (the kind of 
credulous US thinking that gave us Cambridge Analytica).


That’s what the GDPR is about. It’s *your* responsibility as data 
collector to think about how data could be used, it’s *your* problem to 
protect it, it’s *your* problem if it’s misused, you can not make it 
available on a platter for others to do evil things with and claim it’s 
those people’s problem.


--
Nicolas Mailhot
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Chris Murphy
On Tue, Jan 8, 2019 at 8:50 AM Matthew Miller  wrote:
>
> On Tue, Jan 08, 2019 at 04:22:39PM +0100, Lennart Poettering wrote:
> > > The additional information could be
> > > 10.5.124.209 - - [31/Dec/2018:09:07:21 +] "GET
> > > /metalink?repo=fedora-28=x86_64==
> > > HTTP/1.1" 200 62200 "-" "dnf/2.7.5"
> > If all you want to do is count, then it should be entirely sufficient
> > to do it like this:
> >GET /metalink?repo=fedora-28=x86_64==1 
> > HTTP/1.1
> > the first time within each one-week window and a simple
> >GET /metalink?repo=fedora-28=x86_64= HTTP/1.1
> > all other times.
> > Then, sum up how many "countme=1" GET requests we get per week, and
> > you have a good count, without tracking individual clients, without
> > inventing new uuids¹.
>
> I do like this idea!
>
> And, if there's not an associated UUID, it's more comfortable to do
> "countme=2" the second week and onward -- this would make it easy to
> distinguish systems which are short-lived. (Or "countme=new" and
> "countme=ongoing" or something?)
>
> H. How comfortable would people be with reporting an incrementing count
> *every* week (again, without a UUID attached)? That'd give a new axis into
> the data which I can imagine being quite useful.

I would opt in, and would not be bothered if it were opt out.

-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Benjamin Berg
On Tue, 2019-01-08 at 09:59 -0500, Owen Taylor wrote:
> On Tue, Jan 8, 2019 at 7:17 AM Benjamin Berg  wrote:
> > On Tue, 2019-01-08 at 12:33 +0100, Miroslav Suchý wrote:
> > > Dne 08. 01. 19 v 11:35 Nicolas Mailhot napsal(a):
> > > > *which* *do* *not* *permit* *or* *no* *longer* *permit* *the*
> > > > *identification* *of* *data* *subjects*
> > > 
> > > How do you identify data subject solely on UUID?
> > 
> > You also inherently collect information such as the IP and the
> > timestamp of the request which in principle permits identification. You
> > could for example collect the IP from Fedora account logins and one of
> > these pings. This way you can de-anonymise the data collected for the
> > UUID.
> 
> We can certainly implement a setup that does not collect or store the
> UUID together with the IP address or timestamp. Send the UUID as a
> HTTP header, don't log it, send the UUID off to a counting service
> (*). If we make sure the UUID is protected in transit, sent only to
> our own servers (or servers configured by the user), and not collected
> or stored in a personally identifiable way, I suspect that we're
> meeting our obligations under the GDPR, though we'd need to
> double-check any selected solution carefully.

You are right that it is possible to immediately discard or obfuscate
the information.

But, as Nicolas pointed out, the argument here is that the UUID itself
likely needs to be considered "personal data" in the GDPR sense. And
even doing something as minimal as that seems to imply "processing"[1]
the data in the GDPR sense.

Benjamin

[1] The definition of "processing" reads:
"""
‘processing’ means any operation or set of operations which is
performed on personal data or on sets of personal data, whether or not
by automated means, such as collection, recording, organisation,
structuring, storage, adaptation or alteration, retrieval,
consultation, use, disclosure by transmission, dissemination or
otherwise making available, alignment or combination, restriction,
erasure or destruction;
"""
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread James Cassell
On Tue, Jan 8, 2019, at 11:15 AM, Stephen Gallagher wrote:
> On Tue, Jan 8, 2019 at 10:50 AM Matthew Miller  
> wrote:
> >
> > On Tue, Jan 08, 2019 at 04:22:39PM +0100, Lennart Poettering wrote:
> > > > The additional information could be
> > > > 10.5.124.209 - - [31/Dec/2018:09:07:21 +] "GET
> > > > /metalink?repo=fedora-28=x86_64==
> > > > HTTP/1.1" 200 62200 "-" "dnf/2.7.5"
> > > If all you want to do is count, then it should be entirely sufficient
> > > to do it like this:
> > >GET /metalink?repo=fedora-28=x86_64==1 
> > > HTTP/1.1
> > > the first time within each one-week window and a simple
> > >GET /metalink?repo=fedora-28=x86_64= HTTP/1.1
> > > all other times.
> > > Then, sum up how many "countme=1" GET requests we get per week, and
> > > you have a good count, without tracking individual clients, without
> > > inventing new uuids¹.
> >
> > I do like this idea!
> >
> > And, if there's not an associated UUID, it's more comfortable to do
> > "countme=2" the second week and onward -- this would make it easy to
> > distinguish systems which are short-lived. (Or "countme=new" and
> > "countme=ongoing" or something?)
> >
> > H. How comfortable would people be with reporting an incrementing count
> > *every* week (again, without a UUID attached)? That'd give a new axis into
> > the data which I can imagine being quite useful.
> >
> 
> 
> I like this idea and I think it's generally less likely to set off
> alarm bells about privacy. I do think we probably want to avoid an
> *incrementing* count, though to avoid questions around using
> time-of-install as a vector into identifying the owner. So the
> "new-vs-ongoing" differentiator seems reasonable to me. I *would*
> suggest that we probably want to have it send "countme=new" every time
> it tries to reach the mirrorlink until the first time it gets a proper
> response. After that, sending "countme=ongoing" once a week would be
> good additional information.

I'd propose countme=new the first time, then countme=thirty or the original 
version of fedora that was installed on this machine, so you could also track 
upgrades over time vs new installs.

V/r,
James Cassell
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Robert Marcano

On 1/7/19 2:28 PM, Matthew Miller wrote:

On Mon, Jan 07, 2019 at 06:24:14PM +0100, Lennart Poettering wrote:

* The Fedora community cares about privacy and is adverse to tracking
measures. We don't want to track; just count.

Uh, so what's the story there? i mean, if you pass over the uuid you
make clients trackable, regardless if you want to make use of that or
not...


Not if we don't keep them for long. One idea is to rotate them fairly
frequently. But this is mostly a statement of intent and might be more about
how we build the backend than about what we force in the client.


If the client generate a new UUID every month (for example), or use the 
current month in the UUID generation algorithm, There is no need for the 
users to trust that the server is removing the logs is true. You can 
have an approximation of how many active users Fedora has, not realtime 
and with some inaccuracies at the start of the period (months in this 
example)







* For this reason, we don’t want to use any identifier like
/etc/machine-id which may be used for other purposes.

For purposes like this we have "application-specific machine
IDs". This is exposed in the sd_id128_get_machine_app_specific() API:
https://www.freedesktop.org/software/systemd/man/sd_id128_get_machine.html

[...]

It appears to me that this concept is what you might want to use
here. You could either use our C API for that, but you can easily
reimplement it in a fully compatible way in any programming language
you like without using our C API too, after all HMAC-SHA256 is pretty
commonly available and not fancy in any way.


Thanks, that makes sense.



BTW, afaik Ubuntu counts installations through NTP: they provide their

[..]

Of course, doing it that way would mean fedora would have to host NTP
servers...


Hmmm. We have fedora.pool.ntp.org, in fact. I'm not sure who actually runs
that!


___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Stephen Gallagher
On Tue, Jan 8, 2019 at 11:42 AM Benjamin Berg  wrote:
>
> Hi,
>
> On Tue, 2019-01-08 at 10:49 -0500, Matthew Miller wrote:
> > On Tue, Jan 08, 2019 at 04:22:39PM +0100, Lennart Poettering wrote:
> > > > The additional information could be
> > > > 10.5.124.209 - - [31/Dec/2018:09:07:21 +] "GET
> > > > /metalink?repo=fedora-28=x86_64==
> > > > HTTP/1.1" 200 62200 "-" "dnf/2.7.5"
> > > If all you want to do is count, then it should be entirely
> > > sufficient
> > > to do it like this:
> > >GET /metalink?repo=fedora-
> > > 28=x86_64==1 HTTP/1.1
> > > the first time within each one-week window and a simple
> > >GET /metalink?repo=fedora-28=x86_64= HTTP/1.1
> > > all other times.
> > > Then, sum up how many "countme=1" GET requests we get per week, and
> > > you have a good count, without tracking individual clients, without
> > > inventing new uuids¹.
> >
> > I do like this idea!
> >
> > And, if there's not an associated UUID, it's more comfortable to do
> > "countme=2" the second week and onward -- this would make it easy to
> > distinguish systems which are short-lived. (Or "countme=new" and
> > "countme=ongoing" or something?)
>
> Wouldn't it be easiest to only send the ping for machines that exist
> longer than a week? All that is needed would be to suppress the ping
> the first time while still storing the timestamp after which the next
> ping should happen.

Knowing which machines *don't * last more than a week is still
valuable information. It helps us learn whether Fedora is being used
for quick testing environments, short-lived VMs in the cloud, etc.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Benjamin Berg
Hi,

On Tue, 2019-01-08 at 10:49 -0500, Matthew Miller wrote:
> On Tue, Jan 08, 2019 at 04:22:39PM +0100, Lennart Poettering wrote:
> > > The additional information could be
> > > 10.5.124.209 - - [31/Dec/2018:09:07:21 +] "GET
> > > /metalink?repo=fedora-28=x86_64==
> > > HTTP/1.1" 200 62200 "-" "dnf/2.7.5"
> > If all you want to do is count, then it should be entirely
> > sufficient
> > to do it like this:
> >GET /metalink?repo=fedora-
> > 28=x86_64==1 HTTP/1.1
> > the first time within each one-week window and a simple
> >GET /metalink?repo=fedora-28=x86_64= HTTP/1.1
> > all other times.
> > Then, sum up how many "countme=1" GET requests we get per week, and
> > you have a good count, without tracking individual clients, without
> > inventing new uuids¹.
> 
> I do like this idea!
> 
> And, if there's not an associated UUID, it's more comfortable to do
> "countme=2" the second week and onward -- this would make it easy to
> distinguish systems which are short-lived. (Or "countme=new" and
> "countme=ongoing" or something?)

Wouldn't it be easiest to only send the ping for machines that exist
longer than a week? All that is needed would be to suppress the ping
the first time while still storing the timestamp after which the next
ping should happen.

Benjamin
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Stephen John Smoogen
On Tue, 8 Jan 2019 at 00:30, Christopher Tubbs 
wrote:

> A few concerns/comments (inline):
>
> > === The problem ===
> >
> > * A. Currently, we can only count Fedora OS use by observing IP
> > addresses. This is subject to undercounting due to NAT — and to
> > overcounting due to short DHCP leases and laptops moving between work
> > or school and home or coffee shop.
>
> "Counts are estimates" is not necessarily a problem. Please explain why
> this is a problem. Also, why not use statistical modeling to try to improve
> the estimates based on these known behaviors?
>
>
In the past when I looked at this, it was always a problem of choosing
which model best fit your data. You can come up with all kinds of models to
prove whatever you want but you need some sort of 'accurate' count at some
point to test those models against. There is also the fact that different
installations fit under different 'models'. The IOT systems will have a
different representational set than the laptop versus the livecd versus...
because how they are installed and how they look on the network is
different. Using the same model for all of them seemed questionable.

Currently the statistics are done off of the http logs from the proxies
which just see a basic set of information. Due to the fact that the
proxy/cache boxes are remote we wait for the rsync to take N days to
complete, then merge all the logs and then do a simple processing on an 8
year old 24 GB server.

The data in this merged log file is noisy due to dnf/yum trying to be
resilient as possible. A single 'dnf update' or 'yum update' may show up as
multiple requests for the same data on different proxies because something
didn't look right.. or it might just show up once. However I don't know if
I have 10 systems behind a firewall or just 1. At the moment I assume that
I have 1 by just saving the tuple (date,ip=x,arch=x,rel=y) once per day.

Trying to count the number of times that tuple occurred was very very noisy
where looking at specific ip addresses I knew have N systems would show up
as either Nhundred systems or none.. depending on the vagaries of the
internet and whatever the systems decided to do that day.

-- 
Stephen J Smoogen.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Stephen Gallagher
On Tue, Jan 8, 2019 at 10:50 AM Matthew Miller  wrote:
>
> On Tue, Jan 08, 2019 at 04:22:39PM +0100, Lennart Poettering wrote:
> > > The additional information could be
> > > 10.5.124.209 - - [31/Dec/2018:09:07:21 +] "GET
> > > /metalink?repo=fedora-28=x86_64==
> > > HTTP/1.1" 200 62200 "-" "dnf/2.7.5"
> > If all you want to do is count, then it should be entirely sufficient
> > to do it like this:
> >GET /metalink?repo=fedora-28=x86_64==1 
> > HTTP/1.1
> > the first time within each one-week window and a simple
> >GET /metalink?repo=fedora-28=x86_64= HTTP/1.1
> > all other times.
> > Then, sum up how many "countme=1" GET requests we get per week, and
> > you have a good count, without tracking individual clients, without
> > inventing new uuids¹.
>
> I do like this idea!
>
> And, if there's not an associated UUID, it's more comfortable to do
> "countme=2" the second week and onward -- this would make it easy to
> distinguish systems which are short-lived. (Or "countme=new" and
> "countme=ongoing" or something?)
>
> H. How comfortable would people be with reporting an incrementing count
> *every* week (again, without a UUID attached)? That'd give a new axis into
> the data which I can imagine being quite useful.
>


I like this idea and I think it's generally less likely to set off
alarm bells about privacy. I do think we probably want to avoid an
*incrementing* count, though to avoid questions around using
time-of-install as a vector into identifying the owner. So the
"new-vs-ongoing" differentiator seems reasonable to me. I *would*
suggest that we probably want to have it send "countme=new" every time
it tries to reach the mirrorlink until the first time it gets a proper
response. After that, sending "countme=ongoing" once a week would be
good additional information.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Matthew Miller
On Tue, Jan 08, 2019 at 04:25:25PM +0100, Lennart Poettering wrote:
> And let me also stress that if you do it this way there's a better
> chance that people will leave this on, since you won't raise red flags
> all over the place that you can track individual users with this.

Yeah, absolutely!

-- 
Matthew Miller

Fedora Project Leader
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Matthew Miller
On Tue, Jan 08, 2019 at 04:22:39PM +0100, Lennart Poettering wrote:
> > The additional information could be
> > 10.5.124.209 - - [31/Dec/2018:09:07:21 +] "GET
> > /metalink?repo=fedora-28=x86_64==
> > HTTP/1.1" 200 62200 "-" "dnf/2.7.5"
> If all you want to do is count, then it should be entirely sufficient
> to do it like this:
>GET /metalink?repo=fedora-28=x86_64==1 HTTP/1.1
> the first time within each one-week window and a simple
>GET /metalink?repo=fedora-28=x86_64= HTTP/1.1
> all other times.
> Then, sum up how many "countme=1" GET requests we get per week, and
> you have a good count, without tracking individual clients, without
> inventing new uuids¹.

I do like this idea!

And, if there's not an associated UUID, it's more comfortable to do
"countme=2" the second week and onward -- this would make it easy to
distinguish systems which are short-lived. (Or "countme=new" and
"countme=ongoing" or something?)

H. How comfortable would people be with reporting an incrementing count
*every* week (again, without a UUID attached)? That'd give a new axis into
the data which I can imagine being quite useful.

-- 
Matthew Miller

Fedora Project Leader
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Lennart Poettering
On Di, 08.01.19 16:22, Lennart Poettering (mzerq...@0pointer.de) wrote:

> On Di, 08.01.19 07:49, Stephen John Smoogen (smo...@gmail.com) wrote:
>
> > The additional information could be
> >
> > 10.5.124.209 - - [31/Dec/2018:09:07:21 +] "GET
> > /metalink?repo=fedora-28=x86_64==
> > HTTP/1.1" 200 62200 "-" "dnf/2.7.5"
>
> If all you want to do is count, then it should be entirely sufficient
> to do it like this:
>
>GET /metalink?repo=fedora-28=x86_64==1 HTTP/1.1
>
> the first time within each one-week window and a simple
>
>GET /metalink?repo=fedora-28=x86_64= HTTP/1.1
>
> all other times.
>
> Then, sum up how many "countme=1" GET requests we get per week, and
> you have a good count, without tracking individual clients, without
> inventing new uuids¹.
>
> Such a form of counting is so minimal that I think you don't even have
> to query the user whether he agrees with that in the installer UI. And
> the user knows that with the one additional bit of info he grants you
> every week there's very little you can do you couldn't do in the
> status quo ante.
>
> Morever, doing accumulation like the proposed also makes things
> extremely simple to account for, as you don't have to store per-client
> info in a huge database on the server. Instead it's entirely
> sufficient to have a single counter for each subset of distro you want
> to count.
>
> In the interest of privacy the valid desire to have statistics
> about the use of our distro needs to be implemented with data
> frugality in mind. Keeping a full database of all uuids of all clients
> on a Fedora server somewhere is definitely not data frugality if all
> you want is count. Even if Fedora wouldn't misuse the data, somebody
> might exploit the server and steal the database and there you go. Not
> even having the database is hence the much better approach, and you
> really need neither the database nor the uuid concept to do proper
> counting.
>
> So yeah, in the interest of privacy and simplicity, please don't got
> the uuid way, there are simpler and better approaches.

And let me also stress that if you do it this way there's a better
chance that people will leave this on, since you won't raise red flags
all over the place that you can track individual users with this.

Lennart

--
Lennart Poettering, Red Hat
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Lennart Poettering
On Di, 08.01.19 07:49, Stephen John Smoogen (smo...@gmail.com) wrote:

> The additional information could be
>
> 10.5.124.209 - - [31/Dec/2018:09:07:21 +] "GET
> /metalink?repo=fedora-28=x86_64==
> HTTP/1.1" 200 62200 "-" "dnf/2.7.5"

If all you want to do is count, then it should be entirely sufficient
to do it like this:

   GET /metalink?repo=fedora-28=x86_64==1 HTTP/1.1

the first time within each one-week window and a simple

   GET /metalink?repo=fedora-28=x86_64= HTTP/1.1

all other times.

Then, sum up how many "countme=1" GET requests we get per week, and
you have a good count, without tracking individual clients, without
inventing new uuids¹.

Such a form of counting is so minimal that I think you don't even have
to query the user whether he agrees with that in the installer UI. And
the user knows that with the one additional bit of info he grants you
every week there's very little you can do you couldn't do in the
status quo ante.

Morever, doing accumulation like the proposed also makes things
extremely simple to account for, as you don't have to store per-client
info in a huge database on the server. Instead it's entirely
sufficient to have a single counter for each subset of distro you want
to count.

In the interest of privacy the valid desire to have statistics
about the use of our distro needs to be implemented with data
frugality in mind. Keeping a full database of all uuids of all clients
on a Fedora server somewhere is definitely not data frugality if all
you want is count. Even if Fedora wouldn't misuse the data, somebody
might exploit the server and steal the database and there you go. Not
even having the database is hence the much better approach, and you
really need neither the database nor the uuid concept to do proper
counting.

So yeah, in the interest of privacy and simplicity, please don't got
the uuid way, there are simpler and better approaches.

Lennart


(Footnote: ¹ if you are concerned that not every client is updated
every week, then you could even slightly extend this and maybe submit
countme=2 the first time within each 4 week period, and countme=3
within each 52 week period, so that you you catch even those though it
will take a bit longer for them to accumulate them)

--
Lennart Poettering, Red Hat
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Owen Taylor
On Tue, Jan 8, 2019 at 7:17 AM Benjamin Berg  wrote:
>
> On Tue, 2019-01-08 at 12:33 +0100, Miroslav Suchý wrote:
> > Dne 08. 01. 19 v 11:35 Nicolas Mailhot napsal(a):
> > > *which* *do* *not* *permit* *or* *no* *longer* *permit* *the*
> > > *identification* *of* *data* *subjects*
> >
> > How do you identify data subject solely on UUID?
>
> You also inherently collect information such as the IP and the
> timestamp of the request which in principle permits identification. You
> could for example collect the IP from Fedora account logins and one of
> these pings. This way you can de-anonymise the data collected for the
> UUID.

We can certainly implement a setup that does not collect or store the
UUID together with the IP address or timestamp. Send the UUID as a
HTTP header, don't log it, send the UUID off to a counting service
(*). If we make sure the UUID is protected in transit, sent only to
our own servers (or servers configured by the user), and not collected
or stored in a personally identifiable way, I suspect that we're
meeting our obligations under the GDPR, though we'd need to
double-check any selected solution carefully.

That being said, certainly some users might still have an issue with
having a UUID sent to Fedora servers even if we are meeting our legal
obligations. What we say we are doing with the data might not
correspond to reality in case of a security breach or court order. For
this reason, the first_time_this_week=1 option that Lennart and
Benjamin mentioned has some appeal to me - it would avoid the need for
extra opt-in/out screens, confusing text, etc. It would also allow any
yum repository to do counting the same way - not just our own
repositories.

Owen

(*) implementation left to your imagination. Store a hash of the UUID
for a week then discard. Use HyperLogLog. Etc.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Stephen John Smoogen
On Tue, 8 Jan 2019 at 06:23, Miroslav Suchý  wrote:
>
> Dne 08. 01. 19 v 11:14 Reindl Harald napsal(a):
> > but the UUID is sent over TCP and so you receive the current IP at the
> > same time with the UUID and that way you can even pofile how that
> > machine is moved around the country
>
> But it is not - or to be precise - will not be stored as couple, so this 
> information cannot be retrieved.

Unless you are somehow going to not use IP to get the data from client
to server.. it will be retrievable in multiple ways. The simplest is

Log A (http app) will have timestamp and uuid. Log B
(apache/nginx/etc) will have timestamp and ip. Log merge is possible.


Turning off Log B sounds like an option, but you turn off the ability
to troubleshoot if Log A is working or not. And if you aren't sure if
you have accurate data in Log A.. you are back where you started.

Turning off timestamp in log A is also problematic because we still
need to figure out things like 'when was the last time we saw it',
have we recorded it already within a timeframe, etc. Keeping it
ephemeral in memory is an idea but the usual 'oh we had to restart
this service because of a leak in the stack' makes that impractical.

Currently I have a hard time not seeing us needing the IP address. If
all I record is uuid.. what if we have 10k ips all with the same uuid?
How will we know? Plus if a sizable percentage don't use this.. you
will still need to do the old method of counting via ip address to
know that. Again merge is possible and could be probable.

I would like to think these are 'solvable' problems.. but I also am
quite aware that pretty much every pseudo-anonymization method falls
over pretty darn quickly due to yet one more thing that has to be
recorded somewhere.


-- 
Stephen J Smoogen.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Stephen John Smoogen
On Tue, 8 Jan 2019 at 03:43, Lennart Poettering  wrote:
>
> On Mo, 07.01.19 16:58, Stephen John Smoogen (smo...@gmail.com) wrote:
>
> > > I wonder if it is worth introducing an entirely new tracking concept
> > > here if you actually don't want to track but just count. The NTP
> > > approach has the benefit that you introduce no new tracking concept at
> > > all, but you just use the data that is pretty much generated
> > > anyway. It also makes this all feel less one-sided, after all you
> > > provide them with a deal: fedora gives the user correct time, the user
> > > is therefore counted.
> >
> > The problems with NTP are the following:
> > 1. The administrative headaches of regular
> > blocks/takedowns/ill-advised security emails because our servers are
> > attacking someone's box on port 123. [As funny as this sounds, getting
> > regular angry emails from some site whose security tool has decided
> > that 123/{tcp,udp} is a major threat still occurs. ]
>
> This is not realistic. NTP is not really just an option, it's pretty
> much a must-have in todays's Internet. You cannot properly validate
> SSL certs if you don't have correct time, which shuts you out of a
> good part of the Internet, including typical download servers that do
> https.

I don't disagree that it should be unrealistic. I also know that it
still gets reported regularly as an attack and sites get dropped
off/spam blocked because some antivirus/firewall router reported it to
. As such it needs to be listed
as a possible problem. [This is just the usual risk management CYA so
that when it does happen we can say 'we considered it a risk and
mitigated it with X'. I just don't have an X when I wrote that list
and don't want it to be 'spend 2 hours a day calling clients who can't
connect to RH data-centers because someone thought 123 was a hacker
attack.' ]

>
> If a system doesn't do NTP then it will cetrainly encounter a lot more
> problems then are created by switching from NTP pool servers to Fedora
> servers by default.
>
> Moreover, afair we install and enable NTP clients by default on all
> our installations, no? just like pretty much any other OS these days
> does... counting by NTP mostly just means switching from NTP pool
> servers to fedora's own servers.
>
> > 2. NTP bandwidth while small per system grows a lot as you wrack up
> > servers randomly checking in. Having a pool of servers around the
> > world would require us to get NTP GPS clocks, getting the datacenters
> > to put the antenae out and a bunch of other items. [The budget for
> > this is non-zero.]
>
> Nah, that's not how NTP works, you don't have to have a "GPS clock",
> you can simply replicate the time of a set of upstream servers, that's
> totally OK.

I expect I am running off of old knowledge as I got out of running NTP
servers 10 years ago. At that time it was 'required' that a stratum 1
clock was supposed to have a GPS, atomic or similar solid time. If you
are just relaying you are supposed to be a stratum 2 clock. If you are
the main clock for a large set of systems you should be a stratum 1
system. If you ran a dedicated system you wanted to have 3-5 systems
spread out.

For how the old way of doing things was
https://www.endruntechnologies.com/stratum1.htm

I expect with a large pool of stratum 2, the noise between them is
averaged out of over time lowering the need for a clock. I will go
look at current best practices and revise.



-- 
Stephen J Smoogen.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Matthew Miller
On Tue, Jan 08, 2019 at 09:30:56AM +0100, Miroslav Suchý wrote:
> How Mock should handle this? DNF executed by Mock cannot send VERSION_ID
> and VARIANT_ID of chroot(ed) environment because they are not know yet.
> I think the question in general is - how to put tracking of build systems 
> aside?

Possibly it could use a special VARIANT_ID reserved for this case?


-- 
Matthew Miller

Fedora Project Leader
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Nicolas Mailhot

Le 2019-01-08 13:42, Kevin Kofler a écrit :

Nicolas Mailhot wrote:

1. it needs to be opt-out not opt-in (ie an explicit question in the
installer, with no tracking unless the user says yes)


I think you mean "opt-in not opt-out". (At least, that's what your
explanation in the parentheses describes.)


Yes, sorry about that

Regards,

--
Nicolas Mailhot
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Stephen John Smoogen
On Mon, 7 Jan 2019 at 22:47, Kevin Kofler  wrote:
>
> Matthew Miller wrote:
> > Since there is no personal information attached, I don't see how on the
> > face of it this is a privacy violation. I want to take this concern
> > seriously, but I need more to go on than "this is inherent". Can you
> > elaborate?
>
> I detailed it further down my message: my concern is that the UUID can
> theoretically be used to track users, to build personas out of them from the
> packages downloaded by the UUID, and in the extreme case even to identify
> the person owning the UUID by name (e.g., if a package downloaded by the
> UUID is downloaded only by 1 person and you find some bug report for it in
> Bugzilla). I don't care that you promise that you won't do it, the fact is
> that you *can*. And possibly others can too, depending on how exactly this
> is implemented.
>

Currently we can't see what packages a client requested. All the
Fedora mirror proxies sees is

10.5.124.209 - - [31/Dec/2018:09:07:21 +] "GET
/metalink?repo=fedora-28=x86_64 HTTP/1.1" 200 62200 "-"
"dnf/2.7.5"

The additional information could be

10.5.124.209 - - [31/Dec/2018:09:07:21 +] "GET
/metalink?repo=fedora-28=x86_64==
HTTP/1.1" 200 62200 "-" "dnf/2.7.5"

Individual mirrors do see what packages the person requested but do
not see the uuid=, edition= data

10.5.124.209 - - [31/Dec/2018:06:44:46 +] "GET
/pub/fedora/linux/updates/28/Everything/x86_64/repodata/repomd.xml
HTTP/1.1" 200 3312 "-" "dnf/2.7.5"
10.5.124.209 - - [31/Dec/2018:06:44:46 +] "GET
/pub/fedora/linux/updates/28/Everything/x86_64/repodata/5ca6bd7f4a9e8b0bc75e6c9f3d239549cfb627f34a5aa5d949c99fedf1a39ab7-comps-Everything.x86_64.xml.gz
HTTP/1.1" 200 448854 "-" "dnf/2.7.5"
10.5.124.209 - - [31/Dec/2018:06:45:21 +] "GET
/pub/fedora/linux/releases/28/Everything/x86_64/os/Packages/p/python3-rpmdeplint-1.4-2.fc28.noarch.rpm
HTTP/1.1" 404 299 "-" "dnf/2.7.5"


> > Like I said, tracking is a non-goal. And, we want a design that is
> > resistant to tracking -- but I don't think we need to go overboard.
>
> If you take privacy seriously, you have to assume the worst. It is always
> safer to send less data rather than more.
>
> Kevin Kofler
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: 
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org



-- 
Stephen J Smoogen.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Kevin Kofler
Nicolas Mailhot wrote:
> 1. it needs to be opt-out not opt-in (ie an explicit question in the
> installer, with no tracking unless the user says yes)

I think you mean "opt-in not opt-out". (At least, that's what your 
explanation in the parentheses describes.)

Other than that apparent typo, I entirely agree with your message.

Kevin Kofler
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Nicolas Mailhot

Le 2019-01-08 12:33, Miroslav Suchý a écrit :

Dne 08. 01. 19 v 11:35 Nicolas Mailhot napsal(a):
*which* *do* *not* *permit* *or* *no* *longer* *permit* *the* 
*identification* *of* *data* *subjects*


How do you identify data subject solely on UUID?


Art 26 makes it pretty clear that reversing must take into account all 
the other data that can be associated with the pseudominisation (either 
because it is available at the same time or can be associated with it 
some other way). So you don’t get to play the solely card.


Regards,

--
Nicolas Mailhot
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Benjamin Berg
On Tue, 2019-01-08 at 12:33 +0100, Miroslav Suchý wrote:
> Dne 08. 01. 19 v 11:35 Nicolas Mailhot napsal(a):
> > *which* *do* *not* *permit* *or* *no* *longer* *permit* *the*
> > *identification* *of* *data* *subjects*
> 
> How do you identify data subject solely on UUID?

You also inherently collect information such as the IP and the
timestamp of the request which in principle permits identification. You
could for example collect the IP from Fedora account logins and one of
these pings. This way you can de-anonymise the data collected for the
UUID.

Benjamin
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Miroslav Suchý
Dne 08. 01. 19 v 11:35 Nicolas Mailhot napsal(a):
> *which* *do* *not* *permit* *or* *no* *longer* *permit* *the* 
> *identification* *of* *data* *subjects*

How do you identify data subject solely on UUID?

Miroslav
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Panu Matilainen

On 1/8/19 1:06 PM, Peter Robinson wrote:

Dne 08. 01. 19 v 10:10 Zbigniew Jędrzejewski-Szmek napsal(a):

I an IP address qualifies as "personal data", then an installation UUID does 
too.


IANAL but I disagree. With IP address, I can very easily guess your 
town/village. With more effort I can track you to
individual house and individual device.
You cannot say the same about UUID.


I agree with this, I think even the machine ID would be more anonymous
than an IP address in most cases.


In *isolation*, yes. The problem is that here it'll be associated with 
an IP address and together they reveal things that neither of them do alone.


- Panu -
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Miroslav Suchý
Dne 08. 01. 19 v 11:14 Reindl Harald napsal(a):
> but the UUID is sent over TCP and so you receive the current IP at the
> same time with the UUID and that way you can even pofile how that
> machine is moved around the country

But it is not - or to be precise - will not be stored as couple, so this 
information cannot be retrieved.

Miroslav
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Peter Robinson
> Dne 08. 01. 19 v 10:10 Zbigniew Jędrzejewski-Szmek napsal(a):
> > I an IP address qualifies as "personal data", then an installation UUID 
> > does too.
>
> IANAL but I disagree. With IP address, I can very easily guess your 
> town/village. With more effort I can track you to
> individual house and individual device.
> You cannot say the same about UUID.

I agree with this, I think even the machine ID would be more anonymous
than an IP address in most cases.

P
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Tom Hughes

On 08/01/2019 10:38, Lennart Poettering wrote:


Also, you want to use standard primitives, and a HMAC is one that is
designed for purposes like this. For the reasons why a HMAC is
constructed the way it is, read the wikipedia page.


Well it's constructed the way it is (as wikipedia explains) to
stop you being able to add data to a message and have it generate
the same MAC which makes perfect sense when you are using it is
a signature to check that the input hasn't been modified.

That's not what is happening here though - here the hash is just
to disguise the input not to verify that it hasn't changed, so the
property that we are interested in is whether the algorithm can
be reversed to recover plain text not whether an alternate plain
text can be found to give the same cipher text.

So HMAC probably isn't strictly necessary in this case but it's
not going to do any harm either.

Tom

--
Tom Hughes (t...@compton.nu)
http://compton.nu/
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Lennart Poettering
On Di, 08.01.19 10:11, Richard Hughes (hughsi...@gmail.com) wrote:

> On Tue, 8 Jan 2019 at 08:57, Lennart Poettering  wrote:
> > Yes, Tom's proposal makes sense. Calculate the UUID you submit as
> >   HMAC(machined_id, CONCAT(fixedappuuid, unixtime/432000))
>
> Out of interest, how is using a HMAC different to just using the
> machine-id appended with a salt, sha256'd?

I am not sure how you'd define "salt" in this case. Randomly generated
and stored somewhere? i mean, storing something somewhere is what
should really be avoided I think.

Also, you want to use standard primitives, and a HMAC is one that is
designed for purposes like this. For the reasons why a HMAC is
constructed the way it is, read the wikipedia page.

Lennart

--
Lennart Poettering, Red Hat
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Nicolas Mailhot

Le 2019-01-08 11:17, Miroslav Suchý a écrit :

Dne 08. 01. 19 v 11:04 Miroslav Suchý napsal(a):
IANAL but I disagree. With IP address, I can very easily guess your 
town/village. With more effort I can track you to

individual house and individual device.
You cannot say the same about UUID.


I just checked and UUID is definitelly under ‘pseudonymisation’ - see
Article 4 - Definitions:

https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32016R0679

And that pseudonymisation is actually encouraged to be used for
statictical purposes - (156) in the preamble.


Only if it can not be correlated back (Art 156)

“The further processing of personal data for archiving purposes in the 
public interest, scientific or historical research purposes or 
statistical purposes is to be carried out when the controller has 
assessed the feasibility to fulfil those purposes by processing data 
*which* *do* *not* *permit* *or* *no* *longer* *permit* *the* 
*identification* *of* *data* *subjects*, provided that appropriate 
safeguards exist (such as, for instance, pseudonymisation of the data)”


Otherwise it is considered personal data (Art 26)

“Personal data which have undergone pseudonymisation, which could be 
attributed to a natural person by the use of additional information 
should be considered to be information on an identifiable natural 
person. ”


--
Nicolas Mailhot
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Miroslav Suchý
Dne 08. 01. 19 v 11:04 Miroslav Suchý napsal(a):
> IANAL but I disagree. With IP address, I can very easily guess your 
> town/village. With more effort I can track you to
> individual house and individual device.
> You cannot say the same about UUID.

I just checked and UUID is definitelly under ‘pseudonymisation’ - see Article 4 
- Definitions:

https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32016R0679

And that pseudonymisation is actually encouraged to be used for statictical 
purposes - (156) in the preamble.

Miroslav
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Richard Hughes
On Tue, 8 Jan 2019 at 08:57, Lennart Poettering  wrote:
> Yes, Tom's proposal makes sense. Calculate the UUID you submit as
>   HMAC(machined_id, CONCAT(fixedappuuid, unixtime/432000))

Out of interest, how is using a HMAC different to just using the
machine-id appended with a salt, sha256'd?

Richard.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Benjamin Berg
Hi,

On Tue, 2019-01-08 at 10:06 +0100, Nicolas Mailhot wrote:
> You can turn it all the way you like getting accurate counts means 
> disambiguating systems which means tracking, regardless if you do it
> in a central way or via system agents.

No, you do not need to track individual machines.

As Lennart pointed out elsewhere already, there are other methods like
setting a boolean from 0 to 1 in a request that happens more regularly
anyway. If each machine does a request with the boolean true once a
week, then you can trivially calculate the number of installations from
that.

Obviously you need to rely on the sending machines to do such a request
at reasonably regular intervals (e.g. once a week on average). But, we
need to trust the machines anyway. After all, you could easily generate
UUIDs and manipulate the statistics that way. All you are doing is
pushing some of the counting logic to the client, thereby removing most
of the privacy issues.

Benjamin
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Miroslav Suchý
Dne 08. 01. 19 v 10:10 Zbigniew Jędrzejewski-Szmek napsal(a):
> I an IP address qualifies as "personal data", then an installation UUID does 
> too.

IANAL but I disagree. With IP address, I can very easily guess your 
town/village. With more effort I can track you to
individual house and individual device.
You cannot say the same about UUID.

Miroslav
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Nicolas Mailhot

Le 2019-01-08 04:00, Matthew Miller a écrit :

On Mon, Jan 07, 2019 at 11:09:48PM +0100, Kevin Kofler wrote:
Please no! This is an inherent privacy violation. I hate software 
doing this
and I always opt out of it. I find it especially worrying that Free 
Software

is now doing this more and more often, this used to be something only
privacy-violating proprietary software would do.


Since there is no personal information attached, I don't see how on the 
face
of it this is a privacy violation. I want to take this concern 
seriously,

but I need more to go on than "this is inherent". Can you elaborate?


That's not how you need to think of it.

Basically, the European definition is that if it can be correlated to 
personal information, it *is* personal information, regardless of what 
the original info is in isolation.


That means that if you want it not to be personal info, you need to make 
bloody sure it is not shared with data aggregators, your protect against 
leaking to systems that allows correlation, and every time there is an 
advance in big data processing that enables more kinds of correlation, 
that automatically restricts what you can safely collect.


Regards,

--
Nicolas Mailhot
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Zbigniew Jędrzejewski-Szmek
On Mon, Jan 07, 2019 at 10:00:25PM -0500, Matthew Miller wrote:
> On Mon, Jan 07, 2019 at 11:09:48PM +0100, Kevin Kofler wrote:
> > Please no! This is an inherent privacy violation. I hate software doing 
> > this 
> > and I always opt out of it. I find it especially worrying that Free 
> > Software 
> > is now doing this more and more often, this used to be something only 
> > privacy-violating proprietary software would do.
> 
> Since there is no personal information attached, I don't see how on the face
> of it this is a privacy violation. I want to take this concern seriously,
> but I need more to go on than "this is inherent". Can you elaborate?

I'm not a lawyer, but GDPR is something that affects all of use. Going
by the wiki page and GDPR announcements from European Commission:

Scope:
> The regulation applies if ... the data subject (person) is based in the EU
So Fedora obviously falls under the scope of GDPR.

> personal data is any information relating to an individual ... a computer's 
> IP address.
I an IP address qualifies as "personal data", then an installation UUID does 
too.

Lawful basis for processing:
> Unless a data subject has provided informed consent to data
> processing for one or more purposes, personal data may not be
> processed unless there is at least one legal basis to do
> so. According to Article 6, the lawful purposes are:
> (a) If the data subject has given consent to the processing of his
> or her personal data;

(b)-(e) obviously don't apply

> (f) For the legitimate interests of a data controller or a third
> party, unless these interests are overridden by interests of the
> data subject

We could argue [1] that reliably collecting the number of individual
installations is a "legitimate interest", for example because it
allows us to decide what parts of Fedora are most used and direct our
efforts there. I think it's pretty obvious that knowing the number of
users is a valid interest for any software project. Then we could use
point (f).

Otherwise, we have to use point (a) which is only satisfied by an clearly
worded, and specific, opt-*in* dialogue.

[1] 
https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/legitimate-interests/what-is-the-legitimate-interests-basis/

Zbyszek
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Nicolas Mailhot

Le 2019-01-07 23:44, John Harris a écrit :

On Monday, January 7, 2019 5:20:55 PM EST Bruno Wolff III wrote:
On Mon, Jan 07, 2019 at 22:54:46 +0100, Tom Gundersen  
wrote:


So this allows better tracking than if you just had to go by IP, time 
and

other information in the requests.


Keep in mind that we do not want tracking, at all. Just counting. That 
said,
your message certainly does highlight some of the risks of using this 
proposed

UUID system.

We don't need to be thinking of more things to track about the user, 
but ways

to prevent tracking and still get the counts the Council wants.


Pretty much everything that has been described so far is the usual 
sleazeware tracking, with the usual sleazeware justifications (“everyone 
else doing it is evil but I’m not” “I know I’m good so I don’t need to 
take care” “hijacking of benign mechanisms to ambush users, they’re not 
aware I'm using them this way that makes it good” “I know I’m using it 
for good, not my problem if others can reuse it for evil” and so on), 
and the usual focus on getting accurate counts over caring about side 
effects.


You can turn it all the way you like getting accurate counts means 
disambiguating systems which means tracking, regardless if you do it in 
a central way or via system agents.


And yes we know it makes it easier for marketing people to know pretty 
much everything about the user base, getting a few billions of free 
money would make *my* life easier that does not mean I’m going to get 
them.


If you want to be trusted:

1. it needs to be opt-out not opt-in (ie an explicit question in the 
installer, with no tracking unless the user says yes)
2. it needs to be easily audited and disabled post-install (ie a 
separate explicitly named and described package, not a setting or a 
built-in hidden in a mass of other things)
3. there need to be a lot of though on how the collection process or the 
collected data could be misappropriated and how to make sure to protect 
against it
4. how it is used and how it can be audited and disabled needs to be 
described in a stable public legally binding and easy to find document


And I strongly suggest a review by European privacy experts, since the 
level of awareness on this kind of things in the USA is pretty low.


Regards,

--
Nicolas Mailhot
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Lennart Poettering
On Mo, 07.01.19 22:54, Tom Gundersen (t...@jklm.no) wrote:

> On Mon, Jan 7, 2019, 7:31 PM Matthew Miller 
> wrote:
>
> > On Mon, Jan 07, 2019 at 06:24:14PM +0100, Lennart Poettering wrote:
> > > > * The Fedora community cares about privacy and is adverse to tracking
> > > > measures. We don't want to track; just count.
> > > Uh, so what's the story there? i mean, if you pass over the uuid you
> > > make clients trackable, regardless if you want to make use of that or
> > > not...
> >
> > Not if we don't keep them for long. One idea is to rotate them fairly
> > frequently. But this is mostly a statement of intent and might be more
> > about
> > how we build the backend than about what we force in the client.
>
> You could move the rotation to the client by hashing the UUID with a
> timestamp of sufficiently coarse granularity (a week?) before submitting it.
>
> Then you make sure that all UUIDs submitted by a given machine during a
> given time window are the same, but UUIDs submitted in different windows
> are not related, and you don't have to trust the server to respect your
> privacy.

Yes, Tom's proposal makes sense. Calculate the UUID you submit as

  HMAC(machined_id, CONCAT(fixedappuuid, unixtime/432000))

where:

  machine_id = the id from /etc/machine-id
  fixedappuuid = some fixed compiled-in uuid you make up for dnf
  unixtime = UNIX time, seconds since 1970

(432000 is the seconds in 5 days, just as an example)

This way the uuid submitted is changed automatically both when the
machine ID is reset and every 5 days.

Of course, I still think the NTP (or http ping check) approach is
nicer overall, since it doesn't smell so awfully like "we track users".

Lennart

--
Lennart Poettering, Red Hat
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Lennart Poettering
On Mo, 07.01.19 16:58, Stephen John Smoogen (smo...@gmail.com) wrote:

> > I wonder if it is worth introducing an entirely new tracking concept
> > here if you actually don't want to track but just count. The NTP
> > approach has the benefit that you introduce no new tracking concept at
> > all, but you just use the data that is pretty much generated
> > anyway. It also makes this all feel less one-sided, after all you
> > provide them with a deal: fedora gives the user correct time, the user
> > is therefore counted.
>
> The problems with NTP are the following:
> 1. The administrative headaches of regular
> blocks/takedowns/ill-advised security emails because our servers are
> attacking someone's box on port 123. [As funny as this sounds, getting
> regular angry emails from some site whose security tool has decided
> that 123/{tcp,udp} is a major threat still occurs. ]

This is not realistic. NTP is not really just an option, it's pretty
much a must-have in todays's Internet. You cannot properly validate
SSL certs if you don't have correct time, which shuts you out of a
good part of the Internet, including typical download servers that do
https.

If a system doesn't do NTP then it will cetrainly encounter a lot more
problems then are created by switching from NTP pool servers to Fedora
servers by default.

Moreover, afair we install and enable NTP clients by default on all
our installations, no? just like pretty much any other OS these days
does... counting by NTP mostly just means switching from NTP pool
servers to fedora's own servers.

> 2. NTP bandwidth while small per system grows a lot as you wrack up
> servers randomly checking in. Having a pool of servers around the
> world would require us to get NTP GPS clocks, getting the datacenters
> to put the antenae out and a bunch of other items. [The budget for
> this is non-zero.]

Nah, that's not how NTP works, you don't have to have a "GPS clock",
you can simply replicate the time of a set of upstream servers, that's
totally OK.

I am pretty sure Ubuntu doesn't have any fancy hw for this either,
they just provide some servers that propagate NTP pool time I figure.

> 3. Logging NTP does not cover the problem the UUID is trying to help
> solve.. there are two places where we undercount and overcount
> systems.
>  a. systems behind nat firewalls all show up as 1 ip address. ntp or
> yum or gnome-hotspot ask multiple times during a day.. but not a set
> number. Just looking at my 3 home systems I see around 1 to 80
> connections depending on what i have done that day.

The amount of traffic within a time window is linear to the number of
hosts behind that IP address. It's relatively easy to estimate that
there are 5 clients behind an IP adress if you get 5 NTP request
datagrams within one protocol iteration instead of just one...

>  b. systems on short lived dhcp ranges. multiple major isps use
> various methods which make a system look like multiple boxes. The
> system will show up as 123.45.67.89 and 2 minutes later the same
> system will be 89.76.54.123 [made up ip addresses.. but various
> carriers seem to do this.]

Well, this breaks TCP, hence sure systems will do that, but not
constantly. And all Fedora needs are estimates, and if you break
things down to some time window granularity you should be able to deal
with such IP renumbering games just fine.

> 4. NTP is a high security problem when you concentrate it to a set of
> servers. These become servers that everyone wants to hack even more
> than build systems. These problems range from DDOS to active hacks.

Uh, well, the major NTP servers tend to be pretty well tested and
fuzzed these days, and they can be sandboxed efficiently, since they
involve no big stack but only trivial SOCK_DGRAM traffic. I see no
reason whatsoever for them to be less secure than a hand-written HTTP
service that only Fedora runs and doesn't get all the validation love
the NTP servers get...

> 5. Which leads to us being in charge of the security of every kerberos
> and SSL session which relies on our clocks to be available and in
> sync. That leads to other administrative headaches where sites will
> complain that our servers broke one of those because the services was
> DDOS'd, ASN rerouted, off by N amount, UDP replayed etc.

Well, people generally already rely on entirely random people
participating in the NTP pool project to run the servers for them. If
people can depend on that they should easily be willing to depend on
Fedora for this too... I mean, they have to trust Fedora a lot more
*anyway*, since we provide them with their frickin compiled
programs...

I mean, I can see reasons why doing the NTP thing is not a good idea
for Fedora (for example: nobody willing to maintain NTP servers but
enough people commit to maintaing some other solution), but I doubt
the technical points you raise above are really valid...

> > BTW, iirc intel used to count installations through the http ping
> > check in their captive 

Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Peter Robinson
> > Not if we don't keep them for long. One idea is to rotate them fairly
> > frequently. But this is mostly a statement of intent and might be more about
> > how we build the backend than about what we force in the client.
>
> My understanding is that the Fedora project does not control how much
> network logging Red Hat does on its behalf, so rotating the UUID might
> well not bring back the old anonymity.

The mirror manager bits run all over the place, not just in Red Hat
hosted locations and all the mirror manager bits run over https where
Fedora infra controls the server end points so from that perspective
it's mostly irrelevant
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Miroslav Suchý
Dne 07. 01. 19 v 17:34 Ben Cotton napsal(a):
> * We need to be able to distinguish between short-lived instances
> (like temporary containers or test machines) and actual installations.

How Mock should handle this? DNF executed by Mock cannot send VERSION_ID and 
VARIANT_ID of chroot(ed) environment
because they are not know yet.

I think the question in general is - how to put tracking of build systems aside?

Miroslav
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-08 Thread Florian Weimer
* Matthew Miller:

> Not if we don't keep them for long. One idea is to rotate them fairly
> frequently. But this is mostly a statement of intent and might be more about
> how we build the backend than about what we force in the client.

My understanding is that the Fedora project does not control how much
network logging Red Hat does on its behalf, so rotating the UUID might
well not bring back the old anonymity.

My concern here is that the UUID would allow implementation of a service
that gives the current IP address for a system, based on a past
timestamp and observed IP address.

Thanks,
Florian
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Bruno Wolff III

On Mon, Jan 07, 2019 at 22:00:25 -0500,
 Matthew Miller  wrote:


Since there is no personal information attached, I don't see how on the face
of it this is a privacy violation. I want to take this concern seriously,
but I need more to go on than "this is inherent". Can you elaborate?


From the users point if view, they can't tell if IP addresses are tracked 
along with UUIDs. Some IP addresses can be tied to specific users, and now 
with UUIDs, the same machine can be seen to use different IP addresses so 
that a person can now be seen to be using multiple IP addreses that couldn't 
be as easily correlated before. Some of these IP addresses may have been 
hard to associate with the person previously.
Users can defend against this by being selective when they do updates 
relatively easily as long as updates are the only thing using this UUID.


If you care about that level of not revealing usage, Fedora is probably not 
the best distribution in the first place. A number of packages do not make 
a priority of limiting networking requests. For example it is common for 
web browsers in Fedora to refer to a network version of a Fedora web page as 
their default start page rather than using a local copy of this page that 
might be a bit out of date. So I don't know if IP address correlation is 
likely to be of big concern to many Fedora users. I would prefer that Fedora 
make different privacy / convenience trade offs than it does, but I'm pretty 
sure I'm in a small minority and I'm able to do work arounds on my end for 
this for cases where I want to spend the effort.

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Bruno Wolff III

On Tue, Jan 08, 2019 at 00:44:26 -0500,
 John Harris  wrote:

On Tuesday, January 8, 2019 12:32:45 AM EST Bruno Wolff III wrote:

The cost for pretending to be lots of machines is also reduced a lot in
this scheme over having to connect from lots of different IP addresses.
Though at some point spoofing too many would probably be considered
a denial of service attack and might get the perpatrator in legal trouble,
which might discourage people from doing that. If such an attack wasn't
noticed because of the request volume from a small amount of IP addresses,
it might be possible to have a significant affect on the aggregate stats.
So it might be worth having some filters watching out for this kind of
attack.


I definitely don't think it's best to start considering legal action against
Fedora users in a thread about invading on user privacy. This will only scare
folks.


I think it is reasonable to discuss mitigations to attacks on the proposed 
system for counting unique users before implementation starts as that might 
affect the design. The new system greatly reduces the cost for pretending to 
be unique systems and someone mad at Fedora or just for laughs, might try to 
spoof a very large number of systems. Legal risk is one thing that might 
encourage people not to do this (possibly to the point where no one tries to 
do an attack spoofing say multiple unique machines per second). Another 
mitigation is proactively looking for lots of unique machines on a small 
number of IP addresses and flagging this for evaluation by a human.

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread John Harris
On Tuesday, January 8, 2019 12:32:45 AM EST Bruno Wolff III wrote:
> The cost for pretending to be lots of machines is also reduced a lot in 
> this scheme over having to connect from lots of different IP addresses. 
> Though at some point spoofing too many would probably be considered 
> a denial of service attack and might get the perpatrator in legal trouble, 
> which might discourage people from doing that. If such an attack wasn't
> noticed because of the request volume from a small amount of IP addresses,
> it might be possible to have a significant affect on the aggregate stats.
> So it might be worth having some filters watching out for this kind of
> attack.

I definitely don't think it's best to start considering legal action against 
Fedora users in a thread about invading on user privacy. This will only scare 
folks.

-- 
John M. Harris, Jr. 
Splentity
https://splentity.com/

signature.asc
Description: This is a digitally signed message part.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Bruno Wolff III

On Mon, Jan 07, 2019 at 21:43:46 -0500,
 Matthew Miller  wrote:

On Mon, Jan 07, 2019 at 02:27:39PM -0600, Bruno Wolff III wrote:


Is this going to happen on install or upgrade before there is a
chance to turn it off?


Maybe? Keep in mind that you are _already_ contacting the mirror systems
when installing or upgrading. Sending a random number once (or a few times,
even) does not seem particularly invasive.


I keep local mirrors of the particular versions and arches I use, so I 
generally don't connect to Fedora repos on a per machine basis. But I 
have only a few machines. I imagine there are some organizations where 
this might also be the case. Probably not enough to care about from a 
stats perspective and they probably aren't doing it for privacy reasons. 
But it isn't guaranteed that installs and upgrades will need to connect 
to Fedora infrastructure to access repos.



Are the UUIDs going to be sanity checked so that NSFW UUIDs don't
show up in reports?


You mean if someone sends a fake UUID rather than a genuine one? I don't
expect we'll actually present the UUIDs directly in reports. It does seem
reasonable to check that UUIDs actually match the expected format, which
should cut out most of that.


Yes I was thinking of fake ones. They might be ones intended to be disruptive 
visually or someone may change their UUID every hour so that each dnf 
contact is likely to have a different UUID. I don't know that this would 
change the aggregate stats enough to care about.


The cost for pretending to be lots of machines is also reduced a lot in 
this scheme over having to connect from lots of different IP addresses. 
Though at some point spoofing too many would probably be considered 
a denial of service attack and might get the perpatrator in legal trouble, 
which might discourage people from doing that. If such an attack wasn't 
noticed because of the request volume from a small amount of IP addresses, 
it might be possible to have a significant affect on the aggregate stats. So 
it might be worth having some filters watching out for this kind of attack.

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Christopher Tubbs
A few concerns/comments (inline):

> === The problem ===
> 
> * A. Currently, we can only count Fedora OS use by observing IP
> addresses. This is subject to undercounting due to NAT — and to
> overcounting due to short DHCP leases and laptops moving between work
> or school and home or coffee shop.

"Counts are estimates" is not necessarily a problem. Please explain why this is 
a problem. Also, why not use statistical modeling to try to improve the 
estimates based on these known behaviors?

> * B. We can count what releases are observed, but we can’t distinguish 
> variants.

Distinguishing between variants is reasonable, I think.

> * C. We can’t count quickly because various logs are copied back to a
> central server and data is not consistent for several days.

Why is eventual consistency not sufficient? What's wrong with waiting several 
days? Why would better counting be so time-sensitive/urgent?

> === Constraints ===
> * The Fedora community cares about privacy and is adverse to tracking
> measures. We don't want to track; just count.

I think you mean "averse", and yes, I think you're right about privacy. Who is 
"we"? I don't think you're speaking for the entire Fedora community.

> * For this reason, we don’t want to use any identifier like
> /etc/machine-id which may be used for other purposes.
> * And, also for that reason, there needs to be a relatively easy way to opt 
> out.

If this must happen, please make it opt-in. You can use the opt-in data as a 
sample set statistical model for the rest of the data (those who didn't 
opt-in), so you don't need to have many people opt-in.

At the very least, the installer should have a dedicated full screen page 
explaining the feature with the ability to opt-out. The opt-out should not be 
buried in a menu, or some post-install step. dnf system-upgrades should be 
opted-out by default.

> * We need to be able to distinguish between short-lived instances
> (like temporary containers or test machines) and actual installations.

I think you're using the word "need" here, when "want" is more accurate. Either 
way, why do you want/need to do this?

> * Being able to see how systems are upgraded over time might be
> interesting but isn’t as important as privacy concerns.

A lot of the reasoning for this proposal seems to be based on "interesting", 
which I agree isn't as important as privacy concerns. Perhaps this is just my 
opinion, but I don't think a case has adequately been made for getting more 
accurate counts of Fedora installs.

> == Benefit to Fedora ==
> 
> * Better metrics overall
> * Public stats page updated automatically
> * Better knowledge of relative use of different variants
> * Insight into Fedora's use in short-lived test systems and temporary
> containers vs. longer-term installations

It's not clear how these things benefit Fedora. Are they benefits for their own 
sake or do they serve some larger purpose? Who looks at these metrics or stats 
pages and needs them to be more accurate or more regularly/automatically 
updated? What benefit do these insights bring to the Fedora community?

> == Documentation ==
> Release notes need to be written, and documentation describing how to opt out.

Documentation would certainly be necessary, but not sufficient. A good 
(prominent) UI for opting out is needed, or make it opt-in.

--
Christopher
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Peter Robinson
> > Since there is no personal information attached, I don't see how on the
> > face of it this is a privacy violation. I want to take this concern
> > seriously, but I need more to go on than "this is inherent". Can you
> > elaborate?
>
> I detailed it further down my message: my concern is that the UUID can
> theoretically be used to track users, to build personas out of them from the
> packages downloaded by the UUID, and in the extreme case even to identify

My understanding is the UUID would be just used at the MirrorManager
level, the packages get pulled from the mirrors directly, even with IP
address data we have now we can't correlate the packages installed by
an IP.

> the person owning the UUID by name (e.g., if a package downloaded by the
> UUID is downloaded only by 1 person and you find some bug report for it in
> Bugzilla). I don't care that you promise that you won't do it, the fact is
> that you *can*. And possibly others can too, depending on how exactly this
> is implemented.

If we had the data from the mirrors it would be possible to do that
now especially with devices that aren't behind CG-NAT or running on
IPv6, but given we don't get the logs from mirrors it's not possible
in the current state and I don't see this changing by adding a UUID.

> > Like I said, tracking is a non-goal. And, we want a design that is
> > resistant to tracking -- but I don't think we need to go overboard.
>
> If you take privacy seriously, you have to assume the worst. It is always
> safer to send less data rather than more.
>
> Kevin Kofler
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: 
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Kevin Kofler
Matthew Miller wrote:
> Since there is no personal information attached, I don't see how on the
> face of it this is a privacy violation. I want to take this concern
> seriously, but I need more to go on than "this is inherent". Can you
> elaborate?

I detailed it further down my message: my concern is that the UUID can 
theoretically be used to track users, to build personas out of them from the 
packages downloaded by the UUID, and in the extreme case even to identify 
the person owning the UUID by name (e.g., if a package downloaded by the 
UUID is downloaded only by 1 person and you find some bug report for it in 
Bugzilla). I don't care that you promise that you won't do it, the fact is 
that you *can*. And possibly others can too, depending on how exactly this 
is implemented.

> Like I said, tracking is a non-goal. And, we want a design that is
> resistant to tracking -- but I don't think we need to go overboard.

If you take privacy seriously, you have to assume the worst. It is always 
safer to send less data rather than more.

Kevin Kofler
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Matthew Miller
On Mon, Jan 07, 2019 at 11:09:48PM +0100, Kevin Kofler wrote:
> Please no! This is an inherent privacy violation. I hate software doing this 
> and I always opt out of it. I find it especially worrying that Free Software 
> is now doing this more and more often, this used to be something only 
> privacy-violating proprietary software would do.

Since there is no personal information attached, I don't see how on the face
of it this is a privacy violation. I want to take this concern seriously,
but I need more to go on than "this is inherent". Can you elaborate?


> You will never be able to reliably count all Fedora installations. Any UUID 
> you introduce can be opted out of, bypassed, etc. Installations using local 
> mirrors for updates will never send you a UUID to begin with. All numbers 
> will always be estimates, no matter how deeply you invade our privacy in an 
> attempt to get a supposedly better count.

It's true that it will always be an estimate. I think this scheme gives a
reasonable better estimate.


> I also don't see why it is so important to have an absolute count of Fedora 
> users. IMHO, data like the relative download frequency of the different 
> Fedora deliverables is much more interesting (though you have to keep in 
> mind that the download count does not necessarily reflect the true user 
> preferences because deliverables that you advertise more prominently will 
> necessarily get downloaded more often than those hidden behind several 
> clicks from the download page).


The download count is *really* noisy. There are an order of magnitude more
bot and automatic downloads then there are ones that seem initiated by a
human. Maybe this is due to automated systems, but I suspect it is basically
just the horrible nature of the internet. Unless we were to gate downloads
with a captcha or registration (which, uh, we don't want, just to be clear),
I don't see any way to make those numbers useful.


> But sending a UUID inherently also allows to track the machine. There is no 
> way for the user to be sure that the UUID will not be used to track them. 
> Even if the software on the Fedora infrastructure is completely open and 
> audited, there might still be some proxy in the middle, some mirror 
> operator, etc. abusing the UUID for tracking purposes. And besides, the user 
> would in all cases have to trust that Fedora really runs the published code 
> and only the published code on the infrastructure servers.

Like I said, tracking is a non-goal. And, we want a design that is resistant
to tracking -- but I don't think we need to go overboard.


> Such a tracking feature must be opt-in, not opt-out! See also the EU GDPR.

This will be reviewed by lawyers. And, I do note that what I am proposing is
nothing more than what openSUSE already does.



> > * We need to be able to distinguish between short-lived instances
> > (like temporary containers or test machines) and actual installations.
> And how would you accomplish that? Other than an "I am a test installation" 
> checkbox in the installer, I don't see at all how it could be done.

One method: separate UUIDs which only show up on a single day. (This is why
a UUID is better than just a ping.)

[...]
> The installation would also only end up recognized as permanent after the 24 
> hours pass. And who says a test installation cannot last more than 24 hours? 
> I think it can last at least a week, but that also means that it would take 
> a whole week until you can reasonably assume that an installation is 
> probably permanent.

Sure, it's a threshold and we'd have to set a balance.





-- 
Matthew Miller

Fedora Project Leader
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Matthew Miller
On Mon, Jan 07, 2019 at 04:06:52PM -0600, Bruno Wolff III wrote:
> >I have to say that I actually disagree with this. It is possible that Fedora
> >Remixes could send the variant as being the name of their Remix. While my
> >Remix wouldn't do this (it is privacy oriented, and ensures only free
> >software), I can see the case for others.
> Presumably groups that wanted these counts could let Fedora know the
> varient name to expect for counting.

That seems reasoanble.

Also, I or other administrators of the system can look and see. If a common
variant is "Qubes Fedora Remix", I might add it to the report, and if it's
"Fedora Suxxx!" I would not.

-- 
Matthew Miller

Fedora Project Leader
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Matthew Miller
On Mon, Jan 07, 2019 at 02:27:39PM -0600, Bruno Wolff III wrote:
> Is this data only going to be sent to the metalink or do the mirrors
> actually used, get the data?

That's a good question.

> Is the data going to be sent along with requests to non-Fedora repos
> (e.g. rpmfusion)?

Also a good question. The intent in any case is to share aggregate
information (in a way we don't currently), so third parties will benefit
from that even if they don't have their own counting.

> This will make it much easier to spoof being lots of systems. Is
> there some plan to mitigate this risk?

I guess I'm not super worried about that. We could probably add some
server-side heuristics to detect suspicious activity.


> Is this going to turn on automatically for rawhide users?

I propose that it will at some point, yes. Perhaps a devel-announce post is
appropriate.


> Is this going to happen on install or upgrade before there is a
> chance to turn it off?

Maybe? Keep in mind that you are _already_ contacting the mirror systems
when installing or upgrading. Sending a random number once (or a few times,
even) does not seem particularly invasive.

For what it's worth, with Ubuntu's new opt-out info collection, they send a
pingback _when someone opts out_, so that they can get a sense of % of
people who make that choice. I'm not proposing that here, but... I think we
can be privacy sensitive without needing to over-design beyond reasonable
expectations. Again, especially given that software installed by default on
most Fedora installations does not have any strong restrictions.


> Are the UUIDs going to be sanity checked so that NSFW UUIDs don't
> show up in reports?

You mean if someone sends a fake UUID rather than a genuine one? I don't
expect we'll actually present the UUIDs directly in reports. It does seem
reasonable to check that UUIDs actually match the expected format, which
should cut out most of that.

-- 
Matthew Miller

Fedora Project Leader
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Matthew Miller
On Mon, Jan 07, 2019 at 04:36:38PM -0500, John Harris wrote:
> My suggestion was not because of some fear that the machine ID would be 
> leaked, but rather my personal opinion that this UUID should not be derived 
> in 
> any way from the machine ID. 

John, what's the concern there? I agree that it's a little more complicated
story to tell, but I think it's pretty reasonable to trust that it can't be
reversed in a useful way. Particularly, there are a lot easier ways for an
adversary to track a (still anonymous!) Fedora installation than this attack
vector, and the advantage (cleaned by standard image prep) is clear.


> We need to first decide whether or not we want 
> containers and other declarative environments to be considered separate 
> machines.

Sorry, I'm not seeing the connection. Maybe it's just too late in the day.
Can you spell it out for me?


-- 
Matthew Miller

Fedora Project Leader
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Bruno Wolff III

On Mon, Jan 07, 2019 at 17:44:59 -0500,
 John Harris  wrote:


We don't need to be thinking of more things to track about the user, but ways
to prevent tracking and still get the counts the Council wants.


There are two mutually opposed sides here. The users need to consider how they 
might be attacked by Fedora infrastructure or by people with access to the 
information collected by Fedora infrastructure. And Fedora needs to be 
concerned about people supplying bogus data to their logging related to 
getting data on system counts. So it is useful to consider what might be 
done with data available to Fedora infrastructure even if that isn't the 
plan.

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread John Harris
On Monday, January 7, 2019 5:20:55 PM EST Bruno Wolff III wrote:
> On Mon, Jan 07, 2019 at 22:54:46 +0100, Tom Gundersen  wrote:
>
> So this allows better tracking than if you just had to go by IP, time and
> other information in the requests. 

Keep in mind that we do not want tracking, at all. Just counting. That said, 
your message certainly does highlight some of the risks of using this proposed 
UUID system.

We don't need to be thinking of more things to track about the user, but ways 
to prevent tracking and still get the counts the Council wants.

-- 
John M. Harris, Jr. 
Splentity
https://splentity.com/

signature.asc
Description: This is a digitally signed message part.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Bruno Wolff III

On Mon, Jan 07, 2019 at 22:54:46 +0100,
 Tom Gundersen  wrote:

You could move the rotation to the client by hashing the UUID with a
timestamp of sufficiently coarse granularity (a week?) before submitting it.

Then you make sure that all UUIDs submitted by a given machine during a
given time window are the same, but UUIDs submitted in different windows
are not related, and you don't have to trust the server to respect your
privacy.


There are ways to link the new UUIDs to the old ones in many cases. This 
could be by looking at IP addresses in common, times of the requests, 
varients, repo(s) and possibly other characteristics of the requests. While 
a GUUID is in use it could be used to link IP, and time information with 
more certainty than you would otherwise. So this allows better tracking 
than if you just had to go by IP, time and other information in the requests.

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Bruno Wolff III

On Mon, Jan 07, 2019 at 17:04:11 -0500,
 John Harris  wrote:

On Monday, January 7, 2019 5:00:48 PM EST Stephen Gallagher wrote:

I think the only useful data we could get from unknown variants would
be "the number of times we see an unknown variant". So I think
throwing it away and just incrementing a counter of "the number of
times people have tried to poison the data" is probably reasonable.


I have to say that I actually disagree with this. It is possible that Fedora
Remixes could send the variant as being the name of their Remix. While my
Remix wouldn't do this (it is privacy oriented, and ensures only free
software), I can see the case for others.


Presumably groups that wanted these counts could let Fedora know the 
varient name to expect for counting.

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Kevin Kofler
Ben Cotton wrote:
> systems, but a quick thing we can do is implement a per-system UUID
> (unique identifier) and count that instead of IP addresses.

Please no! This is an inherent privacy violation. I hate software doing this 
and I always opt out of it. I find it especially worrying that Free Software 
is now doing this more and more often, this used to be something only 
privacy-violating proprietary software would do.

> === The problem === 
> * A. Currently, we can only count Fedora OS use by observing IP
> addresses. This is subject to undercounting due to NAT — and to
> overcounting due to short DHCP leases and laptops moving between work
> or school and home or coffee shop.

You will never be able to reliably count all Fedora installations. Any UUID 
you introduce can be opted out of, bypassed, etc. Installations using local 
mirrors for updates will never send you a UUID to begin with. All numbers 
will always be estimates, no matter how deeply you invade our privacy in an 
attempt to get a supposedly better count.

I also don't see why it is so important to have an absolute count of Fedora 
users. IMHO, data like the relative download frequency of the different 
Fedora deliverables is much more interesting (though you have to keep in 
mind that the download count does not necessarily reflect the true user 
preferences because deliverables that you advertise more prominently will 
necessarily get downloaded more often than those hidden behind several 
clicks from the download page).

> === Constraints ===
> * The Fedora community cares about privacy and is adverse to tracking
> measures. We don't want to track; just count.

But sending a UUID inherently also allows to track the machine. There is no 
way for the user to be sure that the UUID will not be used to track them. 
Even if the software on the Fedora infrastructure is completely open and 
audited, there might still be some proxy in the middle, some mirror 
operator, etc. abusing the UUID for tracking purposes. And besides, the user 
would in all cases have to trust that Fedora really runs the published code 
and only the published code on the infrastructure servers.

The only reliable way to ensure that users will not be tracked by a UUID is 
to not send a UUID to begin with!

> * For this reason, we don’t want to use any identifier like
> /etc/machine-id which may be used for other purposes.

I don't think using an identifier different from /etc/machine-id will really 
help all that much. Whatever identifier you use can be abused for tracking.

> * And, also for that reason, there needs to be a relatively easy way to
> opt out.

Such a tracking feature must be opt-in, not opt-out! See also the EU GDPR.

But I think that such a privacy invasion is incompatible with the Fedora 
project's goals to begin with, even if it is opt-in.

> * This needs to work with Yum/DNF, MicroDNF, PackageKit, Cockpit,
> rpm-ostree, GNOME Software, Muon, Apper, and software update
> mechanisms used in other spins.

Apper is no longer shipped in Fedora. The KDE Spin uses plasma-pk-updates as 
its official updater, but Discover and Dnfdragora (which are both shipped 
for different purposes) can also be used to update the system.

But if you require an explicit opt out in more than one place (e.g., once 
for DNF and once for PackageKit), that makes this feature all the more 
dangerous and scary.

> * We need to be able to distinguish between short-lived instances
> (like temporary containers or test machines) and actual installations.

And how would you accomplish that? Other than an "I am a test installation" 
checkbox in the installer, I don't see at all how it could be done.

> === Non-Goals ===
> * We don’t want to track users, just count systems.

Again, there is no way you can guarantee that. We would have to take your 
word for it. This is not acceptable.

> === Other Elements ===
> * We may also want each report to contain a boolean flag showing
> whether the system has been in use for at least 24 hours to help
> separately categorize test and other throw-away instances.

So this is even more data that you would be collecting behind the user's 
back.

The installation would also only end up recognized as permanent after the 24 
hours pass. And who says a test installation cannot last more than 24 hours? 
I think it can last at least a week, but that also means that it would take 
a whole week until you can reasonably assume that an installation is 
probably permanent.

Without being able to magically predict the future and without asking the 
user, I don't think you can ever be able to make this distinction reliably.

Kevin Kofler
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 

Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread John Harris
On Monday, January 7, 2019 5:00:48 PM EST Stephen Gallagher wrote:
> I think the only useful data we could get from unknown variants would
> be "the number of times we see an unknown variant". So I think
> throwing it away and just incrementing a counter of "the number of
> times people have tried to poison the data" is probably reasonable.

I have to say that I actually disagree with this. It is possible that Fedora 
Remixes could send the variant as being the name of their Remix. While my 
Remix wouldn't do this (it is privacy oriented, and ensures only free 
software), I can see the case for others.

-- 
John M. Harris, Jr. 
Splentity
https://splentity.com/

signature.asc
Description: This is a digitally signed message part.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Stephen Gallagher
On Mon, Jan 7, 2019 at 4:55 PM Bruno Wolff III  wrote:
>
> On Mon, Jan 07, 2019 at 16:41:46 -0500,
>   John Harris  wrote:
> >On Monday, January 7, 2019 4:31:29 PM EST Bruno Wolff III wrote:
> >> If the strings aren't checked when they are received, they could be
> >> anything.
> >>  The system varient also has the same issue. You shouldn't trust
> >> the clients supplying this information.
> >
> >If we are just using this UUID to count machines, it doesn't matter what the
> >UUID is. Just that it's different between machines.
>
> Yes, if they are not so long as to break the software and no public report
> has the actual strings so the project doesn't get embarrassed and no one who
> has to look at the strings is easily offended, then it isn't a problem.
>
> The system varient is probably a bit different of a case. Unexpected varients
> could end up in public reports depending on things are designed. It might
> be good to throw out any data which has unexpected varients in it.

I think the only useful data we could get from unknown variants would
be "the number of times we see an unknown variant". So I think
throwing it away and just incrementing a counter of "the number of
times people have tried to poison the data" is probably reasonable.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Stephen John Smoogen
On Mon, 7 Jan 2019 at 15:34, Lennart Poettering  wrote:
>
> On Mo, 07.01.19 13:28, Matthew Miller (mat...@fedoraproject.org) wrote:
>
> > On Mon, Jan 07, 2019 at 06:24:14PM +0100, Lennart Poettering wrote:
> > > > * The Fedora community cares about privacy and is adverse to tracking
> > > > measures. We don't want to track; just count.
> > > Uh, so what's the story there? i mean, if you pass over the uuid you
> > > make clients trackable, regardless if you want to make use of that or
> > > not...
> >
> > Not if we don't keep them for long. One idea is to rotate them fairly
> > frequently. But this is mostly a statement of intent and might be more about
> > how we build the backend than about what we force in the client.
>
> Well, that's entirely intransparent to users, what fedora does with
> the uuid is entirely a blackbox for clients if you do it this way.
>
> I wonder if it is worth introducing an entirely new tracking concept
> here if you actually don't want to track but just count. The NTP
> approach has the benefit that you introduce no new tracking concept at
> all, but you just use the data that is pretty much generated
> anyway. It also makes this all feel less one-sided, after all you
> provide them with a deal: fedora gives the user correct time, the user
> is therefore counted.

The problems with NTP are the following:
1. The administrative headaches of regular
blocks/takedowns/ill-advised security emails because our servers are
attacking someone's box on port 123. [As funny as this sounds, getting
regular angry emails from some site whose security tool has decided
that 123/{tcp,udp} is a major threat still occurs. ]
2. NTP bandwidth while small per system grows a lot as you wrack up
servers randomly checking in. Having a pool of servers around the
world would require us to get NTP GPS clocks, getting the datacenters
to put the antenae out and a bunch of other items. [The budget for
this is non-zero.]
3. Logging NTP does not cover the problem the UUID is trying to help
solve.. there are two places where we undercount and overcount
systems.
 a. systems behind nat firewalls all show up as 1 ip address. ntp or
yum or gnome-hotspot ask multiple times during a day.. but not a set
number. Just looking at my 3 home systems I see around 1 to 80
connections depending on what i have done that day.
 b. systems on short lived dhcp ranges. multiple major isps use
various methods which make a system look like multiple boxes. The
system will show up as 123.45.67.89 and 2 minutes later the same
system will be 89.76.54.123 [made up ip addresses.. but various
carriers seem to do this.]
 c. when ips are reused and nat'd (the nat over nat).. you combine a and b.
4. NTP is a high security problem when you concentrate it to a set of
servers. These become servers that everyone wants to hack even more
than build systems. These problems range from DDOS to active hacks.
5. Which leads to us being in charge of the security of every kerberos
and SSL session which relies on our clocks to be available and in
sync. That leads to other administrative headaches where sites will
complain that our servers broke one of those because the services was
DDOS'd, ASN rerouted, off by N amount, UDP replayed etc.

Most of the above except for 3 are solvable but would require
additional resources which have been hard to get in the past.

> > > BTW, afaik Ubuntu counts installations through NTP: they provide their
> > [..]
> > > Of course, doing it that way would mean fedora would have to host NTP
> > > servers...
> >
> > Hmmm. We have fedora.pool.ntp.org, in fact. I'm not sure who actually runs
> > that!
>
> That's fedora's allocation of the public NTP pool project, see
> https://www.ntppool.org/. That's hosted by all kinds of people
> voluntarily.
>
> I guess the question is if hosting an NTP server is more or less work
> than hosting a uuid counting server, and whether the privacy issues
> the uuid solution brings are worth it.
>
> BTW, iirc intel used to count installations through the http ping
> check in their captive portal detection. Fedora runs a similar service
> which is used by NM, no? maybe that's a nicer solution too: add a http
> header field to the ping check that each client sets to "1" on one of
> these ping checks a day, and "0" all other times. Then you count how
> many non-zero ping checks you get within a 24h window and you have a
> really good idea how many users you have. All without any explicit
> tracking. And again this appears to me is a much better deal to me
> than the uuid/dnf check that has been proposed, as you can say "we
> provide you with ping check functionality therefore we count you":
> both sides get something out of it.

We do this but have I have found it to have problems with the NAT over
NAT.. where we know a system should show up 288 times in a day.. but
have seen multiple class C where every IP address shows up 1-8 times,
but spread over a day. Are these groups of  255 systems only on for a

Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Tom Gundersen
On Mon, Jan 7, 2019, 7:31 PM Matthew Miller 
wrote:

> On Mon, Jan 07, 2019 at 06:24:14PM +0100, Lennart Poettering wrote:
> > > * The Fedora community cares about privacy and is adverse to tracking
> > > measures. We don't want to track; just count.
> > Uh, so what's the story there? i mean, if you pass over the uuid you
> > make clients trackable, regardless if you want to make use of that or
> > not...
>
> Not if we don't keep them for long. One idea is to rotate them fairly
> frequently. But this is mostly a statement of intent and might be more
> about
> how we build the backend than about what we force in the client.
>

You could move the rotation to the client by hashing the UUID with a
timestamp of sufficiently coarse granularity (a week?) before submitting it.

Then you make sure that all UUIDs submitted by a given machine during a
given time window are the same, but UUIDs submitted in different windows
are not related, and you don't have to trust the server to respect your
privacy.

Cheers,

Tom

>
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Bruno Wolff III

On Mon, Jan 07, 2019 at 16:41:46 -0500,
 John Harris  wrote:

On Monday, January 7, 2019 4:31:29 PM EST Bruno Wolff III wrote:

If the strings aren't checked when they are received, they could be
anything.
 The system varient also has the same issue. You shouldn't trust
the clients supplying this information.


If we are just using this UUID to count machines, it doesn't matter what the
UUID is. Just that it's different between machines.


Yes, if they are not so long as to break the software and no public report 
has the actual strings so the project doesn't get embarrassed and no one who 
has to look at the strings is easily offended, then it isn't a problem.


The system varient is probably a bit different of a case. Unexpected varients 
could end up in public reports depending on things are designed. It might 
be good to throw out any data which has unexpected varients in it.

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread John Harris
On Monday, January 7, 2019 4:31:29 PM EST Bruno Wolff III wrote:
> If the strings aren't checked when they are received, they could be
> anything. 
>  The system varient also has the same issue. You shouldn't trust
> the clients supplying this information.

If we are just using this UUID to count machines, it doesn't matter what the 
UUID is. Just that it's different between machines.

-- 
John M. Harris, Jr. 
Splentity
https://splentity.com/

signature.asc
Description: This is a digitally signed message part.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread John Harris
On Monday, January 7, 2019 4:29:02 PM EST Zbigniew Jędrzejewski-Szmek wrote:
> If the sd_id128_get_machine_app_specific/... mechanism is used, this
> could be added to previous releases in a dnf update. This is an additional
> advantage over having a indepdent uuid for this.

If we do go forward with this, it would be best not to backport it, as we have 
no clean way to inform these users of the new privacy concerns that this 
brings with it.

-- 
John M. Harris, Jr. 
Splentity
https://splentity.com/

signature.asc
Description: This is a digitally signed message part.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Bruno Wolff III

On Mon, Jan 07, 2019 at 16:00:46 -0500,
 John Harris  wrote:

On Monday, January 7, 2019 3:27:39 PM EST Bruno Wolff III wrote:

Are the UUIDs going to be sanity checked so that NSFW UUIDs don't show up
in reports?


I don't see how a UUID could possibly be NSFW, or why UUIDs would ever be
included in reports regardless. The point is supposedly counting, not
tracking.


If the strings aren't checked when they are received, they could be anything. 
The system varient also has the same issue. You shouldn't trust the clients 
supplying this information.

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Kevin Kofler
Lennart Poettering wrote:
> BTW, iirc intel used to count installations through the http ping
> check in their captive portal detection. Fedora runs a similar service
> which is used by NM, no? maybe that's a nicer solution too: add a http
> header field to the ping check that each client sets to "1" on one of
> these ping checks a day, and "0" all other times. Then you count how
> many non-zero ping checks you get within a 24h window and you have a
> really good idea how many users you have. All without any explicit
> tracking. And again this appears to me is a much better deal to me
> than the uuid/dnf check that has been proposed, as you can say "we
> provide you with ping check functionality therefore we count you":
> both sides get something out of it.

And this is why I have always been and am still opposed to the 
NetworkManager-config-connectivity-fedora spyware and uninstalled it (or did 
not install it in the first place on upgrades) on my computers.

Kevin Kofler
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread John Harris
On Monday, January 7, 2019 4:32:04 PM EST Lennart Poettering wrote:
> On Mo, 07.01.19 16:04, John Harris (joh...@splentity.com) wrote:
> 
> 
> > On Monday, January 7, 2019 3:18:10 PM EST Lennart Poettering wrote:
> > 
> > > hence my recommendation to derive the any uuid for purposes like this
> > > from /etc/machine-id, by using a HMAC of some kind (see other mail).
> >
> >
> >
> > I really don't think that this should be derived in any way from a machine
> > id,
> > if it really is meant to be used for counting users, rather than
> > tracking.
> 
> Please read up on what I wrote above, and what an HMAC does. Deriving
> some identifier from the machine ID doesn't mean you leak the machine
> ID, but it means resetting the machine ID will also reset that
> identifier, which is a useful property in this case.
> 
> Lennart
> 
> --
> Lennart Poettering, Red Hat

My suggestion was not because of some fear that the machine ID would be 
leaked, but rather my personal opinion that this UUID should not be derived in 
any way from the machine ID. We need to first decide whether or not we want 
containers and other declarative environments to be considered separate 
machines.

-- 
John M. Harris, Jr. 
Splentity
https://splentity.com/

signature.asc
Description: This is a digitally signed message part.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Lennart Poettering
On Mo, 07.01.19 16:04, John Harris (joh...@splentity.com) wrote:

> On Monday, January 7, 2019 3:18:10 PM EST Lennart Poettering wrote:
> > hence my recommendation to derive the any uuid for purposes like this
> > from /etc/machine-id, by using a HMAC of some kind (see other mail).
>
> I really don't think that this should be derived in any way from a machine id,
> if it really is meant to be used for counting users, rather than tracking.

Please read up on what I wrote above, and what an HMAC does. Deriving
some identifier from the machine ID doesn't mean you leak the machine
ID, but it means resetting the machine ID will also reset that
identifier, which is a useful property in this case.

Lennart

--
Lennart Poettering, Red Hat
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Zbigniew Jędrzejewski-Szmek
On Mon, Jan 07, 2019 at 11:34:47AM -0500, Ben Cotton wrote:
> == Summary ==
> Right now, we estimate installed Fedora systems by counting unique IP
> addresses which show up in our updates mirror statistics. We need
> better data than that. There are some proposals for more complicated
> systems, but a quick thing we can do is implement a per-system UUID
> (unique identifier) and count that instead of IP addresses.

FWIW, I think that doing this is great idea. We can really use more
reliable information about the number of installations.

As Lennart wrote, using either sd_id128_get_machine_app_specific(3)
or `systemd-id128 new --app-specific=...` (new in F30) or an independent
reimplementation seems to right way to implement this.

> == Upgrade/compatibility impact ==
> Older versions will not have the UUID counting enabled; we will keep
> collecting stats in the traditional way for those systems.

If the sd_id128_get_machine_app_specific/... mechanism is used, this
could be added to previous releases in a dnf update. This is an additional
advantage over having a indepdent uuid for this.

Zbyszek
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Zbigniew Jędrzejewski-Szmek
On Mon, Jan 07, 2019 at 04:04:24PM -0500, John Harris wrote:
> On Monday, January 7, 2019 3:18:10 PM EST Lennart Poettering wrote:
> > hence my recommendation to derive the any uuid for purposes like this
> > from /etc/machine-id, by using a HMAC of some kind (see other mail).
> 
> I really don't think that this should be derived in any way from a machine 
> id, 
> if it really is meant to be used for counting users, rather than tracking.

Please read the man page [1]. The machine id cannot be derived from
this number. On the other hand, using a derived UUID for this makes it
easier to wipe identifying information from an image, because there is
still just the one /etc/machine-id file that needs to be wiped.

[1] 
https://www.freedesktop.org/software/systemd/man/sd_id128_get_machine_app_specific.html

Zbyszek
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread Stephen John Smoogen
On Mon, 7 Jan 2019 at 15:19, Lennart Poettering  wrote:
>
> On Mo, 07.01.19 13:32, Stephen John Smoogen (smo...@gmail.com) wrote:
>
> > On Mon, 7 Jan 2019 at 12:32, John Harris  wrote:
> > >
> > > On Monday, January 7, 2019 11:34:47 AM EST Ben Cotton wrote:
> > > > The Fedora community cares about privacy and is adverse to tracking
> > > > measures. We don't want to track; just count.
> > >
> > > If this is ever implemented, we should probably notify end users and 
> > > provide
> > > an easy way to disable this. If you pass an identifier, that enables 
> > > client
> > > tracking.
> >
> > The original proposla was looking at something to what yum has had
> > built into it for a while. Every yum installation has a file
> > /var/lib/yum/uuid which contains whatever was pulled from
> > /proc/sys/kernel/random/uuid when yum was installed.  Here is one
> > example 34cb9496-a62c-496e-8935-22f550247262
>
> Uh, please don't do it this way. People build reusable images of

As I said that was the original proposal which was from a proof of
concept of what was already in existence. It has a bunch of problems
as you said. The reason for using the /proc/sys was because it could
be cron'd and rebuilt regularly if needed and the kernel items was
avaliable for EL5 (when I started working on this) -> EL7. With
systemd's /etc/machine-id and having a regular hmac regeneration
process would be equally useful.

> Fedora that are then run unmodified in many instances. If you invent a
> new file for a new uuid like this then it's highly unlikely people
> will reset it when building such images, and hence your counting will
> count all such instances as one, which you probably don't want.
>
> Hence, any such uuid should be keyed off /etc/machine-id, as that file
> exists for purposes like this, and the chance that it is reset during
> image building is higher, and doesn't require people to reset uuids
> all over the place.
>
> hence my recommendation to derive the any uuid for purposes like this
> from /etc/machine-id, by using a HMAC of some kind (see other mail).
>
> Lennart
>
> --
> Lennart Poettering, Red Hat
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: 
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org



-- 
Stephen J Smoogen.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: F30: System-Wide Change proposal: DNF UUID

2019-01-07 Thread John Harris
On Monday, January 7, 2019 3:18:10 PM EST Lennart Poettering wrote:
> hence my recommendation to derive the any uuid for purposes like this
> from /etc/machine-id, by using a HMAC of some kind (see other mail).

I really don't think that this should be derived in any way from a machine id, 
if it really is meant to be used for counting users, rather than tracking.

-- 
John M. Harris, Jr. 
Splentity
https://splentity.com/

signature.asc
Description: This is a digitally signed message part.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


  1   2   >